*This program selects sample size so that the probability that the R squared is in a target interval will be .95. As the program is set up, the interval is rho squared (population squared multiple correlation) plus and minus c. To make the interval rho squared minus c to 1.00, set switch=1.To make the interval .00 to rho squared plus c, set switch=2. The number of predictors is k and ranges from 2 to 20. These values can be changed in the first do statement. The population squared multiple correlation is rhosq and ranges from .05 to .95 in steps of .05. These values can be changed in the second do statement. Twice the target accuracy is acc and ranges from .10 to .40 in steps of .10. These values can be changed in the third do statement. options ps=65; data gen; switch=0; array ll{4} ll1-ll4; array med{4} med1-med4; array uu{4} uu1-uu4; array ns{4} nn1-nn4; array ac{4} acc1-acc4; do k=2 to 20; do rhosq = .05 to .90 by .05; do kk = 1 to 4; if kk=1 then acc = .10; if kk=2 then acc = .20; if kk=3 then acc = .30; if kk=4 then acc = .40; ll{kk}=0; med{kk}=0; uu{kk}=0; ac{kk}=0; start=k+2; rsqul=rhosq+acc/2;if switch=1 then rsqul=1.00; rsqll=rhosq-acc/2;if switch=2 then rsqul=0.00; if rsqul>1.0 then rsqul=1.0; if rsqll<0.0 then rsqll=0.0; rhotidsq=rhosq/(1-rhosq); do nn=start to 10000; n=nn; df=n-1; df2=n-k-1; * rsqul=((n-k-1)/(n-1))*(rsqul)+(k/(n-1)); * rsqll=((n-k-1)/(n-1))*(rsqll)+(k/(n-1)); if rsqul=1.0 then rsqul=.99999; rtsqul=rsqul/(1-rsqul); rtsqll=rsqll/(1-rsqll); gamma=sqrt(1+rhotidsq); phi1=df*(gamma**2-1)+k; phi2=df*(gamma**4-1)+k; phi3=df*(gamma**6-1)+k; g=(phi2-sqrt(phi2**2-phi1*phi3))/phi1; nu=(phi2-2*rhotidsq*gamma*(sqrt(df*df2)))/(g**2); lambda=(rhotidsq*gamma*(sqrt(df*df2)))/(g**2); ul=((df2*rtsqul)/(nu*g)); sl=((df2*rtsqll)/(nu*g)); pu=probf(ul,nu,df2,lambda); pl=probf(sl,nu,df2,lambda); pd=pu-pl; if .95 <= pu-pl <= .96 then do; rtisq50=((nu*g)/(df2))*(finv(.5000,nu,df2,lambda)); rsq50=rtisq50/(1+rtisq50);arsq50=rsq50*(df/df2)-(k/df2); rsqtop=rtsqul/(1+rtsqul); rsquu=rsqtop; arsquu=(df/df2)*rsqtop-(k/df2); rsqbot=rtsqll/(1+rtsqll); rsqll=rsqbot; arsqll=(df/df2)*rsqbot-(k/df2); ll{kk}=round(rsqll,.001); med{kk}=round(rsq50,.001); uu{kk}=round(rsquu,.001); ns{kk}=n; ac{kk}=acc; nn=10000; end; end; end; output; end; end; proc print;var k rhosq acc1 acc2 acc3 acc4 nn1 nn2 nn3 nn4 ll1 med1 uu1 ll2 med2 uu2 ll3 med3 uu3 ll4 med4 uu4; run; quit;