Content DisclaimerCopyright @2014.All Rights Reserved.
StatsToDo : Sample Size for Pearson's Correlation Coefficient Explained and Table
 Introduction Sample Size Table References This page provides a sample size table for Pearson's Correlation Coefficient rho (ρ), as calculated using the Sample Size for Pearson's Correlation Coefficient Program Page . The table provides sample size for : Power (1-β) of 0.8, 0.9, and 0.95 Probability of Type I Error (α) of 0.1, 0.05, 0.01, and 0.001 One or two tail models Correlation Coefficient (ρ) at 0.02 intervals Sample size calculation is based on the algorithm as described by Machin et.al., where the correlation coefficient ρ is firstly transformed to Fisher's Z, and the estimation is based on Z assuming Z is normally distributed. Fisher's Z Z = log((1+ρ) / (1-ρ)) / 2 Sample Size n = ((zα+zβ) / Z)2 + 3 for one tail model For 2 tail model the α value is haved. e.g. for p=0.05, zα for 1 tail=1.64, and for 2 tail=1.96 Power is calculated by reversing the same formula, so that zβ = Z(sqrt(n-3))-zα, then converting zβ to probability β. Again value of zα depends on α and whether 1 or 2 tail model Confidence interval is also calculated by reversing the formula, where Z = (zα+zβ) / sqrt(n-3) SE = sqrt(n-3) Confidence interval = Z ±zα(SE) Z and the confidence intervals are then converted to correlation coefficient where ρ=exp(2Z-1) / exp(2Z+1) The intervals are not symmetrical, being wider towards 0 and narrower towards 1 and -1 On whether to use a one or two tail model If the user intends to establish 95% confidence interval for comparison with other correlation coefficients, or to use the results in a future meta-analysis, then a two tail model is appropriate. More commonly, however, the interest is whether a significant correlation exists in the data observed, whether one of the tails overlaps the null value of zero (0). In this case, the one tail model suffices, and requires a smaller sample size. In most publications, therefore, when the one or two tail model is not mentioned, the default is usually the one tail model. Example To establish an expected correlation coefficient (ρ) of 0.6, significant at the α<0.05 and a power (1-β) of 0.8, requires 16 pairs of x/y values, taking the default position of a one tail model.