New Relevant Tool: https://irthoughts.wordpress.com/2018/09/14/regression-correlation-calculator-updates-and-improvements/
Today I updated my Tutorial on Correlation Coefficients to include a new section on the effect of sample size on the significance of correlation coefficients. This was motivated by some comments from search engine marketers on correlation strengths. (http://searchenginewatch.com/3641002). The new material might help those interested in learning whether a reported correlation coefficient is statistically different from zero. It is given below. Enjoy it.
The problem with correlation strength scales is that these say nothing about how the size of a sample impacts the significance of a correlation coefficient. This is a very important issue that is now addressed.
Consider three different correlation coefficients: 0.50, 0.35, and 0.17. Assume that we want to test that there is no significant relationship between the two variables at hand. The null hypothesis (H0) to be tested is that these r values are not statistically different from zero (rho = 0). How to proceed?
As recommended by Stevens (17), for rho = 0, H0 can be tested using a two tailed (i.e.,two sided) t-test at a given confidence level, usually at a 95% level. If tcalculated ≥ ttable, H0 is rejected. However, if tcalculated < ttable H0 is not rejected and there is no significant correlation between variables.
Here tcalculated is computed as r/SEr = r*SQRT[((n – 2)/(1 – r2))] while ttable values are obtained from the literature (http://en.wikipedia.org/wiki/Student%27s_t-distribution#Table_of_selected_values ). Table 2 summarizes the result of testing the null hypothesis at different sample size values.
Table 2. H0 tests at different sample sizes; two-tailed, 95% confidence. | ||||||
n | df = n – 2 | r | SEr | t(calc) | t (0.95) | Reject (H0 : rho = 0)? |
5 | 3 | 0.50 | 0.50 | 1.000 | 3.182 | don’t reject |
10 | 8 | 0.50 | 0.31 | 1.633 | 2.306 | don’t reject |
12 | 10 | 0.50 | 0.27 | 1.826 | 2.228 | don’t reject |
14 | 12 | 0.50 | 0.25 | 2.000 | 2.179 | don’t reject |
20 | 18 | 0.50 | 0.20 | 2.449 | 2.101 | reject |
30 | 28 | 0.50 | 0.16 | 3.055 | 2.048 | reject |
40 | 38 | 0.50 | 0.14 | 3.559 | 2.024 | reject |
50 | 48 | 0.50 | 0.13 | 4.000 | 2.011 | reject |
5 | 3 | 0.35 | 0.54 | 0.647 | 3.182 | don’t reject |
10 | 8 | 0.35 | 0.33 | 1.057 | 2.306 | don’t reject |
12 | 10 | 0.35 | 0.30 | 1.182 | 2.228 | don’t reject |
14 | 12 | 0.35 | 0.27 | 1.294 | 2.179 | don’t reject |
20 | 18 | 0.35 | 0.22 | 1.585 | 2.101 | don’t reject |
30 | 28 | 0.35 | 0.18 | 1.977 | 2.048 | don’t reject |
40 | 38 | 0.35 | 0.15 | 2.303 | 2.024 | reject |
50 | 48 | 0.35 | 0.14 | 2.589 | 2.011 | reject |
5 | 3 | 0.17 | 0.57 | 0.299 | 3.182 | don’t reject |
10 | 8 | 0.17 | 0.35 | 0.488 | 2.306 | don’t reject |
12 | 10 | 0.17 | 0.31 | 0.546 | 2.228 | don’t reject |
14 | 12 | 0.17 | 0.28 | 0.598 | 2.179 | don’t reject |
20 | 18 | 0.17 | 0.23 | 0.732 | 2.101 | don’t reject |
30 | 28 | 0.17 | 0.19 | 0.913 | 2.048 | don’t reject |
40 | 38 | 0.17 | 0.16 | 1.063 | 2.024 | don’t reject |
50 | 48 | 0.17 | 0.14 | 1.195 | 2.011 | don’t reject |
The table addresses at which size level an r value is high enough to be statistically significant.
For n = 14, all three r values (0.50, 0.35, and 0.17) are not statistically different from zero.
For n = 30, r = 0.50 is statistically different from zero while r = 0.35 and r = 0.17 are not.
Conversely, r = 0.50 is not statistically different from zero when n is equal or less than 14 while r = 0.35 is not different from zero when n is equal or less than 30.
Finally, r = 0.17 is not statistically different from zero at any of the sample sizes tested.
Related post: https://irthoughts.wordpress.com/2016/04/18/virus-evolution-citation/
Can you tell me what is SEr and how is it calculated?
Standard Error of r, where r is a correlation coefficient. BTW all tutorials are offline but are accessible from the Vault section of the site.