I made a typo in last issue of the newsletter which was corrected right away, but many still got the uncorrected version. Instead of writing C(X,Y) = 0 for uncorrelation, I wrote P(X,Y) = 0. This is a non trivial error, as C(X, Y) = 0 means that the covariance between X and Y variables is zero (uncorrelation) while P(X,Y) = 0 is often used in reference to disjoint (mutually exclusive events). Ah, the power of a single letter-typo… That’s kind of stuffs happen when you are pressed with so many deadlines. My apologies for that. I’ll repeat this clarification in the next issue of IRW since it covers the second part of the subject on correlation. The purpose of this post is to gives you a sneak preview on what to expect from the June issue.

Variables or events are said to be independent, correlated, uncorrelated, orthogonal, or unrelated depending on the type of scenario.

Independence implies that P(X,Y) = P(X)*P(Y) where P is probability; i.e., the probability of X and Y to occur is equal to the product of the individual variable probabilities.

Correlation is a measure of linear association. Variables can be correlated without having any causal relationship, or can have a causal relationship and yet be uncorrelated.

Uncorrelation implies that C(X,Y) = 0; i.e., the covariance between variables is zero. This implies that their correlation coefficient, r, is also zero since r = C(X,Y)/S(X)*S(Y), where S stands for standard deviation. There is a generalization of this using expectation values; i.e. it is said that X and Y are uncorrelated if and only if E(X,Y) = E(X)*E(Y).

Describing variables as being either orthogonal or uncorrelated is easier to understand in terms of vectors.

If vectors representing raw variables are perpendicular (dot product is zero) then the variables are orthogonal.

Now if the variables are centered and represented as vectors and their dot product is yet zero they are uncorrelated.

In other words, orthogonal denotes that the raw variables (non-centered) are perpendicular while uncorrelated means that the centered variables are perpendicular. C(X,Y) is then obtained from this dot product as C(X,Y) = 0.

There is an old, but still relevant article, Linearly Independent, Orthogonal, and Uncorrelated Variables, written by Rodgers et al which explains all this.