Oxford Mathematician Dr Tom Crawford explains the process of Principal Component Analysis and how it is used to find patterns in data. This video is part of the ‘Oxford Statistics Lecture Series’ which is based on the first year undergraduate mathematics course that Tom teaches at the University of Oxford.
The video begins with a discussion of the purpose of PCA and why it is helpful for finding patterns in data. We then look at a simple example to determine what a principal component is, and the properties that we want it to have.
Once we have determined the concept of a principal component as the linear combination of the variables which maximises the variance, the variance is expressed using vector notation via the covariance matrix.
We then set up a Lagrange multiplier problem which is solved to give the eigenvector equation of the covariance matrix. This is solved using the characteristic polynomial to find the eigenvalues, where the eigenvector corresponding to the largest eigenvalue gives the first principal component. Finally, the scores matrix is introduced as the data matrix multiplied by the matrix of the ordered eigenvectors.
