# How Principal Are Your Components?

This post is an offshoot from Chapter 1 of Probably Overthinking It, which is available for pre-order now!

In a previous post I explored the correlations between measurements in the ANSUR-II dataset, which includes 93 measurements from a sample of U.S. military personnel. I found that measurements of the head were weakly correlated with measurements from other parts of the body – and in particular the protrusion of the ears is almost entirely uncorrelated with anything else.

A friend of mine, and co-developer of the Modeling and Simulation class I taught at Olin, asked whether I had tried running principal component analysis (PCA). I had not, but now I have. Let’s look at the results.

The ANSUR data is available from The OPEN Design Lab.

## Explained Variance

Here’s a visualization of explained variance versus number of components.

With one component, we can capture 44% of the variation in the measurements. With two components, we’re up to 62%. After that, the gains are smaller (as we expect), but with 10 measurements, we get up to 78%.

Looking at the loadings, we can see which measurements contribute the most to each of the components, so we can get a sense of which characteristics each component captures.

I won’t explain all of the measurements, but if there are any you are curious about, you can look them up in The Measurer’s Handbook, which includes details on “sampling strategy and measuring techniques” as well as descriptions and diagrams of the landmarks and measurements between them.

```Principal Component 1:
0.135 	 suprasternaleheight
0.134 	 cervicaleheight
0.134 	 buttockkneelength
0.134 	 acromialheight
0.133 	 kneeheightsitting

Principal Component 2:
0.166 	 waistcircumference
-0.163 	 poplitealheight
0.163 	 abdominalextensiondepthsitting
0.161 	 waistdepth
0.159 	 buttockdepth

Principal Component 3:
0.338 	 elbowrestheight
0.31 	 eyeheightsitting
0.307 	 sittingheight
0.228 	 waistfrontlengthsitting

Principal Component 4:
0.247 	 balloffootcircumference
0.212 	 sittingheight

Principal Component 5:
0.319 	 interscyeii
0.275 	 shoulderlength
0.273 	 interscyei
0.184 	 shouldercircumference

Principal Component 6:
0.316 	 shoulderlength

Principal Component 7:
0.374 	 crotchlengthposterioromphalion
-0.298 	 earlength
-0.284 	 waistbacklength
0.253 	 crotchlengthomphalion

Principal Component 8:
0.472 	 earprotrusion
0.346 	 earlength
0.215 	 crotchlengthposterioromphalion
-0.202 	 wristheight

Principal Component 9:
0.294 	 crotchlengthposterioromphalion
-0.228 	 shoulderlength
0.189 	 neckcircumferencebase

Principal Component 10:
0.356 	 earprotrusion
-0.269 	 waistfrontlengthsitting
0.239 	 earlength
-0.228 	 waistbacklength
```

Here’s my interpretation of the first few components.

• Not surprisingly, the first component is loaded with measurements of height. If you want to predict someone’s measurements, and can only use one number, choose height.
• The second component is loaded with measurements of girth. No surprises so far.
• The third component seems to capture torso length. That makes sense — once you know how tall someone is, it helps to know how that height is split between torso and legs.
• The fourth component seems to capture hand and foot size (with sitting height thrown in just to remind us that PCA is not obligated to find components that align perfectly with the axes we expect).
• Component 5 is all about the shoulders.