Using and interpreting data requires storing and manipulating sets of numbers in conceptually and computationally helpful ways. The language of linear algebra provides basic vocabulary, visualizations, and mathematical results for understanding the structure of a dataset.
Consider a spreadsheet whose rows correspond to individuals and whose three columns correspond to weight in kilograms, height in centimeters, and height in inches. Are any of the columns redundant?
Solution. Yes, the third column is redundant. If we know a person's height in centimeters, we can work out their height in inches by multiplying their height in centimeters by 2.54.
Alternatively, we could say that the second column is redundant, since we could obtain it by dividing the numbers in the third column by 2.54. So there are two ways to trim the number of columns from 3 to 2 without losing information.
Is it possible to have numbers populating three columns in a spreadsheet such that any one of the three columns can be recovered from the other two, yet no column can be recovered from any other single column?
Solution. Yes! If the third column is the sum of the first two, then any column can be recovered from any other (either by adding to get the third from the first and second, or by subtracting to get the first from the third and second or the second from the first and third). However, if the first two columns contain different data, then you do need at least two columns to figure out the rest.
In this course, we will develop a more general and mathematically rigorous version of the idea of redundancy developed in the two exercises above.