How to convert a matrix to a table in Google SheetsJune 19, 2021
I came across an interesting problem while performing my data engineering duties at work.
I had to import a spreadsheet's content into a relational database. It may sound like a trivial task, but there was a catch: the data was defined in a matrix format.
It looked something like this:
It's a matrix that defines the monthly budget of each department for the first half of 2021.
To import this data into an RDBMS, we need to convert it to a table first.
The transformation should be dynamic, so when the matrix is updated, the changes are automatically reflected in the table.
I haven't found any articles on the subject, so I think my solution is worth sharing.
To make our lives easier, let's define two helper tables - one for the matrix rows and one for the matrix columns.
Let's make them truly dynamic, so whenever a new row or column is added to the matrix, it automatically appears in the helper table.
Counting the matrix rows/columns
First, we need a way to count the number of rows/columns in the matrix.
Luckily for us, the COUNTA function will cover 100% of our counting needs.
It's a function that counts all non-empty cells inside a given range:
counta(matrix!$A:$A)- count the number of rows (i.e. non-empty cells in column A)
counta(matrix!$1:$1)- count the number of columns (i.e. non-empty cells in row 1)
In its current state, the matrix has 9 rows and 6 columns.
Copying the matrix rows
To copy the matrix rows, we can use the ARRAYFORMULA function.
Builds the string
"matrix!A2:A10". This string represents the cell range of the matrix rows.
If a new row is added to the matrix, the range is automatically increased by one row (because of the
- Then, it converts the cell range string to an actual cell range reference with the INDIRECT function.
- Finally, it copies the cell range reference to the current position with the ARRAYFORMULA function.
Generating ID numbers
To generate ID numbers, we can use ARRAYFORMULA combined with the ROW function.
The idea is similar to copying cells.
First, it builds the cell range string
"B2:B10". If a row is added/deleted, this range is automatically adjusted.
- Then, it converts the string to a cell range reference with INDIRECT.
- Next, it takes the row number of each cell (with ROW) and subtracts two from all of them, so the resulting sequence is zero-based. We'll see why this is important later.
- Finally, it copies the sequence to the given position with ARRAYFORMULA.
Calculating the last matrix column
To calculate the last matrix column, we can use the ADDRESS function.
The ADDRESS function takes a row and a column number, and returns a cell reference string.
So, in the formula above:
1is the row number.
counta(matrix!$1:$1) + 1is the column number. It will evaluate to
7in this example. Column 7 is G.
4is the flag for returning the address in the format
Copying the matrix columns
Copying the matrix columns is pretty similar to copying the rows.
That's quite a mouthful! Here's what it does:
First, it builds the
"matrix!B1:G1"cell range string with the help of the ADDRESS function.
If a new column is added, the range is automatically adjusted.
For example, if we added the column
Jul-21, the cell range string would be
- Then, it transforms the cell range string to a cell range reference with INDIRECT, and copies the values to the position with ARRAYFORMULA (as discussed above).
- To make sure Google Sheets handles the values as texts (not as dates), all values are converted to text with the TO_TEXT function.
- Note that ARRAYFORMULA will output the cell range horizontally. To change it to vertical, we need to use the TRANSPOSE function.
Flattening the matrix values
Now that both helper tables are ready, we can start adding columns to the main table.
To make things easier, let's use the FLATTEN function.
The FLATTEN function takes a range of values and flattens them into a single column.
We can use it to extract the matrix values into the main table:
Conceptually, this formula is pretty similar to the previous ones:
First, it builds the
"matrix!B2:G10"cell range string with the help of the ADDRESS function. This range includes all matrix values (but not the row/column names). If someone adds a new column or row to the matrix, it is automatically adjusted.
- Then, the cell range string is transformed to a cell range reference with INDIRECT.
- Finally, FLATTEN flattens the matrix values into a column and outputs them (similarly to ARRAYFORMULA).
Calculating the row IDs
To calculate which matrix row a given item belongs to, divide the ID with the width of the matrix and discard the remainder (keep the integer part only).
- 0, 1, 2 and 3 divided by 4 is 0
- 4, 5, 6 and 7 divided by 4 is 1
- 8, 9, 10 and 11 divided by 4 is 2
The QUOTIENT function is the natural choice for this calculation.
The formula is pretty straightforward:
First, it builds the
"A2:A55"cell range string (referring to the id column) and creates a cell range reference with the INDIRECT function.
- Then, the values are transformed with the QUOTIENT function (according to the rules above).
- Finally, the values are copied to the given cell with ARRAYFORMULA.
Calculating the column IDs
To calculate which matrix column a given item belongs to, divide the ID with the width of the matrix and discard the integer part (keep the remainder only).
- 0, 4 and 8 modulo 4 is 0
- 1, 5 and 9 modulo 4 is 1
- 2, 6 and 10 modulo 4 is 2
- 3, 7 and 11 modulo 4 is 3
For this calculation, the appropriate function is MOD.
Looking up the rows/columns by their IDs
We're almost done!
We already have the row and column IDs. All that's left to do is to look up the actual values from the helper tables... and when it comes to looking up things, the best in the business is: VLOOKUP.
To put it simply, it's a function for performing a linear search in a table. For example:
Translates to: "find the row name where the id is 0".
The function takes 3 parameters:
0- the id to search for
rows!A2:B10- the table to search in
2- the column to return
And it can also be combined with ARRAYFORMULA:
It looks a bit complicated, but there's actually nothing new here:
First, it builds the
"rows!A2:B10"cell range references. The first one refers to the row id column, the second one refers to the rows helper table.
- We pass 3 arguments to VLOOKUP - the ids to search for, the table to search in and the column to return.
- Finally, VLOOKUP finds the row name for each id and the results are displayed with ARRAYFORMULA.
We transformed a matrix into a table using nothing but Google Sheets formulas. Not as easy as it looks!
I hope this post was helpful.