The Horrors of Excel for Data Management
Locus has been preaching on the pitfalls of Excel for a long time. It’s no surprise that one of the worst imaginable errors in Excel that could’ve happened, did. Almost 16,000 COVID-19 cases in England went unreported because Public Health England hit the maximum row count in their version of Excel.
This is not the only example of Excel being misused or being the wrong tool entirely for the job. Excel is not in any way a data management system for complex or vital data. When it comes to sustainability reporting and environmental data management, the evils of the grid are a force to be reckoned with. We have highlighted a few examples that will have you shivering.
Case 1: The Evils of Autofill
Take a look at this harmless-looking chart. It shows monthly electricity consumption for a facility set to report:
Month | Monthly Electricity Consumption (MWh) |
January 2019 | 133,500 |
February 2019 | 122,400 |
March 2019 | 138,900 |
April 2019 | 141,600 |
May 2019 | 141,601 |
June 2019 | 141,602 |
July 2019 | 141,603 |
August 2019 | 141,604 |
September 2019 | 141,605 |
October 2019 | 141,606 |
November 2019 | 141,607 |
December 2019 | 141,608 |
During review, the auditor notices a distinct trend from April to December, indicating false data overwritten by a stray double-click. Eventually, the auditor required re-entering all invoice data for dozens of facilities to correct the issue. Where the original data went and how autofill went astray remains a mystery.
Case 2: The Phantom File Editor
Imagine using a massive spreadsheet with lots of linked calculations for your annual sustainability report. One of the team engineers works on the file to input more data and get it ready for presentation. But in the final steps, they accidentally delete one of the formulas that sum up the indicators. The annual total looks great for the presentation since you’ve effectively removed a portion of your resource consumption, but afterwards you discover the conclusions were incorrectly calculated. How did that error get introduced? The spreadsheet has no auditing capabilities on the individual values, so you may never know.
Excel supports multiple users editing one document simultaneously, but not well. Multiple records are saved, edits are lost, and vital data vanishes, or at best is very hard to recover. The Track Changes feature is not infallible, and over reliance on it will cause hardship.
Case 3: Date of the Dead
Excel has a frustrating insistence of changing CAS numbers into dates, even if they are something like “7440-09-7″ turning into September 7, 7400. If you’re not explicit in your cell formatting, Excel isn’t happy leaving values as they are.
Case 4: Imposter Numerical Values
You meant to type 1.5, but you typed “1..5” or “.1.5”. Does Excel reject these imposter numbers or let you know of a potential error? No, it’s stored in Text format. This can throw off any averages or sums you may be tracking. This minor identity theft can cause a real headache.
Other Significant Cases:
Other data quality issues with using Excel include, but are not limited to:
- Locations with multiple variations of the same ID/name (e.g., MW-1, MW-01, MW 1, MW1, etc.)
- Use of multiple codes for the same entity (e.g., SW and SURFW for surface water samples)
- Loss of significant figures for numeric data
- Special characters (such as commas) that may cause cells to break unintentionally over rows when moving data into another application
- Bogus dates like “November 31” in columns that do not have date formats applied to them
- Loss of leading zeros associated with cost codes and projects numbers (e.g., “005241”) that have only numbers in them but must be stored as text fields
- The inability to enforce uniqueness, leading to duplicate entries
- Null values in key fields (because entries cannot be marked as required)
- Hidden rows and/or columns that can cause data to be shifted unintentionally or modified erroneously
- Inconsistent use of lab qualifiers— in some cases, these appear concatenated in the same Excel column (e.g., “10U, <5”) while in other cases they appear in separate columns
As you can see, the horrors of Excel are common, and terrifying. Without a proper system of record, auditing features, and the ability for data to vanish into the ephemera, Excel offers little in the way of data security and quality for organizations managing vital environmental and compliance data. Many are learning firsthand the superiority of database management systems over spreadsheets when it comes to managing data. Now is the time to examine the specific shortcomings of your current system and consider your options.