Notes from the field on data mapping, reference data, registries, and the spreadsheets that quietly cost teams their weekends.
Spreadsheets are fine for two people. Around five, they start drifting in ways most teams do not notice until the drift is costing them weeks.
Hunting missing data does not show up on a sprint board. Nobody assigns it. Nobody estimates it. When you price it out, the numbers get uncomfortable.
Eight fifteen, Monday morning. The dashboard is wrong. Almost every time, the root cause is the same: a mapping was edited late Friday by someone who needed to add one new value before going home.
You have one customer table. The EU team should see only EU customers. The US team should see only US customers. Two ways to do this. Most teams pick the wrong one first.
A bank we worked with took three days to answer one auditor's question about a single cell change. The remediation memo took another week. Reconstruction is not the same as recording.
You ask three departments for the customer list. Three files arrive, none interchangeable. The reason is structural, and the fix is not a master file.
Files have one persistent problem: they are always slightly out of date. And they have a class of failures that an API does not.
Customer asks for a slice of your data. The instinct is custom infrastructure per customer. A few customers in, the maintenance cost exceeds the value of the data. A cheaper pattern exists.
In normal operations a wrong mapping costs hours. During an ERP cutover it costs days, sometimes a quarter. We see the same seven errors repeatedly, and each one is preventable.
The dashboard does not break. It becomes wrong in a way that nothing on the screen flags. The only way to catch it is to know what should not be there.
Mid-sized companies typically sit between 200 and 1,000 reference files. The temptation is to consolidate everything at once. That almost always fails.
Engineers have had real version control for decades. Then they move to a data team and discover the mapping files are in Excel on SharePoint, and version control means "the file has fifty versions and you can hopefully open one of the older ones".
The finance team wants to add a new customer at 10am and have the afternoon report use it. The CSV refreshes at midnight. Here is how to switch without breaking anything live.
MDM and RDM get confused regularly. Vendor marketing blurs them. They solve different problems and the wrong purchase is expensive.