Artikkeli
25. huhtikuuta 2025 · 4 min lukuaikaIt appears enterprise software has its own laws — one of them being the inevitable entanglement with Excel and Word. From timestamp quirks to formatting nightmares, here’s a field report from two veteran developers on the strange gravity these tools exert on every enterprise system, and how our computer science courses never prepared us for any of it.
The other day, a new Law of Nature was coined at Nitor. It is called Saalo's Law of Software Development, and it reads as follows:
Every piece of enterprise software eventually evolves into either reading or writing Excel files.
According to our experience, the validity of this Law is beyond dispute. This always happens down the road. Usually, your piece of enterprise software manages data crucial to your enterprise. Eventually, business users will want that data exported so that it can be analysed further. Perhaps they will want to read in some product data from an Excel file exported from another program. Hey presto, we need Excel export or import!
Don’t get us wrong, there is nothing wrong with this. Excel is a great spreadsheet, and Word is the go-to word processor for many Mac users, too. From a software developer’s point of view, exporting in Excel is a natural choice, since all Office formats are based on an open standard called Office Open XML – championed, of course, by Microsoft. It is both the ECMA and ISO standard.
In practice, this means that you can, for example, generate Excel and Word files to your heart’s content, and all will be well in the world. This also means that there are plenty of libraries available for the job in various languages.
Murky waters and erroneous timestamps
However, this is also where we find ourselves in murky waters: some of the libraries are quite awful indeed. All libraries can do a straightforward data export quite easily, with headers and data types correctly set. The fun starts if you want to have formulas, graphs, tables and the like in your spreadsheet. And trust us on this one, business users will want these.
Just as an example: if you want to have an Excel Table with AutoFilter on your sheet, you need to create an AreaReference, then apply AutoFilter on your Table with the AreaReference, without forgetting to specify that your table should use “TableStyleMedium9” styling. That is, you are typically directly exposed to Excel internals, which are… interesting, as we shall see. Enter Excel dates and timestamps.
Excel dates differ from the dates we use in the real world. Plus, the numbers are not always correct. In “Excel dates” the year 1900 is a leap year – which is not correct. Excel is not really to blame here, since this was purposefully built into Excel to be backwards compatible with Lotus 1-2-3. This matters because, internally, Excel stores dates as integers. So – the “dates” in your Excel files before February 28th, 1900 are off by one! Oh, and did we mention that Mac Excel used a different date numbering in the past from Windows Excel? They were not off by one day, but 1462 days!
In Excel, a timestamp is “naturally” a floating-point number. As an example, in Excel lingo, 44561.75 means September 5th, 2022, 18:00. Dates and timestamps being internally numbers in Excel might explain some of the strange behaviour you may have witnessed.
In addition to the less-than-user-friendly libraries and conventions, collecting the data required for the export can be quite an exercise in itself, turning a simple “Excel export” into an interesting array of solutions. Not so long ago, one of us was faced with exporting a data table in the user interface as an Excel table. For various reasons, this seemingly simple task involved five different systems communicating with each other, using three different communication protocols! Next time you wonder why this Excel export is so slow – wonder no more.
The stuff of nightmares
If you are down on your luck, you will need to write something that imports Word files. A “template”, perhaps. This is one of the worst nightmares in software development. The Word file format is not “interesting”, it is plain awful. In Word, you can have styling on anything. You even have styling in words that seemingly have no styling. And styling means that you don’t get just the text, you get the styling, too. To clarify, a four-letter word may include styling, two letters, then some additional styling, then another two letters, and finally some more styling. Word doesn’t just preserve text; it preserves its internal chaos.
Importing all this amounts to an exercise in pure frustration. Back in the day, there was a word processor named WordPerfect (well, it still exists!), which could show you absolutely everything your document had, including all formatting. Not so with the Word OOXML format.
Despite everything we learned in computer science, nothing prepared us for the reality of Excel and Word in enterprise software. Schools and universities should perhaps prepare their students better for these eventualities. You will be writing Excel functionality at some point. If you are lucky, that is. If you are not, you are importing Word. There should perhaps be a mandatory course called “Excel import/export 101”, covering Excel operations in Typescript, Python and Java. During our studies, we took some courses whose lessons we’ve never needed in our long careers in enterprise software development. This one we surely would have needed!