Information Lifecycle Management and the cost of forgetfulness
Maxwell’s demon is a classic thought experiment that illustrates the second law of thermodynamics. The conundrum drove Ludwig Boltzmann to suicide. Leo Szilard, a contemporary and friend of Einstein, and one of the first proponents of the atomic bomb, provided the first refutation in 1929 – Maxwell’s demon appears to create energy from scratch, but what it is really doing is transferring entropy to the outside world.
In his analysis, Szilard considered alternative demons that would overcome his objection, and for one of them, now known as the Szilard Engine, his interesting conclusion is that it cannot work because forgetting information from memory in itself incurs thermodynamic costs. To make a real-world analogy – you may pay to get information in the form of your daily newspaper, but disposing of all that paper also incurs real costs in the form of garbage hauling taxes, even if you are not aware of them. In the cosmic order, getting rid of data is as important as acquiring it in the first place.
One of the buzzwords of the day in IT is Information Lifecycle Management, This basically means using a fancy database to track information assets, how they are stored, backed up and disposed of in accordance to retention policies and various legislative mandates like the Sarbanes-Oxley law. Companies like Microsoft discovered to their dismay the consequences of having incriminating information dragged into court under subpoena.
It seems the price of forgetfulness is eternal vigilance…
A side note – one of the things that seems consistently forgotten whenever designing a database is archiving and deleting old historical data – the data just keeps accumulating, usually until the database becomes obsolete and is decommissioned or the original designers have moved on to other jobs. In large scale databases, the efficient archiving of data requires partitioning, and is several orders of magnitude harder if the partitioning was poorly designed in the original data model. For instance, if some classers of historical data have to be held for longer retention period than others, make sure they are stored in different partitions as well, otherwise separating them will require lengthy batches. If you are specifying a database today, for your successors’ sake, plan for the orderly disposal of data once it is no longer relevant.