Sharing Small Data
© 26 Apr 2012 Luther Tychonievich
Licensed under Creative Commons: CC BY-NC-ND 3.0
other posts


One common approach to reusing information, rather than algorithms.


In Windows operating systems there’s something called the “‍Registry‍”. In Gnome there’s the “‍GConf‍”. But there’s nothing like these in KDE or OS X. What are these things, why do they exist, and why do only some systems have them?

The key idea behind the registry/gconf is that sometimes data needs to be shared between many programs. If I change the language on my computer screen, I’d like all of my programs to know that. I’d also like all of them to know what program I like to use to open .png files, and how large my screen is, and so on.

Shared data can be roughly broken down into two kinds. There are options-style data, relatively short bits of information stored for a potentially large number of keys in an associative array. And there is data that can be lengthy and is stored in individual files. An example of the former is the preferred application for a particular file format; and example of the latter is the definition of a particular font. I’ll address only the former in this post.

How should such data be stored? Windows and Gnome say “‍it’s an associative array, so we’ll store it as a single associative array that every program can access.‍” That associative array is called the registry in Windows, and GConf in Gnome; I’ll call it simply the AA in this post. The decision to have an actual shared AA has both good and bad consequences. On the good side, all programs know where to find data and any advances in the implementation of the AA benefit all programs. On the bad side, programs can mess with each other by accidental interactions through the AA, and the AA often gets full of leftover bits placed by programs no longer installed or written by a program but never read by anyone.

Systems that lack a single AA revert to a set of files in fixed locations in the directory tree that contain the same information in one form or another. This requires more disk space and wastes effort as multiple programs read and re-read the same files, but it makes the sharing of data more transparent to the user.

There are other approaches to sharing data, of course. Command-line shells introduce something called an environment, which is sort of like a set of globally-visible variables. Graphical state is often accessed by subroutines designed for that purpose within a particular set of graphical subroutines.

As with all reuse, the main problem with all of these approaches is that the same feature that makes sharing useful makes it a potential source of conflict between programs: what one program does, another can see. The Registry, GConf, configuration files, specialized subroutines, environment variables… all are attempts to control this potential danger without removing the benefit of sharing. And, for the most part, they all work pretty well.

Looking for comments…

Loading user comment form…