IT

Donating old computers

I recently upgraded my laptop, and donated my old (but still functional) one to StreetTech, a group that trains disadvantaged youths so they can obtain certifications that will get them jobs in IT. I found them using the Cristina foundation, an organization that matches donors to groups like StreetTech.

If you are a compulsive computer shopper like myself, who has functional but not-quite bleeding edge equipment lying around gathering dust, or an IT manager in a company looking to upgrade its computer fleet, please consider donating them this way rather than putting them on eBay. It’s certainly a much better way of disposing of old computers than this one in China.

Windows configuration management

The key to running a reasonably reliable Windows system is configuration management. A typical Windows will have tens of thousand of files and hundreds of software components installed. It’s a numbers game: the more components interacting on the system, the greater the probability that two of them will conflict.

Windows gets a lot of heat from Unix zealots (I am one myself) for being unreliable, but any operating system that attempts to comprehensively support all the wide variety of oddball peripherals and software out there is going to experience the same integration problems; certainly, Linux is converging towards Windows in terms of the number of security advisories released. Of course, using an obsolete version like 98 or ME without modern protected memory is a prescription for disaster, but the NT-based versions, i.e. 2000 and XP can have reasonable reliability, at least for desktop usage.

The rest of this article describes my strategy minimizing entropy in my Windows systems.

Separation

The way I approach my Windows configuration is to establish a clear separation between Operating System/Applications and Data.

The OS and Applications do not mean anything special to me other than the amount of work required to reinstall. Data represent actual productive work on my part and must be protected. I separate OS/Applications from data clearly, and make regular checkpoints of OS/Apps after installation and every now and then before I make major changes like installing an application or OS service packs. If my system becomes unstable at some point in time, I can easily revert to a known stable configuration.

The specific tools used to provide this backup of system configuration are a question of personal choice. A number of commercial software like Roxio GoBack or Powerquest SecondChance (since discontinued) purport to do this, as does Windows XP.

I personally don’t trust these programs all that much, and prefer to make a total backup of my system using Norton Ghost. To ensure my data is not erased when I restore from a Ghost image, I have at least two partitions on each of my systems:

  1. C: for Windows and applications (NTFS)
  2. D: for my personal data (NTFS on desktops, FAT32 on laptops)
  3. I: for my Ghost images on desktops (FAT32), on a different drive than C: so I can survive a drive failure

That way I can destroy C: at my leisure, in the worst case I will have to reinstall a couple of applications and reapply some settings that were lost since the last release. My data sits safely on the D: partition (and backups).

Backup strategy

I don’t trust CD-R media or removable drive cartridges for backup purposes, and tape is either too slow or too expensive in the case of DLT. I keep full duplicates of my data partition and some Ghost images on a pair of 100GB external FireWire drives, one I keep at home and one at work. I rotate them weekly so even if my house burns down I will have lost in the worst case only a week of work or photos.

Limitations of this method

This technique doesn’t work very well if the underlying hardware configuration changes too often, and assumes a linear install history. If I install software A, then B, then C, I can go back from A+B+C to A+B or A but not to A+C.

How-to

This section shows how to extricate the data from OS/Apps which Windows and most apps usually try and commingle. The TweakUI utility from Microsoft is an absolute must-have. It is a control panel that allows you to change the behavior of the OS in vital ways that are not accessible otherwise short of editing the Registry directly.

Outlook

Outlook files are the single largest data files on my system (Ghost images do not count). By default, Outlook will create its PST file in the Documents and Settings directory. You can either relocate this directory to the D: partition, or create a new PST file in a location of your choice and use Advanced properties in the properties dialog for the PST in Outlook to make it the default location for POP delivery, after which you can close the old one and delete it.

My Documents

And derivatives like “My Pictures”, can be relocated to D: using TweakUI.

Favorites

For IE users, TweakUI allows you to relocate the Favorites directory to another place than the default. This way, you will not lose your favorites when you have to restore your system.

For Netscape/Mozilla users, the Profile manager utility allows you to set up a new profile with files that are stored where you want, e.g. on the D: partition.

Peer-to-peer collaborative spam filtering

An interesting product from a young company called Cloudmark addresses the spam explosion.

It works as an Outlook add-in that allows you to flag a message as spam and uploads a signature to the network, and thus helps other Cloudmark users to block the spam, in effect acting as a distributed peer-to-peer Brightmail.

It remains to be seen whether this system will be resistant to denial of service or poisoning attacks.

Update (2002-07-10): I’ve been trying it out for three weeks, and so far it looks pretty good. Out of 353 spam I’ve received, it successfully blocked 233. It also gave 3 false positives from permission marketing companies (Art.com and MyPoints), which is not absurd as they have very poor optout management. But it also flagged an IEEE newsletter as spam, which seems a little bit excessive. So, use with precaution.

Rules of thumb

Typography

The optimal line length for readability is around 10-12 words (Source by Ruari McLean)

Telecoms, networking and IT

The ratio of peak load to average load in a service with diurnal activity variations is approximately 3 to 1. Source: my own empirical observation from Wanadoo access logs and France Telecom telephone call usage logs.

Probability of a Web page having X incoming links referring to it: P = X ^ -2.1 (Source)

When specifying computers, for balanced performance provision one gigabyte of RAM per gigahertz per core/thread.

Any standard making use of ASN.1 is a piece of junk.

You only get the benefits of statistical multiplexing or compression once, and it should only be done in one layer. Any other layers attempting to do the same only add cost, complexity, brittleness, overhead and latency.

When designing high-availability systems, fail-over is not the hard part, falling back is.

Photography

For most ordinary lenses, optimal sharpness is around f/8. For high-quality lenses, it is one or two stops below full aperture. Only the very best lenses are diffraction-limited and offer optimal performance at full aperture.

Camera light meters are calibrated for 12% gray. Common gray cards are 18% gray, so if you use one for metering, you should open up one half stop to compensate. (Source)

The human eye is a 6-7 megapixel sensor. The monocular field of view is 180 °, the binocular field of view is 120-140°, and the normal focus of attention spans a 45° field of view.

Avoid Kodak products like the plague. Those products they make that are actually decent (i.e. the engineers managed to sneak them past the bean counters) soon get adulterated (like Tri-X) or discontinued (like PhotoCD or their medium-format digital backs). Prefer Fuji, Agfa or Ilford.

Fighting spam

Spam has become a global scourge in terms of the sheer volume of spam out there which is reducing the signal to noise ratio of email. While being careful with email addresses (using throwaway Yahoo or Hotmail accounts, for instance, when posting online) goes some way to minimizing the volume of spam, it doesn’t remove it altogether.

Bob Metcalfe, the inventor of Ethernet, postulated what is now known as Metcalfe’s law: the value of a network is proportional to the square of the number of users. The flaw in this law is that it does not take into account a law of diminishing returns: once all of your acquaintances are on the network, each additional user adds only very little value, whereas each additional bad apple destroys a constant value due to the time they waste, and thus, even if bad apples are a small minority, they will eventually drive the value of the network down in a sort of tragedy of the commons. So the value of an email network is going to be some constant times the number of your acquaintances minus the number of spammers. At some tipping point, the rising number of spammers will make this value negative.

What can be done?

Brightmail

Brightmail is a company that sells spam filtering services to ISPs and large corporations. They basically set up unused email addresses and spread them around where they can be picked up by spammers’ email address gathering robots: search engines, newsgroups. Any email that goes to such an address is bound to be spam. Brightmail monitors these mailboxes and whenver they find a new piece of spam, they create a filter specifically for it. If this is done sufficiently quickly, they can nip a mass emailing batch in the bud before it has had the time to hit too many mailboxes. The system is also very reliable and unlikely to cause false positives (a legitimate email being flagged as spam). Unfortunately, this is very labor intensive and thus costly, and will be limited to those with deep pockets.

A company called Cloudmark, founded by an ex-employee of Napster, offers what is essentially a peer-to-peer distributed version of Brightmail. It remains to be seen how resistant that system can be to denial of service attacks.

Legislation

Legislation against spam should be introduced, but is only a long-term solution as spammers will simply relocate to countries without anti-spam laws. Even common crimes like theft are not that well enforced across borders due to the cumbersome procedures involved with Interpol or international judiciary cooperation.

Pricing

The reason spammers can blast away hundreds of thousands or even million of emails is that the marginal cost to them is practically nil. Some people have advocated putting a per-email charge to make spamming economically no longer viable. I have been responsible for building large-scale billing systems at Wanadoo, France’s largest ISP, and I can tell you building a billing system on the scale of the whole Internet is simply not feasible from a project management point of view.

Even if it were, it would not be desirable because in many ways it would be throwing the baby with the bath water. Internet email is successful because it is so cheap, unlike the price-gouging of earlier messaging systems like EDI or X.400. Andrew Odlyzko has written a series of very persuasive papers that show how usage pricing stunts the development of networks and thus prevents society from realizing their full benefits: http://www.dtc.umn.edu/~odlyzko/doc/networks.html

Certification

The main problem with spammers is they are anonymous, and that Internet email with its limited support for cryptographically strong authentication makes it easy for them to hide. S/MIME or OpenPGP signatures are not very commonly deployed because they are cumbersome and this outweighs their advantages (national security agencies also dislike anything that makes crypto more commonplace, but that is another story).

Spammers, however, make digital signatures more attractive by increasing the cost of not using them. I believe when the tipping point I mentioned above is reached, people will only accept email that is signed by someone they already know (someone who is already in their address book) or by someone whose signature is certified by a trusted third party not to be a spammer (probably the same companies that sell SSL certificates that make electronic commerce possible, Verisign being the most commonly known of them).