Mylos

Effective anti-spam enforcement

The European Union E-Privacy directive of 2002, the US CAN-SPAM act of 2003 and other anti-spam laws allow legal action against spammers. Only official authorities can initiate action (although there are proposals to set up a bounty system in the US), but enforceability of these statutes is a problem, as investigations and prosecutions are prohibitively expensive, and both law enforcement and prosecutors have other pressing priorities contending for finite resources. Financial investigative techniques (following the money trail) that can be deployed against terrorists, drug dealers and money launderers are overkill for spammers, and would probably raise civil liberties issues.

There is an option that could dramatically streamline anti-spam enforcement, however. Spammers have to find a way to get paid, and payment is usually tendered using a credit card. Visa and Mastercard both have systems by which a temporary, one-time use credit card number can be generated. This service is used mostly to assuage the fears of online shoppers, but also provides a solution.

Visa and Mastercard could offer an interface that would allow FTC investigators and their European counterparts to generate “poisoned” credit card numbers. Any merchant account that attempts a transaction using such a number would be immediately frozen and its balance forfeited. Visa and Mastercard’s costs could be defrayed by giving them a portion of the confiscated proceeds.

Of course, proper judicial oversight would have to be provided, but this is a relatively simple way to nip the spam problem in the bud, by hitting spammers where it hurts most – in the pocketbook.

Shutterbabe

Deborah Copaken Kogan

Random House, ISBN: 0375758682  PublisherBuy online

coverI picked up the hardcover edition of this book from the sale bin at Stacey’s Booksellers, as the Leica on the cover just beckoned to me.

This is an autobiography by an American woman, almost a girl, who moved to Paris, fresh out of college, to break into the tightly-knit (and not a little macho) community of photojournalists. Who knows, I might even have crossed paths with her when I studied in Paris. She was certainly not the first female war correspondent, Margaret Bourke-White springs to mind (even though she is not referred to anywhere in the book), but women were still a rarity, specially one as young and inexperienced. She started as a freelancer and eventually ended up working for the Gamma agency, one of the few independent photo agencies left.

For some unknown reason, many of the prestigious photo press agencies are based in Paris, starting with Magnum, founded simultaneously in Paris and New York by Robert Capa (the man who took the only photographs of D-Day), Henri Cartier-Bresson, George Rodger and Chim Seymour. Others like Gamma, Sygma and Sipa followed, but most have been acquired since by large media conglomerates like Bill Gates’ Corbis. The move to digital, with the corresponding explosion in equipment costs is one reason – the independent agencies simply couldn’t compete with wire services like Reuters or Agence France Presse (AFP), the latter being government-subsidized. Saturation is probably another, and press photographers struggle to make a living in a world with no shortage of wannabes. Just read the Digital Journalist if you are not convinced.

Shutterbabe is not a mere feminist screed, however. Engagingly written, with very candid (sometimes too candid) descriptions of the sexual hijinks and penurious squalor behind her trade, this book is a pleasurable read and features a varied rogues’ gallery ranging from the cad (her first partner) to the tragically earnest (her classmate who is executed by Iraqi soldiers while covering Kurdish refugees). It only touches in passing on photographic technique, as the general public was clearly the intended audience, but more surprisingly, does not include that many of her photos either. The main thread reads like a coming of age story, with the young (25 year old at the time) woman moving on from her thrill-seeking ways and discovering true love and marriage in a life marked by death: deaths of friends and colleagues, victims of strife and war in Afghanistan or Russia, but also orphans dying of neglect in Romania.

A photojournalist is always in a rush to get to the next assignments, and she recognizes her involvement with her subjects’ culture as superficial, unlike that of her locally based correspondent colleagues or those who would nowadays be called photoethnographers. There is more humanity in a single frame by Karen Nakamura or Dorothea Lange than in all of Deborah Copaken’s work. Much like her idol Cartier-Bresson’s work, there is a certain glib coldness, perhaps even callousness to her attitude. On her first war coverage, an Afghan who is escorting her (so she can make her ablutions in privacy) has his leg blown off by a landmine, and she hardly elicits any concern for the poor soul. Granted, this is the “Shutterbabe”, not the reborn Mom. but it is hard to imagine one’s fundamental personality changing that much.

The author is not uncontroversial. She featured in a nasty spat with Jim Nachtwey, one of the most famous photographers alive, and who is obliquely referred to in Shutterbabe‘s Romanian chapter (where she implies she found out first about the terrible situation in the orphanages, and nobly tipped him so the story could come out). The follow-ups are here and here.

Her observations of the one culture she is immersed in, the French one, seldom go beyond the realm of cliché. Glamorous but feckless and chauvinistic Frenchmen! Sexpot Frenchwomen! Narcissistic French intellectuals!

In the end, she returns to the United States with her husband, and moves into an equally short-lived career in TV production to support her family. A happy ending? One hopes. I for one am curious about how her children will react to the book when they are old enough to read it.

Why IPv6 will not loosen IP address allocation

The current version of Internet Protocol (IP), the communications protocol underlying the Internet, is version 4. In IPv4, the address of any machine on the Internet, whether a client or a server, is encoded in 4 bytes. Due to various overheads, the total number of addresses available for use is much less than the theoretical 4 billion possible. This is leading to a worldwide crunch in the availability of addresses, and rationing is in effect, specially in Asia, which came late to the Internet party and has a short allocation (Stanford University has more IPv4 addresses allocated to it than the whole of China).

Internet Protocol version 6, IPv6, quadrupled the size of the address field to 16 bytes, i.e. unlimited for all practical purposes, and made various other improvements. Unfortunately, its authors severely underestimated the complexity of migrating from IPv4 to IPv6, which is why it hasn’t caught on as quickly as it should have, even though the new protocol is almost a decade old now. Asian countries are leading in IPv6 adoption, simply because they don’t have the choice. Many people make do today with Network Address Translation (NAT), where a box (like a DSL router) allows several machines to share a single global IP address, but this is not an ideal solution, and one that only postpones the inevitable (but not imminent) reckoning.

One misconception, however, is that that the slow pace of the migration is somehow related to the fact you get your IP addresses from your ISP, and don’t “own” them or have the option to port them the way you now can with your fixed or mobile phone numbers. While IPv6 greatly increases the number of addresses available for assignment, this will not change the way addresses are allocated, for reasons unrelated to the address space crunch.

First of all, nothing precludes anyone from requesting an IPv4 address directly from the registry in charge of their continent:

  • ARIN in North America and Africa south of the Equator
  • LACNIC for Latin America and the Caribbean
  • RIPE (my former neighbors in Amsterdam) for Europe, Africa north of the Equator, and Central Asia
  • APNIC for the rest of Asia and the Pacific.

That said, these registries take the IP address shortage seriously and will require justification to grant the request. Apart from ISPs, the other main kind of allocation recipients are large organizations that require significant numbers of IP addresses (e.g. for a corporate Intranet) and that will use multiple ISPs for their Internet connectivity.

The reason why IP addresses are allocated mostly through ISPs is the stability of the routing protocols used by ISPs to provide global IP connectivity. The Internet is a federation of independent networks that agree to exchange traffic, sometimes for free (peering) or for a fee (transit). Each of these networks is called an “Autonomous System” (AS) and has an AS number (ASN) assigned to it. ASNs are coded in 16 bits, so there are only 65536 available to begin with.

When your IP packets go from your machine to their destination, they will first go through your ISP’s routers to your ISP’s border gateway that connects to other transit or final destination ISPs leading to your destination. There usually are an order of magnitude or two fewer border routers than interior routers. The interior routers do not need much intelligence, all they need to know is how to get their packets to the border. The border routers, on the other hand, need to have a map of the entire Internet. For each block of possible destination IP addresses, they need to know which next-hop ISP to forward the packet on to. Border routers exchange routing information using the Border Gateway Protocol, version 4 (BGP4).

BGP4 is in many ways black magic. Any mistake in BGP configuration can break connectivity or otherwise impair the stability of vast swathes of the Internet. Very few vendors know how to make reliable and stable implementations of BGP4 (Cisco and Juniper are the only two really trusted to get it right), and very few network engineers have real-world experience with BGP4, learned mostly through apprenticeship. BGP4 in the real scary world of the Internet is very different from the safe and stable confines of a Cisco certification lab. The BGP administrators worldwide are a very tightly knit cadre of professionals, who gather in organizations like NANOG and shepherd the Net.

The state of the art in exterior routing protocols like BGP4 has not markedly improved in recent years, and the current state of the art in core router technology just barely keeps up with the fluctuations in BGP. One of the control factors is the total size of BGP routing tables, which has been steadily increasing as the Internet expands (but no longer exponentially, as was the case in the early days). The bigger the routing tables, the more memory has to be added to each and every border router in the planet, and the slower route lookups will be. For this reason, network engineers are rightly paranoid about keeping routing tables small. Their main weapon consists of aggregating blocks of IP addresses that should be forwarded the same way, so they take up only one slot.

Now assume every Internet user on the planet has his own IP address that is completely portable. The size of the routing tables would explode from 200,000 or so today to hundreds of millions. Every time someone logged on to a dialup connection, every core router on the planet would have to be informed, and they would simply collapse under the sheer volume of routing information overhead, and not have the time to forward actual data packets.

This is the reason why IP addresses will continue to be assigned by your ISP: doing it this way allows your ISP to aggregate all its IP addresses in a single block, and send a single route to all its partners. Upstream transit ISPs do even more aggregation, and keep the routing tables to a manageable size. The discipline introduced by the regional registries and ISPs is precisely what changed the exponential trend in routing table growth (one which even Moore’s law would not be able to keep up with) to a linear one.

It’s not as if this requirement is anti-competitive, unlike telcos dragging their feet on number portability – the DNS was precisely created so users would not have to deal with IP addresses, and can easily be changed to point to new addresses in the event of a change of IP addresses.

Threadframe: multithreaded stack frame extraction for Python

Note: threadframe is obsolete. Python 2.5 and later include a function sys._current_frames() that does the same thing. Threadframe is only useful for Python 2.2 through 2.4.

Rationale

I was encountering deadlocks in a multi-threaded CORBA server (implemented using omniORB). Debugging using GDB gave me too low-level information, and what I needed was an equivalent of the GDB command “info threads”. There was no such facility available from within Python’s standard library, so I rolled my own.

David Beazley added advanced debugging functions to the Python interpreter, and they have been folded into the 2.2 release.

I used these hooks to build a debugging module that is useful when you are looking for deadlocks in a multithreaded application. It basically has a single function that will return a list of the stack frames for all Python interpreter threads in the process.

Guido van Rossum added in Python 2.3 the thread ID to the interpreter state structure, and this allows us to produce a dictionary mapping thread IDs to frames.

This functionality is now integrated in Python 2.5’s batteries-included sys._current_frames() function.

Of course, I disclaim any liability if this code should crash your system, erase your homework, eat your dog (who also ate your homework) or otherwise have any undesirable effect.

Building and installing

Python 2.2 or later is required. Thread ID to frame dictionary extraction is only available in Python 2.3 and later, and will generate a NotImplementedError if used from 2.2.

Download the source tarball threadframe-0.2.tar.gz. You can use the Makefile or directly with the setup.py script. I have built and tested this only on Solaris 8/x86 and Windows 2000, but the code should be pretty portable. There is a small test program test.py that illustrates how to use this module to dump stack frames of all the Python interpreter threads. A sample run is available for your perusal.

For Windows users, I have available pre-compiled binaries, built using Mingw32 and GCC 2.95.2. Just copy the file threadframe.pyd in any location in your Python path and you should be able to run the test script test.py.

Windows binaries
Python versionDownload
2.2.1 threadframe.pyd
2.3.4 threadframe.pyd
2.4.x threadframe.pyd

License

This code is licensed under the same terms as Python itself.

Change history

Release 0.2 (2004-06-10)

Distutils based setup.py contributed by Bob Ippolito. Bob also noticed that thread_id was added to the Python interpreter state, and contributed a patch to get a dictionary mapping thread_ids to frames instead of a list.

Release 0.1 (2002-10-11)

Initial release for Python 2.2: threadframe-0.1.tar.gz