Just Set up a Tor Relay

June 20th, 2009

I’ve just set up a Tor relay in the hope that every extra little bit of bandwidth will help the protesters in Iran preparing for today’s big rally in Enghelab Square, Tehran. If you’re technically minded and can spare some bandwidth then please consider doing the same. If you’re not technically minded and want a simple, fictionalised and very readable introduction to this kind of technology and why it matters, consider Cory Doctorow’s book Little Brother.

The Manuscript Found in Saragossa

March 1st, 2009

I’ve recently finished Jan Potocki’s “The Manuscript Found in Saragossa”, a fantastical novel written between 1800 and 1815 that consists of a series of stories within stories told over a period of sixty-six days in the manner of Arabian Nights or The Decameron.

The principal narrator, Alphonse van Worden, is the son of a man so pathologically obsessed with the finer points of honour manifested in the “tribunal of blood” that he thinks nothing of fighting a dozen duels in a day and punctiliously records the history of each in his notebook. His mother takes aristocratic traits to a similarly absurd degree. Having decided that the French are beneath her, she endures her stay in Paris by maintaining an absolute disdain: “She made it a rule not only not to learn French but also never to listen to it when it was spoken.”

Abandoned by his valet and mulateer while traversing the mountain range of Sierra Morena, a land rumoured to be inhabited by smugglers, bandits, murderous gypsies and terrifying ghosts, Alphonse finds himself bewitched by a pair of beautiful women who may be his cousins or may be succubi whose mysterious appearances and disappearances are woven into a narrative composed of encounters with a wide range of characters united in their love of story telling.

Stories interrupt stories in Tristram Shandy-esque digressions as each narrative introduces further characters who in turn narrate their own tales, the stories recursing back in on themselves until narratives are four or five levels deep and the listeners at the outer level announce themselves as confused as the reader would be in danger of becoming had Potocki not exercised considerable skill in managing the various threads.

The book’s end is slightly disappointing, perhaps inevitably since the beauty of the book lies in the digressions not in the forward impulse of the underlying plot, but the charm and humour of the stories carries the day.

Information is Not Knowledge

February 4th, 2009

Spent all day in meetings about the proposed ‘Bad Bank’ and kept wondering what their uniform might look like.

Alistair Darling

havent signed tonys card yet, can’t think of anything funny to put

Gordon Brown

Maybe it’s an age thing; perhaps I’ve already crossed that subtle threshold after which you become unable to understand the appeal of new fads - but I don’t understand the purpose of Twitter. The spoof tweets allegedly from the men currently crash landing the British economy are funny because of the frighteningly plausible, Pooteresque banality of their thoughts. The tweets of a nobody, however, lack such saving irony.

There’s a fine line between spontaneous and knee-jerk, between wit and bigotry, between the simple statement of fact and the simplification that distorts the truth.

We live at a time where politicians clash horns using sound bites and real policy is rarely debated, where newspapers and other media channels uncritically repeat information known to be false (”we only use ten percent of our brains”, “hair and fingernails continue to grow after death”, “house prices always go up in value”). It’s an era uniquely rich in data about the natural world and yet culturally we lack the critical facilities to evaluate this information, making us ripe pickings for every charlatan who appears on television dressed as an expert.

In such an age, do we really want our thoughts to be restricted to what can be expressed in 140 characters?

C++ Exception Safety Guarantee

January 14th, 2009

After stumbling on this one in a recent telephone interview, I thought I’d refresh my memory.

The following from Anthony Williams’s ACCU Overload Journal article last August provides a succinct summary of the topic.

The Abrahams Exception Safety Guarantee

These guarantees were first documented by Dave Abrahams when the C++ Standards committee were working on the 1998 C++ Standard. The idea is that code should provide one of the three guarantees - if it doesn’t, then an exception occuring in your code will result in leaked resources or corrupt data structures or both. The guarantees are:

The no-fail (or no-throw) guarantee

This is the strongest of all guarantees. A function that provides this guarantee will not throw any exceptions, and will not fail. All destructors should provide this guarantee, as should important operations like swap which provide the building blocks for the code that uses them to provide suitable exception safety guarantees.

The strong guarantee

A function that provides this guarantee is all or nothing: if it fails, then any effects are rolled back so the state of the data structure is the same as it was on entry. This requires that the function doesn’t do anything irreversible (like perform I/O), and that there are suitable operations that provide the no-fail guarantee which can be used to commit or roll back the changes.

The basic guarantee

This is the basic level you should strive for in all code: if a function fails, then it must leave the data structures in a valid state, even if that state differs from the original. For example, failure to insert a new item into a container must leave the container in a valid state, even if all the existing items have been deleted.

Any code that doesn’t provide even the basic guarantee is not exception safe.

Exceptions Make for Elegant Code, ACCU Overload Journal #86, August 2008

Writing exception safe code is hard. If you have any doubts about that statement consider the following challenge from Herb Sutter: Guru of the Week 8. In Exceptional C++, Sutter expands on his earlier post and provides the following guidelines -

Observe the canonical exception-safety rules: (1) Never allow an exception to escape from a destructor or from an overloaded operator delete() or operator delete[](); write every destructor and deallocation function as though it had an exception specification of “throw()“. (2) Always use the “resource acquisition is initialization” idiom to isolate resource ownership and management. (3) In each function, take all the code that might emit an exception and do all that work safely off to the side. Only then, when you know that the real work has succeeded, should you modify the program state (and clean up) using only nonthrowing operations.

The standard for the C++ Standard Library contains the guarantee that no destructor operation defined in the library itself will throw an exception. The C++ Standard, however, does not enforce this requirement for all C++ code so your compiler will not prevent you from creating a class that breaks the golden rule: never throw exceptions from a destructor.

Anyone looking for more information should consult the links above and also look at Abrahams’s essay: Exception-Safety in Generic Components: Lessons Learned from Specifying Exception-Safety for the C++ Standard Library.

Editing and Removing Pages from PDF Documents

January 11th, 2009

I’ve been using Google Docs to store and edit my CV, however I’ve recently run into a number of problems when exporting the file.

Word documents are invariably mangled. This is a common problem because Word format is a de facto standard by virtue of the number of companies using Microsoft Word but formatting varies from version to version, platform to platform so it’s not really a standard at all. It’s more frustrating than web development at times. You spend hours laying out your CV on a Mac only to find that it looks like hell when opened and printed on a PC.

Recruiters and employers that use text-processing algorithms to assess candidates for positions hate PDF documents but for anyone who cares about presentation and wants to guarantee that their potential employers see their CV exactly as they intended, there is no other choice.

Unfortunately, Google Docs insists on adding a line feed to the last line of every document and when you export the file as a PDF this can result in a blank page being appended to the end.

Fortunately you can edit PDFs on linux using pdftk.

For example, to create a new two page PDF from the first two pages of an original try:

pdftk originalCV.pdf cat 1-2 output editedCV.pdf

The application enables many more useful ways to manipulate PDF documents. Read the man pages for further details.

Connecting to Ubuntu from iBook G4 Using NxMachine

January 3rd, 2009

I’ve been using my girlfriend’s iBook recently and am very impressed by it.

The Good
Things I love include the fact that it feels like unix, the build quality of the hardware itself and the failsafe reliability of its sleep/resume.

(I’ve long forgotten the number of hours I spent a couple of years back disassembling then recompiling the buggy DSDT on my old IBM Thinkpad T20 to fix all the warnings and errors before linking it against a patched kernel in order to get ACPI working. Sure it gave me a taste of the days “when men were men and wrote their own device drivers” but sometimes it’s nice when things Just Work.)

The Bad
Things that niggle include the lack of right-mouse button, the unfamiliar keyboard layout and the absence anywhere on the keyboard of a hash/pound key which makes writing bash scripts a little tricky (it’s ALT+3 but for some reason this isn’t printed on the key itself).

The Ugly
Things that seriously annoy include the monolithic, closed nature of the operating system that requires you to upgrade the whole damn thing in order to use a more recent version of Java.

Early observations aside, connecting to my Ubuntu box using NxMachine was pretty straightforward.

The mac client is straightforward to download and install. Apt-get makes setting up the server on the linux box utterly painless. The instructions on the site are more than sufficient for getting the connection up and running.

Getting the key mapping right takes a little longer - out of the box several keys did not behave as expected.

Anyone looking to save a little time is welcome to use my keyboard settings. To apply them use xmodmap:


xmodmap keyboardsettings

Retrieving Rapidshare Files with Python

January 2nd, 2009

A cursory google search will reveal several scripts for retrieving rapidshare files using python, but each one I’ve seen delegates the actual retrieval to wget.

This is not necessary.

Rapidshare uses basic authentication to identify logged in members and urllib2 can handle this easily.

The following method would do the trick without the need to call external executables:

def rapidget(url. login, password):
    "Retrieve files from rapidshare using only python"
    request = urllib2.Request(url)
    base64string = base64.encodestring('%s:%s' % (login, password))[:-1]
    request.add_header("Authorization", "Basic %s" % base64string)
    i = url.rfind('/')
    filename = url[i+1:]
    print url, "->", filename
    file = open(filename, 'wb')
    handle = urllib2.urlopen(request)
    buffer = ''
    buffersize = 1024*1024
    while True:
        buffer = handle.read(buffersize)
        if not buffer:
            handle.close()
            file.close()
            break
        file.write(buffer)
        buffer = ''
        print '.',

This assumes, of course, that you have an account at rapidshare.

Is the iPlayer a Trojan Horse?

December 31st, 2008

I won’t be joining the celebrations around the launch of the BBC’s iPlayer on Mac and Linux.

The encroachment into the network of broadcasting corporations such as the BBC should be vigorously resisted both as a tremendous waste of bandwidth by a company that already enjoys a monopoly on huge swathes of the spectrum and as a step towards the licensing of internet access.

UK readers sensible enough not to own a television will have first hand experience of the Gestapo-like tactics of the BBC licensing authorities whose regular, nasty, intimidatory letters misleadingly and illegally threaten prosecution to anyone found using equipment capable of receiving a television signal including “computers connected to the internet.” The more organisations like the BBC pollute the web with their output, the stronger the calls to extend the license to cover access to the internet.

Already the iPlayer is being tested as a justification for bringing a tiered internet into place.

Combine that with a quixotic and sinister plan to introduce cinema style ratings to websites being considered and we have all the makings of Chinese-style censorship.

Paranoid? Perhaps. But I do live under a government planning on tracking everyone’s calls, emails, texts and internet use.

Governments are a Conspiracy of the Rich

December 27th, 2008

What justice is there in this: that a nobleman, a goldsmith, a banker, or any other man, that either does nothing at all, or, at best, is employed in things that are of no use to the public, should live in great luxury and splendour upon what is so ill acquired, and a mean man, a carter, a smith, or a ploughman, that works harder even than the beasts themselves, and is employed in labours so necessary, that no commonwealth could hold out a year without them, can only earn so poor a livelihood and must lead so miserable a life, that the condition of the beasts is much better than theirs? For as the beasts do not work so constantly, so they feed almost as well, and with more pleasure, and have no anxiety about what is to come, whilst these men are depressed by a barren and fruitless employment, and tormented with the apprehensions of want in their old age; since that which they get by their daily labour does but maintain them at present, and is consumed as fast as it comes in, there is no overplus left to lay up for old age.

“Is not that government both unjust and ungrateful, that is so prodigal of its favours to those that are called gentlemen, or goldsmiths, or such others who are idle, or live either by flattery or by contriving the arts of vain pleasure, and, on the other hand, takes no care of those of a meaner sort, such as ploughmen, colliers, and smiths, without whom it could not subsist? But after the public has reaped all the advantage of their service, and they come to be oppressed with age, sickness, and want, all their labours and the good they have done is forgotten, and all the recompense given them is that they are left to die in great misery. The richer sort are often endeavouring to bring the hire of labourers lower, not only by their fraudulent practices, but by the laws which they procure to be made to that effect, so that though it is a thing most unjust in itself to give such small rewards to those who deserve so well of the public, yet they have given those hardships the name and colour of justice, by procuring laws to be made for regulating them.

“Therefore I must say that, as I hope for mercy, I can have no other notion of all the other governments that I see or know, than that they are a conspiracy of the rich, who, on pretence of managing the public, only pursue their private ends, and devise all the ways and arts they can find out; first, that they may, without danger, preserve all that they have so ill-acquired, and then, that they may engage the poor to toil and labour for them at as low rates as possible, and oppress them as much as they please; and if they can but prevail to get these contrivances established by the show of public authority, which is considered as the representative of the whole people, then they are accounted laws

Sir Thomas More, Utopia, 1516

Deleting Messages from an IMAP Folder Using Python

August 16th, 2008

I was asked to help delete 16,980 messages from an IMAP spam folder the other day. No email client could handle it without crashing. Even mutt choked after several hours of valiantly struggling.

Python to the rescue. Rather than write a script to do this I ran each command from the python shell. It’s a very addictive way of working because you get instant feedback.

import getpass, imaplib
M = imaplib.IMAP4_SSL("imap.gmail.com")
M.login("yourusername@gmail.com", getpass.getpass())

Now we’re in. Let’s see what directories exist.

M.list()

Pick the offending directory.

M.select("[Gmail]/Spam")

View the messages.

typ, data = M.search(None, 'ALL')
for num in data[0].split():
....typ, data = M.fetch(num, '(RFC822)')
....print 'Message %s\n%s\n' % (num, data[0][1])

Now delete them all and close the connection to the mailserver. To delete a message in IMAP, you need to set the delete flag on it then expunge the folder.

typ, data = M.search(None, 'ALL')
for num in data[0].split():
....M.store(num, '+FLAGS', '\\Deleted')
M.expunge()
M.close()
M.logout()

(To be fair to google I must point out that gmail was not the offending mailserver although I’ve used them in the sample code above.)