• Movable Type's Export File Format

    Here are a short list of things that possess more elegance than Movable Type’s export file format:

    • XML,
    • SMTP,
    • the C string API,
    • the C multibyte string API (mbsinit, wcrtomb, mbsnrtowcs, etc),
    • the C++ grammar specification,
    • C++ template error messages,
    • the BIND zone file format,
    • Bourne shell parameter expansion involving spaces,
    • PHP,
    • CSV,
    • GNU libtool,
    • wGetGUI,
    • POSIX regular expressions,
    • MPEG-7,
    • the mplayer code base,
    • the Cisco VPN client,
    • the ld(1) manpage on the UNIX system of your choice,
    • the sudoers(5) manpage,
    • Makefiles generated by GNU autogoats,
    • Eric S. Raymond,
    • ICCCM,
    • pretty much everything.

    Feel free to extend this list in the comments.

  • UCS-2 vs UTF-16

    I always used to get confused between UCS-2 and UTF-16. Which one’s the fixed-width encoding and which one’s the variable-length encoding that supports surrogate pairs?

    Then, I learnt this simple little mnemonic: you know that UTF-8 is variable-length encoded1. UTF = variable-length. Therefore UTF-16 is variable-length encoded, and therefore UCS-2 is fixed-length encoded. (Just don’t extend this mnemonic to UTF-32.)

    Just thought I’d pass that trick on.

    1 I’m assuming you know what UTF-8 is, anyway. If you don’t, and you’re a programmer, you should probably learn sometime…

  • UCS-2 vs UTF-16

    I always used to get confused between UCS-2 and UTF-16. Which one’s the fixed-width encoding and which one’s the variable-length encoding that supports surrogate pairs?

    Then, I learnt this simple little mnemonic: you know that UTF-8 is variable-length encoded1. UTF = variable-length. Therefore UTF-16 is variable-length encoded, and therefore UCS-2 is fixed-length encoded. (Just don’t extend this mnemonic to UTF-32.)

    Just thought I’d pass that trick on.

    1 I’m assuming you know what UTF-8 is, anyway. If you don’t, and you’re a programmer, you should probably learn sometime…

  • Computing Heroes

    I was chatting to a mate of mine about a remarkable book that I found the other day:

    One of the greatest intellectuals of our century writes about computing systems and fundamental aspects of the brain. What’s there not to like here? I’m only halfway through the book, and it’s already got so much worthy material in it that I will recommend it to any other computing folks. It’s worth it for the Foreword alone.

    Alas, von Neumann passed on a while ago. Right after our discussion, I find out that John Backus passed away last Saturday. Phil Windley comments that “Computer Science has always been a discipline where the founders were still around. That’s changing.”

    Arguably computing’s most famous face-person right now is Bill Gates. I don’t see Gates being famous as bad: after all, the guy is a multi-billionaire, which naturally gives him a little bit of a reputation, and his philanthropic acts are to be admired even if one despises his business tactics. However, what does the greater public know about our real heroes? Alan Turing? John von Neumann? Grace Hopper? Alan Kay? John Backus? Donald Knuth? Edsgar Dijkstra? Doug Engelbart?

    I remember when John Shepherd taught me CS2041 at university, he spent 5 minutes at the start of each lecture talking about “famous geeks” and what they did for our industry. We need to educate ourselves as an industry and learn and respect what these folks did; go back to our roots; respect our elders. I’d wager that a lot more mathematicians know about Bertrand Russell and Leonhard Euler than self-described programmers and computing geeks know about Alan Turing and Edsgar Dijkstra.

    If you’re a programmer (or even if you’re not), go to Wikipedia’s list of Turing Award winners sometime and just start reading about people you don’t know, starting with the man who the Turing award’s named after. (I’m ashamed to say that I only recognise a mere 22 out of 51 names of the Turing Award winners, and I’m scared to think that I’m probably doing a lot better than a lot of other folks.)

    I understand that people such as Knuth and Dijkstra made specialised contributions to our field, and that the greater public won’t particularly care for them (in the same way that a lot of the general public won’t know about Bertrand Russell or even Euler, but they’re known by pretty much every single mathematician). However, there are lots of computing legends who we can talk about at dinner with all our non-geek friends and family. Go get Doug Engelbart’s Mother of All Demos or Ivan Sutherland’s Sketchpad demo and show it to your friends. Tell your family about the role that Turing played in World War II, and the amusing story of Grace Hopper finding an actual bug inside her computer.

    As Dijkstra said, “in their capacity as a tool, computers will be but a ripple on the surface of our culture. In their capacity as intellectual challenge, they are without precedent in the cultural history of mankind.” Computing is one of the most important things to emerge from this entire century. I hope that in twenty years’ time, at least Alan Turing will be a household name alongside Bill Gates. Let’s do our part to contribute to that.

  • Computing Heroes

    I was chatting to a mate of mine about a remarkable book that I found the other day:

    One of the greatest intellectuals of our century writes about computing systems and fundamental aspects of the brain. What’s there not to like here? I’m only halfway through the book, and it’s already got so much worthy material in it that I will recommend it to any other computing folks. It’s worth it for the Foreword alone.

    Alas, von Neumann passed on a while ago. Right after our discussion, I find out that John Backus passed away last Saturday. Phil Windley comments that “Computer Science has always been a discipline where the founders were still around. That’s changing.”

    Arguably computing’s most famous face-person right now is Bill Gates. I don’t see Gates being famous as bad: after all, the guy is a multi-billionaire, which naturally gives him a little bit of a reputation, and his philanthropic acts are to be admired even if one despises his business tactics. However, what does the greater public know about our real heroes? Alan Turing? John von Neumann? Grace Hopper? Alan Kay? John Backus? Donald Knuth? Edsgar Dijkstra? Doug Engelbart?

    I remember when John Shepherd taught me CS2041 at university, he spent 5 minutes at the start of each lecture talking about “famous geeks” and what they did for our industry. We need to educate ourselves as an industry and learn and respect what these folks did; go back to our roots; respect our elders. I’d wager that a lot more mathematicians know about Bertrand Russell and Leonhard Euler than self-described programmers and computing geeks know about Alan Turing and Edsgar Dijkstra.

    If you’re a programmer (or even if you’re not), go to Wikipedia’s list of Turing Award winners sometime and just start reading about people you don’t know, starting with the man who the Turing award’s named after. (I’m ashamed to say that I only recognise a mere 22 out of 51 names of the Turing Award winners, and I’m scared to think that I’m probably doing a lot better than a lot of other folks.)

    I understand that people such as Knuth and Dijkstra made specialised contributions to our field, and that the greater public won’t particularly care for them (in the same way that a lot of the general public won’t know about Bertrand Russell or even Euler, but they’re known by pretty much every single mathematician). However, there are lots of computing legends who we can talk about at dinner with all our non-geek friends and family. Go get Doug Engelbart’s Mother of All Demos or Ivan Sutherland’s Sketchpad demo and show it to your friends. Tell your family about the role that Turing played in World War II, and the amusing story of Grace Hopper finding an actual bug inside her computer.

    As Dijkstra said, “in their capacity as a tool, computers will be but a ripple on the surface of our culture. In their capacity as intellectual challenge, they are without precedent in the cultural history of mankind.” Computing is one of the most important things to emerge from this entire century. I hope that in twenty years’ time, at least Alan Turing will be a household name alongside Bill Gates. Let’s do our part to contribute to that.

  • Objective-C Accessors

    I like Objective-C. It’s a nice language. However, having to write accessor methods all day is boring, error-prone, and a pain in the ass:

    - (NSFoo*) foo{ return foo;}- (void) setFoo:(NSFoo* newFoo){ [foo autorelease]; foo = [newFoo retain];}

    I mean, c’mon. This is Objective-C we’re talking about, not Java or C . However, until Objective-C 2.0’s property support hits the streets (which, unfortunately, will only be supported on Mac OS X 10.5 and later as far as I know), you really have to write these dumb-ass accessors to, well, access properties in your objects correctly. You don’t need to write accessors thanks to the magic of Cocoa’s Key-Value Coding, but it just feels wrong to access instance variables using strings as keys. I mean, ugh—one typo in the string and you’ve got yourself a problem. Death to dynamic typing when it’s totally unnecessary.

    As such, I got totally fed up with this and wrote a little script to generate accessor methods. I’m normally not a fan of code generation, but in this case, the code generation’s actually designed to be one-shot, and it doesn’t alter the ever-picky build process. It’s meant to be used in Xcode, although you can run it via the commandline too if you like.

    Given the following input:

    int integerThing;NSString* _stringThing;IBOutlet NSWindow* window;

    It will spit out the following:

    #pragma mark Accessors- (int) integerThing;- (void) setIntegerThing:(int)anIntegerThing;- (NSString*) stringThing;- (void) setStringThing:(NSString*)aStringThing;- (NSWindow*) window;- (void) setWindow:(NSWindow*)aWindow;%%%{PBXSelection}%%%#pragma mark Accessors- (int) integerThing{ return integerThing;}- (void) setIntegerThing:(int)anIntegerThing{ integerThing = anIntegerThing;}- (NSString*) stringThing{ return _stringThing;}- (void) setStringThing:(NSString*)aStringThing{ [_stringThing autorelease]; _stringThing = [aStringThing copy];}- (NSWindow*) window{ return window;}- (void) setWindow:(NSWindow*)aWindow{ [window autorelease]; window = [aWindow retain];}

    There’s a couple of dandy features in the script that I find useful, all of which are demonstrated in the above output:

    1. It will detect whether your instance variables start with a vowel, and write out anInteger instead of aInteger as the parameter names for the methods.
    2. It will copy rather than retain value classes such as NSStrings and NSNumbers, as God intended.
    3. For all you gumbies who prefix your instance variables with a leading underscore, it will correctly recognise that and not prefix your accessor methods with an underscore.1
    4. IBOutlet and a few other type qualifiers (__weak, __strong, volatile etc) are ignored correctly.
    5. It will emit Xcode-specific #pragma mark places to make the method navigator control a little more useful.
    6. It will emit Xcode-specific %%%{PBXSelection}%%% markers so that the accessor methods meant to go into your .m implementation file are automatically selected, ready for a cut-and-paste.

    Download the objc-make-accessors script and throw it into your “~/Library/Application Support/Apple/Developer Tools/Scripts” folder. If you don’t have one yet:

    mkdir -p ~/Library/"Application Support"/Apple/Developer Tools/Scripts/10-Scriptsln -sf "/Library/Application Support/Apple/Developer Tools/Scripts/10-User Scripts/99-resetMenu.sh" ~/Library/"Application Support"/Apple/Developer Tools/Scripts/10-Scripts/cp ~/Desktop/objc-make-accessors ~/Library/"Application Support"/Apple/Developer Tools/Scripts/10-Scripts/

    Done. You should now have a Scripts menu in Xcode with a new menu item named “IVars to Accessor Methods”. Have fun.

    1 Note that older versions of the Cocoa Coding Guidelines specified that prefixing instance variables with underscores is an Apple-only convention and you should not do this in your own classes. Now the guidelines just don’t mention anything about this issue, but I still dislike it because putting underscores every time you access an instance variable really lowers code readability.

  • Objective-C Accessors

    I like Objective-C. It’s a nice language. However, having to write accessor methods all day is boring, error-prone, and a pain in the ass:

    - (NSFoo*) foo{ return foo;}- (void) setFoo:(NSFoo* newFoo){ [foo autorelease]; foo = [newFoo retain];}

    I mean, c’mon. This is Objective-C we’re talking about, not Java or C . However, until Objective-C 2.0’s property support hits the streets (which, unfortunately, will only be supported on Mac OS X 10.5 and later as far as I know), you really have to write these dumb-ass accessors to, well, access properties in your objects correctly. You don’t need to write accessors thanks to the magic of Cocoa’s Key-Value Coding, but it just feels wrong to access instance variables using strings as keys. I mean, ugh—one typo in the string and you’ve got yourself a problem. Death to dynamic typing when it’s totally unnecessary.

    As such, I got totally fed up with this and wrote a little script to generate accessor methods. I’m normally not a fan of code generation, but in this case, the code generation’s actually designed to be one-shot, and it doesn’t alter the ever-picky build process. It’s meant to be used in Xcode, although you can run it via the commandline too if you like.

    Given the following input:

    int integerThing;NSString* _stringThing;IBOutlet NSWindow* window;

    It will spit out the following:

    #pragma mark Accessors- (int) integerThing;- (void) setIntegerThing:(int)anIntegerThing;- (NSString*) stringThing;- (void) setStringThing:(NSString*)aStringThing;- (NSWindow*) window;- (void) setWindow:(NSWindow*)aWindow;%%%{PBXSelection}%%%#pragma mark Accessors- (int) integerThing{ return integerThing;}- (void) setIntegerThing:(int)anIntegerThing{ integerThing = anIntegerThing;}- (NSString*) stringThing{ return _stringThing;}- (void) setStringThing:(NSString*)aStringThing{ [_stringThing autorelease]; _stringThing = [aStringThing copy];}- (NSWindow*) window{ return window;}- (void) setWindow:(NSWindow*)aWindow{ [window autorelease]; window = [aWindow retain];}

    There’s a couple of dandy features in the script that I find useful, all of which are demonstrated in the above output:

    1. It will detect whether your instance variables start with a vowel, and write out anInteger instead of aInteger as the parameter names for the methods.
    2. It will copy rather than retain value classes such as NSStrings and NSNumbers, as God intended.
    3. For all you gumbies who prefix your instance variables with a leading underscore, it will correctly recognise that and not prefix your accessor methods with an underscore.1
    4. IBOutlet and a few other type qualifiers (__weak, __strong, volatile etc) are ignored correctly.
    5. It will emit Xcode-specific #pragma mark places to make the method navigator control a little more useful.
    6. It will emit Xcode-specific %%%{PBXSelection}%%% markers so that the accessor methods meant to go into your .m implementation file are automatically selected, ready for a cut-and-paste.

    Download the objc-make-accessors script and throw it into your “~/Library/Application Support/Apple/Developer Tools/Scripts” folder. If you don’t have one yet:

    mkdir -p ~/Library/"Application Support"/Apple/Developer Tools/Scripts/10-Scriptsln -sf "/Library/Application Support/Apple/Developer Tools/Scripts/10-User Scripts/99-resetMenu.sh" ~/Library/"Application Support"/Apple/Developer Tools/Scripts/10-Scripts/cp ~/Desktop/objc-make-accessors ~/Library/"Application Support"/Apple/Developer Tools/Scripts/10-Scripts/

    Done. You should now have a Scripts menu in Xcode with a new menu item named “IVars to Accessor Methods”. Have fun.

    1 Note that older versions of the Cocoa Coding Guidelines specified that prefixing instance variables with underscores is an Apple-only convention and you should not do this in your own classes. Now the guidelines just don’t mention anything about this issue, but I still dislike it because putting underscores every time you access an instance variable really lowers code readability.

  • Cocoa Users Group in Sydney

    To all the Mac users out there: would you be interested in a Cocoa Users’ Group in Sydney? If so, please drop me an email—my address is at the bottom of the page—and if there’s enough numbers, perhaps we can organise something. The idea’s to have a local forum for geekupsmeetups, random presentations, mailing lists, and all that sort of fun stuff.

    Oh yeah, and please also let me know your self-described level of expertise: none, novice, intermediate, expert.

    (For those who closely track the Cocoa scene in Australia: yep, this is the same call for interest that Duncan Campbell has initiated.)

  • Cocoa Users Group in Sydney

    To all the Mac users out there: would you be interested in a Cocoa Users’ Group in Sydney? If so, please drop me an email—my address is at the bottom of the page—and if there’s enough numbers, perhaps we can organise something. The idea’s to have a local forum for geekupsmeetups, random presentations, mailing lists, and all that sort of fun stuff.

    Oh yeah, and please also let me know your self-described level of expertise: none, novice, intermediate, expert.

    (For those who closely track the Cocoa scene in Australia: yep, this is the same call for interest that Duncan Campbell has initiated.)

  • Oldskool!

    I was cleaning up my room the other day, and lo and behold, look what I found…



    That, sir, would be the entire 147-page printed manual for FrontDoor 1.99b, an endearing piece of software for all of us who used to run BBSs1. FidoNet, SIGNet and AlterNet indeed. (For all you BinkleyTerm chumps, yeah, I ran that as well, with the holy trinity of BinkleyTerm, X00 and Maximus all under OS/2. Don’t even get me started on ViSiON-X, Oblivion/2, Echo forums and all that stuff… oh, I dread to think the number of hours I must’ve spent looking at every single BBS package under the sun.)

    Ah, but the BBS as we know it is dead, Jim. Long live the Internet!

    1 And hey Mister Ryan Verner: yes, I ran it with the BNU FOSSIL driver for a while ;-).

  • Oldskool!

    I was cleaning up my room the other day, and lo and behold, look what I found…



    That, sir, would be the entire 147-page printed manual for FrontDoor 1.99b, an endearing piece of software for all of us who used to run BBSs1. FidoNet, SIGNet and AlterNet indeed. (For all you BinkleyTerm chumps, yeah, I ran that as well, with the holy trinity of BinkleyTerm, X00 and Maximus all under OS/2. Don’t even get me started on ViSiON-X, Oblivion/2, Echo forums and all that stuff… oh, I dread to think the number of hours I must’ve spent looking at every single BBS package under the sun.)

    Ah, but the BBS as we know it is dead, Jim. Long live the Internet!

    1 And hey Mister Ryan Verner: yes, I ran it with the BNU FOSSIL driver for a while ;-).

  • Blister Packs

    Now I know why blister packs are called blister packs. It’s because you get fucking blisters whenever you try to open them.

  • Blister Packs

    Now I know why blister packs are called blister packs. It’s because you get fucking blisters whenever you try to open them.

  • Static and Dynamic Typing: Fight!

    It’s rare that I find a good, balanced article on the (dis)advantages of static vs dynamic typing, mostly because people on each side are too religious (or perhaps just stubborn) to see the benefits of the other. Stevey’s blog rant comparing static vs dynamic typing is one of the most balanced ones that I’ve seen, even if I think half his other blog posts are on crack.

    I lean toward pretty far toward the static typing end of the spectrum, but I also think that dynamic typing not only has its uses, but is absolutely required in some applications. One of my favourite programming languages is Objective-C, which seems to be quite unique in its approach: the runtime system is dynamically typed, but you get a reasonable amount of static checking at compile-time by using type annotations on variables. (Survey question: do you know of any Objective-C programmers who simply use id types everywhere, rather than the more specific types such as NSWindow* and NSArray*? Yeah, I didn’t think so.) Note that I think Objective-C could do with a more a powerful type system: some sort of parameterised type system similar in syntax to C++ templates/Java generics/C# generics would be really useful just for the purposes of compile-time checking, even though it’s all dynamically typed at runtime.

    One common thread in both Stevey’s rant and what I’ve personally experienced is that dynamic typing is the way to go when your program really needs to be extensible: if you have any sort of plugin architecture or long-lived servers with network protocols that evolve (hello Erlang), it’s really a lot more productive to use a dynamic typing system. However, I get annoyed every time I do some coding in Python or Erlang: it seems that 50% of the errors I make are type errors. While I certainly don’t believe that static type systems guarantee that “if it compiles, it works”, it’s foolish to say that they don’t help catch a large class of errors (especially if your type system’s as powerful as Haskell’s or Ocaml’s), and it’s also foolish to say that unit tests are a replacement for a type system.

    So, the question I want to ask is: why are programming languages today so polarised into either the static and dynamic camp? The only languages I know of that strive to accommodate for the benefits of both are Objective-C, Perl (though I’d say that writing Perl without use strict is an exercise in pain, since its only three types are scalars, arrays and hashes), and (gasp) Visual Basic. Programming languages and programming language research should’ve looked at integrating static and dynamic typing a long time ago. C’mon guys, it’s obvious to me that both approaches have good things to offer, and I ain’t that smart. I think a big reason they haven’t is largely for religious reasons, because people on both sides are too blinded to even attempt to see each other’s point of view. How many academic papers have there been that address this question?

    I hope that in five years, we’ll at least have one mainstream programming language that we can write production desktop and server applications in, that offer the benefits of both static and dynamic typing. (Somebody shoot me, now I’m actually agreeing with Erik Meijer.) Perhaps a good start is for the current generation of programmers to actually admit that both approaches have their merit, rather than simply get defensive whenever one system is critiqued. It was proved a long time ago that dynamic typing is simply staged type inference and can be subsumed as part of a good-enough static type system: point to static typing. However, dynamic typing is also essential for distributed programming and extensibility. Point to dynamic typing. Get over it, type zealots.

    P.S. Google Fight reckons that dynamic typing beats static typing. C’mon Haskell and C++ guys, unite! You’re on the same side! Down with those Pythonistas and Rubymongers! And, uhh, down with Smalltalk and LISP too, even though they rule! (Now I’m just confusing myself.)

  • All Hail the DeathStation 9000

    While learning about the immaculate DeathStation 9000, I came across the homepage of Armed Response Technologies. That page nearly had me snort coffee out of my nose on my beautiful new 30” LCD monitor. Make very sure you see their bloopers page.

    (This is quite possibly the best thing to ever come out of having so much stupidit… uhh, I mean, undefined behaviour, in C.)

  • All Hail the DeathStation 9000

    While learning about the immaculate DeathStation 9000, I came across the homepage of Armed Response Technologies. That page nearly had me snort coffee out of my nose on my beautiful new 30” LCD monitor. Make very sure you see their bloopers page.

    (This is quite possibly the best thing to ever come out of having so much stupidit… uhh, I mean, undefined behaviour, in C.)

  • The Problem with Threads

    If you haven’t had much experience with the wonderful world of multithreading and don’t yet believe that threads are evil1, Edward A. Lee has an excellent essay named “The Problem with Threads”, which challenges you to solve a simple problem: write a thread-safe Observer design pattern in Java. Good luck. (Non-Java users who scoff at Java will often fare even worse, since Java is one of the few languages with some measure of in-built concurrency control primitives—even if those primitives still suck.)

    His paper’s one of the best introductory essays I’ve read about the problems with shared state concurrency. (I call it an essay since it really reads a lot more like an essay than a research paper. If you’re afraid of academia and its usual jargon and formal style, don’t be: this paper’s an easy read.) For those who aren’t afraid of a bit of formal theory and maths, he presents a simple, convincing explanation of why multithreading is an inherently complex problem, using the good ol’ explanation of computational interleavings of sets of states.

    His essay covers far more than just the problem of inherent complexity, however: Lee then discusses how bad threading actually is in practice, along with some software engineering improvements such as OpenMP, Tony Hoare’s idea of Communicating Sequential Processes2, Software Transactional Memory, and Actor-style languages such as Erlang. Most interestingly, he discusses why programming languages aimed at concurrency, such as Erlang, won’t succeed in the main marketplace.

    Of course, how can you refuse to read a paper that has quotes such as these?

    • “… a folk definition of insanity is to do the same thing over and over again and to expect the results to be different. By this definition, we in fact require that programmers of multithreaded systems be insane. Were they sane, they could not understand their programs.”
    • “I conjecture that most multi-threaded general-purpose applications are, in fact, so full of concurrency bugs that as multi-core architectures become commonplace, these bugs will begin to show up as system failures. This scenario is bleak for computer vendors: their next generation of machines will become widely known as the ones on which many programs crash.”
    • “Syntactically, threads are either a minor extension to these languages (as in Java) or just an external library. Semantically, of course, they rhoroughly disrupt the essential determinism of the languages. Regrettably, programmers seem to be more guided by syntax than semantics.”
    • “… non-trivial multi-threaded programs are incomprehensible to humans. It is true that the programming model can be improved through the use of design patterns, better granularity of atomicity (e.g. transactions), improved languages, and formal methods. However, these techniques merely chip away at the unnecessarily enormous non-determinism of the threading model. The model remains intrinsically intractable.” (Does that “intractable” word remind you of anyone else?)
    • “… adherents to… [a programming] language are viewed as traitors if they succumb to the use of another language. Language wars are religious wars, and few of these religions are polytheistic.”

    If you’re a programmer and aren’t convinced yet that shared-state concurrency is evil, please, read the paper. Please? Think of the future. Think of your children.

    1 Of course, any non-trivial exposure to multithreading automatically implies that you understand they are evil, so the latter part of that expression is somewhat superfluous.

    2 Yep, that Tony Hoare—you know, the guy who invented Quicksort?

  • The Problem with Threads

    If you haven’t had much experience with the wonderful world of multithreading and don’t yet believe that threads are evil1, Edward A. Lee has an excellent essay named “The Problem with Threads”, which challenges you to solve a simple problem: write a thread-safe Observer design pattern in Java. Good luck. (Non-Java users who scoff at Java will often fare even worse, since Java is one of the few languages with some measure of in-built concurrency control primitives—even if those primitives still suck.)

    His paper’s one of the best introductory essays I’ve read about the problems with shared state concurrency. (I call it an essay since it really reads a lot more like an essay than a research paper. If you’re afraid of academia and its usual jargon and formal style, don’t be: this paper’s an easy read.) For those who aren’t afraid of a bit of formal theory and maths, he presents a simple, convincing explanation of why multithreading is an inherently complex problem, using the good ol’ explanation of computational interleavings of sets of states.

    His essay covers far more than just the problem of inherent complexity, however: Lee then discusses how bad threading actually is in practice, along with some software engineering improvements such as OpenMP, Tony Hoare’s idea of Communicating Sequential Processes2, Software Transactional Memory, and Actor-style languages such as Erlang. Most interestingly, he discusses why programming languages aimed at concurrency, such as Erlang, won’t succeed in the main marketplace.

    Of course, how can you refuse to read a paper that has quotes such as these?

    • “… a folk definition of insanity is to do the same thing over and over again and to expect the results to be different. By this definition, we in fact require that programmers of multithreaded systems be insane. Were they sane, they could not understand their programs.”
    • “I conjecture that most multi-threaded general-purpose applications are, in fact, so full of concurrency bugs that as multi-core architectures become commonplace, these bugs will begin to show up as system failures. This scenario is bleak for computer vendors: their next generation of machines will become widely known as the ones on which many programs crash.”
    • “Syntactically, threads are either a minor extension to these languages (as in Java) or just an external library. Semantically, of course, they rhoroughly disrupt the essential determinism of the languages. Regrettably, programmers seem to be more guided by syntax than semantics.”
    • “… non-trivial multi-threaded programs are incomprehensible to humans. It is true that the programming model can be improved through the use of design patterns, better granularity of atomicity (e.g. transactions), improved languages, and formal methods. However, these techniques merely chip away at the unnecessarily enormous non-determinism of the threading model. The model remains intrinsically intractable.” (Does that “intractable” word remind you of anyone else?)
    • “… adherents to… [a programming] language are viewed as traitors if they succumb to the use of another language. Language wars are religious wars, and few of these religions are polytheistic.”

    If you’re a programmer and aren’t convinced yet that shared-state concurrency is evil, please, read the paper. Please? Think of the future. Think of your children.

    1 Of course, any non-trivial exposure to multithreading automatically implies that you understand they are evil, so the latter part of that expression is somewhat superfluous.

    2 Yep, that Tony Hoare—you know, the guy who invented Quicksort?

  • svk and the Psychological Effect of Fast Commits

    svk—a distributed Subversion client by Chia Liang Kao and company—is now an essential part of my daily workflow. I’ve been using it almost exclusively for the past year on the main projects that I work with, and it’s fantastic being able to code when you’re on the road and do offline commits, syncing back to the main tree when you’re back online. Users of other distributed revision control systems do, of course, get these benefits, but svk’s ability to work with existing Subversion repositories is the killer reason to use it. (I’m aware that Bazaar has some Subversion integration now, but it’s still considered alpha, whereas svk has been very solid for a long time now.)

    The ability to do local checkins with a distributed revision control client has a nice side-effect: commits are fast. They typically take around two seconds with svk. A checkin from a non-distributed revision control client such as Subversion requires a round-trip to the server. This isn’t too bad on a LAN, but even for a small commit, it can take more than 10 or 15 seconds to a server on the Internet. The key point is that these fast commits have a psychological effect: having short commit times encourages you to commit very regularly. I’ve found that since I’ve switched to svk, not only can I commit offline, but I commit much more often: sometimes half a dozen times inside of 10 minutes. (svk’s other cool feature of dropping files from the commit by deleting them from the commit message also helps a lot here.) Regular commits are always better than irregular commits, because either (1) you’re committing small patches that are easily reversible, and/or (2) you’re working very prolifically. Both of these are a win!

    So, if you’re still using Subversion, try svk out just to get the benefits of this and its other nifty features. The svk documentation is quite sparse, but there are some excellent tutorials that are floating around the ‘net.

  • svk and the Psychological Effect of Fast Commits

    svk—a distributed Subversion client by Chia Liang Kao and company—is now an essential part of my daily workflow. I’ve been using it almost exclusively for the past year on the main projects that I work with, and it’s fantastic being able to code when you’re on the road and do offline commits, syncing back to the main tree when you’re back online. Users of other distributed revision control systems do, of course, get these benefits, but svk’s ability to work with existing Subversion repositories is the killer reason to use it. (I’m aware that Bazaar has some Subversion integration now, but it’s still considered alpha, whereas svk has been very solid for a long time now.)

    The ability to do local checkins with a distributed revision control client has a nice side-effect: commits are fast. They typically take around two seconds with svk. A checkin from a non-distributed revision control client such as Subversion requires a round-trip to the server. This isn’t too bad on a LAN, but even for a small commit, it can take more than 10 or 15 seconds to a server on the Internet. The key point is that these fast commits have a psychological effect: having short commit times encourages you to commit very regularly. I’ve found that since I’ve switched to svk, not only can I commit offline, but I commit much more often: sometimes half a dozen times inside of 10 minutes. (svk’s other cool feature of dropping files from the commit by deleting them from the commit message also helps a lot here.) Regular commits are always better than irregular commits, because either (1) you’re committing small patches that are easily reversible, and/or (2) you’re working very prolifically. Both of these are a win!

    So, if you’re still using Subversion, try svk out just to get the benefits of this and its other nifty features. The svk documentation is quite sparse, but there are some excellent tutorials that are floating around the ‘net.