Archive for June 2007

Abject oriented programming and some serious stuff

Just can’t help wanting to point people to this description of abject oriented programming, a little-known but widely used programming paradigm.

Over to something more serious. I want a zooming desktop that uses the power of language. Something for GNOME 3 maybe?

Decorator pattern in Python

The other day I was talking to a mate and former colleague of mine, he’s been doing a lot of Java and C# before but recently he got hired by a small company to do Python work. Anyway he related a funny part of the interview where he said he’d done design patterns and they asked him to explain one that he’s used. He chose Decorator. After he was done explaining the interviewer commented that surely he meant Proxy. The interviewer was wrong and my mate suspects this might be something that’s common in the Python world due to the built-in support for function/method decorators in the language. I suspect he’s right. Anyway, he showed me what he was playing with and I couldn’t help but play a bit on my own afterwards.

Here’s the class of the core object, a simple self-explanatory piece of code:

class Writer(object):
    def write(self, s):
        print s

Here’s a not very exciting example of using it:

> w = Writer()
> w.write('hello')
hello

We want to decorate it by modifying the string passed to write in different ways. First here’s a base decorator class:

class WriterDecorator(object):
    def __init__(self, wrappee):
        self.wrappee = wrappee

    def write(self, s):
        self.wrappee.write(s)

Using it is straight forward, and still not very exciting:

> wd = WriterDecorator(w)
> wd.write('hello')
hello

The constructor requires a wrappee object and the implementation of write is straight forward. Strictly speaking this class is unnecessary, but it’s convenient once we implement “real” decorators. Here’s the first one, it converts the string to upper case before passing it on down the chain:

class UpperWriter(WriterDecorator):
    def write(self, s):
        self.wrappee.write(s.upper())

This is where it gets a little more exciting, not much though:

> uw = UpperWriter(w)
> uw.write('hello')
HELLO

Here’s a nice detail about Python that I’ve never reflected over myself—constructors are inherited in Python. Here’s another decorator, one that makes the string “shouty”:

class ShoutWriter(WriterDecorator):
    def write(self, s):
        self.wrappee.write('!'.join([t for t in s.split(' ') if t]) + ‘!’)

Now it’s getting a little more interesting, because the decorators can be combined:

> sw1 = ShoutWriter(w)
> sw1.write('hello again')
hello!again!
> sw2 = ShoutWriter(uw)
> sw2.write('hello again')
HELLO!AGAIN!

Some of these combinations are more useful than others, and if they’re used very often then it might be worth creating a convenience class for them. Here’s one that I imagine could be useful if you’re a writer for The Register:

class YahooWriter(WriterDecorator):
    def __init__(self, wrappee):
        self.wrappee = UpperWriter(ShoutWriter(wrappee))

Using it is simple:

> yw = YahooWriter(w)
> yw.write('hello again')
HELLO!AGAIN!

Well, so far it’s been child’s play and I wouldn’t have bothered writing about this unless I took this a little further. I thought something was familiar about how the convenience class worked. I vaguely remembered reading something about super being harmful and there seemed to be similarities between behaviour described there and the desired behaviour when nesting decorators. Rewriting the basic decorator classes using super like this retains their behaviour:

class UpperWriter(WriterDecorator):
    def write(self, s):
        super(UpperWriter, self).write(s.upper())


class ShoutWriter(WriterDecorator):
    def write(self, s):
        super(ShoutWriter, self).write('!'.join([t for t in s.split(' ') if t]) + ‘!’)

What this does though is allow implementing YahooWriter like this:

class YahooWriter(UpperWriter, ShoutWriter):
    pass

I think that’s pretty cute.

Here’s where I have to stop though. I don’t know if this is even useful, is it? Maybe it has some serious draw-backs my inexperience and ignorance prevents me from seeing, does it? Has super been used like this somewhere? I’d love pointers to that code :-)

[Edited 16-06-2007 00:34 BST] Bloody hell, can’t believe I had a spelling error in the title all this time. Embarrassing really!

Random stuff, 2007-06-14

Just after being added to Planet Haskell I changed the theme of WPi but as always with themes there were things I didn’t really like. I was happy to notice that this time I’d chosen a theme written by someone who knew English which was a relief since the previous theme was commented and even contained id names in Spanish. Still, modifying the theme, especially the style sheet, is a pain. Then I found Firebug. Let’s just say I’m never going to bother looking through the style sheet for a theme again without first having found the exact line number by using Firebug. It’s simply a brilliant add-on for Firefox.

After talking to a mate I dropped enable-ssh-support in ~/.gnupg/gpg-agent.conf and stopped using ssh-agent altogether. The Debian developers seem to have anticipated this and there’s full support for this in the scripts in /etc/X11/Xsession.d/. Excellent!

I’ve finally taken the time to look into getting the webcam that I bought from OpenForEveryone working. After a false start with spca5xx–it doesn’t build on recent kernels–I built a kernel module for gspca. Firing up Camorama revealed that the cam was indeed working, however colours, contrast and brightness was all screwed up and couldn’t be changed. Later that turned out to be a problem with Camorama rather then with the cam itself; it works perfectly well in Ekiga.

I’ve also signed up for a SIP account at ekiga.net. I can now be reached on sip:magnus.therning@ekiga.net.

  1. I had made a manual change to the old theme that really didn’t belong on a Planet. It can only be described as discrimination against IE users.[back]

Russel: putting HTML/XML in blog entries

Russel, I’m glad I’m using WP’s markdown plugin. I’ve never experienced any of the problems you are having relating to <pre> tags or <code>. It just works! :-)

LRL—registered and paid

I just finished registering and paying for LRL. Let me know if you want to meet up for a beer, GPG key signing, or whatever…

Unicode in URIs makes my head hurt

I’ve read very little about Unicode before but today I had the questionable pleasure of delving a bit deeper into it. Mind you, it still feels like I’ve just dipped a foot in the water, but before today I had only dipped a single toe.

Especially I was interested in the URI encoding (”percentage encoding”) and Unicode. According to RFC 3986:

When a new URI scheme defines a component that represents textual data consisting of characters from the Universal Character Set [UCS], the data should first be encoded as octets according to the UTF-8 character encoding [STD63]; then only those octets that do not correspond to characters in the unreserved set should be percent- encoded.

Of course this particular document if fairly new (January 2005) so I bet there are quite a few URI codecs out there that don’t behave this way yet. Another interesting detail is that Microsoft long has supported a special URI encoding especially suited for dealing with UCS-2i which takes the form %uhhhh. E.g. the character ‘A’ would be %41 according to the standard encoding, using Microsoft’s encoding it looks like %u0041. So far it’s quite straight forward but then enters something strange in Unicode; compatibility characters. They do make certain sense when they are combinations of a base character and some sort of marker (I’m not sure I’m using the right terminology here), e.g. the character ‘å’ can be constructed in two ways, either using the code point U+00E5 or by combining an ‘a’ (U+0061) and the “combining diacritical mark” ‘ ̊’ (U+030A). Of course comparing these two characters which are completely differently encoded while still having exactly the same semantics is a bit of a problem. That’s solved by canonicalisation, which there are two standards for. I didn’t bother going further into that, because my real problem, the reason why I started all of this was that there are compatibility characters for something called “Halfwidth and Fullwidth Forms” (block FF01–FFEF). This block contains some non-latin characters and then it makes sense, but for some strange reason all printable characters in the Basic Latin block (0000–007f) is present as “fullwidth forms” as well. The reason for this is unclear to me and I’d really love an explanation. The result of this is that there apparently is some confusion just what to do with these “fullwidth forms” when decoding them, in some cases they are treated just like their “halfwidth form” cousins in the Basic Latin block. The end result is that on Microsoft products ‘A’ can also be encoded as %uff21.

While reading about Unicode I always have to remind myself that “for every complex problem, there is a solution that is simple, neat, and wrong”. I simply can’t help but think “this is so complicated, there must be an easier solution”…

Re-reading this post I realise there isn’t much of a point to it, besides possibly that writing (or talking) about something always helps my understanding of it. Please let me know if my understanding of Unicode or URI encoding is wrong…

  1. I suspect this is connected to Microsoft’s love for UCS-2 in other areas of their operating system.[back]

Adventures in parsing, part 4

I received a few comments on part 3 of this little mini-series and I just wanted to address them. While doing this I still want the main functions of the parser parseXxx to read like the maps file itself. That means I want to avoid “reversing order” like thenChar and thenSpace did in part2. I also don’t want to hide things, e.g. I don’t want to introduce a function that turns (a <* char ' ') <*> b into a <#> b.

So, first up is to do something about hexStr2Int <$> many1 hexDigit which appears all over the place. I made it appear in even more places by moving around a few parentheses; the following two functions are the same:

foo = a <$> (b <* c)
bar = (a <$> b) <* c

Then I scrapped hexStr2Int completely and instead introduced hexStr:

hexStr = Prelude.read . ("0x" ++) <$> many1 hexDigit

This means that parseAddress can be rewritten to:

parseAddress = Address <$>
    hexStr <* char '-' <*>
    hexStr

Rather than, as Conal suggested, introduce an infix operation that addresses the pattern (a <* char ' ') <*> b I decided to do something about a <* char c. I feel Conal’s suggestion, while shortening the code more than my solution, goes against my wish to not hide things. This is the definition of <##>:

(<##>) l r = l <* char r

After this I rewrote parseAddress into:

parseAddress = Address <$>
    hexStr <##> '-' <*>
    hexStr

The pattern (== c) <$> anyChar appears three times in parsePerms so it got a name and moved down into the where clause. I also modified cA to use pattern matching. I haven’t spent much time considering error handling in the parser, so I didn’t introduce a pattern matching everything else.

parsePerms = Perms <$>
    pP 'r' <*>
    pP 'w' <*>
    pP 'x' <*>
    (cA <$> anyChar)

    where
        pP c = (== c) <$> anyChar
        cA 'p' = Private
        cA 's' = Shared

The last change I did was remove a bunch of parentheses. I’m always a little hesitant removing parentheses and relying on precedence rules, I find I’m even more hesitant doing it when programming Haskell. Probably due to Haskell having a lot of infix operators that I’m unused to.

The rest of the parser now looks like this:

parseDevice = Device <$>
    hexStr <##> ':' <*>
    hexStr

parseRegion = MemRegion <$>
    parseAddress <##> ' ' <*>
    parsePerms <##> ' ' <*>
    hexStr <##> ' ' <*>
    parseDevice <##> ' ' <*>
    (Prelude.read <$> many1 digit) <##> ' ' <*>
    (parsePath <|> string "")

    where
        parsePath = (many1 $ char ' ') *> (many1 anyChar)

I think these changes address most of the comments Conal and Twan made on the previous part. Where they don’t I hope I’ve explained why I decided not to take their advice.

Adventures in parsing, part 3

I got a great many comments, at least by my standards, on my earlier two posts on parsing in Haskell. Especially on the latest one. Conal posted a comment on the first pointing me towards liftM and its siblings, without telling me that it would only be the first step towards “applicative style”. So, here I go again…

First off, importing Control.Applicative. Apparently <|> is defined in both Applicative and in Parsec. I do use <|> from Parsec so preventing importing it from Applicative seemed like a good idea:

import Control.Applicative hiding ( (<|>) )

Second, Cale pointed out that I need to make an instance for Control.Applicative.Applicative for GenParser. He was nice enough to point out how to do that, leaving syntax the only thing I had to struggle with:

instance Applicative (GenParser c st) where
    pure = return
    (<*>) = ap

I decided to take baby-steps and I started with parseAddress. Here’s what it used to look like:

parseAddress = let
        hexStr2Int = Prelude.read . ("0x" ++)
    in do
        start <- liftM hexStr2Int $ thenChar '-' $ many1 hexDigit
        end <- liftM hexStr2Int $ many1 hexDigit
        return $ Address start end

On Twan’s suggestion I rewrote it using where rather than let ... in and since this was my first function I decided to go via the ap function (at the same time I broke out hexStr2Int since it’s used in so many places):

parseAddress = do
    start <- return hexStr2Int `ap` (thenChar '-' $ many1 hexDigit)
    end <- return hexStr2Int `ap` (many1 hexDigit)
    return $ Address start end

Then on to applying some functions from Applicative:

parseAddress = Address start end
    where
        start = hexStr2Int <$> (thenChar '-' $ many1 hexDigit)
        end = hexStr2Int <$> (many1 hexDigit)

By now the use of thenChar looks a little silly so I changed that part into many1 hexDigit <* char '-' instead. Finally I removed the where part altogether and use <*> to string it all together:

parseAddress = Address <$>
    (hexStr2Int <$> many1 hexDigit <* char '-') <*>
    (hexStr2Int <$> (many1 hexDigit))

From here on I skipped the intermediate steps and went straight for the last form. Here’s what I ended up with:

parsePerms = Perms <$>
    ( (== 'r') <$> anyChar) <*>
    ( (== 'w') <$> anyChar) <*>
    ( (== 'x') <$> anyChar) <*>
    (cA <$> anyChar)

    where
        cA a = case a of
            'p' -> Private
            's' -> Shared

parseDevice = Device <$>
    (hexStr2Int <$> many1 hexDigit <* char ':') <*>
    (hexStr2Int <$> (many1 hexDigit))

parseRegion = MemRegion <$>
    (parseAddress <* char ' ') <*>
    (parsePerms <* char ' ') <*>
    (hexStr2Int <$> (many1 hexDigit <* char ' ')) <*>
    (parseDevice <* char ' ') <*>
    (Prelude.read <$> (many1 digit <* char ' ')) <*>
    (parsePath <|> string "")

    where
        parsePath = (many1 $ char ' ') *> (many1 anyChar)

I have to say I’m fairly pleased with this version of the parser. It reads about as easy as the first version and there’s none of the “reversing” that thenChar introduced.

Metacity joy…

As I wrote a while back the development version of Metacity now has support for _NET_MOVERESIZE_WINDOW. I’ve just compiled a bleeding edge version checked out of subversion and it’s all working fine. Excellent news for Jendela.

More sex is safer sex…

I just listened to episode 10 of the Get Illuminated audio cast where Steven E. Landsburg is interviewed (I found the link on Boing Boing). It sounds like a very interesting book; I really love that sort of provocative writing, the sort that challenges your common sense.

The only argument I can raise against the author’s reasoning, and bear in mind that I haven’t actually read the book yet this is all based on the interview, is that it’s “how not to be part of the problem”, but it’s not “how to solve the problem”. I suppose it really highlights the difference between “do no evil” and “do good”.