Posts tagged ‘haskell’

Rotonyms in Haskell

I have to say rotonyms seem kind of pointless, but since there is a post with a Python solution I thought I’d make an attempt at a solution in Haskell. Of course I didn’t manage on my own, I received some much needed help from Cale and dmwit on #haskell.

First some rotation functions, as far as I know there’s no built-in rot13 encoding in Haskell:

rot13Char c = chr ( (ord c - ord 'a' + 13) `mod` (ord 'z' - ord 'a' + 1) + ord 'a')
rot13 = map rot13Char

In order to ease working with rotonyms I introduced a type to represent one. This turned out to pay off later on.

data Rotonym = R String

It would be easy to let the compiler derive Show, but I wanted to mimick the output produced by Will’s solution in Python:

instance Show Rotonym where
    show (R w) =
        let r = rot13 w
        in (min w r) ++ "\t" ++ (max w r)

Now I’m ready for the meat of the matter, reading the word list, pick out the rotonyms and finally print them. Following Will’s cue I represent the word list as a set of strings (importing Data.Set as S), this also eliminates duplicates.

main = do
    words <- liftM lines $ readFile "WORD.LST"
    let wordset = S.fromList words
        isRotonym w = S.member (rot13 w) wordset
        rots = S.fromList [R w | w <- words, isRotonym w]
    mapM_ (putStrLn . show) (S.elems rots)

Sticking Rotonym instances in a set requires them to be instances of Ord and Eq:

instance Eq Rotonym where
    (R w1) == (R w2) =
        let r1 = rot13 w1
            r2 = rot13 w2
        in (w1 == w2 && r1 == r2) || (w1 == r2 && w2 == r1)

instance Ord Rotonym where
    compare (R w1) (R w2) =
        let r1 = rot13 w1
            r2 = rot13 w2
        in compare (length w1, (min w1 r1)) (length w2, (min w2 r2))

dmwit pointed out that what is actually going on here is taking an intersection of the the words and their rotated counterparts. This means the main could be written

main = readFile "WORD.LST" >>=
        mapM_ (putStrLn . show) . S.toList . S.map R . findRotonyms . S.fromList . lines
    where findRotonyms ws = ws `S.intersection` (S.map rot13 ws)

Another idea that dmwit came up with was to introduce a class for rot13:

class Rot13 a where
    rot13 :: a -> a

Char is an instance of that class with the expected implementation of rot13:

instance Rot13 Char where
    rot13 c = chr ((ord c - ord 'a' + 13) `mod` (ord 'z' - ord 'a' + 1) + ord 'a')

If a is an instance of Rot13 then list of a is as well, again with the expected implementation:

instance (Rot13 a) => Rot13 [a] where
    rot13 = map rot13

It is possible to intersect lists (Data.List.intersect) but it turned out to be painfully slow, so back to sets again. A set of a is an instance of Rot13:

instance (Ord a, Rot13 a) => Rot13 (S.Set a) where
    rot13 = S.map rot13

Finding all rotonyms is then straight forward:

main = readFile "WORD.LST" >>=
        mapM_ (putStrLn . show) . S.toList . S.map R . findRotonyms . S.fromList . lines
    where findRotonyms ws = ws `S.intersection` (rot13 ws)

A somewhat surprising catch

Let’s get straight to it. Here’s an example of something that I found somewhat surprising in Haskell. First a bit of setup:

printNum n = putStrLn $ "num: " ++ (show n)

handleE e = do
    putStrLn $ "Caught something: " ++ (show e)
    return (-1)

errorP = fromMaybe (error "Got Nothing")

And here it is, in ghci:

> CE.catch (return $ errorP Nothing ) handleE >>= printNum
num: *** Exception: Got Nothing

And to show that things are lazy we change handleE:

handleE e = do
    putStrLn $ "Caught something: " ++ (show e)
    return []

Then we can map errorP onto a list like this:

> CE.catch (return $ map errorP [Just 17, Just 42, Nothing, Just 666]) handleE
[17,42,*** Exception: Got Nothing

In neither cas I saw the behaviour I was expecting. A chat on IRC showed that others see this as natural behaviour, explained by laziness. It wasn’t until a night’s sleep that I realised that there still was something that bothered me about that explanation. Another explanation would be that catch isn’t special. At first I didn’t realise I expected it to be special; I expected it to somehow wrap the evaluation of its first argument so that no matter when it was evaluated any exception raised would be caught. As far as I can see this would be no small feat if laziness is to be preserved. It would require catch to be special.

So, what does this mean in practice? Well, here are my thoughts. One needs to think carefully about where using catch makes sense in a program. It has to be inside IO, a well-documented fact, but it also has to cover something that isn’t lazy since it then will have no effect at all. My gut feeling is that catch is useful “in the large”, with that I mean as a sort of catch-all “high up” in the program, e.g. in main. That means its usefulness in libraries is limited (except for IO heavy libs, like FFI bindings to C), it should also probably not be used like try-catch often is used in Python.

I’ve since taken a look at the ErrorT monad transformer and it seems it behaves like I expected catch to. That’s for another post though.

Mail!

Yes, I think there’s an opening for an email client, it doesn’t have to be written in Haskell of course, but if I ever get my arse into gear it will be. Especially now that mutt-ng seems to have stalled in development, and even though mutt still is being developed I’ve gotten too spoilt by using mutt-ng exclusively for the last year or so. I’ve looked extensively at sup and it’s sweet, but due to some recent changes in what kind of machines I have access to on the web it doesn’t quite fit me anymore.

What would my ideal email client look like?

  • GMail as backend, accessed through Python’s libgmail. Hopefully it won’t be too difficult to write a Haskell module to deliver this part, but using Python to begin with is absolutely acceptable.
  • Haskell for the frontend, specifically using vty for the UI.
  • Vim, Yi, or any other old terminal-based editor to write email.

I’m confident vty has enough functionality. I’ve played with libgmail enough to be confident it offers enough access to mail data to do most everything I want from a client (the missing part is sending PGP/MIME messages, but sending mail through Google’s SMTP server is a workable solution). I’ve just started looking at options for mail parsing in Haskell, identifying three and already discarding one…

Yes, it’s all vapourware at this point, but hey, I can dream, right?

N-Queens in Haskell

After reading the chapter on options, exceptions, and failure continuations from Programming in Standard ML I thought spending a few minutes on translating the code to Haskell might be fun.

I should start with a disclaimer, I’ve never coded in SML, never even read any article or book about the language or even looked at code written in it. Nevertheless I seem to have produced runnable, and presumably correct, code in a fairly short amount of time. I’ve tried to stay as close to the original code as possible, there are undoubtably better and more effective way of writing the same code in Haskell. Feel free to provide comments with improvements :-)

First the representation of the board:

data Board = Board Int Int Int [(Int, Int)]
    | NoBoard
    deriving (Show)

Then some functions to manipulate a board:

new n = Board n 1 0 []

size (Board n _ _ _) = n

complete (Board n _ k _) = k == n

positions (Board _ _ _ qs) = qs

place (Board n i k qs) j = Board n (i + 1) (k + 1) ( (i, j) : qs)

threatens (i, j) (i’, j’)
    | i == i’ = True
    | j == j’ = True
    | i + j == i’ + j’ = True
    | i - j == i’ - j’ = True
    | otherwise = False

conflicts q [] = False
conflicts q (q’:qs) = threatens q q’ || conflicts q qs

safe (Board _ i _ qs) j = not $ conflicts (i, j) qs

These are all straight-forward translations, within minimal “haskell-ifications”.

The first solution using Maybe and explicit checks for failure:

addqueenM bd =
    let try j = if j > size bd
            then Nothing
            else if safe bd j
                then case addqueenM (place bd j) of
                    Nothing -> try $ j + 1
                    Just bd' -> Just bd'
                else try $ j + 1
    in
        if complete bd
            then Just bd
            else try 1

queensM = addqueenM . new

The second uses exceptions to avoid the explicit check (note that I’ve imported Control.Exception as CE to avoid the clash with Prelude’s catch; see the documentation of catch if you don’t know why that’s necessary):

addqueenE bd =
    let try j = if j > size bd
            then error "no more space on board"
            else if safe bd j
                then CE.catch (addqueenE (place bd j)) (\ e -> try $ j + 1)
                else try $ j + 1
    in
        if complete bd
            then return bd
            else try 1

queensE n = CE.catch (addqueenE $ new n) (\ e -> return NoBoard)

The last translated version uses an explicit failure continuation to avoid both checking results and dealing with exceptions:

addqueenC bd fc =
    let try j = if j > size bd
            then fc
            else if safe bd j
                then addqueenC (place bd j) (try $ j + 1)
                else try $ j + 1
    in
        if complete bd
            then Just bd
            else try 1

queensC n = addqueenC (new n) Nothing

So far so good, but since this is Haskell, what about monads? :-) Minor changes results in the following somewhat general version of addqueenX:

addqueenH bd =
    let try j = if j > size bd
            then mzero
            else if safe bd j
                then addqueenH (place bd j) `mplus` (try $ j + 1)
                else try $ j + 1
    in
        if complete bd
            then return bd
            else try 1

With that it’s possible to lean on the type system to get something that’s similar to queenM from above:

queensHM :: Int -> Maybe Board
queensHM = addqueenH . new

It’s also easy to write a version that calculates all solutions given the size of the board:

queensHL :: Int -> [Board]
queensHL = addqueenH . new

Slightly embarrassed about continuations…

Recently I’ve spent some time trying to understand continuations better. First off I have to admit to being a bit daft because it was only recently that I realised that call-with-current-continuation (call/cc in Scheme and callCC in Haskell) actually means “call foo with the current execution point as continuation” rather than “call foo with the continuation passed to the current function”. Well, there’s no way to explain it besides daftness I guess, even though I’d like to think that my decision to look at continuations in Haskell (callCC) to learn about them (the Cont monad just might have confused me somewhat). I really should have started by looking at continuations in Scheme (call/cc). The best description I’ve come across so far is Applications of Continuations.

I still find it somewhat confusing, as can be seen in my latest use of hpaste, but it’s somewhat clearer now than before. At least I seem to be able to almost reason my way through code I write myself, but doesn’t do what I expect :-)

Aha-one-liners

I’ve been working with computers for more than seven years now and before that I spent five and a half years in university studying computer science and engineering. During these years I’ve learnt a lot and had many aha-moments. However, since starting learning Haskell I’ve had aha-moments that manifest themselves in a single line of code. This has never happened before. Ever! I’m not sure what that says about Haskell. Maybe it’s computer science distilled, who knows?

Anyway, my latest aha-one-liner is this:

liftM dec $ sequence $ map (flip M.lookup decodeMap) s

I’ll try to recount how I got to it.

When writing dataenc I wrote a function decode :: String -> [Word8]. In a discussion on the Haskell libraries list apfelmus pointed out the necessity of allowing decode to fail. The obvious solution is to change the type to String -> Maybe [Word8].

The decoding is done using a look-up table, with some twiddling of bits to get the original data back from the encoded string. The first version used ! to do the lookup, but when this fails it throws an exception. There have been many emails on Haskell Cafe and several articles written on exceptions in Haskell, and as anyone who’s read them know: it’s better to avoid ever getting exceptions than dealing with them (I suspect this is true for any language). The function lookup offers a nice way of looking up a value without risking getting an exception. By mapping it over the string I end up with a list of possible values ([Maybe Word8]). The next step was to find a function that converts this list of possible values to possibly a list (Maybe [Word8]). sequence is just that function. The most important thing is that a single Nothing in the list of possible values results in a result of Nothing from sequence too. At this point I have a list of values, e.g. Just [1,2,3], or Nothing. All that’s left to do now is the bit twiddling part, and that’s of course what lifting dec does, and of course lifting dec into Nothing produces Nothing.

So, what was my aha-moment in all this?

Well, there were actually two things. First, it was the realisation that instead of using a function that can throw an exception (!) I could use a function that allows for failure in a controlled way (lookup). Second, by using the right type (Maybe) I can write my function as if failure doesn’t exist.

Why doesn’t all programming languages make it this easy?

[Edited 24-10-2007]: Spelling and grammar.

My first trip to the phantom zone

Based on apfelmus’ reply on the libraries mailing list it seems I got phantom types a bit wrong. So yesterday I set out to try to understand a bit better.

I got my inspiration from two blog posts in particular, Neil Mitchell’s and Lennart Augustsson’s. I am not sure I’m adding very much beyond the latter of the two but just in case it’ll help someone I’m documenting this anyway.

This is a rather contrived example but it’s short and simple. I have a type that can have two types of values:

type MyType = Either Int String

Ints can be added and Strings can be concatenated:

add :: MyType -> MyType -> MyType
add (Left x) (Left y) = Left $ x + y

cat :: MyType -> MyType -> MyType
cat (Right a) (Right b) = Right $ a ++ b

The problem with this of course is that nothing prevents me from trying to pass a Left and a Right to either of the functions. Both would result in errors as it stands. One option would of course be to explicitly deal with the error cases in code, but wouldn’t it be much nicer if it would somehow be possible to catch those error cases already at compile time?

This is exactly what phantom types allow, and it does it within “regular” Haskelli.

First I need a new type:

newtype MyPh a = MyPh MyType deriving (Show)

Then I rename the two functions above, add and cat, as add' and cat' and introduce two “wrapper functions”:

add :: MyPh Int -> MyPh Int -> MyPh Int
add (MyPh x) (MyPh y) = MyPh $ add' x y

cat :: MyPh String -> MyPh String -> MyPh String
cat (MyPh a) (MyPh b) = MyPh $ cat' a b

The type signatures on these functions are the important part. Without them Haskell would infer the type MyPh a -> MyPh b -> MyPh c for both functions and I gain nothing by that. What is left to do is to add constructors:

phInt :: Int -> MyPh Int
phInt = MyPh . Left

phString :: String -> MyPh String
phString = MyPh . Right

Again the type signatures are crucial.

Now, to make sure that the user of this API doesn’t mess with the type safety that I’ve introduced I must take care to export the phantom type but not its constructor (MyPh), the two manually written constructors (phInt and phString), and the safe functions for adding and concatenating (add and cat).

[Edited 17-10-2007] Corrected the inferred type after Derek’s comment.

  1. As Lennart A points out in his post GADT’s offer a very elegant way of achieving the same results. However, GADT’s aren’t part of Haskell 98.[back]

Phantom type problems

The last few days i’ve been hacking on a data encoding library in Haskell. Haskell has long been lacking in this area and receiving a double-encoded email about a week ago alerted me to this (you might have had to actually be in my head to follow that train of thought :) ). Yesterday I put up a Wiki page and posted an email to the Haskell library list. I also logged into #Haskell and asked for feedback. This was when Saizan came with an interesting suggestion:

What about using phantom types to annotate the encoding and use a typeclass to unify the interfaces?

That’s an excellent idea. One thought has been bugging me since beginning implementing the library, each module exposes the same methods, with the same type signatures, but there is no type-safe polymorphism in the API. I’ve never really understood phantom types so Saizan’s idea was a bit of an eye-opener for me. I think they can help solve the problem. Unfortunately I ran into another problem when playing.

I picked base16 encoding to play a bit with this because the implementation is fairly short and simple. I began with creating a new typeclass:

class DataEncoding d where
    encode :: [Word8] -> d String
    decode :: d String -> [Word8]
    chop :: Int -> d String -> d [String]
    unchop :: d [String] -> d String
    liberate :: d [String] -> [String]
    incarcerate :: [String] -> d [String]

I added the last two functions since it’s no good to have the encoding locked up and inaccessible all the time, at some point we might actually want to do something with the data.

The implementation in Base16 looks like this:

data B16Enc p = B16EncS String
    | B16EncLS [String]
    deriving (Eq, Show)

instance DataEncoding B16Enc where
    encode os = B16EncS $ b16Encode os
    decode (B16EncS s) = b16Decode s
    chop n (B16EncS s) = B16EncLS $ b16Chop n s
    unchop (B16EncLS ss) = B16EncS $ b16Unchop ss
    liberate (B16EncLS ss) = ss
    incarcerate ss = B16EncLS ss

Then I realised that it might be useful to be able to liberate not only the result of a chop but also the result of a plain encode. That is I want something like

    liberate :: d a -> a
    incarcerate :: a -> d a

Where ideally a can be String or [String] and nothing else (I’ll settle for a weaker type if I have to though). What I want to avoid, if possible is the need for two functions for both liberate and incarcerate:

    liberateS :: d String -> String
    incarcerateS :: String -> d String
    liberateSL :: d [String] -> [String]
    incarcerateSL :: [String] -> d [String]

I’ve tried just introducing the a like above, but that causes a failure to match types, and declaring the class on DataEncoding d a (and enabling Glasgow extensions) causes arity problems in the types that confuse me to no end.

So, introducing phantom types and a typeclass is still ongoing. Any tip on solving the immidiate issue is welcome. I’ll be back as soon as I’ve come to any conclusion as to whether it’s worth the trouble or not :-)

Haskell and C—structs

When creating Haskell bindings for a C library one will sooner or later have to deal with ’structs’. Let’s look at how it can be done using hsc2hs.

Here’s a somewhat silly struct that’ll do for this example:

typedef struct {
    int a;
    int b;
} Foo;

It goes the file foo.h. I need some way of making sure that the data was passed properly from Haskell to C so here’s a declaration for a function to print an instance of Foo:

void print_foo(Foo *);

The actual implementation of print_foo goes into foo.c:

void
print_foo(Foo *f)
{
    printf("%s\n", __FUNCTION__);
    printf("f->a: %i\n", f->a);
    printf("f->b: %i\n", f->b);
}

No surprises so far. Now onto the Haskell side of things. First some basic setup:

{-# OPTIONS -ffi #-}
module Main
    where

import Foreign
import Foreign.C.Types

That makes sure I don’t forget to tell GHC that I’m using the foreign function interface (FFI) and imports everything I need for the rest. hsc2hs needs to know about the struct so I simply include the header file. I also need the Haskell representation of Foo, called Bar here, and for convenience I add a type for a pointer to Bar:

#include "foo.h"

data Bar = Bar { a :: Int, b :: Int }
type BarPtr = Ptr (Bar)

Now I’m ready to add the declaration of the “foreign” function:

foreign import ccall "static foo.h print_foo"
    f_print_foo :: BarPtr -> IO ()

Looking through the standard modules it isn’t completely obvious how to create a BarPtr for passing to f_print_foo. Whit the help of people on #haskell I found with, which has the type Storable a => a -> (Ptr a -> IO b) -> IO b. That means I have to make Bar a Storable. According to the documentation on Storable and some experimentation I found that for this particular example I only need full implementations of sizeOf, alignment and poke:

instance Storable Bar where
    sizeOf _ = (#size Foo)
    alignment _ = alignment (undefined :: CInt)
    peek _ = error "peek is not implemented"
    poke ptr (Bar a' b') = do
        (#poke Foo, a) ptr a'
        (#poke Foo, b) ptr b'

([Edited 09-08--2007 07:43 BST] See DeeJay’s comment below for an explanation of the definition of alignment.)

Using with every time I have to call f_print_foo will get tiring so here’s a function that’s a bit more convenient to use:

printFoo b = with b f_print_foo

Now I can write a small main function to test it all:

main = printFoo $ Bar { a=17, b=47 }

It works beautifully. However it’s very limited since it only covers the cases when a C function takes a pointer to a struct as a pure in argument. What about inout? (It’ll be easy to see how to deal with out arguments once the inout case is covered.) So, here’s a C function that adds 1 to one of the members in the struct:

void
add_a(Foo *f)
{
    printf("%s\n", __FUNCTION__);
    f->a++;
}

Of course a declaration in the header file is needed as well, but that’s pretty obvious so I’ll skip it here. The declaration of the foreign function is as simple as for f_print_foo:

foreign import ccall "static foo.h add_a"
    f_add_a :: BarPtr -> IO ()

Now comes the interesting part. Writing a convenience function for f_add_a isn’t as straight forward as for f_print_foo. I think something with the type Bar -> IO Bar would be useful. with creates a temporary BarPtr for the duration of the call which means I have to have an inner function that takes the Bar part out of the BarPtr and returns it inside IO. Luckily peek does exactly that. Adding an implementation of it means that Bar as a Storable is implemented like this:

instance Storable Bar where
    sizeOf _ = (#size Foo)
    alignment _ = 1 -- ???
    peek ptr = do
        a' <- (#peek Foo, a) ptr
        b' <- (#peek Foo, b) ptr
        return Bar { a=a', b=b' }
    poke ptr (Bar a' b') = do
        (#poke Foo, a) ptr a'
        (#poke Foo, b) ptr b'

And the convenience function for f_add_a can be written like this:

addA b = with b $ \ p -> f_add_a p >> peek p

Then I can modify the main function to test this as well:

main = do
    b <- return $ Bar { a=17, b=47 }
    printFoo b
    d <- addA b
    printFoo b
    printFoo d

Indeed, produces the expected out put:

print_foo
f->a: 17
f->b: 47
add_a
print_foo
f->a: 17
f->b: 47
print_foo
f->a: 18
f->b: 47

Pure out arguments can be handled using alloca with a function similar to the one I pass to with in addA above.

OSCON videos are available

I’m quite likely the last one to realise this but the OSCON 2007 videos are available on blip.tv. Search for “oscon” to find them. I’ve heard Simon Peyton-Jones’s tutorial on Haskell was very well received, I just downloaded the two parts to watch in the weekend.