Posts tagged ‘phantom types’

My first trip to the phantom zone

Based on apfelmus’ reply on the libraries mailing list it seems I got phantom types a bit wrong. So yesterday I set out to try to understand a bit better.

I got my inspiration from two blog posts in particular, Neil Mitchell’s and Lennart Augustsson’s. I am not sure I’m adding very much beyond the latter of the two but just in case it’ll help someone I’m documenting this anyway.

This is a rather contrived example but it’s short and simple. I have a type that can have two types of values:

type MyType = Either Int String

Ints can be added and Strings can be concatenated:

add :: MyType -> MyType -> MyType
add (Left x) (Left y) = Left $ x + y

cat :: MyType -> MyType -> MyType
cat (Right a) (Right b) = Right $ a ++ b

The problem with this of course is that nothing prevents me from trying to pass a Left and a Right to either of the functions. Both would result in errors as it stands. One option would of course be to explicitly deal with the error cases in code, but wouldn’t it be much nicer if it would somehow be possible to catch those error cases already at compile time?

This is exactly what phantom types allow, and it does it within “regular” Haskelli.

First I need a new type:

newtype MyPh a = MyPh MyType deriving (Show)

Then I rename the two functions above, add and cat, as add' and cat' and introduce two “wrapper functions”:

add :: MyPh Int -> MyPh Int -> MyPh Int
add (MyPh x) (MyPh y) = MyPh $ add' x y

cat :: MyPh String -> MyPh String -> MyPh String
cat (MyPh a) (MyPh b) = MyPh $ cat' a b

The type signatures on these functions are the important part. Without them Haskell would infer the type MyPh a -> MyPh b -> MyPh c for both functions and I gain nothing by that. What is left to do is to add constructors:

phInt :: Int -> MyPh Int
phInt = MyPh . Left

phString :: String -> MyPh String
phString = MyPh . Right

Again the type signatures are crucial.

Now, to make sure that the user of this API doesn’t mess with the type safety that I’ve introduced I must take care to export the phantom type but not its constructor (MyPh), the two manually written constructors (phInt and phString), and the safe functions for adding and concatenating (add and cat).

[Edited 17-10-2007] Corrected the inferred type after Derek’s comment.

  1. As Lennart A points out in his post GADT’s offer a very elegant way of achieving the same results. However, GADT’s aren’t part of Haskell 98.[back]

Phantom type problems

The last few days i’ve been hacking on a data encoding library in Haskell. Haskell has long been lacking in this area and receiving a double-encoded email about a week ago alerted me to this (you might have had to actually be in my head to follow that train of thought :) ). Yesterday I put up a Wiki page and posted an email to the Haskell library list. I also logged into #Haskell and asked for feedback. This was when Saizan came with an interesting suggestion:

What about using phantom types to annotate the encoding and use a typeclass to unify the interfaces?

That’s an excellent idea. One thought has been bugging me since beginning implementing the library, each module exposes the same methods, with the same type signatures, but there is no type-safe polymorphism in the API. I’ve never really understood phantom types so Saizan’s idea was a bit of an eye-opener for me. I think they can help solve the problem. Unfortunately I ran into another problem when playing.

I picked base16 encoding to play a bit with this because the implementation is fairly short and simple. I began with creating a new typeclass:

class DataEncoding d where
    encode :: [Word8] -> d String
    decode :: d String -> [Word8]
    chop :: Int -> d String -> d [String]
    unchop :: d [String] -> d String
    liberate :: d [String] -> [String]
    incarcerate :: [String] -> d [String]

I added the last two functions since it’s no good to have the encoding locked up and inaccessible all the time, at some point we might actually want to do something with the data.

The implementation in Base16 looks like this:

data B16Enc p = B16EncS String
    | B16EncLS [String]
    deriving (Eq, Show)

instance DataEncoding B16Enc where
    encode os = B16EncS $ b16Encode os
    decode (B16EncS s) = b16Decode s
    chop n (B16EncS s) = B16EncLS $ b16Chop n s
    unchop (B16EncLS ss) = B16EncS $ b16Unchop ss
    liberate (B16EncLS ss) = ss
    incarcerate ss = B16EncLS ss

Then I realised that it might be useful to be able to liberate not only the result of a chop but also the result of a plain encode. That is I want something like

    liberate :: d a -> a
    incarcerate :: a -> d a

Where ideally a can be String or [String] and nothing else (I’ll settle for a weaker type if I have to though). What I want to avoid, if possible is the need for two functions for both liberate and incarcerate:

    liberateS :: d String -> String
    incarcerateS :: String -> d String
    liberateSL :: d [String] -> [String]
    incarcerateSL :: [String] -> d [String]

I’ve tried just introducing the a like above, but that causes a failure to match types, and declaring the class on DataEncoding d a (and enabling Glasgow extensions) causes arity problems in the types that confuse me to no end.

So, introducing phantom types and a typeclass is still ongoing. Any tip on solving the immidiate issue is welcome. I’ll be back as soon as I’ve come to any conclusion as to whether it’s worth the trouble or not :-)