Archive for October 2007

Computer security and liability—my thoughts

Almost three years ago Bruce Schneier posted a blog entry on Computer Security and Liability. Since then he has repeated his opinion several times; one of the more high-profile occasions was in front of the House of Lords. Some people agree, others disagree.

Until just a few days ago I disagreed with him on this particular issue. After the four learned hosts of LugRadio brought up the issue in episode 3 I had another think and I’ve now changed my opinion. I am now in favour of holding companies financially liable for damages resulting form security vulnerabilities in software products.

The software business is interesting because there’s a very obvious asymmetry in what is known about a product between the people who write and sell software and the people who use and buy software. Bruce Schneier has touched on that as well in his post on Security Lemons. Basically the buyer of software knows nothing of the ilities of what they are being sold, so there is very little to hang an informed decision on.

I think that introducing financial liability for software producers should take into consideration whether a buyer can make an informed decision before buying or not. This means that in cases where the buyer has full access to the sourcei there will be no financial liability on the developer. It would be enough to offer all source code under an NDA to a buyer before the deal is finalised. Basically liability would be the price a software vendor has to pay to keep the buyer in the dark regarding how secure the product is.

  1. Note that the source doesn’t have to be free as in having all four freedoms granted by e.g. the GPL.[back]

Aha-one-liners

I’ve been working with computers for more than seven years now and before that I spent five and a half years in university studying computer science and engineering. During these years I’ve learnt a lot and had many aha-moments. However, since starting learning Haskell I’ve had aha-moments that manifest themselves in a single line of code. This has never happened before. Ever! I’m not sure what that says about Haskell. Maybe it’s computer science distilled, who knows?

Anyway, my latest aha-one-liner is this:

liftM dec $ sequence $ map (flip M.lookup decodeMap) s

I’ll try to recount how I got to it.

When writing dataenc I wrote a function decode :: String -> [Word8]. In a discussion on the Haskell libraries list apfelmus pointed out the necessity of allowing decode to fail. The obvious solution is to change the type to String -> Maybe [Word8].

The decoding is done using a look-up table, with some twiddling of bits to get the original data back from the encoded string. The first version used ! to do the lookup, but when this fails it throws an exception. There have been many emails on Haskell Cafe and several articles written on exceptions in Haskell, and as anyone who’s read them know: it’s better to avoid ever getting exceptions than dealing with them (I suspect this is true for any language). The function lookup offers a nice way of looking up a value without risking getting an exception. By mapping it over the string I end up with a list of possible values ([Maybe Word8]). The next step was to find a function that converts this list of possible values to possibly a list (Maybe [Word8]). sequence is just that function. The most important thing is that a single Nothing in the list of possible values results in a result of Nothing from sequence too. At this point I have a list of values, e.g. Just [1,2,3], or Nothing. All that’s left to do now is the bit twiddling part, and that’s of course what lifting dec does, and of course lifting dec into Nothing produces Nothing.

So, what was my aha-moment in all this?

Well, there were actually two things. First, it was the realisation that instead of using a function that can throw an exception (!) I could use a function that allows for failure in a controlled way (lookup). Second, by using the right type (Maybe) I can write my function as if failure doesn’t exist.

Why doesn’t all programming languages make it this easy?

[Edited 24-10-2007]: Spelling and grammar.

My first trip to the phantom zone

Based on apfelmus’ reply on the libraries mailing list it seems I got phantom types a bit wrong. So yesterday I set out to try to understand a bit better.

I got my inspiration from two blog posts in particular, Neil Mitchell’s and Lennart Augustsson’s. I am not sure I’m adding very much beyond the latter of the two but just in case it’ll help someone I’m documenting this anyway.

This is a rather contrived example but it’s short and simple. I have a type that can have two types of values:

type MyType = Either Int String

Ints can be added and Strings can be concatenated:

add :: MyType -> MyType -> MyType
add (Left x) (Left y) = Left $ x + y

cat :: MyType -> MyType -> MyType
cat (Right a) (Right b) = Right $ a ++ b

The problem with this of course is that nothing prevents me from trying to pass a Left and a Right to either of the functions. Both would result in errors as it stands. One option would of course be to explicitly deal with the error cases in code, but wouldn’t it be much nicer if it would somehow be possible to catch those error cases already at compile time?

This is exactly what phantom types allow, and it does it within “regular” Haskelli.

First I need a new type:

newtype MyPh a = MyPh MyType deriving (Show)

Then I rename the two functions above, add and cat, as add' and cat' and introduce two “wrapper functions”:

add :: MyPh Int -> MyPh Int -> MyPh Int
add (MyPh x) (MyPh y) = MyPh $ add' x y

cat :: MyPh String -> MyPh String -> MyPh String
cat (MyPh a) (MyPh b) = MyPh $ cat' a b

The type signatures on these functions are the important part. Without them Haskell would infer the type MyPh a -> MyPh b -> MyPh c for both functions and I gain nothing by that. What is left to do is to add constructors:

phInt :: Int -> MyPh Int
phInt = MyPh . Left

phString :: String -> MyPh String
phString = MyPh . Right

Again the type signatures are crucial.

Now, to make sure that the user of this API doesn’t mess with the type safety that I’ve introduced I must take care to export the phantom type but not its constructor (MyPh), the two manually written constructors (phInt and phString), and the safe functions for adding and concatenating (add and cat).

[Edited 17-10-2007] Corrected the inferred type after Derek’s comment.

  1. As Lennart A points out in his post GADT’s offer a very elegant way of achieving the same results. However, GADT’s aren’t part of Haskell 98.[back]

Phantom type problems

The last few days i’ve been hacking on a data encoding library in Haskell. Haskell has long been lacking in this area and receiving a double-encoded email about a week ago alerted me to this (you might have had to actually be in my head to follow that train of thought :) ). Yesterday I put up a Wiki page and posted an email to the Haskell library list. I also logged into #Haskell and asked for feedback. This was when Saizan came with an interesting suggestion:

What about using phantom types to annotate the encoding and use a typeclass to unify the interfaces?

That’s an excellent idea. One thought has been bugging me since beginning implementing the library, each module exposes the same methods, with the same type signatures, but there is no type-safe polymorphism in the API. I’ve never really understood phantom types so Saizan’s idea was a bit of an eye-opener for me. I think they can help solve the problem. Unfortunately I ran into another problem when playing.

I picked base16 encoding to play a bit with this because the implementation is fairly short and simple. I began with creating a new typeclass:

class DataEncoding d where
    encode :: [Word8] -> d String
    decode :: d String -> [Word8]
    chop :: Int -> d String -> d [String]
    unchop :: d [String] -> d String
    liberate :: d [String] -> [String]
    incarcerate :: [String] -> d [String]

I added the last two functions since it’s no good to have the encoding locked up and inaccessible all the time, at some point we might actually want to do something with the data.

The implementation in Base16 looks like this:

data B16Enc p = B16EncS String
    | B16EncLS [String]
    deriving (Eq, Show)

instance DataEncoding B16Enc where
    encode os = B16EncS $ b16Encode os
    decode (B16EncS s) = b16Decode s
    chop n (B16EncS s) = B16EncLS $ b16Chop n s
    unchop (B16EncLS ss) = B16EncS $ b16Unchop ss
    liberate (B16EncLS ss) = ss
    incarcerate ss = B16EncLS ss

Then I realised that it might be useful to be able to liberate not only the result of a chop but also the result of a plain encode. That is I want something like

    liberate :: d a -> a
    incarcerate :: a -> d a

Where ideally a can be String or [String] and nothing else (I’ll settle for a weaker type if I have to though). What I want to avoid, if possible is the need for two functions for both liberate and incarcerate:

    liberateS :: d String -> String
    incarcerateS :: String -> d String
    liberateSL :: d [String] -> [String]
    incarcerateSL :: [String] -> d [String]

I’ve tried just introducing the a like above, but that causes a failure to match types, and declaring the class on DataEncoding d a (and enabling Glasgow extensions) causes arity problems in the types that confuse me to no end.

So, introducing phantom types and a typeclass is still ongoing. Any tip on solving the immidiate issue is welcome. I’ll be back as soon as I’ve come to any conclusion as to whether it’s worth the trouble or not :-)

Are all languages leaky?

Dolio wrote an excellent comment on space leaks for the previous post. I’ve read posts on haskell-cafe before that mentions the concept but I’ve never bothered to ask. Thanks for clarifying the term!

Dolio’s comment made me think of something I heard during a lecture on OO at university, with the risk of paraphrasing slightly:

It helps a lot to know how a C compiler works while programming C.

In my mind this means that the language is “leaky” in a similar sense to how Joel’s law of leaky abstractions. I suppose a programming language is little more than an abstraction of the machine underneath or, in the case of most languages, of the compiler/interpreter of the language.

Now of course I’m left wondering if “leakiness” can be measured, can languages be ordered based on it? Maybe there are two dimensions to “leakiness”, how early you need knowledge of the layer below and how deep knowledge that knowledge has to be. My gut tells me you can’t program in C for very long without needing some compiler knowledge but I’m not sure just how deep that knowledge needs to be. My gut also tells me I can blissfully hack along in Haskell for quite a while before needing to know how things like laziness actually works, and again I don’t know if the knowledge needs to be deep or noti.

Yes, I’ve been in a somewhat philosophical mood lately (some would say it’s a procrastinating mood :) ).

  1. My problem is really that I feel I know a fair bit about how compilers and interpreters work for imperative languages like C, while I’m new to lazy functional languages like Haskell.[back]

I wonder…

After reading some posts in my haskell category of blogs I read I found myself wondering the following: