On mocks and stubs in python (free monad or interpreter pattern)

A few weeks ago I watched a video where Ken Scambler talks about mocks and stubs. In particular he talks about how to get rid of them.

One part is about coding IO operatioins as data and using the GoF interpreter pattern to

What he’s talking about is of course free monads, but I feel he’s glossing over a lot of details. Based on some of the questions asked during the talk I think I share that feeling with some people in the audience. Specifically I feel he skipped over the following:

  • How does one actually write such code in a mainstream OO/imperative language?
  • What’s required of the language in order to allow using the techniques he’s talking about?
  • Errors tend to break abstractions, so how does one deal with error (i.e. exceptions)?

Every time I’ve used mocks and stubs for unit testing I’ve had a feeling that “this can’t be how it’s supposed to be done!” So to me, Ken’s talk offered some hope, and I really want to know how applicable the ideas are in mainstream OO/imperative languages.

The example

To play around with this I picked the following function (in Python):

def count_chars_of_file(fn):
    fd = os.open(fn, os.O_RDONLY)
    text = os.read(fd, 10000)
    n = len(text)
    os.close(fd)
    return n

It’s small and simple, but I think it suffices to highlight a few important points. So the goal is to rewrite this function such that calls to IO operations (actions) (e.g. os.read) are replaced by data (an instance of some data type) conveying the intent of the operation. This data can later be passed to an interpreter of actions.

Thoughts on the execution of actions and the interpreter pattern

When reading the examples in the description of the interpreter pattern what stands out to me is that they are either

  1. a list of expressions, or
  2. a tree of expressions

that is passed to an interpreter. Will this do for us when trying to rewrite count_chars_of_file?

No, it won’t! Here’s why:

  • A tree of actions doesn’t really make sense. Our actions are small and simple, they encode the intent of a single IO operation.
  • A list of actions can’t deal with interspersed non-actions, in this case it’s the line n = len(text) that causes a problem.

The interpreter pattern misses something that is crucial in this case: the running of the interpreter must be intermingled with running non-interpreted code. The way I think of it is that not only the action needs to be present and dealt with, but also the rest of the program, that latter thing is commonly called a continuation.

So, can we introduce actions and rewrite count_chars_of_file such that we pause the program when interpretation of an action is required, interpret it, and then resume where we left off?

Sure, but it’s not really idiomatic Python code!

Actions and continuations

The IO operations (actions) are represented as a named tuple:

Op = collections.namedtuple('Op', ['op', 'args', 'k'])

and the functions returning actions can then be written as

def cps_open(fn, k):
    return Op('open', [fn], k)

def cps_read(fd, k):
    return Op('read', [fd], k)

def cps_close(fd, k):
    return Op('close', [fd], k)

The interpreter is then an if statement checking the value of op.op with each branch executing the IO operation and passing the result to the rest of the program. I decided to wrap it directly in the program runner:

def runProgram(prog):
    def runOp(op):
        if op.op == 'open':
            fd = os.open(*op.args, os.O_RDONLY)
            return op.k(fd)
        elif op.op == 'read':
            text = os.read(*op.args, 10000)
            return op.k(text)
        elif op.op == 'close':
            os.close(*op.args)
            return op.k()

    while isinstance(prog, Op):
        prog = runOp(prog)

    return prog

So far so good, but what will count_char_of_file all of this do to count_chars_of_file?

Well, it’s not quite as easy to read any more (basically it’s rewritten in CPS):

def count_chars_of_file(fn):

    def cont_1(text, fd):
        n = len(text)
        return cps_close(fd, lambda n=n: n)

    def cont_0(fd):
        return cps_read(fd, lambda text, fd=fd: cont_1(text, fd))

    return cps_open(fn, cont_0)

Generators to the rescue

Python does have a notion of continuations in the form of generators.1 By making count_char_of_file into a generator it’s possible to remove the explicit continuations and the program actually resembles the original one again.

The type for the actions loses one member, and the functions creating them lose an argument:

Op = collections.namedtuple('Op', ['op', 'args'])

def gen_open(fn):
    return Op('open', [fn])

def gen_read(fd):
    return Op('read', [fd])

def gen_close(fd):
    return Op('close', [fd])

The interpreter and program runner must be modified to step the generator until its end:

def runProgram(prog):
    def runOp(op):
        if op.op == 'open':
            fd = os.open(op.args[0], os.O_RDONLY)
            return fd
        elif op.op == 'read':
            text = os.read(op.args[0], 10000)
            return text
        elif op.op == 'close':
            os.close(op.args[0])
            return None

    try:
        op = prog.send(None)
        while True:
            r = runOp(op)
            op = prog.send(r)
    except StopIteration as e:
        return e.value

Finally, the generator-version of count_chars_of_file goes back to being a bit more readable:

def count_chars_of_file(fn):
    fd = yield gen_open(fn)
    text = yield gen_read(fd)
    n = len(text)
    yield gen_close(fd)
    return n

Generators all the way

Limitations of Python generators mean that we have either have to push the interpreter (runProgram) down to where count_char_of_file is used, or make all intermediate layers into generators and rewrite the interpreter to deal with this. It could look something like this then:

def runProgram(prog):
    def runOp(op):
        if op.op == 'open':
            fd = os.open(op.args[0], os.O_RDONLY)
            return fd
        elif op.op == 'read':
            text = os.read(op.args[0], 10000)
            return text
        elif op.op == 'close':
            os.close(op.args[0])
            return None

    try:
        op = prog.send(None)
        while True:
            if isinstance(op, Op):
                r = runOp(op)
                op = prog.send(r)
            elif isinstance(op, types.GeneratorType):
                r = runProgram(op)
                op = prog.send(r)

    except StopIteration as e:
        return e.value

Final thoughts

I think I’ve shown one way to achieve, at least parts of, what Ken talks about. The resulting code looks almost like “normal Python”. There are some things to note:

  1. Exception handling is missing. I know of no way to inject an exception into a generator in Python so I’m guessing that exceptions from running the IO operations would have to be passed in via generator.send as a normal value, which means that exception handling code would have to look decidedly non-Pythonic.
  2. Using this approach means the language must have support for generators (or some other way to represent the rest of the program). I think this rules out Java, but probably it can be done in C#.
  3. I’ve only used a single interpreter here, but I see no issues with combining interpreters (to combine domains of operations like file operations and network operations). I also think it’d be possible to use it to realize what De Goes calls Onion Architecture.

Would I ever advocate this approach for a larger Python project, or even any project in an OO/imperative language?

I’m not sure! I think that testing using mocks and stubs has lead to a smelly code base each an every time I’ve used it, but this approach feels a little too far from how OO/imperative code usually is written. I would love to try it out and see what the implications really are.


  1. I know, I know, coroutines! I’m simplifying and brushing over details here, but I don’t think I’m brushing over any details that are important for this example.

That's a large project

From LWN October 6, 2016:

Over the course of the last year, the project accepted about eight changes per hour — every hour — from over 4,000 developers sponsored by over 400 companies.

Free, take 2

The other day I read a blog post on monads and stuff after which I felt rather silly about my earlier posts on Free.

I think this is probably the post I should have written instead :)

I’ll use three pieces of code, each one builds on the one before:

  • Free1.hs - Uses a free monad for a single algebra/API (full code here).
  • Free2.hs - Uses a free monad for a two algebras/APIs, where one decorates the other (full code here).
  • Free3.hs - Uses a free monad for a three algebras/APIs, where two are used in the program and the remaing one decorates the other two (full code here).

The first - one algebra

I’m re-using the algebras from my previous posts, but I believe it makes it easier to follow along if the amount of jumping between posts is minimized so here is the first one once again:

data SimpleFileF a
  = LoadFile FilePath (String -> a)
  | SaveFile FilePath String a
  deriving(Functor)

type SimpleFileAPI = Free SimpleFileF

loadFile :: FilePath -> SimpleFileAPI String
loadFile fp = liftF $ LoadFile fp id

saveFile :: FilePath -> String -> SimpleFileAPI ()
saveFile fp d = liftF $ SaveFile fp d ()

It’s a ridiculously small one, but I believe it’s good enough to work with. In the previous posts I implemented the Functor instances manually. I couldn’t be bothered this time around; I think I pretty much know how to do that for this kind of types now.

Having a type for the algebra, SimpleFileAPI, is convenient already now, even more so in the other examples.

The two convenience functions on the end makes it straight forward to write functions using the algebra:

withSimpleFile :: (String -> String) -> FilePath -> SimpleFileAPI ()
withSimpleFile f fp = do
  d <- loadFile fp
  let result = f d
  saveFile (fp ++ "_new") result

This is simple, straight forward monadic code. Easy to read and work with. Of course it doesn’t actually do anything at all yet. For that I need need an interpreter, something that translates (reduces) the algebra, the API, the commands, call them what you will, into the (side-)effects I want. For Free that is foldFree together with a suitable function f :: SimpleFileF a -> IO a.

I want LoadFile to translate to a file being read and SaveFile to some data being saved to a file. That makes it pretty obvious how that f needs to be written:

runSimpleFile :: SimpleFileAPI a -> IO a
runSimpleFile = foldFree f
  where
    f (LoadFile fp f') = f' <$> readFile fp
    f (SaveFile fp d r) = writeFile fp d >> return r

At this point it might be good to explain the constructors of SimpleFileF a bit more. At first I thought they looked a bit funny. I mean, why does SaveFile have an a at all since it obviously always should result in ()? And what’s up with that function in Loadfile?

It did become a little clearer to me after some thought and having a look at Free:

data Free f a = Pure a | Free (f (Free f a))

I personally find the latter constructor a bit mind-bending. I can handle recursive functions fairly well, but recursive types have a tendency to confuse me. From what I understand one can think of Free as similar to a list. Pure ends the list (Nil) and Free one instance of f to the rest of the list (Cons). Since Free f a is a monad one can think of a as the result of the command.

If I were to write saveFile explicitly it’d look like this

saveFile fp d = Free (SaveFile fp d (Pure ()))

and for loadFile:

loadFile fp = Free (LoadFile fp (\ s -> Pure s))

But let’s get back to the type and why ‘a’ occurs like it does in the two constructors. As Gabriel G shows in his post Why free monads matter a constructor without a would result in termination. In other words, if SaveFile didn’t hold an a I’d not be able to write, in a natural way, a function that saves two files.

Another limiting factor is that foldFree of the Free implementation I’m using has the type Monad m => (forall x. f x -> m x) -> Free f a -> m a. This sets a requirement on what the function translating from my API into effects may look like, i.e. what f in runSimpleFile may look like. If SaveFile had no a to return what would f (SaveFile {}) return, how could it ever satisfy the required type?

The reason for LoadFile having a function String -> a is simply that there is no data yet, but I still want to be able to manipulate it. Using a function and function composition is the ticket then.

I think that’s all there is to say about to the first piece of code. To run it take a look at the comment at the end of the file and then play with it. If you want to turn all characters of a file foo into upper case you can use

runSimpleFile $ withSimpleFile (map toUpper) "foo"

The second - two algebras, one decorating the other

The second piece of code almost only adds to the first one. There is one exception though, the function runSimpleFile is removed. Instead I’ve taken the transformation function, which used to be called f and was internal to runSimpleFile and moved it out. It’s called stepSimpleFile:

stepSimpleFile :: SimpleFileF a -> IO a
stepSimpleFile (LoadFile fp f') = f' <$> readFile fp
stepSimpleFile (SaveFile fp d r) = writeFile fp d >> return r

The logging API, LogAPI, follows the same pattern as SimpleFileAPI and I’m counting on the description above being clear enough to not have to repeat myself. For completeness I include the code:

data LogF a = Log String a
          deriving(Functor)

type LogAPI = Free LogF

stepLog :: LogF a -> IO a
stepLog (Log s r) = putStrLn s >> return r

I intend the LogAPI to be used as embellishments on the SimpleFileAPI, in other words I somehow have to turn an operation of SimpleFileAPI into an operation of LogAPI, i.e. I need a transformation. I called it logSimpleFileT and let it turn operations in SimpleFileF (i.e. not exactly SimpleFileAPI) into LogAPI (if you are wondering about my choice of type I hope it’ll become clear below, just trust me for now that this is a good choice):

logSimpleFileT :: SimpleFileF a -> LogAPI ()
logSimpleFileT (LoadFile fp _) = liftF $ Log ("** load file " ++ fp) ()
logSimpleFileT (SaveFile fp _ _) = liftF $ Log ("** save file " ++ fp) ()

So far everything is hopefully very straight forward and unsurprising. Now I need to combine the two APIs, I need to add them, in other words, I need a sum type:

data S a1 a2 t = A1 (a1 t) | A2 (a2 t)
              deriving(Functor)

type SumAPI = Free (S LogF SimpleFileF)

The next question is how to turn my two original APIs, SimpleFileAPI and LogAPI, into SumAPI. Luckily that’s already solved by the function hoistFree:

hoistFree :: Functor g => (forall a. f a -> g a) -> Free f b -> Free g b 

With this and logSimpleFileT from above I can use foldFree to decorate each operation with a logging operation like this:

logSimpleFile :: SimpleFileAPI a -> SumAPI a
logSimpleFile = foldFree f
where
  f op = hoistFree A1 (logSimpleFileT op) *> hoistFree A2 (liftF op)

This is where the type of logSimpleFileT hopefully makes sense!

Just as in the first section of this post, I need an interpreter for my API (SumAPI this time). Once again it’s written using foldFree, but this time I provide the interpreters for the sub-algegras (what I’ve chosen to call step functions):

runSum :: Monad m => (forall a. LogF a -> m a) -> (forall a. SimpleFileF a -> m a) -> SumAPI b -> m b
runSum f1 f2 = foldFree f
where
  f (A1 op) = f1 op
  f (A2 op) = f2 op

The file has a comment at the end for how to run it. The same example as in the previous section, but now with logging, looks like this

runSum stepLog stepSimpleFile $ logSimpleFile $ withSimpleFile (map toUpper) "foo"

The third - three algebras, one decorating the other two

To combine three algebras I simply take what’s in the previous section and extend it, i.e. a sum type with three constructors:

data S a1 a2 a3 t = A1 (a1 t) | A2 (a2 t) | A3 (a3 t)
                deriving(Functor)

type SumAPI = Free (S LogF SimpleFileF StdIoF)

runSum :: Monad m => (forall a. LogF a -> m a)
      -> (forall a. SimpleFileF a -> m a)
      -> (forall a. StdIoF a -> m a)
      -> SumAPI b -> m b
runSum f1 f2 f3 = foldFree f
where
  f (A1 op) = f1 op
  f (A2 op) = f2 op
  f (A3 op) = f3 op

With this I’ve already revealed that my three APIs are the two from previous sections, LogAPI (for decorating the other APIs), SimpleFileAPI and a new one, StdIoAPI.

I want to combine them in such a wat that I can write functions using both APIs at the same time. Then I modify withSimpleFile into

withSimpleFile :: (String -> String) -> FilePath ->  SumAPI ()
withSimpleFile f fp = do
  d <- loadFile fp
  let result = f d
  saveFile (fp ++ "_new") result

and I can add another function that uses it with StdIoAPI:

prog :: FilePath -> SumAPI ()
prog fn = do
  stdioPut "About to start"
  withSimpleFile (map toUpper) fn
  stdioPut "Done!"

The way to allow the APIs to be combined this way is to bake in S already in the convenience functions. This means the code for SimpleFileAPI has to change slightly (note the use of A2 in loadFile and saveFile):

data SimpleFileF a
= LoadFile FilePath (String -> a)
| SaveFile FilePath String a
deriving(Functor)

loadFile :: FilePath -> SumAPI String
loadFile fp = liftF $ A2 $ LoadFile fp id

saveFile :: FilePath -> String -> SumAPI ()
saveFile fp d = liftF $ A2 $ SaveFile fp d ()

stepSimpleFile :: SimpleFileF a -> IO a
stepSimpleFile (LoadFile fp f') = f' <$> readFile fp
stepSimpleFile (SaveFile fp d r) = writeFile fp d >> return r

The new API, StdIoAPI, has only one operation:

data StdIoF a = PutStrLn String a
              deriving(Functor)

stdioPut :: String -> SumAPI ()
stdioPut s = liftF $ A3 $ PutStrLn s ()

stepStdIo :: StdIoF b -> IO b
stepStdIo (PutStrLn s a) = putStrLn s >> return a

The logging API, LogAPI, looks exactly the same but I now need two transformation functions, one for SimpleFileAPI and one for StdIoAPI.

data LogF a = Log String a
            deriving(Functor)

type LogAPI = Free LogF

stepLog :: LogF a -> IO a
stepLog (Log s r) = putStrLn s >> return r

logSimpleFileT :: SimpleFileF a -> LogAPI ()
logSimpleFileT (LoadFile fp _) = liftF $ Log ("** load file " ++ fp) ()
logSimpleFileT (SaveFile fp _ _) = liftF $ Log ("** save file " ++ fp) ()

logStdIoT :: StdIoF a -> LogAPI ()
logStdIoT (PutStrLn s _) = liftF $ Log ("** on stdio " ++ s) ()

The new version of logT needs to operate on S in order to decorate both APIs.

logT :: SumAPI a -> SumAPI a
logT = foldFree f
  where
    f (A2 op) = hoistFree A1 (logSimpleFileT op) *> hoistFree A2 (liftF op)
    f (A3 op) = hoistFree A1 (logStdIoT op) *> hoistFree A3 (liftF op)
    f a@(A1 _) = liftF a

This file also has comments on how to run it at the end. This time there are two examples, one on how to run it without logging

runSum undefined stepSimpleFile stepStdIo $ prog "foo"

and one with logging

runSum stepLog stepSimpleFile stepStdIo (logT $ prog "foo")

CMake, ExternalData, and custom fetch script

I failed to find a concrete example on how to use the CMake module ExternalData with a custom fetch script. Since I finally manage to work out how to use it I thought I’d try to help out the next person who needs to go down this route.

Why ExternalData?

I thought I’d start with a short justification of why I was looking at the module at all.

At work I work with a product that processes images and video. When writing tests we often need some rather large files (from MiB to GiB) as input. The two obvious options are:

  1. Check the files into our Git repo, or
  2. Put them on shared storage

Neither of these are very appealing. The former just doesn’t feel quite right, these are large binary files that rarely, if ever, change, why place them under version control at all? And if they do change the Git repo is likely to balloon in size and impact cloning times negatively. The latter makes it difficult to run our tests on a machine that isn’t on the office network and any changes to the files will break older tests, unless we always only add files, never modify any in place. On the other hand, the former guarantees that the files needed for testing are always available and it is possible to modify the files without breaking older tests. The pro of the latter is that we only download the files needed for the current tests.

ExternalData is one option to address this. On some level it feels like it offers a combination of both options above:

  • It’s possible to use the shared storage
  • When the shared storage isn’t available it’s possible to fall back on downloading the files via other means
  • The layout of the storage is such that modifying in place is much less likely
  • Only the files needed for the currest tests will be downloaded when building off-site

The object store

We do our building in docker images that do have our shared storage mapped in, so I’d like them to take advantage of that. At the same time I want the builds performed off-site to download the files. To get this behaviour I defined two object stores:

set(ExternalData_OBJECT_STORES
  ${CMAKE_BINARY_DIR}/ExternalData/Objects
  /mnt/shared/over/nfs/Objects
  )

The module will search each of these for the required files and download only if they aren’t found. Downloaded files will be put into the first of the stores. Oh, and it’s very important that the first store is given with an absolute path!

The store on the shared storage looks something like this:

/mnt/share/over/nfs/Objects
└── MD5
    ├── 94ed17f9b6c74a732fba7b243ab945ff
    └── a2036177b190fbee6e9e038b718f1c20

I can then drop a file MyInput.avi.md5 in my source tree with the md5 of the real file (e.g. a2036177b190fbee6e9e038b718f1c20) as the content. Once that is done I can follow the example found in the introduction of the reference documentation.

curl vs sftp

So far so good. This now works on-site, but for off-site use I need to fetch the needed files. The last section of the reference documentation is called Custom Fetch Scripts. It mentions that files are normally downloaded using file(DOWNLOAD). Neither there, nor in the documentation for file is there a mention of what is used under the hood to fetch the files. After asking on in #cmake I found out that it’s curl. While curl does handle SFTP I didn’t get it to work with my known_hosts file, nor with my SSH agent (both from OpenSSH). On the other hand it was rather easy to configure sftp to fetch a file from the internet-facing SSH server we have. Now I just had to hook it into CMake somehow.

Custom fetch script

As the section on “Custom Fetch Scripts” mention three things are needed:

  1. Specify the script via the ExternalDataCustomScript:// protocol.
  2. Tell CMake where it can find the fetch script.
  3. The fetch script itself.

The first two steps are done by providing a URL template and pointing to the script via a special variable:

set(ExternalData_URL_TEMPLATES "ExternalDataCustomScript://sftp/mnt/shared/over/nfs/Objects/%(algo)/%(hash)")
set(ExternalData_CUSTOM_SCRIPT_sftp ${CMAKE_SOURCE_DIR}/cmake/FetchFromSftp.cmake)

It took me a ridiculous amount of time to work out how to write a script that turns out to be rather short. This is an experience that seems to repeat itself when using CMake; it could say something about me, or something about CMake.

get_filename_component(FFS_ObjStoreDir ${ExternalData_CUSTOM_FILE} DIRECTORY)
get_filename_component(FFS_InputFilename ${ExternalData_CUSTOM_LOCATION} NAME)
get_filename_component(FFS_OutputFilename ${ExternalData_CUSTOM_FILE} NAME)

execute_process(COMMAND sftp sftp.company.com:/${ExternalData_CUSTOM_LOCATION}
  RESULT_VARIABLE FFS_SftpResult
  OUTPUT_QUIET
  ERROR_VARIABLE FFS_SftpErr
  )

if(FFS_SftpResult)
  set(ExternalData_CUSTOM_ERROR "Failed to fetch from SFTP - ${FFS_SftpErr}")
else(FFS_SftpResult)
  file(MAKE_DIRECTORY ${FFS_ObjStoreDir})
  file(RENAME ${FFS_InputFilename} ${FFS_ObjStoreDir}/${FFS_OutputFilename})
endif(FFS_SftpResult)

This script is run with cmake -P in the binary dir of the CMakeLists.txt where the test is defined, which means it’s oblivious about the project it’s part of. PROJECT_BINARY_DIR is empty and CMAKE_BINARY_DIR is the same as CMAKE_CURRENT_BINARY_DIR. This is the reason why the first store in ExternalData_OBJECT_STORES has to be an absolute path – it’s very difficult, if not impossible, to find the correct placement of the object store otherwise.

Qt5+D-Bus+CMake, a complete example

Yesterday I started digging into Qt5 and D-Bus. I never found a complete example, so I put one together myself: https://gist.github.com/magthe/2cf7220655bd8bf431259cc7dee99f64.

Final version of JSON to sum type

After some feedback on my previous post I ended up with the following implementation.

instance FromJSON V.VersionRange where
  parseJSON = withObject "VersionRange" go
    where
      go o =
        V.thisVersion <$> o .: "ThisVersion" <|>
        V.laterVersion <$> o .: "LaterVersion" <|>
        V.earlierVersion <$> o .: "EarlierVersion" <|>
        V.WildcardVersion <$> o .: "WildcardVersion" <|>
        nullaryOp V.anyVersion <$> o .: "AnyVersion" <|>
        binaryOp V.unionVersionRanges <$> o .: "UnionVersionRanges" <|>
        binaryOp V.intersectVersionRanges <$> o .: "IntersectVersionRanges" <|>
        V.VersionRangeParens <$> o .: "VersionRangeParens"

      nullaryOp :: a -> Value -> a
      nullaryOp = const

      binaryOp f [a, b] = f a b

Thanks David for your suggestions.

From JSON to sum type

For a while I’ve been planning to take full ownership of the JSON serialisation and parsing in cblrepo. The recent inclusion of instances of ToJSON and FromJSON for Version pushed me to take the first step by writing my own instances for all external types.

When doing this I noticed that all examples in the aeson docs use a product

data Person = Person {
      name :: Text
    , age  :: Int
    }

whereas I had to deal with quite a few sums, e.g. VersionRange. At first I struggled a little with how to write an instance of FromJSON. After quite a bit of thinking I came up with the following, which I think is fairly nice, but I’d really like to hear what others think about it. Maybe I’ve just missed a much simpler way of implementing parseJSON:

instance FromJSON V.VersionRange where
  parseJSON = withObject "VersionRange" go
    where
      go o = do
        lv <- (o .:? "LaterVersion") >>= return . fmap V.laterVersion
        tv <- (o .:? "ThisVersion") >>= return . fmap V.thisVersion
        ev <- (o .:? "EarlierVersion") >>= return . fmap V.earlierVersion
        av <- (o .:? "AnyVersion") >>= \ (_::Maybe [(Int,Int)]) -> return $ Just V.anyVersion
        wv <- (o .:? "WildcardVersion") >>= return . fmap V.WildcardVersion
        uvr <- (o .:? "UnionVersionRanges") >>= return . fmap toUvr
        ivr <- (o .:? "IntersectVersionRanges") >>= return . fmap toIvr
        vrp <- (o .:? "VersionRangeParens") >>= return . fmap V.VersionRangeParens
        maybe (typeMismatch "VersionRange" $ Object o)
          return
          (lv <|> tv <|> ev <|> uvr <|> ivr <|> wv <|> vrp <|> av)

      toUvr [v0, v1] = V.unionVersionRanges v0 v1
      toIvr [v0, v1] = V.intersectVersionRanges v0 v1

Any and all comments and suggestions are more than welcome!

Freer play with effects

In the previous posts on my playing with free I got stuck at combining APIs. I recalled reading a paper on extensible effects as an alternatve to monad transformers. I have to admit to not having finished the paper, and not quite understanding the part I did read. When looking it up again I found that the work had continued and that there is a paper on more extensible effects. (I got to it via http://okmij.org/ftp/Haskell/extensible/.)

A quick search of Hackage revealed the package extensible-effects with an implementation of the ideas, including the stuff in the latter paper. So, what would the examples from my previous posts look like using extensible effects?

Opening

The examples require a few extensions and modules:

{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE TypeOperators #-}

and

import Control.Eff
import Control.Eff.Lift
import Control.Eff.Operational
import Data.Typeable

Just an API

This part was greatly helped by the fact that there is a example in extensible-effects.

I start with defining the SimpleFile API using GADTs

data SimpleFileAPI a where
  LoadFile :: FilePath -> SimpleFileAPI String
  SaveFile :: FilePath -> String -> SimpleFileAPI ()

The usage of the constructors need to be wrapped up in singleton. To remember that I create two convenience functions

loadFile :: Member (Program SimpleFileAPI) r => FilePath -> Eff r String
loadFile = singleton . LoadFile

saveFile :: Member (Program SimpleFileAPI) r => FilePath -> String -> Eff r ()
saveFile fp = singleton . SaveFile fp

For withSimpleFile I only have to modify the type

withSimpleFile :: Member (Program SimpleFileAPI) r => (String -> String) -> FilePath -> Eff r ()
withSimpleFile f fp = do
  d <- loadFile fp
  let result = f d
  saveFile (fp ++ "_new") result

Now for the gut of it, the interpreter.

runSimpleFile :: (Member (Lift IO) r, SetMember Lift (Lift IO) r) => Eff (Program SimpleFileAPI :> r) a -> Eff r a
runSimpleFile = runProgram f
  where
    f :: (Member (Lift IO) r, SetMember Lift (Lift IO) r) => SimpleFileAPI a -> Eff r a
    f (LoadFile fp) = lift $ readFile fp
    f (SaveFile fp s) = lift $ writeFile fp s

Runnnig it is fairly simple after this

> :! cat test.txt 
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a diam lectus.
Sed sit amet ipsum mauris. Maecenas congue ligula ac quam viverra nec
consectetur ante hendrerit.
> runLift $ runSimpleFile $ withSimpleFile (map toUpper) "test.txt"
> :! cat test.txt_new 
LOREM IPSUM DOLOR SIT AMET, CONSECTETUR ADIPISCING ELIT. DONEC A DIAM LECTUS.
SED SIT AMET IPSUM MAURIS. MAECENAS CONGUE LIGULA AC QUAM VIVERRA NEC
CONSECTETUR ANTE HENDRERIT.

Now, that was pretty easy. It looks almost exactly like when using Free, only without the Functor instance and rather more complicated types.

Combining two APIs

Now I get to the stuff that I didn’t manage to do using Free; combining two different APIs.

I start with defining another API. This one is truly a play example, sorry for that, but it doesn’t really matter. The type with convenience function looks like this

data StdIoAPI a where
  WriteStrLn :: String -> StdIoAPI ()

writeStrLn :: Member (Program StdIoAPI) r => String -> Eff r ()
writeStrLn = singleton . WriteStrLn

The interpreter then is straight forward

runStdIo :: (Member (Lift IO) r, SetMember Lift (Lift IO) r) => Eff (Program StdIoAPI :> r) a -> Eff r a
runStdIo = runProgram f
  where
    f :: (Member (Lift IO) r, SetMember Lift (Lift IO) r) => StdIoAPI a -> Eff r a
    f (WriteStrLn s) = lift $ putStrLn s

Now I just need a program that combines the two APIs

verboseWithSimpleFile :: (Member (Program StdIoAPI) r, Member (Program SimpleFileAPI) r) =>
                         (String -> String) -> String -> Eff r ()
verboseWithSimpleFile f fp = writeStrLn ("verboseWithSimpleFile on " ++ fp) >> withSimpleFile f fp

That type is surprisingly clear I find, albeit a bit on the long side. Running it is just a matter of combining runStdIo and runSimpleFile.

> :! cat test.txt 
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a diam lectus.
Sed sit amet ipsum mauris. Maecenas congue ligula ac quam viverra nec
consectetur ante hendrerit.
> runLift $ runSimpleFile $ runStdIo $ verboseWithSimpleFile (map toUpper) "test.txt"
verboseWithSimpleFile on test.txt
> :! cat test.txt_new 
LOREM IPSUM DOLOR SIT AMET, CONSECTETUR ADIPISCING ELIT. DONEC A DIAM LECTUS.
SED SIT AMET IPSUM MAURIS. MAECENAS CONGUE LIGULA AC QUAM VIVERRA NEC
CONSECTETUR ANTE HENDRERIT.

Oh, and it doesn’t matter in what order the interpreters are run!

At this point I got really excited about Eff because now it’s obvious that I’ll be able to write the logging “decorator”, in fact it’s clear that it’ll be rather simple too.

The logging

As before I start with a data type and a convenience function

data LoggerAPI a where
  Log :: String -> LoggerAPI ()

logStr :: Member (Program LoggerAPI) r => String -> Eff r ()
logStr = singleton . Log

For the decorating I can make use of the fact that APIs can be combined like I did above. That is, I don’t need to bother with any coproduct (Sum) or anything like that, I can simply just push in a call to logStr before each use of SimpleFileAPI

logSimpleFileOp :: (Member (Program SimpleFileAPI) r, Member (Program LoggerAPI) r) => SimpleFileAPI a -> Eff r a
logSimpleFileOp op@(LoadFile fp) = logStr ("LoadFile " ++ fp) *> singleton op
logSimpleFileOp op@(SaveFile fp _) = logStr ("SaveFile " ++ fp) *> singleton op

Of course an interpreter is needed as well

runLogger :: (Member (Lift IO) r, SetMember Lift (Lift IO) r) => Eff (Program LoggerAPI :> r) a -> Eff r a
runLogger = runProgram f
  where
    f :: (Member (Lift IO) r, SetMember Lift (Lift IO) r) => LoggerAPI a -> Eff r a
    f (Log s) = lift $ putStrLn s

Running is, once again, a matter of stacking interpreters

> :! cat test.txt
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a diam lectus.
Sed sit amet ipsum mauris. Maecenas congue ligula ac quam viverra nec
consectetur ante hendrerit.
> runLift $ runLogger $ runSimpleFile $ runProgram logSimpleFileOp $ withSimpleFile (map toUpper) "test.txt"
LoadFile test.txt
SaveFile test.txt_new
> :! cat test.txt_new 
LOREM IPSUM DOLOR SIT AMET, CONSECTETUR ADIPISCING ELIT. DONEC A DIAM LECTUS.
SED SIT AMET IPSUM MAURIS. MAECENAS CONGUE LIGULA AC QUAM VIVERRA NEC
CONSECTETUR ANTE HENDRERIT.

Closing thoughts

With Eff I’ve pretty much arrived where I wanted, I can

  • define APIs of operations in a simple way (simpler than when using Free even).
  • write a definitional interpreter for the operations.
  • combine two different APIs in the same function.
  • translate from one API to another (or even to a set of other APIs).

On top, I can do this without having to write a ridiculous amount of code.

I’m sure there are drawbacks as well. There’s a mention of some of them in the paper. However, for my typical uses of Haskell I haven’t read anything that would be a deal breaker.

Free play, part three

The code in the previous post can do with a bit of cleaning up. I start with introducing a type class for an API that can be run

class RunnableF f where
  runOp :: f a -> IO a

and a function that actually runs iteFile

runF :: (RunnableF o) => Free o a -> IO a
runF = foldFree runOp

A coproduct (Sum) of two runnable APIs can itself be runnable

instance (RunnableF f, RunnableF g) => RunnableF (Sum f g) where
  runOp (InL op) = runOp op
  runOp (InR op) = runOp op

After this all I have to do is to move the guts of runSimpleF and runLogging into implementations for SimpleFil and LogF respectively

instance RunnableF SimpleFileF where
  runOp (LoadFile fp f') = liftM f' $ readFile fp
  runOp (SaveFile fp d r) = writeFile fp d >> return r

instance RunnableF LogF where
  runOp (Log s a)= putStrLn s >> return a

The rest is left as is. Running now looks like this

> :! cat test.txt
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a diam lectus.
Sed sit amet ipsum mauris. Maecenas congue ligula ac quam viverra nec
consectetur ante hendrerit.
> runF $ withSimpleFile (map toUpper) "test.txt"
> :! cat test.txt_new
LOREM IPSUM DOLOR SIT AMET, CONSECTETUR ADIPISCING ELIT. DONEC A DIAM LECTUS.
SED SIT AMET IPSUM MAURIS. MAECENAS CONGUE LIGULA AC QUAM VIVERRA NEC
CONSECTETUR ANTE HENDRERIT.
> runF $ foldFree loggingSimpleFileI $ withSimpleFile (map toLower) "test.txt"
** load file test.txt
** save file test.txt_new
> :! cat test.txt_new
lorem ipsum dolor sit amet, consectetur adipiscing elit. donec a diam lectus.
sed sit amet ipsum mauris. maecenas congue ligula ac quam viverra nec
consectetur ante hendrerit.

What else?

With these changes it becomes slightly esier to create new basic API types. but there are still a few things I think could be useful:

  • More generic decorators. logSimpleFileI is tied to decorating only SimpleFileF. I’m fairly sure this could be dealt with by a type class LoggableF and have APIs implement it. I also think there’s a rather natural implementation of LoggableF for Sum f g.
  • Combining decorators (and other interpreters). I’m guessing this is what John refers to as “generalizing interpreters.”
  • Combining APIs. I’d prefer making small APIs of related operations, but then that also means I need a nice way of combining APIs. I had a go at using FreeT but simply gave up on bending the types to my will. In any case I’m not completely sure that a stack of various FreeT is a good direction to go.

Free play, part two

This post on Free play builds on my previous one. It was John A De Goes’ post that finally got me started on playing with Free and his post contian the following:

Moreover, not only do algebras compose (that is, if you have algebras f and g, you can compose them into a composite algebra Coproduct f g for some suitable definition of Coproduct), but interpreters also compose — both horizontally and vertically.

And a little bit later he offers a few types and some details, but not all details. How could something like that look in Haskell?

Starting from the code in the previous post I first created a new type for a logging action

data LogF a = Log String a

This type has to be a Functor

instance Functor LogF where
  fmap f (Log s a) = Log s (f a)

The logging should basically decorate a SimpleFileF a action, so I need a function to map one into a Free LogF a

logSimpleFileI :: SimpleFileF a -> Free LogF ()
logSimpleFileI (LoadFile fp _) = liftF $ Log ("** load file " ++ fp) ()
logSimpleFileI (SaveFile fp _ _) = liftF $ Log ("** save file " ++ fp) ()

Now I needed a Coproduct for Functor. Searching hackage only offered up one for Monoid (in monoid-extras) so I first translated one from PureScript, but later I got some help via Twitter and was pointed to two in Haskell, Data.Functor.Coproduct from comonad and Data.Functor.Sum from transformers, I decided on the one from transformers because of its shorter name and the fact that it was very different from my translated-from-PureScript version.

Following John’s example I use Applicative to combine the logging with the file action

loggingSimpleFileI :: SimpleFileF a -> Free (Sum LogF SimpleFileF) a
loggingSimpleFileI op = toLeft (logSimpleFileI op) *> toRight (liftF op)

with toLeft and toRight defined like this

toLeft :: (Functor f, Functor g) => Free f a -> Free (Sum f g) a
toLeft = hoistFree InL
toRight :: (Functor f, Functor g) => Free g a -> Free (Sum f g) a
toRight = hoistFree InR

With all of this in place I can decorate the program from the last post like this foldFree loggingSimpleFileI (withSimpleF toUpper "FreePlay.hs"). What’s left is a way to run it. The function for that is a natural extension of runsimpleFile


runLogging :: Free (Sum LogF SimpleFileF) a -> IO a
runLogging = foldFree f
  where
    f :: (Sum LogF SimpleFileF) a -> IO a
    f (InL op) = g op
    f (InR op) = h op

    g :: LogF a -> IO a
    g (Log s a)= putStrLn s >> return a

    h :: SimpleFileF a -> IO a
    h (LoadFile fp f') = liftM f' $ readFile fp
    h (SaveFile fp d r) = writeFile fp d >> return r

Running the decorated program

runLogging $ foldFree loggingSimpleFileI (withSimpleF toUpper "FreePlay.hs")

does indeed result in the expected output

** load file FreePlay.hs
** save file FreePlay.hs_new

and the file FreePlay.hs_new contains only uppercase letters.

My thoughts

This ability to decorate actions (or compose algebras) is very nice. There’s probably value in the “multiple interpreters for a program” in some domains, but I have a feeling that it could be a hard sell. However, combining it with this kind of composability adds quite a bit of value in my opinion. I must say I don’t think my code scales very well for adding more decorators (composing more algebras), but hopefully some type wizard can show me a way to improve on that.

The code above is rather crude though, and I have another version that cleans it up quite a bit. That’ll be in the next post.