QuickCheck on a REST API

Since I’m working with web stuff nowadays I thought I’d play a little with translating my old post on using QuickCheck to test C APIs to the web.

The goal and how to reach it

I want to use QuickCheck to test a REST API, just like in the case of the C API the idea is to

  1. generate a sequence of API calls (a program), then
  2. run the sequence against a model, as well as
  3. run the sequence against the web service, and finally
  4. compare the resulting model against reality.

The REST API

I’ll use a small web service I’m working on, and then concentrate on only a small part of the API to begin with.

The parts of the API I’ll use for the programs at this stage are

Method Route Example in Example out
POST /users {"userId": 0, "userName": "Yogi Berra"} {"userId": 42, "userName": "Yogi Berra"}
DELETE /users/:id

The following API calls will also be used, but not in the programs

Method Route Example in Example out
GET /users [0,3,7]
GET /users/:id {"userId": 42, "userName": "Yogi Berra"}
POST /reset

Representing API calls

Given the information about the API above it seems the following is enough to represent the two calls of interest together with a constructor representing the end of a program

and a program is just a sequence of calls, so list of ApiCall will do. However, since I want to generate sequences of calls, i.e. implement Arbitrary, I’ll wrap it in a newtype

Running against a model (simulation)

First of all I need to decide what model to use. Based on the part of the API I’m using I’ll use an ordinary dictionary of Int and Text

Simulating execution of a program is simulating each call against a model that’s updated with each step. I expect the final model to correspond to the state of the real service after the program is run for real. The simulation begins with an empty dictionary.

The simulation of the API calls must then be a function taking a model and a call, returning an updated model

Here I have to make a few assumptions. First, I assume the indeces for the users start on 1. Second, that the next index used always is the successor of highest currently used index. We’ll see how well this holds up to reality later on.

Running against the web service

Running the program against the actual web service follows the same pattern, but here I’m dealing with the real world, so it’s a little more messy, i.e. IO is involved. First the running of a single call

The running of a program is slightly more involved. Of course I have to set up the Manager needed for the HTTP calls, but I also need to

  1. ensure that the web service is in a well-known state before starting, and
  2. extract the state of the web service after running the program, so I can compare it to the model

The call to POST /reset resets the web service. I would have liked to simply restart the service completely, but I failed in automating it. I think I’ll have to take a closer look at the implementation of scotty to find a way.

Extracting the web service state and packaging it in a Model is a matter of calling GET /users and then repeatedly calling GET /users/:id with each id gotten from the first call

Generating programs

My approach to generating a program is based on the idea that given a certain state there is only a limited number of possible calls that make sense. Given a model m it makes sense to make one of the following calls:

  • add a new user
  • delete an existing user
  • end the program

Based on this writing genProgram is rather straight forward

Armed with that the Arbitrary instance for Program can be implemented as1

The property of an API

The steps in the first section can be used as a recipe for writing the property

What next?

There are some improvements that I’d like to make:

  • Make the generation of Program better in the sense that the programs become longer. I think this is important as I start tackling larger APIs.
  • Write an implementation of shrink for Program. With longer programs it’s of course more important to actually implement shrink.

I’d love to hear if others are using QuickCheck to test REST APIs in some way, if anyone has suggestions for improvements, and of course ideas for how to implement shrink in a nice way.


  1. Yes, I completely skip the issue of shrinking programs at this point. This is OK at this point though, because the generated Programss do end up to be very short indeed.

A simple zaw widget for jumping to git projects

A colleague at work showed me a script he put together to quickly jump to the numerous git projects we work with. It’s based on dmenu and looks rather nice. However, I’d rather have something based on zsh, but when looking around I didn’t find anything that really fit. So, instead I ended up writing a simple widget for zaw.

I then attached bound it like this

On mocks and stubs in python (free monad or interpreter pattern)

A few weeks ago I watched a video where Ken Scambler talks about mocks and stubs. In particular he talks about how to get rid of them.

One part is about coding IO operatioins as data and using the GoF interpreter pattern to

What he’s talking about is of course free monads, but I feel he’s glossing over a lot of details. Based on some of the questions asked during the talk I think I share that feeling with some people in the audience. Specifically I feel he skipped over the following:

  • How does one actually write such code in a mainstream OO/imperative language?
  • What’s required of the language in order to allow using the techniques he’s talking about?
  • Errors tend to break abstractions, so how does one deal with error (i.e. exceptions)?

Every time I’ve used mocks and stubs for unit testing I’ve had a feeling that “this can’t be how it’s supposed to be done!” So to me, Ken’s talk offered some hope, and I really want to know how applicable the ideas are in mainstream OO/imperative languages.

The example

To play around with this I picked the following function (in Python):

It’s small and simple, but I think it suffices to highlight a few important points. So the goal is to rewrite this function such that calls to IO operations (actions) (e.g. os.read) are replaced by data (an instance of some data type) conveying the intent of the operation. This data can later be passed to an interpreter of actions.

Thoughts on the execution of actions and the interpreter pattern

When reading the examples in the description of the interpreter pattern what stands out to me is that they are either

  1. a list of expressions, or
  2. a tree of expressions

that is passed to an interpreter. Will this do for us when trying to rewrite count_chars_of_file?

No, it won’t! Here’s why:

  • A tree of actions doesn’t really make sense. Our actions are small and simple, they encode the intent of a single IO operation.
  • A list of actions can’t deal with interspersed non-actions, in this case it’s the line n = len(text) that causes a problem.

The interpreter pattern misses something that is crucial in this case: the running of the interpreter must be intermingled with running non-interpreted code. The way I think of it is that not only the action needs to be present and dealt with, but also the rest of the program, that latter thing is commonly called a continuation.

So, can we introduce actions and rewrite count_chars_of_file such that we pause the program when interpretation of an action is required, interpret it, and then resume where we left off?

Sure, but it’s not really idiomatic Python code!

Actions and continuations

The IO operations (actions) are represented as a named tuple:

and the functions returning actions can then be written as

The interpreter is then an if statement checking the value of op.op with each branch executing the IO operation and passing the result to the rest of the program. I decided to wrap it directly in the program runner:

So far so good, but what will count_char_of_file all of this do to count_chars_of_file?

Well, it’s not quite as easy to read any more (basically it’s rewritten in CPS):

Generators to the rescue

Python does have a notion of continuations in the form of generators.1 By making count_char_of_file into a generator it’s possible to remove the explicit continuations and the program actually resembles the original one again.

The type for the actions loses one member, and the functions creating them lose an argument:

The interpreter and program runner must be modified to step the generator until its end:

Finally, the generator-version of count_chars_of_file goes back to being a bit more readable:

Generators all the way

Limitations of Python generators mean that we have either have to push the interpreter (runProgram) down to where count_char_of_file is used, or make all intermediate layers into generators and rewrite the interpreter to deal with this. It could look something like this then:

Final thoughts

I think I’ve shown one way to achieve, at least parts of, what Ken talks about. The resulting code looks almost like “normal Python”. There are some things to note:

  1. Exception handling is missing. I know of no way to inject an exception into a generator in Python so I’m guessing that exceptions from running the IO operations would have to be passed in via generator.send as a normal value, which means that exception handling code would have to look decidedly non-Pythonic.
  2. Using this approach means the language must have support for generators (or some other way to represent the rest of the program). I think this rules out Java, but probably it can be done in C#.
  3. I’ve only used a single interpreter here, but I see no issues with combining interpreters (to combine domains of operations like file operations and network operations). I also think it’d be possible to use it to realize what De Goes calls Onion Architecture.

Would I ever advocate this approach for a larger Python project, or even any project in an OO/imperative language?

I’m not sure! I think that testing using mocks and stubs has lead to a smelly code base each an every time I’ve used it, but this approach feels a little too far from how OO/imperative code usually is written. I would love to try it out and see what the implications really are.


  1. I know, I know, coroutines! I’m simplifying and brushing over details here, but I don’t think I’m brushing over any details that are important for this example.

That's a large project

From LWN October 6, 2016:

Over the course of the last year, the project accepted about eight changes per hour — every hour — from over 4,000 developers sponsored by over 400 companies.

Free, take 2

The other day I read a blog post on monads and stuff after which I felt rather silly about my earlier posts on Free.

I think this is probably the post I should have written instead :)

I’ll use three pieces of code, each one builds on the one before:

  • Free1.hs - Uses a free monad for a single algebra/API (full code here).
  • Free2.hs - Uses a free monad for a two algebras/APIs, where one decorates the other (full code here).
  • Free3.hs - Uses a free monad for a three algebras/APIs, where two are used in the program and the remaing one decorates the other two (full code here).

The first - one algebra

I’m re-using the algebras from my previous posts, but I believe it makes it easier to follow along if the amount of jumping between posts is minimized so here is the first one once again:

It’s a ridiculously small one, but I believe it’s good enough to work with. In the previous posts I implemented the Functor instances manually. I couldn’t be bothered this time around; I think I pretty much know how to do that for this kind of types now.

Having a type for the algebra, SimpleFileAPI, is convenient already now, even more so in the other examples.

The two convenience functions on the end makes it straight forward to write functions using the algebra:

This is simple, straight forward monadic code. Easy to read and work with. Of course it doesn’t actually do anything at all yet. For that I need need an interpreter, something that translates (reduces) the algebra, the API, the commands, call them what you will, into the (side-)effects I want. For Free that is foldFree together with a suitable function f :: SimpleFileF a -> IO a.

I want LoadFile to translate to a file being read and SaveFile to some data being saved to a file. That makes it pretty obvious how that f needs to be written:

At this point it might be good to explain the constructors of SimpleFileF a bit more. At first I thought they looked a bit funny. I mean, why does SaveFile have an a at all since it obviously always should result in ()? And what’s up with that function in Loadfile?

It did become a little clearer to me after some thought and having a look at Free:

I personally find the latter constructor a bit mind-bending. I can handle recursive functions fairly well, but recursive types have a tendency to confuse me. From what I understand one can think of Free as similar to a list. Pure ends the list (Nil) and Free one instance of f to the rest of the list (Cons). Since Free f a is a monad one can think of a as the result of the command.

If I were to write saveFile explicitly it’d look like this

and for loadFile:

But let’s get back to the type and why ‘a’ occurs like it does in the two constructors. As Gabriel G shows in his post Why free monads matter a constructor without a would result in termination. In other words, if SaveFile didn’t hold an a I’d not be able to write, in a natural way, a function that saves two files.

Another limiting factor is that foldFree of the Free implementation I’m using has the type Monad m => (forall x. f x -> m x) -> Free f a -> m a. This sets a requirement on what the function translating from my API into effects may look like, i.e. what f in runSimpleFile may look like. If SaveFile had no a to return what would f (SaveFile {}) return, how could it ever satisfy the required type?

The reason for LoadFile having a function String -> a is simply that there is no data yet, but I still want to be able to manipulate it. Using a function and function composition is the ticket then.

I think that’s all there is to say about to the first piece of code. To run it take a look at the comment at the end of the file and then play with it. If you want to turn all characters of a file foo into upper case you can use

The second - two algebras, one decorating the other

The second piece of code almost only adds to the first one. There is one exception though, the function runSimpleFile is removed. Instead I’ve taken the transformation function, which used to be called f and was internal to runSimpleFile and moved it out. It’s called stepSimpleFile:

The logging API, LogAPI, follows the same pattern as SimpleFileAPI and I’m counting on the description above being clear enough to not have to repeat myself. For completeness I include the code:

I intend the LogAPI to be used as embellishments on the SimpleFileAPI, in other words I somehow have to turn an operation of SimpleFileAPI into an operation of LogAPI, i.e. I need a transformation. I called it logSimpleFileT and let it turn operations in SimpleFileF (i.e. not exactly SimpleFileAPI) into LogAPI (if you are wondering about my choice of type I hope it’ll become clear below, just trust me for now that this is a good choice):

So far everything is hopefully very straight forward and unsurprising. Now I need to combine the two APIs, I need to add them, in other words, I need a sum type:

The next question is how to turn my two original APIs, SimpleFileAPI and LogAPI, into SumAPI. Luckily that’s already solved by the function hoistFree:

With this and logSimpleFileT from above I can use foldFree to decorate each operation with a logging operation like this:

This is where the type of logSimpleFileT hopefully makes sense!

Just as in the first section of this post, I need an interpreter for my API (SumAPI this time). Once again it’s written using foldFree, but this time I provide the interpreters for the sub-algegras (what I’ve chosen to call step functions):

The file has a comment at the end for how to run it. The same example as in the previous section, but now with logging, looks like this

The third - three algebras, one decorating the other two

To combine three algebras I simply take what’s in the previous section and extend it, i.e. a sum type with three constructors:

With this I’ve already revealed that my three APIs are the two from previous sections, LogAPI (for decorating the other APIs), SimpleFileAPI and a new one, StdIoAPI.

I want to combine them in such a wat that I can write functions using both APIs at the same time. Then I modify withSimpleFile into

and I can add another function that uses it with StdIoAPI:

The way to allow the APIs to be combined this way is to bake in S already in the convenience functions. This means the code for SimpleFileAPI has to change slightly (note the use of A2 in loadFile and saveFile):

The new API, StdIoAPI, has only one operation:

The logging API, LogAPI, looks exactly the same but I now need two transformation functions, one for SimpleFileAPI and one for StdIoAPI.

The new version of logT needs to operate on S in order to decorate both APIs.

This file also has comments on how to run it at the end. This time there are two examples, one on how to run it without logging

and one with logging

CMake, ExternalData, and custom fetch script

I failed to find a concrete example on how to use the CMake module ExternalData with a custom fetch script. Since I finally manage to work out how to use it I thought I’d try to help out the next person who needs to go down this route.

Why ExternalData?

I thought I’d start with a short justification of why I was looking at the module at all.

At work I work with a product that processes images and video. When writing tests we often need some rather large files (from MiB to GiB) as input. The two obvious options are:

  1. Check the files into our Git repo, or
  2. Put them on shared storage

Neither of these are very appealing. The former just doesn’t feel quite right, these are large binary files that rarely, if ever, change, why place them under version control at all? And if they do change the Git repo is likely to balloon in size and impact cloning times negatively. The latter makes it difficult to run our tests on a machine that isn’t on the office network and any changes to the files will break older tests, unless we always only add files, never modify any in place. On the other hand, the former guarantees that the files needed for testing are always available and it is possible to modify the files without breaking older tests. The pro of the latter is that we only download the files needed for the current tests.

ExternalData is one option to address this. On some level it feels like it offers a combination of both options above:

  • It’s possible to use the shared storage
  • When the shared storage isn’t available it’s possible to fall back on downloading the files via other means
  • The layout of the storage is such that modifying in place is much less likely
  • Only the files needed for the currest tests will be downloaded when building off-site

The object store

We do our building in docker images that do have our shared storage mapped in, so I’d like them to take advantage of that. At the same time I want the builds performed off-site to download the files. To get this behaviour I defined two object stores:

The module will search each of these for the required files and download only if they aren’t found. Downloaded files will be put into the first of the stores. Oh, and it’s very important that the first store is given with an absolute path!

The store on the shared storage looks something like this:

/mnt/share/over/nfs/Objects
└── MD5
    ├── 94ed17f9b6c74a732fba7b243ab945ff
    └── a2036177b190fbee6e9e038b718f1c20

I can then drop a file MyInput.avi.md5 in my source tree with the md5 of the real file (e.g. a2036177b190fbee6e9e038b718f1c20) as the content. Once that is done I can follow the example found in the introduction of the reference documentation.

curl vs sftp

So far so good. This now works on-site, but for off-site use I need to fetch the needed files. The last section of the reference documentation is called Custom Fetch Scripts. It mentions that files are normally downloaded using file(DOWNLOAD). Neither there, nor in the documentation for file is there a mention of what is used under the hood to fetch the files. After asking on in #cmake I found out that it’s curl. While curl does handle SFTP I didn’t get it to work with my known_hosts file, nor with my SSH agent (both from OpenSSH). On the other hand it was rather easy to configure sftp to fetch a file from the internet-facing SSH server we have. Now I just had to hook it into CMake somehow.

Custom fetch script

As the section on “Custom Fetch Scripts” mention three things are needed:

  1. Specify the script via the ExternalDataCustomScript:// protocol.
  2. Tell CMake where it can find the fetch script.
  3. The fetch script itself.

The first two steps are done by providing a URL template and pointing to the script via a special variable:

It took me a ridiculous amount of time to work out how to write a script that turns out to be rather short. This is an experience that seems to repeat itself when using CMake; it could say something about me, or something about CMake.

This script is run with cmake -P in the binary dir of the CMakeLists.txt where the test is defined, which means it’s oblivious about the project it’s part of. PROJECT_BINARY_DIR is empty and CMAKE_BINARY_DIR is the same as CMAKE_CURRENT_BINARY_DIR. This is the reason why the first store in ExternalData_OBJECT_STORES has to be an absolute path – it’s very difficult, if not impossible, to find the correct placement of the object store otherwise.

Qt5+D-Bus+CMake, a complete example

Yesterday I started digging into Qt5 and D-Bus. I never found a complete example, so I put one together myself: https://gist.github.com/magthe/2cf7220655bd8bf431259cc7dee99f64.

Final version of JSON to sum type

After some feedback on my previous post I ended up with the following implementation.

Thanks David for your suggestions.

From JSON to sum type

For a while I’ve been planning to take full ownership of the JSON serialisation and parsing in cblrepo. The recent inclusion of instances of ToJSON and FromJSON for Version pushed me to take the first step by writing my own instances for all external types.

When doing this I noticed that all examples in the aeson docs use a product

whereas I had to deal with quite a few sums, e.g. VersionRange. At first I struggled a little with how to write an instance of FromJSON. After quite a bit of thinking I came up with the following, which I think is fairly nice, but I’d really like to hear what others think about it. Maybe I’ve just missed a much simpler way of implementing parseJSON:

Any and all comments and suggestions are more than welcome!

Freer play with effects

In the previous posts on my playing with free I got stuck at combining APIs. I recalled reading a paper on extensible effects as an alternatve to monad transformers. I have to admit to not having finished the paper, and not quite understanding the part I did read. When looking it up again I found that the work had continued and that there is a paper on more extensible effects. (I got to it via http://okmij.org/ftp/Haskell/extensible/.)

A quick search of Hackage revealed the package extensible-effects with an implementation of the ideas, including the stuff in the latter paper. So, what would the examples from my previous posts look like using extensible effects?

Opening

The examples require a few extensions and modules:

and

Just an API

This part was greatly helped by the fact that there is a example in extensible-effects.

I start with defining the SimpleFile API using GADTs

The usage of the constructors need to be wrapped up in singleton. To remember that I create two convenience functions

For withSimpleFile I only have to modify the type

Now for the gut of it, the interpreter.

Runnnig it is fairly simple after this

> :! cat test.txt 
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a diam lectus.
Sed sit amet ipsum mauris. Maecenas congue ligula ac quam viverra nec
consectetur ante hendrerit.
> runLift $ runSimpleFile $ withSimpleFile (map toUpper) "test.txt"
> :! cat test.txt_new 
LOREM IPSUM DOLOR SIT AMET, CONSECTETUR ADIPISCING ELIT. DONEC A DIAM LECTUS.
SED SIT AMET IPSUM MAURIS. MAECENAS CONGUE LIGULA AC QUAM VIVERRA NEC
CONSECTETUR ANTE HENDRERIT.

Now, that was pretty easy. It looks almost exactly like when using Free, only without the Functor instance and rather more complicated types.

Combining two APIs

Now I get to the stuff that I didn’t manage to do using Free; combining two different APIs.

I start with defining another API. This one is truly a play example, sorry for that, but it doesn’t really matter. The type with convenience function looks like this

The interpreter then is straight forward

Now I just need a program that combines the two APIs

That type is surprisingly clear I find, albeit a bit on the long side. Running it is just a matter of combining runStdIo and runSimpleFile.

> :! cat test.txt 
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a diam lectus.
Sed sit amet ipsum mauris. Maecenas congue ligula ac quam viverra nec
consectetur ante hendrerit.
> runLift $ runSimpleFile $ runStdIo $ verboseWithSimpleFile (map toUpper) "test.txt"
verboseWithSimpleFile on test.txt
> :! cat test.txt_new 
LOREM IPSUM DOLOR SIT AMET, CONSECTETUR ADIPISCING ELIT. DONEC A DIAM LECTUS.
SED SIT AMET IPSUM MAURIS. MAECENAS CONGUE LIGULA AC QUAM VIVERRA NEC
CONSECTETUR ANTE HENDRERIT.

Oh, and it doesn’t matter in what order the interpreters are run!

At this point I got really excited about Eff because now it’s obvious that I’ll be able to write the logging “decorator”, in fact it’s clear that it’ll be rather simple too.

The logging

As before I start with a data type and a convenience function

For the decorating I can make use of the fact that APIs can be combined like I did above. That is, I don’t need to bother with any coproduct (Sum) or anything like that, I can simply just push in a call to logStr before each use of SimpleFileAPI

Of course an interpreter is needed as well

Running is, once again, a matter of stacking interpreters

> :! cat test.txt
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a diam lectus.
Sed sit amet ipsum mauris. Maecenas congue ligula ac quam viverra nec
consectetur ante hendrerit.
> runLift $ runLogger $ runSimpleFile $ runProgram logSimpleFileOp $ withSimpleFile (map toUpper) "test.txt"
LoadFile test.txt
SaveFile test.txt_new
> :! cat test.txt_new 
LOREM IPSUM DOLOR SIT AMET, CONSECTETUR ADIPISCING ELIT. DONEC A DIAM LECTUS.
SED SIT AMET IPSUM MAURIS. MAECENAS CONGUE LIGULA AC QUAM VIVERRA NEC
CONSECTETUR ANTE HENDRERIT.

Closing thoughts

With Eff I’ve pretty much arrived where I wanted, I can

  • define APIs of operations in a simple way (simpler than when using Free even).
  • write a definitional interpreter for the operations.
  • combine two different APIs in the same function.
  • translate from one API to another (or even to a set of other APIs).

On top, I can do this without having to write a ridiculous amount of code.

I’m sure there are drawbacks as well. There’s a mention of some of them in the paper. However, for my typical uses of Haskell I haven’t read anything that would be a deal breaker.