Posts tagged ‘cabal’

Maintaining Haskell packages for a Linux distribution—cblrepo

Maintaining a large set of Haskell packages for a Linux distribution is quite a chore. Especially if one wants to track Hackage as far as possible. Several distributions have tools to automatically convert Cabal-based packages into distribution packages, e.g. cabal2arch for ArchLinux and cabal-rpm. They are just conversion tools though, and the most time-consuming activity in maintaining Haskell packages is resolving and verifying dependencies.

At least that was my experience when I was actively involved in ArchHaskell. I only saw two options when adding or upgrading a package, either I worked out dependencies manually, or I simply tried it out. Neither of them was very appealing, and both were very time-consuming. It seemed obvious that I needed some tool to help out.

Enter cblrepo!

It allows me to maintain a database of specific versions of packages, and when I want to upgrade a package, or add a new one, it’ll verify that all dependencies can be satisfied. In other words, it helps me maintain a buildable set of packages at all times.

The tool also has some functionality that helps in tracking Hackage as packages are updated there.

Something about how it works

At the moment I maintain a small repository of Arch packages, mostly just to try out cblrepo and convince myself that it works. The work environment contains a database and a directory of patches:

% ls
cblrepo.db  patches/
%

The database is a cleartext file containing the information on the packages. It’s basically just a dump of the related Haskell datatype, encoded in JSON. The patches directory holds patches for Cabal files and PKGBUILD files. They must be named patch.cabal.<hackage name> or patch.pkgbuild.<hackage name> in order to be picked up by cblrepo.

There’s also an application directory (~/.cblrepo) for caching info about the packages available on Hackage:

% ls ~/.cblrepo
00-index.tar.gz
%

How to use it

A session with cblrepo looks something like this. First we update the information about what packages are available on Hackage:

% cblrepo idxsync
%

After that it’s possible to see what packages are out-of-date:

% cblrepo updates
cmdargs: 0.6.8 (0.6.9)
test-framework-th: 0.1.3 (0.2.0)
xml: 1.3.7 (1.3.8)
language-haskell-extract: 0.1.2 (0.2.0)
blaze-builder: 0.2.1.4 (0.3.0.0)
%

Let’s check whether cmdargs can be updated:

% cblrepo add -n cmdargs,0.6.9 %

It generates no output, so that means it’s possible to update. When attempting to add all the packages we run into a problem:

% cblrepo add -n cmdargs,0.6.9 \
> test-framework-th,0.2.0 \
> xml,1.3.7 \
> language-haskell-extract,0.2.0 \
> blaze-builder,0.3.0.0
Adding blaze-builder 0.3.0.0 would break:
  haxr : blaze-builder ==0.2.*

We’ll leave blaze-builder at the current version for now:

% cblrepo add cmdargs,0.6.9 \
> test-framework-th,0.2.0 \
> xml,1.3.7 \
> language-haskell-extract,0.2.0
%

After these updates we also need to make sure that all packages that depend on these ones are re-built, that is we need to bump their release version:

% cblrepo bump -n cmdargs \
> test-framework-th \
> xml \
> language-haskell-extract
Would bump:
test-framework
test-framework-hunit
test-framework-quickcheck2
%

Just re-run that without the -n to actually perform the bump. Now that all this is done we need to generate the files necessary to build the Arch packages. We can easily check what packages need re-building, and get a good order for building them:

% cblrepo build cmdargs \
> test-framework-th \
> xml \
> language-haskell-extract
cmdargs
xml
test-framework
test-framework-quickcheck2
test-framework-hunit
language-haskell-extract
test-framework-th
%

And generating the required files is also easy:

% cblrepo pkgbuild $(!!)
% tree
.
|-- cblrepo.db
|-- haskell-cmdargs
|   |-- haskell-cmdargs.install
|   `-- PKGBUILD
|-- haskell-language-haskell-extract
|   |-- haskell-language-haskell-extract.install
|   `-- PKGBUILD
|-- haskell-test-framework
|   |-- haskell-test-framework.install
|   `-- PKGBUILD
|-- haskell-test-framework-hunit
|   |-- haskell-test-framework-hunit.install
|   `-- PKGBUILD
|-- haskell-test-framework-quickcheck2
|   |-- haskell-test-framework-quickcheck2.install
|   `-- PKGBUILD
|-- haskell-test-framework-th
|   |-- haskell-test-framework-th.install
|   `-- PKGBUILD
|-- haskell-xml
|   |-- haskell-xml.install
|   `-- PKGBUILD
`-- patches

8 directories, 15 files
%

Now all that’s left is running makepkg in each of the directories, in the order indicated by cblrepo build above.

Unfortunately they won’t all build—generating the Haddock docs for test-framework-th fails. That’s however fairly easy to remedy by patching the PKGBUILD to disable the generation of docs.

I’ll get back to that in a later post though.

Your comments, please

Please leave comments and suggestions. I’m planning on uploading the source to github shortly.

On maintaining Haskell packages for a Linux distro

When trying to maintain set of binary packages of Haskell libraries for a Linux distribution there are a few issues that come up:

  1. The set of packages must be compilable at all times, and
  2. Updating one package requires all packages that depend on it, in one or more steps, to be re-compiled.

The first requires keeping track of all dependencies of the packages in the set and making sure that they are satisfiable at all times. For a while I was doing this by simple attempting to compile all updated packages and check for breakages. Which was both time-consuming and a painful when build-failures had to be resolved.

The second requires bumping the package release number for all packages that are reachable when following the dependencies in the reverse direction. Doing this manually is tedious and very error prone in my experience.

Of course it ought to be possible to make this a lot easier with the help of a tool. The last few days I’ve been writing such a tool. This is how I’ve been using it so far.

Building the initial database

GHC in ArchLinux ships with a few Haskell libraries and ArchLinux also has a few Haskell packages in its base repositories. Since I don’t need to maintain any of those packages I decided to treat these as a sort of base. Adding those is as simple as this:

% head base-pkgs
base,4.2.0.2
array,0.3.0.1
bytestring,0.9.1.7
Cabal,1.8.0.6
containers,0.3.0.0
directory,1.0.1.1
extensible-exceptions,0.1.1.1
filepath,1.1.0.4
haskell98,1.0.1.1
hpc,0.5.0.5
% cblrepo addbasepkg $(cat base-pkgs)
Success

Then I need to add the packages of the binary repo provided by ArchHaskell. I wrote a little script that extracts the package name and version from the ArchHaskell HABS tree (get-ah-cabals):

#! /bin/bash
 
habsdir=$1
 
for d in ${habsdir}/habs/*; do
    . ${d}/PKGBUILD
    case $_hkgname in
        (datetime|haskell-platform)
            ;;
        (*)
            echo ${_hkgname},${pkgver}
            ;;
    esac
done
 
echo http://hackage.haskell.org/platform/2010.2.0.0/haskell-platform.cabal

Since haskell-platform isn’t on Hackage it requires special handling. The reason why datetime is excluded is slightly different. It’s the only package that requires old base (version <4). GHC in Arch does whip with both old and new base so datetime can be built, but cblrepo can’t deal with two versions of the same package. This is a limitation, but I’m not sure it’s worth fixing it since base is the only library that comes in two versions, and datetime is the only package that hasn’t been updated to use new base.

Knowing this it’s easy to add all the ArchHaskell packages to the database:

% cblrepo idxupdate
% cblrepo add $(get-ah-cabals path/to/habs)
Success

Attempting an update

Now it’s possible to attempt to attempt an update:

% cblrepo add neither,0.2.0
Failed to satisfy the following dependencies for neither:
  monad-peel >=0.1 && <0.2
Adding neither 0.2.0 would break:
  yesod : neither >=0.1.0 && <0.2
  persistent : neither >=0.1 && <0.2

The way to read this is that there first of all is a missing dependency to satisfy for neither itself, and second there are two packages, yesod and persistent, that wouldn’t be buildable if neither were updated.

Now if it were possible to update neither, what packages would require a bump?

% cblrepo bump neither     
persistent
yesod

Even more Cabal fun, visualising in 3d

avsm pointed me to ubigraph. So after a bit of XML-RPC hacking (haxr really rocks, it is just so amazingly easy to use) I now have code that sends the graph data over to a ubigraph server.

Of course I had to create a video of it. The video shows the dependencies of just two packages (dataenc and datetime). I tried creating a second video of the dependencies of more packages, but xvidcap isn’t very reliable it seems. One irritating thing is that I run out of filehandles fairly soon, so I couldn’t create a 3d graph of more than about 75 packages (all packages starting with ‘a’ and ‘b’ on Hackage).

More fun with Cabal, visualising dependencies

It wasn’t why I started playing with Cabal, but after extracting dependencies from a single package I thought struck me that I could extract dependencies from many packages, e.g. hackage, and draw a dependency graph of the result.

The basic idea is to use the code from my earlier post, accumulate dependency information by mapping it over several cabal files. Then convert that information into nodes and edges suitable for building a graph (Data.Graph). That graph is then “graphviz’ed” using Data.Graph.Inductive.Graphviz. Not that since this is performed on Debian Sid I’m using some rather old versions of packagesi.

First off a shortcut for reading cabal files:

readCabalFile = readPackageDescription silent

Once I have a GenericPackageDescription I want to collapse it into a regular PackageDescription (see the comments in my previous post for some details regarding this). Then I extract the package name and its dependencies and package them into a tuple. by mapping this function over a list of GenericPackageDescription I end up with an association list where the key is the package and the value is a list of all its dependencies.

processFile gpd = let
        finPkgDesc = finalizePackageDescription [] Nothing
            "Linux" "X86_64" ("GHC", Version [6, 8, 2] [])
        (Right (pd, _)) = finPkgDesc gpd
        getPackageName (Dependency name _) = name
        nameNDeps = (pkgName . package) &&& (nub . map getPackageName . buildDepends)
    in
        nameNDeps pd

In order to create the graph later on I need a complete list of all nodes. To do this I take all the keys and all the values in the association list, collapse them into a single list, and remove duplicates. To turn this resulting list of packages into a list of LNode I then zip it with a list of integers.

getNodes = let
        assocKeys = map fst
        assocVals = concat . map snd
    in zip [1..] . nub . uncurry (++) . (assocKeys &&& assocVals)

Building the edges is straight forward, but a bit more involved. An edge is a tuple of two integers and something else, I don’t need to label the edges so in my case it is (Int, Int, ()). The list of nodes is basically an association list (an integer for key and a string for value), but I need to flip keys and values since I know the package name and need its node number.

getEdges deps = let
        nodes = getNodes deps
        nodesAssoc = map (\ (a, b) -> (b, a)) nodes
        buildEdges (name, dep) = let
                getNode n = fromJust $ lookup n nodesAssoc
                fromNode = getNode name
            in map (\ t -> (fromNode, getNode t, ())) dep
    in concat $ map buildEdges deps

Now that that’s done I can put it all together in a main function. The trickiest bit of that was to find the size of A4 in incehs :-)

main = do
    files <- getArgs
    gpds <- mapM readCabalFile files
    let deps = map processFile gpds
    let (nodes, edges) = (getNodes &&& getEdges) deps
    let depGraph = mkGraph nodes edges :: Gr String ()
    putStrLn $ graphviz depGraph "Dependency_Graph" (11.7, 16.5) (1,1) Landscape

I downloaded all cabal files from Hackage and ran this code over them. I bumped into a few that use Cabal features not supported in the ancient version I’m using. I was a bit disappointed that Cabal wouldn’t let me handle that kind of errors myself (as I already expressed in an earlier post) so I was forced to delete them manually.

Here’s the output, completely unreadable, I know, but still sort of cool.

  1. Yes, I’d be really happy if the transition to GHC 6.10 would finish soon.[back]

A no-no in my book (found in Cabal)

A recent “find” in Cabal made me think of this:

In my opinion the final decision of what to do in case of an error belongs in the appliation, not in libraries.

I’m sure there are exceptions to this, but I believe it’s good as a guiding principle when defining APIs.

Here’s the code in Cabal that breaks this rule. Basically, if you write an application and call readPackageDescription on a cabal file with errors then your application is terminated (the call to dieWithLocation ends up calling exitWith). I would much rather it returned an error (e.g. based on Either) so I can decide what to do with bad cabal files myself.

I’ve raised a ticket for it of course ;-)

Simple Cabal parsing

This is going to be a silly post, simply because it is so amazingly simple to do some basic parsing of Cabal files. It turns out to be true even for old versions of Cabal, the code below is for 1.2.3.0 since that’s the version still found in Debian Unstablei

First import the required modules:

import Distribution.PackageDescription
import Data.Version
import Distribution.Verbosity
import Distribution.Version

The a simplistic shortcut, I don’t care about setting flags or pretty much anything else

finPkgDesc = finalizePackageDescription [] Nothing "Linux" "X86_64" ("GHC", Version [6, 8, 2] [])

and I’m only interested in listing the package names of dependencies

getPackageName (Dependency name _) = name

After this it’s possible to do the following in ghci (ex01.cabal is a copy of the Cabal file for dataenc):

> gpd <- readPackageDescription silent "ex01.cabal"
...
> let pd = finPkgDesc gpd
> let deps = either (\_ -> []) (\ (p, _) -> buildDepends p) pd
> map getPackageName deps
["array","base","containers"]
  1. That’ll change when GHC 6.10 makes it onto my system. It’s already in Debian, just not on my system since some Xmonad-related packages haven’t been rebuilt yet.[back]

Experience with cabal-debian

So, after receiving several pointers to seereason’s cabal-debian tool I thought I’d take it for a spin.i

After about 30 minutes of browsing through HackageDB and seereason’s source repos, building and installing, I had finally satisfied all dependencies and the build of cabal-debian succeeded. (Oh, BTW, seereason people, it’s really confusing that you have a package called debian among your source repos, when there already is a different debian package on HackageDB. Please consider renaming it!) I decided to take the tool for a spin on my own package, dataenc.

The result was fairly good. It seems the generated files depend on some packages that either aren’t in Debian or that the people at seereason have modified. With the following changes to the generated files I was happy with the contents of ./debian and I successfully built an (almost) autogenerated Debian package:

  1. Download hlibrary.mk into ./debian.
  2. Modify ./debian/rules so that hlibrary.mk is loaded from $(CURDIR)/debian/hlibrary.mk.
  3. Delete ./debian/libghc6-dataenc-doc.post*.
  4. Remove cabal-debian and haskell-devscripts-cdbs from the build dependencies in ./debian/control.

I’ll try to find the time to put those changes into cabal-debian itself. Then I’d have a rather nice tool for building Debian packages automatically.

  1. I did pull down autobuilder as well, but didn’t feel I had enough time to explore it tonight.[back]

C and Haskell sitting in a tree…

A few days ago I thougth I’d take a look at calling C functions from haskell. I wrote up the following set of files:

foo.h:

int foo(int i);

foo.c:

int
foo(int i)
{
    return i * i;
}

Foo.hs:

module Main where

import Foreign.C.Types

main = do
    r <- foo 2
    putStrLn $ show r

foreign import ccall safe "foo.h foo" foo :: CInt -> IO CInt

Compiling the C file was of course no problem:

% gcc -c foo.c

The haskell file offered some resistance:

% ghc -c Foo.hs
Foo.hs:9:8: parse error on input `import'

It took me a round on haskell-cafe before I found out that ghc needs to be told to use the foreign function interface, -ffi or -fffi:

% ghc -c -fffi Foo.hs

Linking is a snap after that:

% ghc -o foo foo.o Foo.o
% ./foo
4

It’s also possible to build and link it all in one go:

% ghc --make -fffi -o foo foo.c Foo.hs

Now, that’s pretty nice, however it’d be even nicer to use cabal to do the building. At the same time I decided to put c2hs to use. It seemed to be a lot easier than having to create the import statements manually. I ended up with the following:

csrc/foo.h:

#ifndef _FOO_H_

int foo(int);

#endif

csrc/foo.c:

#include "foo.h"

int
foo(int i)
{
    return i * i;
}

I couldn’t get cabal to accept Foo.chs as the file containing the Main module in my project. So I ended up putting all the relevant code in Foo and then have a dummy Main.

src/Foo.chs:

module Foo where

#include "foo.h"

import Foreign.C.Types

main = do
    r <- {# call foo #} 2
    putStrLn $ show r

Here’s the dummy Main.

src/Main.hs:

module Main where

import qualified Foo

main = Foo.main

The cabal file is rather straight forward. It took me a round on haskell-cafe to find out how to let the compiler know that I need the foreign function interface without putting compiler directives in the source file.

cnh.cabal:

name: cnh
version: 0.1
build-depends: base

executable: cnh
main-is: Main.hs
hs-source-dirs: src
include-dirs: csrc
c-sources: csrc/foo.c
extensions: ForeignFunctionInterface
other-modules: Foo

Nothing special is needed in the Setup.hs:

#! /usr/bin/env runhaskell

import Distribution.Simple
main = defaultMain

Make it executable and you can build in two easy steps:

% ./Setup.hs configure && ./Setup.hs build