Dealing with life in Haskell

I bet you have at least one silly little thing at work that, whenever it happens, you let out a sigh, maybe roll your eyes and whish that everyone would use a proper operating system. A few days I finally decided to do something about one of my things like that. At work, Windows users will at times for some strange reason, manually create directories inside their work area, even though the directories actually are under version control. Invariably they get the case wrong and due to an onfortunate combination of case insensitive filesystem on the client (Windows) and a version control system the cares about case (Perforce). This results in files ending up all over the place even though they belong in the same directory. The Windows users are none the wiser, they simply don’t see the problem. Since I use a sane system (Linux) I do notice, and when I see it I sigh and roll my eyes.

Here’s my take on solving the problem:

module Main where

import Control.Monad
import Data.Char
import System.Directory
import System.Environment
import System.FilePath
import System.IO.HVFS.Utils
import System.Posix.Files

data IFile = IFile
    { iFileIPath :: FilePath
    , iFileIName :: FilePath
    , iFileFull :: FilePath
    } deriving (Show)

toIFile :: FilePath -> IFile
toIFile fp =  IFile path file fp
    where
        path = map toLower $ takeDirectory fp
        file = map toLower $ takeFileName fp

listFilesR :: FilePath -> IO [IFile]
listFilesR path =
    recurseDir SystemFS path >>=
    filterM doesFileExist >>=
    mapM (return . toIFile)

linkFile :: FilePath -> IFile -> IO ()
linkFile dest ifile = do
        createDirectoryIfMissing True newDir
        createLink (iFileFull ifile) newFile
    where
        newDir = normalise $ dest </> (iFileIPath ifile)
        newFile = newDir </> (iFileIName ifile)

main :: IO ()
main = do
    args <- getArgs
    listFilesR (args !! 0) >>= mapM_ (linkFile $ args !! 1)

Yes, this is the code I wrote in Literate Haskell, but I think I’d better not disclose my rant against clueless Windows users publically ;-)

Rubber and lhs?

Another dear lazyweb post.

Yesterday night I hacked up a silly little tool for use at work. Nothing spectaculari but I thought I’d try this Literate Haskell thing. I decided to go the LaTeX route and created a .lhs file. To my amazement rubber has a module for handling Literate Haskell, unfortunately it passes everything ending in .lhs through lhs2Tex. I didn’t really want to use lhs2Texii but I haven’t been able to find any way of changing rubber’s behaviour through its command line arguments. What I’d like is a way to disable a module for a single invocation. So, dear lazyweb, is there a way to do this from the command line?

So far I’ve worked around this by creating a symbolic link named MyTool.tex that points to the Haskell source, then I run rubber on the link. Not to much of a hazzle, but it’d be nicer to pass an argument to rubber.

  1. I’ll post the source here shortly.[back]
  2. I’ll probably have a closer look at it later but for now I wanted to keep things as easy as possible.[back]

Debian, lighttpd and MediaWiki

I just went through the exercise of setting up MediaWiki on a Debian Etch box, but instead of serving it using the common choice of Apache I wanted to use lighttpd. It seems the combination MediaWiki and Apache is well supported on Debian. There was even an install-time question whether I wanted the configuration files set up for me. Well, it’s boring to follow the common path, right?

It seems the combination MediaWiki and Apache is well supported on Debian. Luckily that doesn’t mean it’s difficult to serve it with lighttpd. Here’s how I configured things, and luckily it all seems to work just fine ;-)

Installation

First off, I installed mediawiki1.7, lighttpd, php5-cgi, and mysql-server. I made sure that aptitude only pulled in required packages and then I started un-choosing all Apache-related packages. By the time you get to installing MySQL you’ll be asked for a root password, that is if your debconf is set to ask medium-level questions.

Setting up MediaWiki

The Debian package for MediaWiki creates a site in /var/lib/mediawiki1.7 consisting mostly of symbolic links to the actual location of its files. I kept lighttpd’s default document root of /var/www and linked the two by creating a symbolic link /var/www/mw pointing to the MediaWiki site (/var/lib/mediawiki1.7).

Configuration of lighttpd

First enable FastCGI by linking /etc/lighttpd/conf-available/10-fastgi.conf into /etc/lighttpd/conf-enabled/. Then go in and change the bin-path to point to /usr/bin/php5-cgi. To prepare for the MediaWiki configuration, enable mod_rewrite in /etc/lighttpd/lighttpd.conf and finally create the file /etc/lighttpd/conf-enabled/20-mediawiki.conf with the following contents:

url.rewrite-once = (
    "^/$" => "/mw/index.php",
    "^/wiki/([^?]*)(?:\?(.*))?” => “/mw/index.php?title=$1&$2″
  )

Now it’s time to restart lighttpd.

Configuration of MediaWiki

Point a browser to http://your.server/mw/config/index.php to configure the site settings. When that’s done copy the file /var/lib/mediawiki1.7/config/LocalSettings.php one layer up.

That should be it! Enjoy!

Debian: pbuilder tip

I’m writing this mostly so that I’ll remember myself :-)

If pbuilder --create fails when building a base image for Sid then it’s worth trying to build a base image for the stable release and then upgrade to Sid afterwards. The very manual way is pbuild --login --save-after-login.

Packages for omnicodec and dataenc

As promised I’ve now created Debian packages for omnicodec. Since dataenc is a pre-requisite for building from source there are packages for it as well.

I’ve uploaded the source to Debian Mentors here and here. As usual pre-built binaries are available from my own APT repo.

ctypes and flexible (zero-length) arrays

Dear lazyweb, I’ve been racking my brain trying to get information out of the following C struct in a nice way when using ctypes:

struct Foo
{
    unsigned int length;
    char name[];
};

I wrote a little C function to get instance of the struct easily into python:

struct Foo *
put_name(char *name)
{
    struct Foo *foo = malloc(sizeof(struct Foo) + strlen(name));
    foo->length = strlen(name);
    memcpy(foo->name, name, foo->length);
    return(foo);
}

Now, how do I actually get the full contents of Foo.name in python? The only way I could think of that actually works is to create a dummy-struct type in order to get the length out and then use it to dynamically create a new sub-class of ctypes.Structure, then create an instance based on the address of what was returned. I think the following shows it pretty clearly:

class FOO(Structure):
    _fields_ = [('length', c_uint), ('name', c_char * 0)]

liblib = cdll.LoadLibrary(’./_build/liblib.so’)
c_put_name = liblib.put_name
c_put_name.argtypes = [c_char_p]
c_put_name.restype = POINTER(FOO)

def put_name(str):
    f = c_put_name(str)
    K = type(’FOO%s’ % f.contents.length, (Structure,),
                        {’_fields_’ : [('length', c_uint), ('name', c_char * f.contents.length)]})
    return K.from_address(addressof(f.contents))

I still think there ought to be some other way that ‘feels’ nicer. I mean the use of “short arrays” (declared as here with [], or zero size as supported by GCC, or the more portable array of size one) seems to be common enough to warrant some support from ctypes, right?

Any suggestions for improvements?

Kid’s play with HTML in Haskell

In my ever-continuing attempts to replace Python by Haskell as my language of first choice I’ve finally managed to dip a toe in the XML/HTML sea. I decided to use the Haskell XML Toolkit (HXT) even though it’s not packaged for Debian (something I might look into doing one day). HXT depends on tagsoup which also isn’t packaged for Debian. Both packages install painlessly thanks to Cabal.

As the title suggests my itch wouldn’t require anything complicated, but when I’ve previously have looked at any Haskell XML library I’ve always shied away. It all just looks so complicated. It turns out it looks worse than it is, and of course the documentation is poor when it comes to simple, concrete examples with adequate explanations. HXT would surely benefit from documentation at a level similar to what’s available for Parsec. I whish I were equipped to write it.

Anyway, this was my problem. I’ve found an interesting audio webcast. The team behind it has published around 90 episodes already and I’d like to listen to all of them. Unfortunately their RSS feed doesn’t include all episodes so I can’t simply use the trusted hpodder to get all episodes. After manually downloading about 20 of them I thought I’d better write some code to make it less labour-intensive. Here’s the complete script:

module Main where

import System.Environment
import Text.XML.HXT.Arrow

isMp3Link = (==) "3pm." . take 4 . reverse

myReadDoc = readDocument [(a_parse_html, "1"), (a_encoding, isoLatin1),
    (a_issue_warnings, "0"), (a_issue_errors, "0")]

myProc src = (runX $ myReadDoc src >>> deep selectMp3Links >>> getAttrValue “href”)
    >>= mapM_ putStrLn

selectMp3Links = hasName “a” >>> hasAttrValue “href” isMp3Link

main = do
    [src] <- getArgs
    myProc src

The thing that took by far the most time was finding out that hasAttrValue exists. I’m currently downloading episodes using the following command line:

curl -L $(for h in $(runhaskell get_mp3links.hs 'http://url.of.page/'); do \
    echo '-O' $h; done)

Yet another set of itches where Haskell has displaced Python as the utensil used for scratching. :-)

Saddle, two SDDL related tools

I’ve just uploaded two small tools that makes it a little easier to deal with SDDL (Security Descriptor Description Language, this is a good resource for SDDL):

  1. saddle-ex - “extract” the security descriptor for a number of different kinds of objects
  2. saddle-pp - a pretty printer for an SDDL string

Full build instructions are included. Download the archive here.

The tools are written (mostly) in Haskell and I’ll probably upload it all to hackage at some point in the future. I’m holding out for a fix to ticket 2097 since a fix for it will allow me to add a third tool which I feel belong in the package.

Suggestions for improvements are always welcome, patches even more so :-)

Omnicodec released

Four days ago I uploaded omnicodec to Hackage. It’s a collection of two simple command-line utilities for encoding and decoding data built on the dataenc package. I’m planning on building a Debian package for it as soon as I find the time.

Haskell on Windows

Recently I’ve had reason to write a few tools for Windows. Nothing complicated, and since there was nothing “mission critical” (there hardly ever is for me :-) ) I decided to try to write them in Haskell rather than Python or C/C++. This post contains some sort of recollection of my experience so far.

First off I should probably disclose that I don’t particularly like Windows. Besides feeling more at home on a Unix (Linux) system I find that Windows also upsets me on a very basic level. If I spend a whole day at work in Windows I tend to go home more stressed and irritated than after a day in Linux. Part of this is probably due to my familiarity with Linux. My hope was that the combination of the immense joy of programming Haskell and my loath of the Windows platform I would end up with something that at least was bearable. It turns out I did.

My setup

Besides installing the pre-built GHC I also installed Visual Studio Express more later to explain the need for it. I don’t like the Visual Studio environment so I installed CMake and jEditi. Last but not least I installed Console, it addresses the fact that Windows’ “DOS-box” is an amazing piece of crap.

FFI on Windows

First, GHC comes with a number of Windows-specific modules and they cover an impressive number of the Win32 API functions. The Win32 API is simply huge and one will sooner or later come find the need to add to what those modules offer. Second, GHC is able to compile C code by delegation to MinGW, however MinGW doesn’t completely cover the Win32 API either and if one is unlucky MinGW isn’t enough for the task at hand. Of course that’s exactly what happened to me. So, I installed Visual Studio Express, but since I don’t like VS that much and I didn’t want VS solutions in my darcs repository I decided to use CMake to build a library that I then instruct GHC, through a Cabal file, to link with my Haskell code. The end result is that building requires a few extra steps before the familiar configure/build sequence. It works remarkably well really, and I’ll definately use that strategy again if I need to.

There is a benign warning thrown when linking the final executable though:

Warning: .drectve `/manifestdependency:"type='win32' name='Microsoft.VC90.CRT'
version='9.0.21022.8' processorArchitecture='x86' publicKeyToken='1fc8b3b9a1e18e3b'"
/DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" '
unrecognized

It seems that it might be possible to skip the extra steps in the future if Cabal ticket #229 is addressed.

Windows System modules

Admittedly I haven’t been using much of the functionality in these modules, but it happens that the only function I needed had a bug that results in a segmentation fault, sometimes. See GHC ticket #2097 for some details. I guess this confirms my impression of most Haskell hackers being Linux/BSD/Unix users.

Conclusions

I won’t drop Haskell on Linux anytime soon, but I’ve found Haskell on Windows to be possible. As I suspected when I started out, Haskell does make Windows somewhat easier to bear.

  1. ordinarily I would install Vim but I thought I’d try out another editor. Vim on Windows is somewhat “leaky”, i.e. it sometimes shows behaviour that betrays its Unix roots.[back]