I received an interesting comment from Conal Elliott on my previous post on parsing. I have to admit I wasn’t sure I understood him at first, I’m still not sure I do, but I think I have an idea of what he means
Basically my code is very sequential in that I use the
do construct everywhere in the parsing code. Personally I thought that makes the parser very easy to read since the code very much mimics the structure of the
maps file. I do realise the code isn’t very “functional” though so I thought I’d take Conal’s comments to heart and see what the result would be.
Let’s start with observation that every entity in a line is separated by a space. However some things are separated by other characters. So the first thing I did was write a higher-order function that first reads something, then reads a character and returns the first thing that was read:
thenChar c f = f >>= (\ r -> char c >> return r)
Since space is used as a separator so often I added a short-cut for that:
thenSpace = thenChar ' '
Then I put that to use on
parseAddress = let hexStr2Int = Prelude.read . ("0x" ++) in do start <- thenChar '-' $ many1 hexDigit end <- many1 hexDigit return $ Address (hexStr2Int start) (hexStr2Int end)
Modifying the other parsing functions using
thenSpace is straight forward.
I’m not entirely sure I understand what Conal meant with the part about
liftM in his comment. I suspect his referring to the fact that I first read characters and then convert them in the “constructors”. By using
liftM I can move the conversion “up in the code”. Here’s
parseAddress after I’ve moved the calls to
parseAddress = let hexStr2Int = Prelude.read . ("0x" ++) in do start <- liftM hexStr2Int $ thenChar '-' $ many1 hexDigit end <- liftM hexStr2Int $ many1 hexDigit return $ Address start end
After modifying the other parsing functions in a similar way I ended up with this:
parsePerms = let cA a = case a of 'p' -> Private 's' -> Shared in do r <- liftM (== 'r') anyChar w <- liftM (== 'w') anyChar x <- liftM (== 'x') anyChar a <- liftM cA anyChar return $ Perms r w x a parseDevice = let hexStr2Int = Prelude.read . ("0x" ++) in do maj <- liftM hexStr2Int $ thenChar ':' $ many1 hexDigit min <- liftM hexStr2Int $ many1 hexDigit return $ Device maj min parseRegion = let hexStr2Int = Prelude.read . ("0x" ++) parsePath = (many1 $ char ' ') >> (many1 $ anyChar) in do addr <- thenSpace parseAddress perm <- thenSpace parsePerms offset <- liftM hexStr2Int $ thenSpace $ many1 hexDigit dev <- thenSpace parseDevice inode <- liftM Prelude.read $ thenSpace $ many1 digit path <- parsePath <|> string "" return $ MemRegion addr perm offset dev inode path
Is this code more “functional”? Is it easier to read? You’ll have to be the judge of that…
Conal, if I got the intention of your comment completely wrong then feel free to tell me I’m an idiot