More Visitor (in Python)

Right after writing the previous post on the Visitor pattern in Python I picked up another paper on the same topic, Visitor Combination and Traversal Control. Of course this one also used Java for its examples, so once again I decided to use Python to explore the ideas presented.

The first part of the paper is all about writing small visitors that then are combined into more complicated ones. This part is nice but not that exciting. The interesting bit is when it gets to controlling traversal, which means it’s possible to remove the traversal code that usually appear in the accept method each visited type has to implement. Let’s see how that can look in Python.

The full code in this post is available at https://gist.github.com/magthe/beddad5c627946f28748.

First we need a structure to traverse, a simple tree will do.

class Tree(Visitable):
    def __init__(self, left, right):
        self.left = left
        self.right = right
 
class Leaf(Visitable):
    def __init__(self, value):
        self.value = value
 
def build_tree():
    l0 = Leaf(0)
    l1 = Leaf(1)
    t0 = Tree(l0, l1)
    l2 = Leaf(2)
    t1 = Tree(t0, l2)
    l3 = Leaf(3)
    l4 = Leaf(4)
    t2 = Tree(l3, l4)
    return Tree(t1, t2)

But before this we really should define Visitor, the base class for visitors, and Visitable, the base class of everything that can be visited.

class Visitor:
    def visit(self, obj):
        getattr(self, 'visit_' + obj.__class__.__name__)(obj)
 
    def visit_Tree(self, t):
        pass
 
    def visit_Leaf(self, l):
        pass
 
class Visitable:
    def accept(self, visitor):
        visitor.visit(self)

We’ll throw in a visitor for printing the whole tree too:

class Print(Visitor):
    def visit_Tree(self, t):
        print('Tree (%s)' % hex(id(t)))
 
    def visit_Leaf(self, l):
        print('Leaf (%s): %i' % (hex(id(l)), l.value))

Due to the lack of traversal in the accept methods it’s easy to be underwhelmed by the Print visitor:

In [32]: build_tree().accept(Print())
Tree (0x7f1680681a90)

To address this we first need a visitor combinator that runs two visitors in sequence. Unsurprisingly we’ll call it Sequence. Its constructor takes two visitors, and for each node in the tree it passed each one to the node.accept method.

class Sequence(Visitor):
    def __init__(self, first, then):
        self.first = first
        self.then = then
 
    def visit_Tree(self, t):
        t.accept(self.first)
        t.accept(self.then)
 
    def visit_Leaf(self, l):
        l.accept(self.first)
        l.accept(self.then)

The next building block is a visitor that descends one level down from a Tree node.

class All(Visitor):
    def __init__(self, v):
        self.v = v
 
    def visit_Tree(self, t):
        t.left.accept(self.v)
        t.right.accept(self.v)

At this point it’s worth noting that the name All probably isn’t very well chosen, since we don’t really get all nodes:

In [33]: build_tree().accept(All(Print()))
Tree (0x7f1680681278)
Tree (0x7f1680681be0)

We only descend one level, but we still keep the name since that’s the name they use in the paper.

With this in place it does become possible to create combinations that do traverse the full tree though. It even becomes rather simple. Traversing top-down is a simple matter of using a sequence that ends with All, and bottom-up is a matter of using a sequence starting with All.

class TopDown(Sequence):
    def __init__(self, v):
        Sequence.__init__(self, v, All(self))
 
 
class BottomUp(Sequence):
    def __init__(self, v):
        Sequence.__init__(self, All(self), v)

First top-down:

In [34]: build_tree().accept(TopDown(Print()))
Tree (0x7f1680681ef0)
Tree (0x7f16806814a8)
Tree (0x7f16806813c8)
Leaf (0x7f1680681278): 0
Leaf (0x7f1680681550): 1
Leaf (0x7f1680681a90): 2
Tree (0x7f1680681f28)
Leaf (0x7f1680681ba8): 3
Leaf (0x7f1680681a20): 4

Then bottom-up:

In [35]: build_tree().accept(BottomUp(Print()))
Leaf (0x7f1680681ba8): 0
Leaf (0x7f16806814a8): 1
Tree (0x7f1680681a90)
Leaf (0x7f16806813c8): 2
Tree (0x7f1680681550)
Leaf (0x7f1680681278): 3
Leaf (0x7f16806a1048): 4
Tree (0x7f16806a1198)
Tree (0x7f16806a1390)

That’s all rather cute I think.

Share

Dealing with Microsoft Products, or Battling Loss of Gumption

I was slightly frustrated and irritated with a situation at work today, which caused me to think about the word “gumption” as it’s used in Pirsig’s Zen and the Art of Motorcycle Maintenance. That led me to Wikipedia’s article on gumption trap which in turn led me to learn about the concept of learned helplessness.

So, what was the situation and how is it connected to learned helplessness?

The rest is just slightly tongue-in-cheek ;)

What to standardise on

I’m in situation where the powers-that-be have standardised on applications. Not on open formats or open protocols, but on specific applications that use proprietary formats and proprietary protocols. Of course these applications suck. That’s what a lack of competition does, it removes any will for a company to actually make improvements to their applications! Some of these applications have captured such a large market share that reverse engineering of the formats was inevitable. Yay! That means I can use a sane OS and vastly better applications. However, one protocol is not reverse engineered yet and I’m forced to use the standard application. This application is painful to use and only runs on a crap OS.

How bad can it be? you ask. The application is Outlook, the OS is Windows! Yes! It’s that bad. Hence the thoughts of gumption, or rather the loss of it. Which is exactly what starting Outlook causes. Every time!

Connection to learned helplessness

It continues to amaze me that companies standardise on Windows and applications that only run on Windows. There are better alternatives, especially in this day and age with fast networks and powerful and fast execution environments that completely sidestep the whole question of which OS to run. Still there seems to be very little will to upgrade to Linux, or to standardise on web-based applications. Why is that? In the past I’ve thought it might be the network effect. Most often I’ve come to the conclusion that it most likely is simple inertia. What’s the explanation for the inertia though?

This is where learned helplessness can offer an explanation. People have been conditioned and have grown so used to Windows and other Microsoft products that they simply don’t recognise that there now is a way out. No matter how many escape routes that become avilable people simply won’t see them.

What to do about it

As the experiments on dogs showed there is hope (from the wikipedia page):

To change their expectation and to recover the dogs from helplessness, experimenters had to physically pick up the dogs and move the legs in a close replication of the physical actions the dogs needed to take to remove themselves from the electrified grid. This had to be replicated at least 2 times before the dogs would exhibit the functional response of jumping over the barrier to get away from the electrified grid. Threats, rewards, and observed demonstrations had no observed effect in helping the dogs to independently move away from the shocks.

Oh how I whish I could pull off the direct translation to my work place: re-install my co-workers computers and replace servers and services. Too bad that’s not a realistic plan. What I can do though is civil disobedience (or maybe it should be called something like civil disobedience in the workplace instead). By simply not conforming and at the same time showing that there are better ways of getting the job done others will hopefully notice and either adopt my way, or come up with something that suits them better (which I then can learn from). Even if that doesn’t happen at least I’ll keep my gumption at healthy levels :)

What I’m doing at the moment

This is what I’m doing at work right now to avoid loss of gumption:

  • Use Linux as my main OS.
  • Run Windows in a VM.
  • Use pandoc to generate MSWord docs.
  • Use LibreOffice.

Finally, for Outlook. The decision of the powers-that-be to disable IMAP forces me to:

  • Limit my mail reading to twice per day.
  • Be logged into Skype to make up for not reading mail more often.
Share

Visitor and Walkabout (in Python)

A couple of weeks ago I found a link to a stackoverflow question I’d saved away a long time ago. I’d saved it due to having asked myself the exact same question and being curious about the papers when I found that answer. Over the last few weeks I’ve made my way through those papers and as so often I found a couple of references to other papers that sound interesting. One such paper was The Essence of the Visitor Pattern by Jens Palsberg and C. Barry Jay. There was however one little thing that bugged me with the Walkabout pattern and I thought I’d try to work that out for myself. Of course I’m using Python rather than Java ;)

The full code can be found at https://gist.github.com/magthe/ad6e23fb560a8a494fd2

Visitor

The Visitor pattern separates the traversal and the operation. It does this by using an accept method in the classes making up the data structure, this method takes a Visitor instance which implements the operation to be carried out. This enables adding new operations without modifying the data structure.

First we need a simple structure to play with: a tree where each node can have zero or more sub-trees.

class Tree:
    def __init__(self, value, children=[]):
        self.value = value
        self.children = children

The implementation of accept for this type is rather obvious:

    def accept(self, visitor):
        visitor.visit(self)
        for c in self.children:
            c.accept(visitor)

Next we need an implemention of a Visitor that all visitors can derive from. Since Python’s dispatch doesn’t depend at all on the type of the argument we’ll have to implement the necessary behaviour ourselves, i.e. inspect the type and then pick the correct method to call.

class Visitor:
    def visit(self, obj):
        func_name = 'visit_' + obj.__class__.__name__
        visit_func = getattr(self, func_name)
        visit_func(obj)

In order to visit a Tree type it also needs the appropriately named method:

    def visit_Tree(self, obj):
        pass

Now it’s easy to create a visitor. Here’s a very simple one:

class TreePrintVisitor(Visitor):
    def visit_Tree(self, obj):
        print('Tree (%s): %i' % (hex(id(obj)), obj.value))

Finally here’s a function that exercises what we’ve just put together:

def test_visitor():
    leaves = [Tree(42), Tree(17)]
    tree = Tree(1, leaves)
    printer = TreePrintVisitor()
    tree.accept(printer)

Looking at this it’s easy to see the objections Palsberg and Jay present in their paper:

  1. A data type is only ‘visitable’ if it has an accept method, and
  2. we must know the types of all objects we want to visit, so changes to the class structure requires changes to the visitors.

The authors then introduce Walkabout in order to remove these limitations.

Walkabout

To remove these limitations the authors use reflection to find if the visitor has a method to carry out the operation on the type of the current object. If such a method doesn’t exist they use reflection to find all members of the object and then visits them. The Walkbout class and its visit method then looks something like this:

class Walkabout:
    def visit(self, obj):
        func_name = 'visit_%s' % obj.__class__.__name__
        if hasattr(self, func_name):
            visit_func = getattr(self, func_name)
            visit_func(obj)
        elif hasattr(obj, '__dict__'):
            for m in obj.__dict__.keys():
                self.visit(getattr(obj, m))

The accept method can be removed from Tree and the visitor is changed to include code to continue the traversal:

class TreePrintWalkabout(Walkabout):
    def visit_Tree(self, tree):
        print('Tree (%s): %i' % (hex(id(tree)), tree.value))
        for c in tree.children:
            self.visit(c)

The function for exercising this only changes slightly:

def test_walkabout():
    leaves = [Tree(42), Tree(17)]
    tree = Tree(1, leaves)
    printer = TreePrintWalkabout()
    printer.visit(tree)

This is where Palsberg and Jay stop, but I think there is one little detail in Walkabout that’s worth looking a little closer at.

Walkabout Two

I personally find it a little strange that the authors first note that the visitor pattern suffers from the limitation that one has to know up front the full set of types to visit, then they don’t recognise that their own solution to this instead require one to know the shape of (parts of) the structure to operate on. In other words, the classes deriving from Walkabout are in charge of carrying on the traversal (the last two lines of visit_Tree above).

This little detail is of course easy to work around, we just modify visit to always visit all members irrespective of whether there is a method to handle the current object. There may be cases where we want to stop the walking about for efficiency reasons, we can address that at the same time and WalkaboutTwo will then look like this:

class WalkaboutTwo:
    def visit(self, obj):
        func_name = 'visit_%s' % obj.__class__.__name__
        descend = True
        if hasattr(self, func_name):
            visit_func = getattr(self, func_name)
            descend = visit_func(obj)
        if descend and hasattr(obj, '__dict__'):
            for m in obj.__dict__.keys():
                self.visit(getattr(obj, m))
        if descend and hasattr(obj, '__iter__'):
            for o in obj:
                self.visit(o)

One little detail above is that if we use only reflection to traverse the Tree we won’t actually find the sub-trees as immediate members since they are contained in a list. We address this by checking whether the object has an iterator as well, and if it does we visit all items.

2014-06-12 Fixed minor typo, thanks to Joe for pointing it out.

Share

Localised configuration in Vim: localcfg

For quite a while I’ve been using a small Vim plugin that lets me write configuration that is specific to a system, it loaded a config file based on the system’s host name. Unfortunately I can’t seem to find that plugin anywhere now, so I’ve put it in a snippet. This allowed me to easily create a single clean Vim configuration and check it into version control, while still allowing for settings that are unique to a system.

Lately I’ve found it slightly limited though, I really wanted to use other things to trigger the loading of some piece of configuration. So I wrote my first ever Vim plugin: localcfg

Hopefully someone will find it useful.

Share

Phabricator on ArchLinux

At work we’ve been using Trac for quite a while now, but it’s always interesting to look at other options. When listening to a recent episode of git Minutes on Blender’s move to using git I heard of Phabricator for the first time. There are good instructions for how to install it on Ubuntu, but I couldn’t find any for ArchLinux.

These are my notes for installing Phabricator on ArchLinux. Hopefully someone will find them useful.

Basic setup

I performed this install in a virtual machine (using VirtualBox). The virtual machine is configured with a bridged network and I gave it the FQDN of phabarch.vbox.net.

Packages

Beyond the packages installed as part of the basic install I needed the following packages:

  • lighttpd
  • git
  • mariadb
  • php
  • php-cgi
  • php-apcu
  • php-gd

Setup of mariadb

Following the instructions on the ArchLinux wiki page on mariadb is first started the service and then finished the installation:

# systemctl start mysqld.service
# mysql_secure_installation

After this I restarted the service, and made sure it’ll restart on system start:

# systemctl restart mysqld.service
# systemctl enable mysqld.service

I picked the root password mariaroot.

Setup of lighttpd

Modify /etc/lighttpd/lighttp.conf to look something like this:

server.modules = (
  "mod_rewrite",
  "mod_fastcgi",
)
 
server.port = 80
server.username  = "http"
server.groupname = "http"
server.document-root = "/srv/http"
server.errorlog  = "/var/log/lighttpd/error.log"
dir-listing.activate = "enable"
index-file.names = ( "index.php", "index.html" )
static-file.exclude-extensions = ( ".php" )
mimetype.assign  = (
  ".html" => "text/html",
  ".txt" => "text/plain",
  ".css" => "text/css",
  ".js" => "application/x-javascript",
  ".jpg" => "image/jpeg",
  ".jpeg" => "image/jpeg",
  ".gif" => "image/gif",
  ".png" => "image/png",
  "" => "application/octet-stream"
  )
 
fastcgi.server += ( ".php" =>
  ((
    "bin-path" => "/usr/bin/php-cgi",
    "socket" => "/var/run/lighttpd/php.socket",
    "max-procs" => 1,
    "bin-environment" => (
      "PHP_FCGI_CHILDREN" => "4",
      "PHP_FCGI_MAX_REQUESTS" => "10000",
    ),
    "bin-copy-environment" => (
      "PATH", "SHELL", "USER",
    ),
    "broken-scriptfilename" => "enable",
  ))
)
 
$HTTP["host"] =~ "phabarch(\.vbox\.net)?" {
  server.document-root = "/srv/http/phabricator/webroot"
  url.rewrite-once = (
    "^(/rsrc/.*)$" => "$1",
    "^(/favicon.ico)$" => "$1",
    # This simulates QSA (query string append) mode in Apache
    "^(/[^?]*)\?(.*)" => "/index.php?__path__=$1&$2",
    "^(/.*)$" => "/index.php?__path__=$1",
  )
}

Setup of php

Modify /etc/php/php.ini and enable the following extensions:

  • mysqli.so
  • openssl.so
  • iconv.so
  • apcu.so
  • gd.so
  • posix.so

Also disable the open_basedir setting.

Getting and setting up Phabricator

Checking it out

I placed it in /srv/http:

# cd /srv/http
# git clone git://github.com/facebook/libphutil.git
# git clone git://github.com/facebook/arcanist.git
# git clone git://github.com/facebook/phabricator.git

Database configuration

# cd /srv/http/phabricator
# ./bin/storage upgrade --user root --password mariaroot
# ./bin/config set mysql.user root
# ./bin/config set mysql.pass mariaroot

Set the base URI

# cd /srv/http/phabricator
# ./bin/config set phabricator.base-uri 'http://phabarch.vbox.net/'

Diffusion configuration

# mkdir /var/repo
# ./bin/config set diffusion.allow-http-auth true

At this point I started lighttpd and used the web interface to configure environment.append-paths to include the path of git-core, /usr/lib/git-core.

phd configuration

First create the daemon user

# useradd -r -M -d /tmp phabd

Then create the phd log dir and set its owner and group:

# mkdir /var/tmp/phd/log
# chown -R phabd:phabd /var/tmp/phd/

Also make the daemon user the owner of the repo folder used by Diffusion:

# chown phabd:phabd /var/repo

Configure sudo

Create the file /etc/sudoers.d/phabricator with this content:

http ALL=(phabd) SETENV: NOPASSWD: /usr/lib/git-core/git-http-backend

User configuration

In the web interface set a VCS password for the user.

Play

Now the system should be ready to be played with :)

Share

How do professional Windows programmers stand Visual Studio?

I have a new assignment at work and now find myself at yet another Windows shop. They are making embedded systems, but have for some strange reason decided that Windows is the only development platform to use. After only a few weeks here I’m noting a growing irritation with the tools offered for my use. The amount of forced mouse usage is astounding and the main tool, Visual Studio, is turning out to be the main culprit. After a week or so of exposure to VS I’ve found what I consider to be a serious flaw with a tool for developers: it doesn’t scale.

  1. No hierarchical structure It doesn’t scale very well with the size of a project. The concept of having a solution with a number of projects is not bad. But the implementation doesn’t scale. A project can’t have sub-projects, which means I can’t really layer the code in the IDE in the same way I do on disk. The only thing I can do is organise viewing of files through the concept of filters.
  2. All configuration is manual, part 1 MS seems to have optimised for small projects. All configuration is kept in XML files, and the only interface to them is a set of property dialogues (some which can be resized, others not) requiring an amazing amount of pointing and clicking to get anything done.
  3. All configuration is manual, part 2 MS have optimised for beginning new projects. Getting started is amazingly quick, but once you reach a size of about 10 projects it becomes daunting to fix anything that requires configuration in all projects. Making sure that all configurations are correct is a major undertaking, and requires an insane amount using the mouse. Some earlier versions of VS seem to even have made it impossible to edit the common settings of configurations properly; a simple mistake and the value of one configuration is lost.
  4. Experience There are no shortcuts to discover. The configuration is all in XML, which means it’s not really possible to jump out of VS and use a text editor for fixing up semi-broken configurations (like dealing with someone’s decision to place all intermediate files and final results in the top-level solution directory).

So, how do Windows developers cope with this? Don’t they use VS? Do they police the configurations diligently to ensure no silliness creeps in? Or are they all using tools to address these points (like CMake)?

Share

TCLAP for command line argument parsing in C++

A long while ago I was looking for a way to handle command line arguments in a C++ program I was writing for Windows. At the time I only found Boost.Program_options. After a bit of experimenting I found that the pre-built Boost libs I found back then had some issues and after a bit of thinking I decided to write the program in Python instead :) Now I once again find that I need to handle command line arguments in C++ on Windows, but this time I found quite a few options, gflags, ezOptionParser, TCLAP. They are all stand-alone, meaning that I don’t need to pull in a huge dependency like Boost or Qt, and liberally licensed (BSD3 and MIT) so usage in at work is no problem. After a bit of playing with all three I found that TCLAP is most to my liking, but gflags does have one nice little feature–it allows putting command line option definitions pretty much anywhere in the code. This would solve one of the problems I’m facing; the tool shall consist of a main program and pluggable modules, where each module must be able to add command line arguments of its own. However, it turns out that the TCLAP API easily allows implementing such a scheme. Here’s how I implemented it in the experiment code I’ve been playing with this weekend.

How it works

The basic idea is to create a central instance of TCLAP::CmdLine that can be used to register options with in the program and the pluggable modules. By declaring the option instances as top-level variables in compilation units it will be enough to just load a pluggable to the constructors run, and by passing the central TCLAP::CmdLine to the constructors they will register themselves properly. This seems to be same idea used in gflags.

Registering a command line option

The following code is the code my pluggable module used to register an option:

1
2
3
4
5
6
7
TCLAP::ValueArg<int> value("v",
                           "value",
                           "ONE: the value to return",
                           false,
                           42,
                           "integer",
                           argparse::argparser);

The parser

I put the parser in its own namespace, and due to the API of TCLAP::CmdLine I found that I needed to subclass it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <tclap/CmdLine.h>
 
namespace argparse {
 
class ArgParser : public TCLAP::CmdLine {
public:
    ArgParser() : TCLAP::CmdLine("No message set.", ' ', "No version set.") {}
 
    void setMessage(std::string msg) { _message = msg; }
    void setVersion(std::string version) { _version = version; }
};
 
extern ArgParser argparser;
 
}

The main function

After this I just need to instantiate the central parser instance and set it up.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
argparse::ArgParser argparse::argparser;
 
int main(int ac, char **av)
{
    argparse::argparser.setMessage("The message.");
    argparse::argparser.setVersion("0.0.0");
 
    // load the the plugin modules
 
    argparse::argparser.parse(ac, av);
 
    // do the work
 
    return 0;
}

Limitations

The main limitation I can see with this approach, and AFAICS that would be true with gflags as well, is that it’s not very easy to find the plugin modules to load by passing them on the command line. If the TCLAP API allowed parsing without acting on --help and --version, as well as be forgiving with arguments it didn’t know how to handle, it would be possible to use an option in the main application to find the modules, load them, and then re-parse the command line. In my particular case it doesn’t matter much, the plugin modules will all be found in a well-known place and all available modules should be loaded every time. It does make testing a bit more cumbersome though.

Share

Strachey, referential transparency, Haskell

This is my input into the recent discussion on referential transparency (RT). I’m nowhere near as well versed in the subject as others, but how am I ever to learn anything unless I put my thoughts out there for them to be laughed at and ridiculed? ;)

It all started with a post on stackoverflow.com, which received several very long and detailed responses, in particular from Uday Reddy (here and here). His answers were also linked to from Reddit. His second response contains a link to an excellent paper by Strachey, Fundamental concepts in programming languages. I’d go as far as saying that, despite it being lecture notes rather than a fully worked paper, it ought to be required reading for all software developers.

The rest of what I write here hinges on me actually understanding what Strachey writes in his paper. Of course I’m looking forward to comments/corrections/etc that help me correct my understanding.

What Strachey says about RT

In section 3.2.1 he introduces RT like this:

One of the most useful properties of expressions is that called by Quine referential transparency. In essence this means that if we wish to find the value of an expression which contains a sub-expression, the only thing we need to know about the sub-expression is its value. Any other features of the sub-expression, such as its internal structure, the number and nature of its components, the order in which they are evaluated or the colour of the ink in which they are written, are irrelevant to the value of the main expression.

There is however a crucial bit just before that:

Like the rest of mathematics, we shall be concerned only with R-values.

That is, he starts out with a very limited subset of what most people would consider a usable imperative programming language.

He then dives into some more details in section 3.2.2 by adding the concept of environment, which is handled through the use of a where-clause, or alternatively using let-statements (this ought to be making any Haskell developer feel right at home). After a few interesting sections on stuff like applicative structure, evaluation, and conditional expressions he finally tackles the issue of variables in section 3.3.1. There are two pieces to the trick, the first is to take advantage of his earlier insight that lead to a split of values into L-values and R-values:

If we consider L-values as well as R-values, however, we can preserve referential transparency as far as L-values are concerned. This is because L-values, being generalised addresses, are not altered by assignment commands. Thus the command x := x+1 leaves the address of the cell representing x (L-value of x) unchanged although it does alter the contents of this cell (R-value of x). So if we agree that the values concerned are all L-values, we can continue to use where-clauses and lambda-expressions for describing parts of a program which include assignments.

The cost of this is that the entire theory constructed earlier for operations taking R-values now has to be revised to incorporate L-values. The outline for this is in the rest of section 3.3 and it basically comes down to include an abstract store in the environment. However, before doing that he mentions that:

I think these problems are inevitable and although much of the work remains to be done, I feel hopeful that when completed it will not seem so formidable as it does at present, and that it will bring clarification to many areas of programming language study which are very obscure today. In particular the problems of side effects will, I hope, become more amenable.

He does reach his goal, but it’s a bit unfortunate that he stops short of considering the wider of problem of side effects. My assumption is that this would have to be dealt with in a similar way to assignment, but that would mean that rather than just adding an store to the environment the world, or a subset of it, would need to be added.

An open question (to me) is if anyone has built on Strachey’s work in this area and thought of the details of RT and general side effects?

RT in Haskell

The original question described RT as

it means you can replace equals with equals

which I actually think is a rather good, and very short, description of it. It’s not the full story, there are further important details, but it’s a good first intuition. Also, it’s a description usable in Haskell. Well, to be slightly more nuanced, it good for Haskell without IO (Haskell-IO). However, this is where the strict type system of Haskell really starts to shine because (here I’m probably a bit imprecise) we only have R-values in Haskell-IO. If we want to use assignment we add the use of a state monad, and we do that explicitly.

A former colleague of mine said that in Haskell we need to build up our own computational models ourselves. For instance, if we need assigment we use State, if we need to record progress we use Writer, etc. In other languages the language designer has already made all those choices for us, we don’t get to make them ourselves. For RT it means that Haskell is more explicit in what the environment of a function is.

Moving on to general side effects those are also more explicit in Haskell since they have to happen inside the IO monad. That alone is a great boon for RT in Haskell since it becomes explicit where RT as worked out by Strachey applies directly, and where there are (hopefully amenable) problems of side effects left. Even further, in Haskell it’s possible to make subsets of IO (by wrapping IO, see e.g. my own posts on wrapping IO, part 1 and wrapping IO, part 2). I’m sure that if including the world in the environment is the way to achieve RT with general side effects, then it’s highly beneficial to be able to create subsets of the world.

RT in Haskell vs. RT in (common) imperative languages

Uday writes in his first answer that:

But, today, functional programmers claim that imperative programming languages are not referentially transparent. Strachey would be turning in his grave.

This may well be true, but I think that when a Haskell programmer says it, he’s only twitching slightly. The reason? Strachey writes:

Any departure of R-value referential transparency in a R-value context should either be eliminated by decomposing the expression into several commands and simpler expressions, or, if this turns out to be difficult, the subject of a comment.

Which is something that Haskell programmers do naturally by use of IO. That is, in Haskell you either have an R-value, and you clearly see that you do, or you put in a comment, which is encoded in the type of the function.

This rather lengthy post basically arrives at the following, which is what I suspect the user [pacala is saying about RT on Reddit][reddit-pacala]:

Imperative languages my well be RT, but when trying to understand a code base the environment of each function is so large that understanding is an intractable problem. I don’t have this problem in Haskell.

Share

Compiling boost for Windows, with MinGW on Linux

Just in case you see the utter logic in developing for Windows on Linux :)

In the root of the unpacked Boost:

  1. Run ./bootstrap.sh --with-python=$(which python2) --prefix=${HOME}/opt/boost-win --without-icu
  2. Modify project-config.jam like this:
         # file.
         if ! gcc in [ feature.values  ]
         {
        -    using gcc ;
        +    using gcc : : i486-mingw32-g++ ;
         }
        
         project : default-build gcc ;
  3. Compile and install by running ./bjam --layout=system variant=release threading=multi link=shared runtime-link=shared toolset=gcc target-os=windows threadapi=win32 install

Limit what is built by adding e.g. --with-program_options to that last command.

Share