Saturday 7 September 2013

Choosing a Linux Distro for an Embedded System

Many years ago, when my old employer Digital Equipment Corporation was trying to stave off the advance of Unix, it came up with the slogan "Unix ain't Unix". In other words, there were a lot of systems at the time (around 1988) that all called themselves Unix, but which in fact were all different - in management, in commands, and in the APIs they supported.

Fast forward to 2013, and Linux is really the only Unix that matters any more. (Yes, I know about BSD - more later on that - and I stand by my statement). But the statement "Linux ain't Linux" applies with just as much truth.

I've been using Linux as my main development environment for a couple of years now. I started with Ubuntu 11.04, for no better reason than I happened to have a DVD of it. It was a pretty decent system - it stayed up for weeks at a time and had a usable Windows-ish GUI. If only that were still true. Recent releases of Ubuntu rarely stay up for more than a day or two at a time, typically before the window manager dies leaving the rest of the system ticking away but completely unusable. And of course they made the incomprehensible decision to replace Gnome, which is dull but functional, with Spirit.  (There's an old - c. 1780 - and rather delicious quote that "Englishwomen's shoes seem to have been made by someone who has heard shoes described but never actually seen any". Ditto with Spirit and the Mac). The first thing I do whenever I bring up a new Ubuntu system is to replace it with "Gnome Classic". (The latest version of Gnome in turn seems to have been developed by someone who has heard Spirit described but never actually seen it).

For the last year I've been developing an embedded system for Internet traffic management and monitoring. From the beginning we've taken for granted that it would run on Linux. The question is, which Linux distro should we use? There are numerous choices: Ubuntu, Fedora, Red Hat, Centos, Arch, Gentoo - and those are just the well-known ones.

For sure Ubuntu is a poor choice. It's desperately trying to be a replacement for Microsoft Windows, and has way too much clutter and extra stuff for an embedded system. We're trying to keep the footprint small, both memory and virtual disk, and we really don't need to have three different GUIs, LibreOffice, three different database systems... you get the picture. So Ubuntu was out from the beginning.

I worked with a company that had selected Gentoo. The advantage of Gentoo is that you get to choose absolutely everything about the system, down to the tiniest details like which implementation of cron you use. The disadvantage of Gentoo is that you have to... do all that. It's true that it will give you an absolutely minimal system, tailored exactly as you need it, but it's a lot of effort, not to mention the learning curve. It might make sense when we're bigger. but right now we need everyone focussed on stuff that will really differentiate us.

Somewhere along the line we looked at BSD - we were using something at the time whose support was much better there than on Linux. What a nightmare! Everything has to be built from source - they have a repository system but it is 'temporarily out of service'. It's truly a system for hobbyists, like Xen.

I looked at Centos, and got as far as installing it on a system. Then I realised that it is rooted so far in the past that I'd almost have to dig out my stock of IBM punch-cards. In particular, it supports a truly ancient version of GCC (4.4 I think). We make extensive use of features from C++11, which means we need at least 4.7. There are ways to have a development environment which is distinct from the system's own build environment, but they look pretty terrifying and weren't something I wanted to try and get my head around - for the same reason I didn't want to become a Gentoo expert.

That left Arch. I'd heard good things about it, and it also tries to be minimalist, so it seemed to be the way to go. I  installed an Arch system without too much trouble, and got our system up an running on it. The only problem was log4cxx, which isn't available as a package and which wouldn't build from source either. Like much Linux software out there, it has a bunch of outdated assumptions about implict include files which don't work with recent versions of gcc. But the changes were simple and we quickly had a version that would build.

Networking in Arch is very quirky. It starts with ethernet devices, which instead of being called eth0 and so on, have names which reflect the PCI heirarchy like 'ep5d3'. It's a nuisance but not a major problem. But then it turns out they selected a completely different way to manage networking than other Linuxes. User administration is completely different, too. I'm sure the answer would be "but you can always build whatever you want and do it your own way." True, but not especially helpful.

Anyway, we persevered with Arch, and got our systems running. It took the passage of time to realise that Arch is constantly changing - as in, every day. An Arch system installed and configured today won't be the same as one installed tomorrow. Anything and everything can change - the kernel, the utilities, the drivers. When Boost 1.53 came out, Arch had it a few days later. Switching Boost versions is not something to be undertaken lightly, and indeed our system wouldn't build - some incompatible change involving locales, themselves a completely incomprehensible feature of Linux.

Our biggest problem came from trying to integrate the Intel DPDK package for high-performance user-space networking. Now, DPDK is essential to what we're doing. But  it is hardly a model of stability either, with a new version coming out practically every week. The combination of this with the ever-changing sands of Arch, especially kernel changes, just made it impossible to keep up. If we got things working on Monday, they'd be broken again on Tuesday.

We looked into somehow selecting our own stable intercept of Arch. In a VM environment, it's easy enough to build a master VM and just use that. But our system also has to run on bare metal, which is not so easy. There is, supposedly, a way to take a snapshot and make a private repository. But once again, the investment in time is just not something a tiny group like ours can afford to make if we are to ship a product in a reasonable time.

And so, with great reluctance, I made the decision last week that we will ship our product on Ubuntu. I know that it is really not the right choice for an embedded system. But it works, and it doesn't change on a daily basis. We're used to its quirks, like yet another gratuitously incompatible set of network configuration tools. Hopefully we'll have the luxury of re-examining this later on when we have more people and more time to look at it.

Saturday 10 August 2013

How to identify a string using regular expressions

A problem that crops up from time to time is to identify which of several possible strings matches a given target. For example, the list might be apple, banana, cherry, date, elderberry, fruit, grapefruit, ... Of course for a short list, just about anything will work fine, including just serially comparing with each of them. If the strings are an exact match, a tree or a hash table will work. It gets more complicated, though, if only a partial match is required, for example to identify which of the list of fruits is contained in the target. The general case is where each of the possible strings is an arbitrary regular expression, i.e. '.*apple.*', '.*banana.*' and so on.

There are well-known algorithms for constructing a suitable search tree, but they don't form part of any conveniently available library that I'm aware of. And they are extremely subtle and complex, especially if you want to accommodate the full power of regular expressions - which means that you'll probably get it wrong in some subtle and difficult to test way if you try to code it yourself.

On the other hand, every system has regular expressions - even C++ in the latest version. It seems fairly obvious to concatenate all the matching strings into one giant regex, separated by '|', and let the regex engine do the work. It surely implements one of the well-known algorithms and has been fully tested. What could be simpler?

Well, except that no regex system I'm aware of gives you an easy way to find out which substring you matched. It will tell you that one of the fruit names is somewhere inside the string, but not which one. It would be nice if there was some way to tag each of the alternatives and then retrieve the tag of the one that matched - but there isn't. You can retrieve the pattern that matched, or part of it, but that just gets you back to where you started.

This problem cropped up again recently for me, in this case matching URLs described by regular expressions. And this time, I thought about it long enough to find a solution, which is to say a way of getting the regex engine to tell me which substring it had matched.

Let's suppose, just to keep the examples simple, that there are fewer than 100 possible strings. As you will see, the technique can be extended to any possible number, but you do need to know an upper bound when you start.

First, append the string '#0123456789,01234567890' to the target string. '#' can be any character or longer string that you are sure will not appear in any of the possible matches.

Next, append to each of the match targets a string like '#.*?(2).*?,.*?(6).*'. The '2' and the '6' in the example should be replaced in each case by a number uniquely identifying the string in question. This is a mildly tricky regex (compared to some of them!) which functions as follows.

  • the leading '#' just ensures that this won't accidentally match anything it shouldn't.
  • .*? will match anything until it finds a '2'. The '?' says that this string should match as little as possible. Without it, the engine would try to match to the furthest '2'. If there are several digits, this would require it to back up several times during the matching process, and would be seriously time-consuming.
  • (2) will match only the digit '2', and, most importantly, will capture it as a substring which can subsequently be retrieved. We know the '2' will be present, because we put it there ourselves.
  • The following '.*?,' matches everything up to and including the following comma. 
  • The following unit repeats the process to capture the '6' from the second group of digits.
For practical use there would be more digits, each repeating the first unit from the example.

Once the string has matched, all regex systems provide a way to retrieve the captured digits, which can then be assembled to get the identifier of the matched substring. In C++, a 'smatch' structure holds the result including the matched substrings. The following code snippet shows how it is done in C++.

int identify(const string &target)
    static string digits("0123456789,");
    string t(target);
    for (int i=0; i<digit_count; ++i) {
        t += digits;
    smatch m;                    // structure that holds parse result
    if (regex_match(t, m, master_regex)) {
        string id_str;
        for (int i=0; i<digit_count; ++i) {
            id_str += m[i+1];    // retrieve and append successive digits
        return lexical_cast<int>(id_str);
    } else {
        return 0;

Et voila!

Tuesday 18 June 2013

The short but happy (and rapid) life of Hercule "Purr-o-matic" Poirot, RIP

Adieu Hercule. It was a short life but a happy one. Born, maybe early May 2012. Died some time in the night of 17/18th June 2013, instantly and in the only too common way for cats, hit by a car.

When our first cat, Lewis, died in February 2012 after a long illness, we were so sad that we didn't really think about getting another cat.  Then in July, we learned of someone who had rescued a litter of feral kittens and their mother. We visited, and saw five beautiful kittens. Finally we chose two of them, a black male with a tiny white patch exactly where a bowtie would go, and a three-colour female. We brought them home - they were so tiny that both fitted comfortably in one cat bag. Once home, they cowered in a corner of a bookshelf in the room we'd set aside for them.

It took us a long time to find names for them. We wanted something that was he-and-she pair, but nothing we came up with really suited them. It was weeks before we thought of Hercule Poirot and Miss Marple, always abbreviated to Missy, names which seemed to suit them very well.

Hercule was always the bold one. Missy spent her first days with us cowering on the bookshelf, or hiding behind the books - she developed an amazing talent for hiding herself which she retains even now, despite being several times bigger.

But Hercule was soon exploring everywhere in the house. He grew fast - after our experience with Lewis I weighed them regularly and kept a chart, so I could see how quickly he was gaining weight.

It took only the slightest thing to make him purr. Just looking at him was enough. We called him "Hercule Purr-o-matic Pussycat".

As he grew up, he became more and more like Lewis. It was as though, from somewhere in pussycat paradise, Lewis was advising him. We fed them both slices of beef, calling it rat in the hope they'd get a taste for it. But soon Hercule would eat nothing but prawns, Lewis's favourite treat - well, and his usual dried catfood of course. In so many ways he seemed to be copying Lewis.

He was a hunter - his sister too, and we mostly didn't know who had done the hunting. They often showed signs of their feral background - Hercule with his hunting, Missy with her ability to hide.

At the back of our house is a stream, usually dry. It's possible, though tricky, for humans to cross it, but nothing for a cat. Just the other side is the garage of a neighbor's house, and under the floor there is a space easily big enough for cats to hide in. It became their second home. Usually when we came home in the evenings we'd call and they come running across our back fence.

Sometimes, especially lately, they waited for us to come to them. My fondest memory of Hercule is him sitting on the opposite bank, looking at me curiously, his eyes bright, his pretty white bowtie shining through the leaves. I never did manage to take a picture of him there, he'd run off before I could reach for my camera.

He ran everywhere, all the time. He never walked. Even going through his catflap he ran - we worried if that ever it was locked, he'd hurt himself. While we were eating breakfast, he'd run to the little dish of cream, sniff it, run somewhere else, run back, lap up a little, run off, run back... we joked that he wasn't really black, it's just he ran so fast that the photons couldn't keep up.

Just two weeks ago, Missy came home one evening with what turned out to a broken femur. She had surgery and as I write this she's still limping with a huge shaved patch on her left flank.

He was always too busy to be really affectionate - sometimes he'd accept a cuddle for a few seconds but then he'd be off somewhere on an urgent mission. But when he was snoozing during the day he was always very happy to have his tummy tickled. He'd roll over and stretch out, purring noisily as I ran my hand up and down his belly. He'd even make a Moebius, his tail pointing the opposite way to his ears - another sign of Lewis's influence. That's my last memory of him, late yesterday afternoon. Then he remembered an important appointment and ran outside. The next time I saw him - broke my heart.

Last night I saw neither of them all night long - which is unusual. This morning Missy was around, but when I called "Hercule, cream" - which always brings him running - there was nobody else. I found his poor dead body in one of the streets the other side of the stream. It's not a busy street, but I suppose a black cat, at night - and of course he was certainly running.

We have no idea what this will mean for Missy, his inseparable sister, together since they were born and rarely far from each other. We hope she won't be too sad, but nobody understands the social life of cats.

So, adieu, Hercule, Purr-o-matic Pussycat. We hope at least that you've found Lewis up there in pussycat paradise.

Sunday 12 May 2013

On typename - and why C++ is a parser's nightmare

If you've done any significant C++ programming using templates, you'll certainly have run into the annoying rule requiring you to write "typename" before constructs of the form "class::member" if the class is actually a template parameter. For example:

template <class C>
class foo
     typedef typename C::value_type value_type;

If you miss out "typename" the compiler will complain. What's more, if it's GCC it will complain in a completely mysterious fashion, giving you no clue as to what the actual problem is.

And yet, surely it's obvious that this must be a typename? Why require all those extra keystrokes and visual clutter for something which is obvious? Every now and then I'd wonder about this, and read something which described the obscure situations where it isn't obvious. But I'd promptly forget, until the next time I spent ages pondering over unhelpful error messages until the little light came on - "aha, it wants a 'typename'!".

There's a good explanation of why actually it isn't obvious, to the compiler at least, here. But it took me trying to explain C++ parsing to someone for me to really get it.

C++ teeters on the hairy edge of total ambiguity the whole time, without you even realising it as a user. And one of the worst culprits is the apparently innocent reuse of the less-than and greater-than symbols as template brackets. Consider the perfectly clear snippet:

foo<bah> foobah;

It's blindingly obvious that this is declaring 'foobah' to be an object of class 'foo' instantiated with 'bah' (presumably another class) as its template parameter.

Well, except that if all three names are actually ints, those template brackets suddenly turn into relational operators. It's not a very useful code snippet, but it is syntactically correct. First compare foo with bah, creating a bool result. Then compare that with foobah, having first cast the latter to bool. Then throw the (pretty meaningless) result away.

You don't even need templates. The reuse of '*' for both multiplication and dereferencing can also lead to ambiguity. Combining the two can get exciting:

foo<bah> *fbptr;

Obviously a declaration of a pointer to a 'foo<bah>'. Unless of course foo and bah are both numeric and fbptr is a pointer to a numeric. Then it's a replay of the previous example.

This is all made worse because C++ (necessarily) allows you to refer to class members before they're defined. Consider the following example:

class c1
    template<class C> class d1 
    class e1

class c2 : public c1
    void fn()
        d1<e1> f1;
    int d1, e1, f1;

When the parser sees the apparent declaration of f1, everything it knows tells it that this is indeed a declaration. Only later does it come across other declarations that completely change the interpretation. I  wonder how compilers deal with this - it would seem necessary to hold two completely different parse trees and hope that one of them will make sense later. Just to make it more interesting, the class could also go on to redefine the '<' and '>' operators, so they apply to different classes.

Even lowly Fortran IV wasn't immune to this kind of problem. It had no reserved words, and gave no significance to spaces. So when the compiler saw:

DO 100 I=1

which is obviously the beginning of a DO statement (equivalent to 'for' in C++), everything hinges on the next character. If it's a comma, this must indeed be a DO statement. But if it's say '+', or there is no next character on this line, then it's an assignment to a variable called 'DO100I' - and of course Fortran also didn't requires variables to be declared, so they could just pop up like this.

I'm glad I don't write compilers for a living!

Saturday 11 May 2013

The End of The Russian Triode Saga

Today I sent my last package of Russian triodes, ending a saga which began 15 years ago at an audio tradeshow in London.

In 1997 I bought my flat in London - the best investment I ever made. I looked for an audio system, and rather to my surprise discovered that everything they said about valve (vacuum tube) amplifiers was true - they really did sound a lot better. I bought a valve system for London, and was then so dissatisfied with the system I had at my main home in France that I set about creating a system there too. That led me to build the extraordinary and wonderful Atma-Sphere MA60 transformerless (OTL) valve amplifiers, and to spend a lot of time investigating all the possibilities of this new-to-me but very old technology.

Somewhere along the line, I became fascinated by the massive Russian 6C33C-B power triode. This was built like the proverbial tank - with good reason, since that's almost where they were used. Supposedly they were built as voltage stabiliser valves for use in Mig jet fighters. Somewhere on the web I indeed saw a picture of some Russian avionics with one of them in it, so I suppose it's true. They're huge, and can pass a continuous current of over half an Amp. This is amazing for a valve - they are really high-voltage, low-current devices (high-impedance in electrical terms). Getting one to pass this kind of current, at a fairly low voltage, requires some very special engineering. For driving loudspeakers directly, without a transformer to turn volts into amps, they seem like the perfect device.

They've been used in a few commercial designs. Atma-Sphere at one time sold an amplifier using them. A company called BAT - which I think is still in business - had another. But - so the story goes - it's difficult to get them to work reliably. This must have been very reassuring to the Mig pilots.

I went to a high-end audio show in London, and got talking to the UK importer for the Sovtek Russian valves. He offered me a great price on the 6C33C if I bought a box of 50 of them. As it happened, I'd just seen somewhere that the factory in Russia had just realised the value of what they were making and were planning to increase the price - like by a factor of ten. At that time you could buy them direct from Russia for about $10 each.

"Aha," I thought. "I can get enough for my own projects, and make a tidy profit selling the rest when the price goes up". We talked some more and eventually I agreed to buy his whole stock of 400 valves, for a really excellent price. I wrote him a check, shook hands and that was that.

Several weeks later, a delivery truck showed up at my home in France. I couldn't believe my eyes! I had no idea that these things were so enormous! Each one came in a cardboard box about eight inches long and four inches square, beautifully protected by foam spacers and more cardboard inside. There were eight huge cartons, each one about two feet on each side. I stacked them all in the garage - where fortunately I had plenty of space - and started to wonder what on earth I was going to do with them.

I looked around at the pricing on the web, and decided I could sell them at a decent profit and still be cheaper than anyone else. I wrote an advert and put it on my own website - in 1997 running a website was a lot more complicated than it is today.

The response was modest, to say the least. I sold a handful during the remaining couple of years I lived in France. So when I moved to California in 2001, I still had my eight huge cartons. Well, mostly. They'd been tucked away in a back corner of the garage, where a combination of humidity and mice had reduced one of the cartons to crumbs, and several of the boxes inside as well.

Once in the US, I had more success. I was selling them way cheaper than the only other supplier outside Russia, and I had a special deal on a whole carton of 50. This made practically no profit, but my main goal was to get rid of them, not to make money. In fact any possible profit was wiped out by the storage I ended up renting so they didn't take up all the available space in my garage.

I think I sold four complete cartons like this. One led to the best pasta I've ever eaten. A guy in Japan bought them, and since I happened to be in Tokyo around that time we got together. He took me to a tiny place in a basement somewhere near Ebisu station where I had the most marvellous tagliatella carbonara ever, a melt-in-the-mouth creaminess that I've never experienced anywhere else.

It may seem surprising that the best Italian food should be in Tokyo, but it isn't really. The best French meal I've ever eaten (including living in France for ten years) was there too, as well as the best French patisserie. When the Japanese copy something, they do it extremely well!

I was amazed by the number of people who'd inquire about them, often sending several emails, and then just not place an order. But I guess it's the same as when you sell a car, and people spend ages on the phone asking every possible detail, and don't show up as arranged. Some people must just have far too much time on their hands.

The pile of boxes in the storage gradually shrunk. Then one day I took my car for service and got a huge pickup truck as a rental. (This was quite common, the local Enterprise offloaded whatever they had for these one-day rentals. I had a Cadillac deVille once, too. Verdict: interesting but don't sell the Audi). I realised the pile had shrunk enough that it would fit in the garage, and emptied the storage.

The pile continued to dwindle, until finally there was just one carton left, plus a few mouse-nibbled boxes that I was keeping for myself. Then, on May 5th 2013, someone ordered eight of them. This left only a dozen, which I'm keeping just in case I ever want to build a mega-transformerless amplifier with them. It's not likely, but then that's true for an awful lot of stuff I keep just in case. And I'be annoyed if I did want some, and had to buy them at the current retail price of $75.

So that's it, the end of the Russian triode saga. Now, does anybody want any 6CW4 Nuvistors by any chance? They do take a lot less space, but I have 100 of them...

Wednesday 1 May 2013

Living without boost::has_member_function

I needed to write a function that would do one thing if its parameter class had a particular member function, and another if it didn't. It seemed like this should be straightforward with a bit of Template Metaprogramming (MPL). But it wasn't. Here's the story of how I found a solution.

In more detail... our software has lots of collections whose contents are all derived from the same class, named_object. There's a bunch of infrastructure which will dump objects and collections of them into various formats including highly abbreviated text for debugging, and Json for our Rest API. There are also a few collection classes which need to be used in the same way but aren't (and can't be) built out of the same class and collection structure.

The main collection base class provides a get_by_name function that does an efficient lookup, using the boost intrusive set class, and either returns the desired object or throws an exception if it isn't there. The other classes are typically held as linked lists, aren't derived from a common base, and don't provide a get_by_name function of their own since the details vary. However, they have enough in common - in particular, they provide a get_name function - that the stl::find_if function can be used to scan through the collection item by item, in O(n) time.

So what I wanted was a way to write a function that would see if the collection class had a get_by_name function, and use that if present. If not, it would use the fallback of stl::find.

Boost provides an extensive library called type_traits which can be used to do all sorts of clever things like this. For example, it's easy to provide a function which does something different if its parameter is a pointer, or supports ranking. But I looked in vain for the has_member_class metafunction. And a web search didn't find anything obviously helpful either. So... how to do it all by myself?

The key to much of metaprogramming is something called SFINAE (Substitution Failure Is Not An Error). This means that if an error occurs while evaluating the eligibility of a function (or a class partial-specialization), it isn't an error, it just removes that particular function overload from the candidate list. The Wikipedia article gives some simple examples. Using SFINAE, it's easy to define a function which will only be eligible if the required member function is present. One way is to define a template argument with a default which uses the member function in question:

    template<class C,
             typename C::value_type(C::*FN)(const string&)=C::get_by_name>
    typename C::value_type __get_by_name(const C *c, const string &name)       
        return c->get_by_name(name);

Note that the type of get_by_name is spelled out in excruciating detail. This wasn't a problem in this case, so I didn't try very hard to avoid it. It seems to me that it ought to be possible, using a list of classes where every parameter is a distinct template argument, but I didn't look into it. And maybe not, which may explain why there isn't a boost::has_member_function. Anyway, this function will use get_by_name if it exists, and be deselected if it doesn't.

Now we just need the fallback function:

    template<class C>
    typename C::value_type __get_by_name(const C *c, const string &name)
       typedef typename C::value_type V;
       return *std::find_if(c.begin(), c.end(), [=](V v)->bool
                                                  { return v->get_name()==name; });

This uses find_if with a simple lambda function to extract the right object. (For clarity I've omitted dealing with what happens if it isn't present. I've also assumed that the container deals in pointers, not references - which isn't the case for the Boost intrusive classes. That led to quite a lot of complication in my actual code).

Unfortunately though, it doesn't work. It's fine if C doesn't have get_by_name - the first function is deselected, leaving only the fallback version. But if it does have get_by_name, both functions are eligible and the compiler (gcc at least) declares the call ambiguous.

The rules for deciding how to order functions when more than one is available are complex, and probably only fully understood by people who write C++ compilers. Basically, a function is preferred if it is "more specific", i.e. if there is a set of parameters that will satisfy the other function but not this one. On that basis, it looks as though the first function should have preference over the second. But it doesn't.

At this point in my explorations I gave up. It wasn't that hard to write a get_by_name function for the other handful of collection classes. But it annoyed me and I kept thinking about the problem.

The very least specific function is one with completely unspecified arguments: "foo(...)". However such a function can't do anything, since it knows nothing about its arguments. Also, such a function only accepts POD arguments, not including references. Something a bit less drastic is required.

The solution was obvious when I saw it: add another, completely unused, parameter, which has explicit type for the preferred function, and is a template for the other one:

    template<class C,
             typename C::value_type(C::*FN)(const string&) C::get_by_name>
    typename C::value_type __get_by_name(const C *c, int, const string &name)       
        return c->get_by_name(name);

    template<class C, class I>
    typename C::value_type __get_by_name(const C *c, I, const string &name)
       typedef typename C::value_type V;
       return *std::find_if(c.begin(), c.end(), [=](V v)->bool
                                                  { return v->get_name()==name; });

A wrapper function just inserts a suitable argument:

    template<class C>
    typename C::value_type get_by_name(const C *c, const string &name)
       return __get_by_name(c, 0, name);

Et voila! This selects the function using the class's get_by_name if it exists, and the fallback function only if it doesn't exist. Problem solved.

Tuesday 2 April 2013

Pete Aguereberry's Car

One of the great characters in the history of Death Valley is Pete Aguereberry. His story is truly a dream come true. Born in the Basque region of southern France (as you can tell from his name), he was fascinated by the stories of the Californian gold rush. At 16, he emigrated to California. By 1907 he was installed in what's now the Death Valley National Park, with his very own gold mine. It hadn't been easy - there were several years of disputed claims and lawsuits.

He stayed there until he died in 1945, working the mine by himself. It must have been a very lonely and hard existence. He lived in a small house a short way from the mine, a long way from the nearest town even by car. Even another other human presence was an hour or more away, and much more before cars were available.

His name lives on today. Aguereberry Point is a 6500 foot peak overlooking Furnace Creek and the northern end of the Badwater valley, with magnificent views in all directions. He loved this view so much that he built the road to it, still in use today, so others could appreciate it too.

Not far from the house is the rusted shell of an old car. There are many pictures on the web, generally entitled "Pete Aguereberry's Car". But to me it looked like a late 1940s model, too late for it to have belonged to him. I was intrigued to know whether it could in fact have been his daily drive.

The wreck is in a sorry state. Everything that could possibly be taken, has been - the wheels, the seats, the instruments, nearly all the engine parts. Even the cylinder head has been taken. The body has been used for target practice and is full of holes. Vandals have evidently jumped on the roof, partly collapsing it. There is nothing that positively identifies the make or model.

However there is one strong clue. The engine is a very unusual type, with eight cylinders arranged in one long straight line. Nearly all eight-cylinder engines use the V8 configuration. "Straight eights" have always been a rarity. They take up a lot of space and the length of the crankshaft means that the front and back of the engine don't necessarily agree with each other, or with the camshaft, about exactly where they are in the four-stroke cycle. They've only ever been used in very high-end luxury models.

A quick Google search showed that in this period, there were only a handful of suitable models. Another helpful feature of the wreck is that it has only two doors - unusual in a luxury sedan. It was fairly easy to narrow it down to a Buick Sport from 1947 or 1948. All of the features compare, including things like the detailed line of the front wings flowing to the rear of the car and the remaining "fangs" of the grille. As further confirmation, the headless engine is clearly the overhead-valve type, like the Buick. Most engines in the late 40s were still of the "flathead" type, with the valves in the cylinder block.

So it couldn't have been Pete's - he'd been dead and buried in Lone Pine for at least two years before this car was built. Whose was it then? The story relates that towards the end of his life, he was helped with the mine by a nephew. Could this have been the nephew's car? In any case the mine must have produced well to finance a car like this, the equivalent of a high-end Mercedes Benz today.

Tuesday 19 February 2013

Raw Sockets, Raw Brain

I've been working for a while on a networking app that involves moving packets at great speed from one interface to another, while doing a little processing on them. Key to the performance we need is to avoid copying data or making kernel calls for each packet. We already did some testing a while back with Netmap, which provides direct access from user-mode code to hardware buffers and buffer rings - perfect for our application. Intel now has something similar called DPDK, but we chose Netmap because it was there when we needed it and also because, being open source, it avoids having to do a secret handshake with Intel.

However Netmap has some limitations so we have to modify it before we can fully integrate with it. In the meantime I wanted to do some functional testing, where ultimate performance isn't essential. So it seemed obvious to code a variation which uses sockets, accepting the performance penalty for now.

Little did I know. Getting raw sockets to work, and then getting my code to work correctly with them, was really a major pain. Since I did eventually get it to work, I hope these notes may help anyone else who runs into the same requirement.

What I wanted was a direct interface to the ethernet device driver for a specific interface. Thus I'd get and send raw ethernet packets directly to or from that specific interface. But sockets are really designed to sit on top of the internal IP network stack in the Linux kernel (or Unix generally) - for which they work very nicely. It's pretty much an unnatural act to get them to deal only with one specific interface. It's also, I guess, a pretty uncommon use case. The documentation is scrappy and incomplete, and the common open source universal solution of "just Google it" didn't come up with much either.

Just getting hold of raw packets is quite well documented. You create a socket using:

    my_socket = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL))

and you will get, and be able to send, raw packets, with everything from the ethernet header upwards created by you. (You'll need to run this as root to get it to work).

Since my application runs as "bump in the wire", picking up all traffic regardless of the MAC address, I also needed to listen promiscuously. This also is reasonably well documented - it's done by reading the interface flags, setting the requisite bit, and setting them again:

    ifreq ifr;
    memset(&ifr, 0, sizeof(ifr));
    ioctl(my_socket, SIOCGIFFLAGS, &ifr);
    ifr.ifr_flags |= IFF_PROMISC;
    ioctl(my_socket, SIOCSIFFLAGS, &ifr);

Production code should check for errors, of course.

I got my code running like this, but it was still getting packets destined for all interfaces. This is where it got tricky and had me tearing my hair out. Some documentation suggests using bind for this, while others suggest using an IOCTL whose name I forget. It's all very unclear. Eventually, bind did the trick. First you have to translate the interface name to an interface id, then use that for the call to bind:

    sockaddr_ll sa;
    ifr.ifr_addr.sa_family = AF_INET;
    strncpy(ifr.ifr_name, sys_name.c_str(), sys_name.size()+1);
    ioctl(my_socket, SIOCGIFINDEX, &ifr);
    sa.sll_ifindex = ifr.ifr_ifindex;
    sa.sll_family = AF_PACKET;
    sa.sll_halen = ETH_ALEN;
    bind(my_socket, (sockaddr*)&sa, sizeof(sockaddr_ll));

A couple of wrinkles here. It may seem intuitively obvious that sockaddr_ll is in effect a subclass of sockaddr, but it isn't documented that way anywhere that I could find. And finding the header files that defines these things, and then the header files they depend upon, and so (almost) ad infinitum, is a nightmare. In the end the best solution I could come up with was to run just the preprocessor on my source files, and look at the resulting C code. And note the ugly cast in the call to bind, because in the world of C there is no such thing as inheritance - the common superclass is actually a macro, and as far as the compiler is concerned sockaddr and sockaddr_ll are completely unrelated.

Another wrinkle is the bind function itself. I use boost::bind all the time, far too much to want to type or read the full qualified name, so my common header file contains "using boost::bind". That absolutely wipes out any attempt to use the socket function of the same name. The only way round it is to define a trivial function  called socket_bind (or whatever you prefer), whose definition in its .cpp file studiously avoids using the common header file. It's only a nuisance, but it did take a little thought to come up with a reasonable workaround when I first ran into the problem.

So, with all this done, I was receiving raw ethernet frames, doing my thing with them, and sending them on through the paired egress interface. Wonderful.

Except actually, not. The frames I was receiving were way longer than ethernet frames. Since I'm using jumbo-frame sized buffers (9000 bytes), I'd receive them OK but not be able to send them. But sometimes, they were even too large for that, and I wouldn't receive anything at all. And this was where things got really frustrating.

The first move, of course, was to check the MTUs (maximum permitted frame size under TCP and IP) on all the relevant interfaces. They were fine. Then I found a suggestion that TCP will use the MTU of the loopback interface, relying on the driver to straighten things out. So I set that down to 1400 too. It still made no difference.

At that point, my code didn't send ICMP messages for too-large packets, which a router or host is supposed to do. I spent a whole Saturday in a distress-coding binge writing my ICMP implementation, and changing my super-slick multi-threaded lock-free infrastructure to accommodate it. It did make a very small difference. Instead of just blasting away with giant frames, it would send each packet initially as a giant frame, then retransmit it later in smaller, frame-sized chunks. The data did get through, but at a pititful rate with all those retransmissions and timeouts.

Finally, after much Googling, I discovered the "tcp segmentation offload" parameter. That made no difference. With more Googling, I also discovered "generic segmentation offload". That made things better, though still far from good. I had Wireshark running on all four interfaces - the two test systems running iPerf, and both interfaces on my system in the middle. (All this is running as VMs under VMware, by the way - see earlier rant reasoned discourse about the problems I had trying to get Xen to work). Wireshark clearly showed packets leaving the first system as correctly sized ethernet frames, yet when they showed up at the second system they'd magically coalesced into jumbo frames.

After much cursing I found the third thing I had to turn off, "generic receive offload". The design assumption here is that practically all network traffic is TCP - which after all is true. So the hardware (emulated in my case) combines smaller TCP packets into huge ones, to reduce the amount of work done in the network stack. It's an excellent idea, since much of the overhead of network processing is in handling each packet rather than the actual data bytes. But of course it completely broke my application.

This is not one of the better documented bits of Linux. There is - of course - a utility to manage all this stuff, but it's so obscure that it's not part of the standard Ubuntu distribution. You have to explicitly install it. So a summary of what is required to solve this problem is:

    sudo -s
    apt-get install ethtool
    ethtool -K ethn tso off
    ethtool -K ethn gso off
    ethtool -K ethn gro off

All of this requires root privileges. Whoever wrote ethtool had a sense of humor - the '-K' option sets parameters, the '-k' option (lower case) shows them. It would have been too hard, I suppose, to think of a different letter for such fundamentally different operations.

With that done, my code sees the packets at their normal (no greater than MTU) size. Finally, I could get on with debugging my own code.

Monday 18 February 2013

Wakkanai - Tales from the Frozen North

A few years ago we visited Hokkaido, taking the train across the island to Abashiri and then driving our rented car all around the southern coast. We spent an especially memorable night in the wind capital of Japan, Erimo Misaki. When I wrote about that trip, I ended with "we're looking forward to returning in winter when everything is covered in snow". And so when Isabelle had a meeting planned in Sapporo in December, we just had to make an adventure out of it.

Since we'd already seen a lot of the southern half of Hokkaido, we decided to go north. The truth is that there isn't much to visit in northern Hokkaido - it's pretty, but extremely empty. We aimed for the northernmost town in Japan, Wakkanai, which is just a few miles from the northernmost accessible place in the whole country, Soya Misaki. It's also the basis of a pun in Japanese, since it sounds very similar to "I don't know" - "Where is this train going?", "Wakkanai". "But you're the driver, surely you must know". Actually, that's probably about the most interesting thing about Wakkanai.

First, though, we had to get there. We flew with JAL from San Francisco to Haneda. This is just so much better than arriving in Narita. First, it's a short subway or monorail ride from the city, or an almost-affordable taxi ride - compared to a long and infrequent train ride or the interminable limo bus (forget a taxi, which will cost at least $250). Second, it just works amazingly well. There are never any lines - for immigration, for check-in, for baggage. If US airports worked even a tenth as well it would be a huge improvement.

The next day we had free time in Tokyo. We had lunch at our favourite kaiten Sushi restaurant, then went for a walk in the neighbourhood of our hotel in Ebisu,  and discovered Shirokanedai Park. This is a nature wilderness smack in the middle of Tokyo - we could hear the Yamanote line trains as we wandered around the gorgeous autumn foliage. Cats and dogs aren't allowed, but nobody had told the ginger cat I caught on camera racing through the undergrowth. Later we had a very pleasant evening with old friends from the days when I used to visit Japan several times a year.

Next morning, return to Haneda. It's the fifth-busiest airport in the world - in the US, only Chicago and Atlanta handle more passengers. Yet it always seems deserted, and once again there were no lines, at all. On takeoff there were fantastic views of Fuji - it was a very clear winter morning. Sapporo had a generous cover of snow, with a lot more falling during the afternoon. The view from our twelfth-floor hotel room of the constantly changing snowfall was magnificent. I ventured outside only to buy the Japan Railways timetable - as on every trip - which turned out later to be a very wise purchase.

Next day Isabelle was busy. I decided to take the train somewhere just for the pleasure of it, and ended up in Noboribetsu on the coast. The name means "special climb", except it doesn't because Kanji are just used for their phonetic value in Hokkaido place names. "Betsu" is actually the Ainu word for river, and appears everywhere - always written as 別 meaning "special".

Noboribetsu is a typical small rural town, which is to say there's absolutely nothing going on there and it has a faintly derelict, mouldy feeling to it. I walked down to the harbour, which was of no interest at all, then found the only place that was open for lunch. It was no surprise to see practically everyone from the restaurant waiting on the platform for the return train to Sapporo.

The next day our train, the Super Soya, departed at 7.48 precisely - Japanese trains always run to time. Well, almost always, as we found later. The train chugs along for five hours, mostly spent on a single track winding through snow-covered fields in narrow valleys as it heads generally northwards towards Wakkanai. The views were magnificent. Finally, after a mountain pass and tiny villages, we saw the sea, and soon after we arrived at Wakkanai station.

Here we had to hurry. Not having a car, the only way to Soya Misaki was by bus. There's exactly one in the afternoon, and it leaves 33 minutes after the train arrives. In that time, we had to get our bags to the hotel, and get back to the station. We made it, with a few minutes to spare - just as well, because the only alternative was a taxi at absolutely astronomical cost (around $150 for the round trip).

The bus trundled along through Wakkanai, all grey and cold and snowy. At the new shopping mall - which seems to be the happening place in town - a group of old ladies, their weekly (monthly?) trip to the big city over, boarded the bus to return to their tiny fishing villages.

Finally we arrived at Soya Misaki, several other people getting off with us. This is the northernmost point in Japan that you can visit - there's a tiny uninhabited island that's further north but you'd need your own boat to get there. The southern tip of Sakhalin, in Russia, is about 40km away. On a clear day you really can see it, or so they say. This day was cold - below freezing, and with a strong breeze. There's very little to see, just a couple of monuments. We looked for the monument to the victims of KAL007 - the plane that in 1983 inadvertently overflew the USSR and got shot down. But we couldn't find it. Only later did we discover that there are several more monuments on top of the hill, a short walk from the shore but not possible in the 25 minutes before the return bus. And you really don't want to miss that, because the next one is another 3 hours.

Once you've scanned the horizon for Sakhalin, and taken each other's pictures in front of the various monuments, the only thing left is to visit the northernmost tacky tourist junk shop in Japan. Every attraction in Japan has these (though there were none in Noboribetsu), selling much the same variety of made-in-China local handicrafts. This one was special, though, as definitively the furthest north in Japan. We bought some postcards and some dried squid, of which more later.

The only thing left now was to wait for the bus. There's a shelter, but it's set back from the road. And we really didn't want to miss it. It was very cold and windy, and the idea of spending another three hours in the gift shop was enough to make me happy to brave the cold. The sight of the bus, when it arrived (precisely punctual of course), was truly the best part of the excursion.

We trundled back to Wakkanai, sharing our dried squid with a couple of Taiwanese girls who were on the same pilgirmage as us. It tasted funny. Later we discovered that this wasn't the classic Japanese snack, which is enjoyable in a chewy and getting-stuck-between-your-teeth kind of way, but an "improved" version made from New Zealand squid soaked in corn syrup. Yuck.

Once back in town, there really wasn't much to do. From our room we had a view over the generously-sized but completely empty harbour. In summer, Wakkanai is the departure point for the much-visited islands of Rebun and Rishiri, but in winter they're covered in snow and nobody goes there. There are also - maybe - boats that visit from Russia.

For dinner, we chose the most famous restaurant in town. Oddly, it's called kuruma-ya, written 車屋, which means "car dealer". We were almost the only people there, but we had an excellent meal including gigantic Hokkaido crab legs and much other seafood.

Our flight next day was at 1.30pm, so to occupy the morning we took a taxi out to Cape Noshabu, the northernmost point in the town itself. There's not much there, so after the obligatory snowy picture, we continued along the coast road. It is very bleak. There are a few houses in the snow, facing out over the cold, grey ocean, surrounded by rickety wooden structures for drying seaweed (kombu). You can see the two islands, grey mountains looming out of the mist and chill.

We still had time when we returned. We discovered the Russian shopping street. Even though the hotel clerk said there are no more tourist boats from Russia, there is this whole street of tourist-type stores, with everything labelled in Russian. Compared to Khabarovsk in Sakhalin, Wakkanai is the sunny south. But the street was almost deserted and looked very sad and run-down. Maybe they run in summer.

Finally it was time to go to the airport. It was sunny when we left, but when we arrived just 20 minutes later there was a blizzard and you could barely see across the runway. We waited, and waited - the display said something about "investigating the weather conditions" which wasn't very encouraging. Finally Isabelle had exhausted the possibilities of the tiny gift shop and we started to go through the security. Just then there was an announcement, and the guard held up his hands in a big "X". What had we done wrong?

It took only a moment to realise that the flight had finally been cancelled. It was very lucky that we hadn't gone airside, because it meant we were in the first few people in the queue at the desk. But what on earth to do now? There's exactly one flight a day to Tokyo, and even if we got on the next day's flight it could just as well be cancelled too. That would be a big problem, since our flight back home was that night. There are a couple of flights to Sapporo, but they're on little commuter turboprops - there were more than enough people waiting to fill the next two flights.

The alternative was the train, again. Luckily Isabelle had retrieved our bags, and with them the JR timetable (moral: never let it out of your hands). I realised we might just make the midday train, which left for Sapporo in half an hour. But how to get back to the town? The taxi rank, outside in the blizzard, was deserted.

Isabelle rushed outside and talked to a guy who had exactly the same reflex as us. Somehow, despite not having a word in common, she communicated with him and agreed to share a taxi, should one arrive. Which it did, in the nick of time. Constantly encouraged by our new friend, the driver went as quickly as he could on the snow and ice covered roads. We arrived at the station seven minutes before the departure time, enough to buy our tickets and see the blessed train arrive.

We had no idea what we'd do once the train took us back to Sapporo, but it was much better to be stuck in Sapporo than Wakkanai! The Tokyo-Sapporo air route is the busiest in the world, so getting back in time for our flight home would not be a problem. JR timetable to the rescue again - it showed that we should easily be in time for the last couple of flights.

This train journey was painful, unlike the previous day. The train was much older, and stunk of diesel fumes. There was no food service, so we survived on a few nuts that we had with us. It was overheated and stuffy, and for much of the trip it was dark outside. But still, we weren't stranded in Wakkanai! And for part of the way, we sat at the front of the train getting a driver's-eye view which is much better than looking sideways. Tiny country stations flashed by, no more than a short platform and a lamppost - yet surely a lifeline to some remote community. The track was completely buried in the snow, only the shiny tops of the rails showing. Visibility was terrible, no more than a couple of hundred yards in the swirling blizzard, and you can certainly appreciate the driver's task, having to slow down for invisible bends, known only through his minute knowledge of every twist and turn in the mountainous track.

One thing we noticed is how polite Japanese trains are. American trains have incredibly loud horns that you can hear from miles away. Japanese trains just have a little high-pitched whistle that goes "pheeeep" - you can barely hear it from inside the train. It's as though the train is saying "shitsurei itashimasu" (the super-polite version of "excuse me") in a squeaky voice just like the shop ladies.

Eventually we arrived in Sapporo. And we were late! This is unknown in Japan, a full quarter hour behind schedule, most likely due to the terrible conditions in the mountains.

Finally we got to the ANA desk at the airport - no queue of course. The lady was very helpful, took all our details, told us there was room on the last flight - and then showed us the price. It was over $1000! I'd been kind of expecting that - I'd managed to call ANA from the train and they weren't very helpful, with our "Explore Japan" tickets which are much cheaper than the published fare. Without much hope, I explained again that we hadn't chosen to travel this way, that the flight had been cancelled and all the rest. There was much "sooouu desu nee" and typing and phone calls. And then - mirabile dictu - she handed us two boarding passes. With my few words of Japanese I'd just saved over $1000!

The plane and then the subway delivered us to Shinagawa station at exactly midnight. We couldn't believe the crowds. Trains departing outbound were so full that people were left standing on the platform. It needed the white-gloved crowd-pushers from the Yamanote line. It was almost impossible to get through all the people to the taxi rank outside. It was the second-last Friday before the Christmas holiday, and evidently Tokyo's salary-men (and women) were partying to the full.

We'd barely eaten all day, so dinner was a priority. When we arrived from the US we'd discovered an Italian wine bar on the wrong side of the tracks at Ebisu station, practically in the catacombs down at street level. We rushed back there. Every few minutes one of us would say, with a sense of wonder, "We're not in Wakkanai!". It was such an incredible relief to be back in Tokyo.

The next day, Saturday, was our final day, but the JAL flight from Haneda doesn't leave until midnight. We went shopping in Ginza, which I haven't done for a very long time. It's an amazing place, in such a Japanese way. The quality of everything on sale is just astounding, whether it's food, household stuff, porcelain... anything. I could have spent the whole day just revelling in the place, admiring things that normally I'd never even pay attention to. I found a little etched brass model of the Toyota FJ, which I just couldn't resist (although it is still sitting wrapped, accusingly, on my desk, waiting to be built). It comes from a lovely series of architectural models, allowing you to build things like a typical Tokyo street scene. And of course I just had to visit Tenshodo, one of the world's greatest model train shops, three tiny floors crammed full of every make in the world (even Hornby from Britain, I wonder who on earth buys that in Japan), and all within a few paces of the Ginza Crossing.

And so, finally, to Haneda, and our flight home. Wakkanakatta.

Saturday 9 February 2013

Ubuntu + Xen = Disaster!

I'm writing this as I reinstall Ubuntu 12.10 on my new test machine. My idea - to test some networking software I'm working on - was to buy a new high-power desktop machine (done), install Ubuntu on it (done), then run Xen to create a simulated network.

Xen is really the open source software from hell. Nothing works the way it's apparently supposed to, and actually in the end nothing works at all. You search around on Google looking for people who've had the same problem, you try the fixes they suggest. Sometimes they work, taking you a few milliseconds further towards failure, other times they have no effect. It's just incredibly frustrating.

So now I'm reinstalling the system and I'll run VMware instead. It's a shame, I liked the idea of Xen, being free and open source and all that. But unless you just have way too much time on your hands, it's hopeless. For me, this whole business of building systems is a distraction from the important matter at hand, which is writing code for my own software. The less time I spend on it, the better.

The first problem was the book I bought. I was on a business trip for a week, so I thought the ten hours or so on the plane, and lonely nights in a hotel, would be the perfect occasion to get myself up to speed. I bought "Running Xen" by Matthews et al. The "et al" turns out to be extremely important. The book is terribly written, by Matthews' entire class of students, none of whom can write very well, and for sure can't write consistently. It never really quite tells you how to do anything, constantly distracting itself with dire warnings about all the bad things that can go wrong. Actually, given my experience with Xen, maybe that's not completely inappropriate.

I bought another book, "The Book of Xen". This is at least well written, and it probably isn't the authors' fault that none of the recipes they give for how to do things actually turn out to work.

Anyway, let's go back to when I didn't know how bad this all was. I installed Xen, made the necessary configuration changes, and rebooted. Woohoo! There I was running a hypervisor. I could type "xm list" and see it, complete with my one and only virtual machine (Dom0 in Xen-speak). So now, all I had to do was follow the recipe from the book, and I'd have my software - which was already running on the bare machine - running on a VM.

Everybody tells you that if you want your VM to use a file as its virtual disk, you absolutely shouldn't use the "loopback driver", you should use something called blktap. So naturally that's what I tried to do. I copied magic incantations into various configuration files, created my virtual disk, and tried to mount it.

Nothing worked. I tried all sorts of combinations of things, and of course I Googled all the error messages and the like. I found lots of suggestions, but nothing that actually helped. That was when the curtain started to open on the fire and brimstone of open source hell. There were loads of responses along the lines of "ah, but you need to install xxx then edit /etc/yyy". But xxx doesn't work on Ubuntu. Google. "xxx doesn't work on Ubuntu, but you can do xxxzzz instead". Well yes, but "xxxzzz" doesn't actually do what "xxx" does. And so on. One article even says "the great thing about Xen is that there are so many different ways to do the same thing." Maybe they had their tongue in their cheek. In any case they were wrong, because the truth is that there are so many ways to fail to do the same thing.

I was about to give up when I suddenly thought to try the dreaded loopback driver, which is supposed not to have good performance. (I've heard that Butler Lampson, one of the great minds in computer science, once said "Performance is a characteristic of a correctly functioning system" - in other words, it doesn't matter what the performance is, if it doesn't actually work. When I had the opportunity to ask him he denied having said it, but it's true anyway).

And... it worked! Replacing "tap:aio:" by "file:" suddenly had everything working correctly. I could mount my virtual device, treat it just like a disk, copy my host system onto it. Now I was ready for the next step, boot the VM with an exact copy of the Ubuntu 12.10 system that was running as the host.

I even tried going back to blktap for the guest system, figuring that maybe there was some kind of conflict with running it in the host. Well, that of course got nowhere. But I edited the config file and... it got another five whole lines further in the console log before failing in a different way.

I Googled the error messages, as you do with open source (open source would just never work at all without Google). The problem was that the root file system wasn't getting mounted. This may (or may not) have something to do with the way Xen deals with booting guests. Instead of doing what the hardware does, and executing files from the guest's disk image, it boots using the host kernel, then somehow flips over to the guest in mid-boot. Why it does this I don't know, but it creates whole chapters in the textbooks explaining how to deal with kernel incompatibilities and the like. And evidently I'd just been bitten by one of these - even though I was using the same system for the host and for the guest.

To cut a long story short, I tried a lot of things - Pygrub, virt-manager - and none of them would work. They all failed in some incomprehensible way, and Googling just produced confusing, conflicting advice, which typically started with "rebuild the kernel..." or "install this completely different toolset".

Yet Xen is in widespread use by cloud hosting companies - for example, by Amazon. I can only suppose that if you have the resources to experiment with different distros, kernel builds, toolsets and configurations, you can eventually find a combination which works. But it certainly isn't for the casual user like me, who just wants to get something working in a day or so. It may be that Ubuntu, or its latest version, is part of the problem. The Ubuntu web pages seem to say it should all work, but there are also a lot of references to things that don't quite work the way they're supposed to under Ubuntu.

I suppose I should have known, really. I worked for a while with a company which had a virtualized software product. They'd started with VMware as the base, and then customer pressure had forced them to port to Xen - just before I joined and found myself responsible for it. It was a nightmare - nothing worked, Citrix (the owners of Xen) were incapable of providing support, and the Xen open source community just laughed when we asked them for advice. "Oh, you're using the xyz toolset - nobody uses that any more, everyone is using pqr. The latest release is pretty good, it mostly works and there are quite a few patches for the stuff that isn't really there yet." The project over-ran by months and only "worked" thanks to numerous hacks and workarounds.

So, here I am reinstalling the machine. I'm annoyed about the time I've spent trying to understand Xen, that I'll now have to spend getting to grips with all the utilities for VMware. But it surely can't be worse... can it?

Tuesday 5 February 2013

Tales from the cup shelf #1: Prague

For years now, whenever we travel (often), we try to buy a souvenir mug. Looking at the cup shelf the other day (well, there are three of them actually - I did say we travel often), I realised that collectively they amount to quite a story.

The first one comes from Prague, our first trip there together in 1993. The Communist era was barely over yet the city was already transformed, cleaned up and full of life. Music was everywhere. Walking around, we were constantly being given fliers for evening chamber concerts, while street performers gave excellent impromptu recitals on nearly every corner.

When I was a teenager - no mugs from then I'm afraid - I happened to buy an album of harpsichord recitals by an artist called Zuzana Ruzickova (give or take a few accents). I was enthralled by it, it's by far the best harpsichord album I've ever found. So you can imagine how I felt when one of the fliers was for a performance by none other than Ms Ruzickova herself. And it was extraordinary. Unlike a piano, a harpsichord can only produce notes of one intensity. No matter how gently or hard you hit the key, the sound is always the same. So the only way to make it louder is to play more notes. That's why harpsichord music always has passages with an incredible number of short notes - to make it louder. Just as I was wondering how she could possibly move her fingers that quickly, in a passage composed entirely of demi-semi-quavers, she doubled up again, to hemi-demi-semi quavers. The effect was electrifying.

All of these concerts - we went to others too - were in delightful buildings, older than the music itself, with mysterious inner courtyards and staircases. These led to small rooms - whence chamber music - where you could literally reach out and touch the musicians (not that we did, though I did introduce myself to Ms Ruzickova after the concert, to thank her for such a wonderful introduction to what is often quite frankly a pretty boring instrument).

Of course we visited Wenceslas Square, site of Jan Palach's 1969 self immolation. And the beautiful Vysehrad Park on top of the hill overlooking the Danube, and Prazsky Hrad, the castle high up on the banks. Everyone does. But my memories are of other places, of the quiet cloistered side street leading to the unforgettable Blue Duck restaurant, where we spent a long and largely afternoon-destroying lunch. Or of the tram ride from our hotel, which was in the suburbs, rattling through bustling residential neighbourhoods to the city centre. And then there was the Tatra museum - a whole collection of those odd Czech luxury cars with their rear-mounted, air-cooled V8 engines.

The mugs, the perfect size for after-dinner coffee, were once six, but now only two remain. They've done better than the crystal glasses, though - of the twelve we bought, only one is left, and even that is chipped. (Quite how we brought all this booty back with us on the plane, I do not recollect.)

Wednesday 16 January 2013

My first Rest API

For several months I've been working on an application that I always knew would sooner or later need a Rest API added to it for configuration and monitoring. These days, if you don't have a Rest API, you're nobody - even Cisco has added one to IOS. This would be the first one I've created myself.

My goal was to make this as simple and as little work as possible. My only previous experience of Rest APIs was the exact opposite. The designers had gone out of their way to make the Rest API as complex and burdensome to implement as possible, spending whole days in minute reviews of parameter names and metadata like XSD files. They had also built an exceedingly complex implementation - a perfect illustration of the French expression une usine à gaz. The bits that needed to have high performance were in PHP, while the low performance string handling was done in C. Honest. Python was in the mix somewhere too, and you could probably find Cobol and APL if you looked hard. It was a big mess, and the exact opposite of what I wanted to achieve.

My main goal was to do as little work as possible, both in building the initial infrastructure and even more so when making additions later. I'm a one-man band on this project, 15,000 lines of code in the last six months, and the less work it takes to do a relatively peripheral thing like this, the better. That said, I'm always willing to spend a little longer getting the infrastructure right if it makes less work later on, and especially if it makes for less error-prone repitition.

I wanted to make it as "truly Restful" as possible, respecting the Rest orthodoxy. Though this turns out to be harder than you'd think. Most writings on the subject are distinctly obscurantist, leaving me with the same feeling as when I try to understand writings on philosophy or (worst of all) sociology - I know what all the words mean and I can pretty much figure out the sentences, but I have no clue what they are actually trying to say. A principle which is held in especially high esteem is called HATEOAS. People write pages and pages about how bad it is not to follow this principle, but nowhere have I found an example which illustrates what it actually means. I think it means that you should have lots of hyperlinks to associated resources, which is a good idea in human-oriented web stuff too. So that's what I did, although honestly it might mean something completely different.

My underlying code is all written in C++. The configuration interface is exposed through a singleton policy_manager class, which has functions like add_interface or show_acl that access the underlying C++ objects. I'm a big fan of using Python to do anything which isn't performance critical, so the first thing to do was expose this as a set of Python functions. Boost::python to the rescue for this part - it took only an hour or so to expose these functions in Python.

One thing I really did not want to do, was to have to repeat each function declaration over and over for different parts of the interface. I already had the usual .h/.cpp files for the C++ code. It was a 15 minute task with emacs to transform the function declarations in the .h file into some macros that captured the essentials of the functions, i.e. the names and details of the parameters, in a way that could expanded differently as required in the different places.

The C pre-processor (CPP) is very limiting in what it can do. To me the nec plus ultra of integrated macro facilities has always been the DEC assemblers. They didn't do much more than CPP, but the little there was made a huge difference to the power and flexibility. Of course there's always M4 - I used it for a big project once in the past and it is amazingly powerful, but I'd prefer to do without the extra build complication, not to mention remembering how to use it.

I came up with what amounts to an Application Specific Language to describe my functions. It's not especially pretty but it gets the job done. Oh, for a "shift" function (like the shell) to deal elegantly with variadic parameters.

All that done (it actually didn't take very long), I wanted an RPC so that my Rest server didn't have to be in the same process as the operational stuff. Pyro was the answer to that. It's a truly amazing little package, that lets you export a Python object with close to zero effort. My RPC server is about a dozen lines of code. It creates the Python object corresponding to my singleton policy_manager, and exports it. Problem solved, with amazingly little work.

The next step was to select a framework for the Rest server itself. I looked at Django, but it's really designed to do a lot more than just serving web pages, and correspondingly rich and complex. I settled on Flask, which has a very intuitive and simple way of relating URLs to the code that serves them.

To minimize the amount of work needed, I made every class work the same way. For the POST method, there is an add_... function, for GET a show_... function and so on. So the heart of the server is a table which maps Rest prefixes to the family name of the function. As a trivial example, the prefix 'interfaces/' maps to "interface", so a GET to 'interfaces/' results in a call to show_interface. There's some generic code to do things like extracting names from URLs (e.g. 'interfaces/eth0/hosts/') then depending on the verb to construct the corresponding function call.

One tricky point was the construction of links in the Rest output. The underlying C++ code knows nothing about Rest, and certainly not about the specific URL structure in use, and I want to keep it that way - so it can't generate explicit links. The solution is to pass a class name as well as the leaf instance name - e.g. {'class':'interface', 'name':'eth0'} - as the value for a link to another object. In the C++ code these attributes are pointers, so the generic attribute output code understands that pointers should be passed in this form. Then the Rest server code has a post-scanner that takes the output from the show_... functions, and using its class-to-URL mapping table, turns these into full-formed URLs.

One nice thing is that the Python code in the Rest server has no idea what is and isn't available for each class. If a class doesn't support POST, for example, then the server will generate a call to add_<class> anyway. Pyro will eventually determine that there is no such function, returning a corresponding error, which the Rest server turns into a 405 (method not permitted) error.

I made a decision early on to support only Json. This seems to be pretty normal now. XML generates a lot more work around metadata, for no added functional value. It wouldn't be hard to also generate XML, but it's hard to see why it would be useful.

In the end it took about three days of work, spread out over the Christmas and New Year break, to get a fully functioning Rest interface. There has been quite a bit more work since in the backend, as I've realised how information should be presented. The Rest server is about 250 lines of Python. Each new class adds exactly one more line, to the mapping table. And there's no PHP!