Monday 13 May 2013

On typename - and why C++ is a parser's nightmare

If you've done any significant C++ programming using templates, you'll certainly have run into the annoying rule requiring you to write "typename" before constructs of the form "class::member" if the class is actually a template parameter. For example:

template <class C>
class foo
{
     typedef typename C::value_type value_type;
};

If you miss out "typename" the compiler will complain. What's more, if it's GCC it will complain in a completely mysterious fashion, giving you no clue as to what the actual problem is.

And yet, surely it's obvious that this must be a typename? Why require all those extra keystrokes and visual clutter for something which is obvious? Every now and then I'd wonder about this, and read something which described the obscure situations where it isn't obvious. But I'd promptly forget, until the next time I spent ages pondering over unhelpful error messages until the little light came on - "aha, it wants a 'typename'!".

There's a good explanation of why actually it isn't obvious, to the compiler at least, here. But it took me trying to explain C++ parsing to someone for me to really get it.

C++ teeters on the hairy edge of total ambiguity the whole time, without you even realising it as a user. And one of the worst culprits is the apparently innocent reuse of the less-than and greater-than symbols as template brackets. Consider the perfectly clear snippet:

foo<bah> foobah;

It's blindingly obvious that this is declaring 'foobah' to be an object of class 'foo' instantiated with 'bah' (presumably another class) as its template parameter.

Well, except that if all three names are actually ints, those template brackets suddenly turn into relational operators. It's not a very useful code snippet, but it is syntactically correct. First compare foo with bah, creating a bool result. Then compare that with foobah, having first cast the latter to bool. Then throw the (pretty meaningless) result away.

You don't even need templates. The reuse of '*' for both multiplication and dereferencing can also lead to ambiguity. Combining the two can get exciting:

foo<bah> *fbptr;

Obviously a declaration of a pointer to a 'foo<bah>'. Unless of course foo and bah are both numeric and fbptr is a pointer to a numeric. Then it's a replay of the previous example.

This is all made worse because C++ (necessarily) allows you to refer to class members before they're defined. Consider the following example:

class c1
{
    template<class C> class d1 
    {
        ....
    };
    class e1
    {
        ....
    };
};

class c2 : public c1
{
    void fn()
    {
        d1<e1> f1;
    }
    ....
    int d1, e1, f1;
};

When the parser sees the apparent declaration of f1, everything it knows tells it that this is indeed a declaration. Only later does it come across other declarations that completely change the interpretation. I  wonder how compilers deal with this - it would seem necessary to hold two completely different parse trees and hope that one of them will make sense later. Just to make it more interesting, the class could also go on to redefine the '<' and '>' operators, so they apply to different classes.

Even lowly Fortran IV wasn't immune to this kind of problem. It had no reserved words, and gave no significance to spaces. So when the compiler saw:

DO 100 I=1

which is obviously the beginning of a DO statement (equivalent to 'for' in C++), everything hinges on the next character. If it's a comma, this must indeed be a DO statement. But if it's say '+', or there is no next character on this line, then it's an assignment to a variable called 'DO100I' - and of course Fortran also didn't requires variables to be declared, so they could just pop up like this.

I'm glad I don't write compilers for a living!

Saturday 11 May 2013

The End of The Russian Triode Saga

Today I sent my last package of Russian triodes, ending a saga which began 15 years ago at an audio tradeshow in London.

In 1997 I bought my flat in London - the best investment I ever made. I looked for an audio system, and rather to my surprise discovered that everything they said about valve (vacuum tube) amplifiers was true - they really did sound a lot better. I bought a valve system for London, and was then so dissatisfied with the system I had at my main home in France that I set about creating a system there too. That led me to build the extraordinary and wonderful Atma-Sphere MA60 transformerless (OTL) valve amplifiers, and to spend a lot of time investigating all the possibilities of this new-to-me but very old technology.

Somewhere along the line, I became fascinated by the massive Russian 6C33C-B power triode. This was built like the proverbial tank - with good reason, since that's almost where they were used. Supposedly they were built as voltage stabiliser valves for use in Mig jet fighters. Somewhere on the web I indeed saw a picture of some Russian avionics with one of them in it, so I suppose it's true. They're huge, and can pass a continuous current of over half an Amp. This is amazing for a valve - they are really high-voltage, low-current devices (high-impedance in electrical terms). Getting one to pass this kind of current, at a fairly low voltage, requires some very special engineering. For driving loudspeakers directly, without a transformer to turn volts into amps, they seem like the perfect device.

They've been used in a few commercial designs. Atma-Sphere at one time sold an amplifier using them. A company called BAT - which I think is still in business - had another. But - so the story goes - it's difficult to get them to work reliably. This must have been very reassuring to the Mig pilots.

I went to a high-end audio show in London, and got talking to the UK importer for the Sovtek Russian valves. He offered me a great price on the 6C33C if I bought a box of 50 of them. As it happened, I'd just seen somewhere that the factory in Russia had just realised the value of what they were making and were planning to increase the price - like by a factor of ten. At that time you could buy them direct from Russia for about $10 each.

"Aha," I thought. "I can get enough for my own projects, and make a tidy profit selling the rest when the price goes up". We talked some more and eventually I agreed to buy his whole stock of 400 valves, for a really excellent price. I wrote him a check, shook hands and that was that.

Several weeks later, a delivery truck showed up at my home in France. I couldn't believe my eyes! I had no idea that these things were so enormous! Each one came in a cardboard box about eight inches long and four inches square, beautifully protected by foam spacers and more cardboard inside. There were eight huge cartons, each one about two feet on each side. I stacked them all in the garage - where fortunately I had plenty of space - and started to wonder what on earth I was going to do with them.

I looked around at the pricing on the web, and decided I could sell them at a decent profit and still be cheaper than anyone else. I wrote an advert and put it on my own website - in 1997 running a website was a lot more complicated than it is today.

The response was modest, to say the least. I sold a handful during the remaining couple of years I lived in France. So when I moved to California in 2001, I still had my eight huge cartons. Well, mostly. They'd been tucked away in a back corner of the garage, where a combination of humidity and mice had reduced one of the cartons to crumbs, and several of the boxes inside as well.

Once in the US, I had more success. I was selling them way cheaper than the only other supplier outside Russia, and I had a special deal on a whole carton of 50. This made practically no profit, but my main goal was to get rid of them, not to make money. In fact any possible profit was wiped out by the storage I ended up renting so they didn't take up all the available space in my garage.

I think I sold four complete cartons like this. One led to the best pasta I've ever eaten. A guy in Japan bought them, and since I happened to be in Tokyo around that time we got together. He took me to a tiny place in a basement somewhere near Ebisu station where I had the most marvellous tagliatella carbonara ever, a melt-in-the-mouth creaminess that I've never experienced anywhere else.

It may seem surprising that the best Italian food should be in Tokyo, but it isn't really. The best French meal I've ever eaten (including living in France for ten years) was there too, as well as the best French patisserie. When the Japanese copy something, they do it extremely well!

I was amazed by the number of people who'd inquire about them, often sending several emails, and then just not place an order. But I guess it's the same as when you sell a car, and people spend ages on the phone asking every possible detail, and don't show up as arranged. Some people must just have far too much time on their hands.

The pile of boxes in the storage gradually shrunk. Then one day I took my car for service and got a huge pickup truck as a rental. (This was quite common, the local Enterprise offloaded whatever they had for these one-day rentals. I had a Cadillac deVille once, too. Verdict: interesting but don't sell the Audi). I realised the pile had shrunk enough that it would fit in the garage, and emptied the storage.

The pile continued to dwindle, until finally there was just one carton left, plus a few mouse-nibbled boxes that I was keeping for myself. Then, on May 5th 2013, someone ordered eight of them. This left only a dozen, which I'm keeping just in case I ever want to build a mega-transformerless amplifier with them. It's not likely, but then that's true for an awful lot of stuff I keep just in case. And I'be annoyed if I did want some, and had to buy them at the current retail price of $75.

So that's it, the end of the Russian triode saga. Now, does anybody want any 6CW4 Nuvistors by any chance? They do take a lot less space, but I have 100 of them...

Wednesday 1 May 2013

Living without boost::has_member_function

I needed to write a function that would do one thing if its parameter class had a particular member function, and another if it didn't. It seemed like this should be straightforward with a bit of Template Metaprogramming (MPL). But it wasn't. Here's the story of how I found a solution.

In more detail... our software has lots of collections whose contents are all derived from the same class, named_object. There's a bunch of infrastructure which will dump objects and collections of them into various formats including highly abbreviated text for debugging, and Json for our Rest API. There are also a few collection classes which need to be used in the same way but aren't (and can't be) built out of the same class and collection structure.

The main collection base class provides a get_by_name function that does an efficient lookup, using the boost intrusive set class, and either returns the desired object or throws an exception if it isn't there. The other classes are typically held as linked lists, aren't derived from a common base, and don't provide a get_by_name function of their own since the details vary. However, they have enough in common - in particular, they provide a get_name function - that the stl::find_if function can be used to scan through the collection item by item, in O(n) time.

So what I wanted was a way to write a function that would see if the collection class had a get_by_name function, and use that if present. If not, it would use the fallback of stl::find.

Boost provides an extensive library called type_traits which can be used to do all sorts of clever things like this. For example, it's easy to provide a function which does something different if its parameter is a pointer, or supports ranking. But I looked in vain for the has_member_class metafunction. And a web search didn't find anything obviously helpful either. So... how to do it all by myself?

The key to much of metaprogramming is something called SFINAE (Substitution Failure Is Not An Error). This means that if an error occurs while evaluating the eligibility of a function (or a class partial-specialization), it isn't an error, it just removes that particular function overload from the candidate list. The Wikipedia article gives some simple examples. Using SFINAE, it's easy to define a function which will only be eligible if the required member function is present. One way is to define a template argument with a default which uses the member function in question:

    template<class C,
             typename C::value_type(C::*FN)(const string&)=C::get_by_name>
    typename C::value_type __get_by_name(const C *c, const string &name)       
    {
        return c->get_by_name(name);
    }    

Note that the type of get_by_name is spelled out in excruciating detail. This wasn't a problem in this case, so I didn't try very hard to avoid it. It seems to me that it ought to be possible, using a list of classes where every parameter is a distinct template argument, but I didn't look into it. And maybe not, which may explain why there isn't a boost::has_member_function. Anyway, this function will use get_by_name if it exists, and be deselected if it doesn't.

Now we just need the fallback function:

    template<class C>
    typename C::value_type __get_by_name(const C *c, const string &name)
    {
       typedef typename C::value_type V;
       return *std::find_if(c.begin(), c.end(), [=](V v)->bool
                                                  { return v->get_name()==name; });
    }

This uses find_if with a simple lambda function to extract the right object. (For clarity I've omitted dealing with what happens if it isn't present. I've also assumed that the container deals in pointers, not references - which isn't the case for the Boost intrusive classes. That led to quite a lot of complication in my actual code).

Unfortunately though, it doesn't work. It's fine if C doesn't have get_by_name - the first function is deselected, leaving only the fallback version. But if it does have get_by_name, both functions are eligible and the compiler (gcc at least) declares the call ambiguous.

The rules for deciding how to order functions when more than one is available are complex, and probably only fully understood by people who write C++ compilers. Basically, a function is preferred if it is "more specific", i.e. if there is a set of parameters that will satisfy the other function but not this one. On that basis, it looks as though the first function should have preference over the second. But it doesn't.

At this point in my explorations I gave up. It wasn't that hard to write a get_by_name function for the other handful of collection classes. But it annoyed me and I kept thinking about the problem.

The very least specific function is one with completely unspecified arguments: "foo(...)". However such a function can't do anything, since it knows nothing about its arguments. Also, such a function only accepts POD arguments, not including references. Something a bit less drastic is required.

The solution was obvious when I saw it: add another, completely unused, parameter, which has explicit type for the preferred function, and is a template for the other one:


    template<class C,
             typename C::value_type(C::*FN)(const string&) C::get_by_name>
    typename C::value_type __get_by_name(const C *c, int, const string &name)       
    {
        return c->get_by_name(name);
    }    


    template<class C, class I>
    typename C::value_type __get_by_name(const C *c, I, const string &name)
    {
       typedef typename C::value_type V;
       return *std::find_if(c.begin(), c.end(), [=](V v)->bool
                                                  { return v->get_name()==name; });
    }

A wrapper function just inserts a suitable argument:


    template<class C>
    typename C::value_type get_by_name(const C *c, const string &name)
    {
       return __get_by_name(c, 0, name);
    }

Et voila! This selects the function using the class's get_by_name if it exists, and the fallback function only if it doesn't exist. Problem solved.