Wednesday 1 May 2013

Living without boost::has_member_function

I needed to write a function that would do one thing if its parameter class had a particular member function, and another if it didn't. It seemed like this should be straightforward with a bit of Template Metaprogramming (MPL). But it wasn't. Here's the story of how I found a solution.

In more detail... our software has lots of collections whose contents are all derived from the same class, named_object. There's a bunch of infrastructure which will dump objects and collections of them into various formats including highly abbreviated text for debugging, and Json for our Rest API. There are also a few collection classes which need to be used in the same way but aren't (and can't be) built out of the same class and collection structure.

The main collection base class provides a get_by_name function that does an efficient lookup, using the boost intrusive set class, and either returns the desired object or throws an exception if it isn't there. The other classes are typically held as linked lists, aren't derived from a common base, and don't provide a get_by_name function of their own since the details vary. However, they have enough in common - in particular, they provide a get_name function - that the stl::find_if function can be used to scan through the collection item by item, in O(n) time.

So what I wanted was a way to write a function that would see if the collection class had a get_by_name function, and use that if present. If not, it would use the fallback of stl::find.

Boost provides an extensive library called type_traits which can be used to do all sorts of clever things like this. For example, it's easy to provide a function which does something different if its parameter is a pointer, or supports ranking. But I looked in vain for the has_member_class metafunction. And a web search didn't find anything obviously helpful either. So... how to do it all by myself?

The key to much of metaprogramming is something called SFINAE (Substitution Failure Is Not An Error). This means that if an error occurs while evaluating the eligibility of a function (or a class partial-specialization), it isn't an error, it just removes that particular function overload from the candidate list. The Wikipedia article gives some simple examples. Using SFINAE, it's easy to define a function which will only be eligible if the required member function is present. One way is to define a template argument with a default which uses the member function in question:

    template<class C,
             typename C::value_type(C::*FN)(const string&)=C::get_by_name>
    typename C::value_type __get_by_name(const C *c, const string &name)       
    {
        return c->get_by_name(name);
    }    

Note that the type of get_by_name is spelled out in excruciating detail. This wasn't a problem in this case, so I didn't try very hard to avoid it. It seems to me that it ought to be possible, using a list of classes where every parameter is a distinct template argument, but I didn't look into it. And maybe not, which may explain why there isn't a boost::has_member_function. Anyway, this function will use get_by_name if it exists, and be deselected if it doesn't.

Now we just need the fallback function:

    template<class C>
    typename C::value_type __get_by_name(const C *c, const string &name)
    {
       typedef typename C::value_type V;
       return *std::find_if(c.begin(), c.end(), [=](V v)->bool
                                                  { return v->get_name()==name; });
    }

This uses find_if with a simple lambda function to extract the right object. (For clarity I've omitted dealing with what happens if it isn't present. I've also assumed that the container deals in pointers, not references - which isn't the case for the Boost intrusive classes. That led to quite a lot of complication in my actual code).

Unfortunately though, it doesn't work. It's fine if C doesn't have get_by_name - the first function is deselected, leaving only the fallback version. But if it does have get_by_name, both functions are eligible and the compiler (gcc at least) declares the call ambiguous.

The rules for deciding how to order functions when more than one is available are complex, and probably only fully understood by people who write C++ compilers. Basically, a function is preferred if it is "more specific", i.e. if there is a set of parameters that will satisfy the other function but not this one. On that basis, it looks as though the first function should have preference over the second. But it doesn't.

At this point in my explorations I gave up. It wasn't that hard to write a get_by_name function for the other handful of collection classes. But it annoyed me and I kept thinking about the problem.

The very least specific function is one with completely unspecified arguments: "foo(...)". However such a function can't do anything, since it knows nothing about its arguments. Also, such a function only accepts POD arguments, not including references. Something a bit less drastic is required.

The solution was obvious when I saw it: add another, completely unused, parameter, which has explicit type for the preferred function, and is a template for the other one:


    template<class C,
             typename C::value_type(C::*FN)(const string&) C::get_by_name>
    typename C::value_type __get_by_name(const C *c, int, const string &name)       
    {
        return c->get_by_name(name);
    }    


    template<class C, class I>
    typename C::value_type __get_by_name(const C *c, I, const string &name)
    {
       typedef typename C::value_type V;
       return *std::find_if(c.begin(), c.end(), [=](V v)->bool
                                                  { return v->get_name()==name; });
    }

A wrapper function just inserts a suitable argument:


    template<class C>
    typename C::value_type get_by_name(const C *c, const string &name)
    {
       return __get_by_name(c, 0, name);
    }

Et voila! This selects the function using the class's get_by_name if it exists, and the fallback function only if it doesn't exist. Problem solved.



No comments: