Tuesday 30 October 2018

Kotlin Part 2 - a real world example for Kotlin

In Part 1 I described my pleasure at finding what seemed to be, on the face it, an alternative to Python for larger programs where compile-time type safety is essential. And then the difficulties I ran into when I actually tried to use it. But in the end, I got a working program which could access our system's Rest API using the khttp package. It was time to move on and start building the pieces needed for a Kotlin replacement for our Python CLI.

Our system generates in real time the metadata for its Rest API, retrievable via another Rest call. This describes each object class, and each attribute of each class. The attributes of a class include its name, its datatype, and various properties such as whether it can be modified. The result of a Rest GET call is a Json string containing a tuple of (name, value) for each requested attribute. The value is always passed as a Json string. For display purposes that is all we need. But sometimes we would like to convert it to its native value, for example so we can perform comparisons or calculate an average across a sequence of historical values.

In Python, this is easy - a good consequence of the completely dynamic type structure. We keep an object for each datatype, which knows how to convert a string to a native value, and vice versa. When the conversion function is called, it returns a Python object of the correct type. As long as are careful never to mix values for different attributes (which we don't have a use case for), everything works fine. If we did happen to, say, try to add a string to a date, we will get an exception at runtime, which we can catch.

In C++ it's harder, because of course there is complete type checking. But our backend code, which is busily transforming data for tens of thousands of flows and millions of packets per second into Rest-accessible analytics, it is necessary.

The key is a C++ pure virtual base type called generic_variable. We can ask an attribute to retrieve from a C++ object (e.g. the representation of a user or an application) its current value, which it returns as a pointer to a generic variable. Later we can, for example, compare it with the value for another object, or perform arithmetic on it.

The owner of a generic variable knows nothing about the specific type of its content. But he does know that he can take two generic variables generated by the same attribute, and ask them to compare with each other, add to each other and so on. They can also be asked to produce their value as a string, or as a floating point number.

What happens if you try to perform an inappropriate operation, like adding two enums, or asking for the float value of a string? You simply get some sensible, if useless, default.

This is very easy to do in C++. The code looks something like this:

template<class C> class typed_generic_variable : public generic_variable
{
    public:
        typedef typed_generic_variable<C> my_type;
    private:
        C my_value = C();
    public:
        typed_generic_variable(const C &v) : my_value(v) { }
        string str() const { return lexical_cast<string>(my_value); }
        void set(const string &s) { my_value = lexical_cast<C>(s); }
        my_type *clone() const { return new my_type(my_value); }
        bool less(const generic_variable *other) const
        {
            my_type *other_typed = dynamic_cast<my_type*>(other);
            return other_typed ? my_value < other_typed->my_value : false;
        }
        bool add(const generic_variable *other) const
        {
            my_type *other_typed = dynamic_cast<my_type*>(other);
            if (other_typed) {
                my_value += other_typed->my_value;
            }
        }
        // and so on...
}

The point here is that in this declaration, we can use the template parameter type C exactly as though it was the name of a class. We can use it to create a new object, we can use it in arithmetic expressions, we can invoke static class functions ("companion objects" in Kotlin). When the compiler deals with the declaration of a class like this, it doesn't worry about the semantics. It only considers that when you instantiate an object of the class. In the above case, if I try to create a typed_generic_variable<foo> where the foo class does not define a += operator, then the compiler will complain.

Two very helpful C++ features here are dynamic_cast and lexical_cast. The former allows us to ask a generic variable whether it is in fact the same derived type as ourself, and to treat it as such if it is. The latter, originally introduced by Boost, makes it easy to convert to and from a string without worrying about the details.

I'll admit this looks quite complicated, but actually it's very simple to code and to figure out what is going on. The language doesn't require me to do anything special to make the type-specific class work. The code is no different than if I had explicitly coded variants for int, float, string and so on - except that I only had to write it once.

(In our actual implementation, we make extensive of template metaprogramming (MPL), so in fact if I do try to create such a variable, the add function will simply be defined as a no-op. But that's more detail than we need for the Kotlin comparison).

The goal in the Kotlin re-implementation was to use the same concept. I kind of assumed that its generic type feature, which uses the underlying Java machinery, would take care of things. But I was sadly disappointed. But this is already too long, so more in Part 3.

No comments: