Tuesday 30 October 2018

Kotlin Part 2 - a real world example for Kotlin

In Part 1 I described my pleasure at finding what seemed to be, on the face it, an alternative to Python for larger programs where compile-time type safety is essential. And then the difficulties I ran into when I actually tried to use it. But in the end, I got a working program which could access our system's Rest API using the khttp package. It was time to move on and start building the pieces needed for a Kotlin replacement for our Python CLI.

Our system generates in real time the metadata for its Rest API, retrievable via another Rest call. This describes each object class, and each attribute of each class. The attributes of a class include its name, its datatype, and various properties such as whether it can be modified. The result of a Rest GET call is a Json string containing a tuple of (name, value) for each requested attribute. The value is always passed as a Json string. For display purposes that is all we need. But sometimes we would like to convert it to its native value, for example so we can perform comparisons or calculate an average across a sequence of historical values.

In Python, this is easy - a good consequence of the completely dynamic type structure. We keep an object for each datatype, which knows how to convert a string to a native value, and vice versa. When the conversion function is called, it returns a Python object of the correct type. As long as are careful never to mix values for different attributes (which we don't have a use case for), everything works fine. If we did happen to, say, try to add a string to a date, we will get an exception at runtime, which we can catch.

In C++ it's harder, because of course there is complete type checking. But our backend code, which is busily transforming data for tens of thousands of flows and millions of packets per second into Rest-accessible analytics, it is necessary.

The key is a C++ pure virtual base type called generic_variable. We can ask an attribute to retrieve from a C++ object (e.g. the representation of a user or an application) its current value, which it returns as a pointer to a generic variable. Later we can, for example, compare it with the value for another object, or perform arithmetic on it.

The owner of a generic variable knows nothing about the specific type of its content. But he does know that he can take two generic variables generated by the same attribute, and ask them to compare with each other, add to each other and so on. They can also be asked to produce their value as a string, or as a floating point number.

What happens if you try to perform an inappropriate operation, like adding two enums, or asking for the float value of a string? You simply get some sensible, if useless, default.

This is very easy to do in C++. The code looks something like this:

template<class C> class typed_generic_variable : public generic_variable
{
    public:
        typedef typed_generic_variable<C> my_type;
    private:
        C my_value = C();
    public:
        typed_generic_variable(const C &v) : my_value(v) { }
        string str() const { return lexical_cast<string>(my_value); }
        void set(const string &s) { my_value = lexical_cast<C>(s); }
        my_type *clone() const { return new my_type(my_value); }
        bool less(const generic_variable *other) const
        {
            my_type *other_typed = dynamic_cast<my_type*>(other);
            return other_typed ? my_value < other_typed->my_value : false;
        }
        bool add(const generic_variable *other) const
        {
            my_type *other_typed = dynamic_cast<my_type*>(other);
            if (other_typed) {
                my_value += other_typed->my_value;
            }
        }
        // and so on...
}

The point here is that in this declaration, we can use the template parameter type C exactly as though it was the name of a class. We can use it to create a new object, we can use it in arithmetic expressions, we can invoke static class functions ("companion objects" in Kotlin). When the compiler deals with the declaration of a class like this, it doesn't worry about the semantics. It only considers that when you instantiate an object of the class. In the above case, if I try to create a typed_generic_variable<foo> where the foo class does not define a += operator, then the compiler will complain.

Two very helpful C++ features here are dynamic_cast and lexical_cast. The former allows us to ask a generic variable whether it is in fact the same derived type as ourself, and to treat it as such if it is. The latter, originally introduced by Boost, makes it easy to convert to and from a string without worrying about the details.

I'll admit this looks quite complicated, but actually it's very simple to code and to figure out what is going on. The language doesn't require me to do anything special to make the type-specific class work. The code is no different than if I had explicitly coded variants for int, float, string and so on - except that I only had to write it once.

(In our actual implementation, we make extensive of template metaprogramming (MPL), so in fact if I do try to create such a variable, the add function will simply be defined as a no-op. But that's more detail than we need for the Kotlin comparison).

The goal in the Kotlin re-implementation was to use the same concept. I kind of assumed that its generic type feature, which uses the underlying Java machinery, would take care of things. But I was sadly disappointed. But this is already too long, so more in Part 3.

Kotlin, Part 1 - oh well, nice try guys

It amazes that new programming languages continue to appear, if anything even faster than ever. In the last few years there have been Scala, D, R and recently I came across Kotlin. At first sight, it looked like a good type-safe alternative to Python. It is one of several "better Java than Java" languages, like Scala, optimised for economy of expression. It runs on the system's JVM, meaning that you can ship a Kotlin program with a very high probability that it will run just about anywhere.

To save you reading this whole blog, here's an executive summary:

  • Kotlin is a very neat toy programming language, great for teaching and such
  • Its apparent simplicity fades very quickly when you try to do any real-world programming
  • Many things which are simple and intuitive to do in Python or C++ require very convoluted coding in Kotlin
  • In particular, Kotlin "generics" - Java-speak for what C++ calls templates - are completely useless for any real-world programming
  • Overall, Kotlin is always just frustratingly short of usable for any actual problem
  • That said, I guess it's fine for GUI programming, since it is now the default language for Android development

Most of my code is written in either C++ or Python. There's no substitute for C++ when you need ultimate performance coupled with high reliability. Being strongly typed, you can pretty much turn the code upside down and shake it (formally known as "refactoring") and if it compiles, there's a good chance it will work.

Python is fantastic for writing short programs, and very convenient as they get larger. All our product's middleware that does things like managing the history database, and our CLI, are written in Python. It's easy to write, and as easy as can be hoped to understand. But refactoring is a nightmare. If function F used to take a P as an argument, but now it wants a Q, there is no way to be sure you've caught all the call sites and changed them. One day, in some obscure corner case, F will get called with a P, and the program will die. This means you absolutely cannot use it for anything where reliability is vital, like network software. It's OK if a failure just means a quiet curse from a human user, or if there is some automatic restart.

So for a long time, I have really wanted to see a language with the ease of use and breadth of library support that Python has, coupled with compile time type safety. When I read the overview of Kotlin, I thought YES! - this is it.

I downloaded both Kotlin and the Intellij IDE, to which it seems to be joined at the hip, and wrote a toy program - bigger than Hello World, but less than a page of code. The IDE did its job perfectly, Kotlin's clever constructs (like the "Elvis operator", ?:) were easy to understand and just right as a solution. I was very happy.

Our CLI and associated infrastructure has really got too big for Python, so it was the obvious candidate for transformation to Kotlin. Basically it is a translator from our Rest API to something a bit more human friendly, so the first thing needed is a Rest friendly HTTP library. Two minutes with Google found khttp, which is a Kotlin redo of the Python Requests package which is exactly what we use. Perfect.

Well, except it doesn't form part of the standard Kotlin distribution. I downloaded the source and built it, with no problems. But there seems to be absolutely no way to make a private build like this known to the Kotlin compiler or to Intellij. I searched my whole computer for existing Java libraries, hoping I could copy it to the same place. Nothing I did worked.

The khttp website also included some mysterious invocations that can be given to Maven. Now, if Java programming is your day job, well, first you have my every sympathy. But second, you're probably familiar with Maven. It's an XML based (yuck!) redo of Make, that is at the heart of all Java development. (Well, it used to be, now apparently the up and coming thing is Gradle - why would you only have one obscure, incomprehensible build system when you can have two?)

So, all you have to do is plug this handful of lines into your Maven files, and everything will work!

Except... Intellij doesn't actually use Maven. I (once again) searched my whole computer for the Maven files I needed to modify, and they weren't there. After a lot of Googling, I finally found how to  get it to export Maven files. Then I edited them according to the instructions, and ran Maven from the command line using these new files. And - amazingly - it worked. By some magic it downloaded hundreds of megabytes of libraries, then built my Kotlin program - which ran and did what I wanted. And if I ran it again, it found all the hundreds of megabytes already there, and just ran the compiler. When I ran my little program, it fired off Rest requests and turned the Json results into Kotlin data structures. Perfect, exactly what I wanted.

But as I said, Intellij doesn't actually use Maven. Goodness knows what it does use, under the covers. So now I had to create a brand new Maven-based project, using my existing source file and my precious Maven config. And now, with Maven having put all the libraries where the compiler is expecting to find them, Intellij's own build system would build my program. In theory there is a place where you can tell Intellij where to find packages on the web, which ought to have been perfect. But in practice, when you get to the right page, it shows an empty list of places, and has no way to do add to it. I guess probably there's an undocumented configuration file you can edit.

That's a good point to break off. In Part 2, I'll talk about my experience trying to build a real-world application using Kotlin.