Wednesday, 16 January 2013

My first Rest API

For several months I've been working on an application that I always knew would sooner or later need a Rest API added to it for configuration and monitoring. These days, if you don't have a Rest API, you're nobody - even Cisco has added one to IOS. This would be the first one I've created myself.

My goal was to make this as simple and as little work as possible. My only previous experience of Rest APIs was the exact opposite. The designers had gone out of their way to make the Rest API as complex and burdensome to implement as possible, spending whole days in minute reviews of parameter names and metadata like XSD files. They had also built an exceedingly complex implementation - a perfect illustration of the French expression une usine à gaz. The bits that needed to have high performance were in PHP, while the low performance string handling was done in C. Honest. Python was in the mix somewhere too, and you could probably find Cobol and APL if you looked hard. It was a big mess, and the exact opposite of what I wanted to achieve.

My main goal was to do as little work as possible, both in building the initial infrastructure and even more so when making additions later. I'm a one-man band on this project, 15,000 lines of code in the last six months, and the less work it takes to do a relatively peripheral thing like this, the better. That said, I'm always willing to spend a little longer getting the infrastructure right if it makes less work later on, and especially if it makes for less error-prone repitition.

I wanted to make it as "truly Restful" as possible, respecting the Rest orthodoxy. Though this turns out to be harder than you'd think. Most writings on the subject are distinctly obscurantist, leaving me with the same feeling as when I try to understand writings on philosophy or (worst of all) sociology - I know what all the words mean and I can pretty much figure out the sentences, but I have no clue what they are actually trying to say. A principle which is held in especially high esteem is called HATEOAS. People write pages and pages about how bad it is not to follow this principle, but nowhere have I found an example which illustrates what it actually means. I think it means that you should have lots of hyperlinks to associated resources, which is a good idea in human-oriented web stuff too. So that's what I did, although honestly it might mean something completely different.

My underlying code is all written in C++. The configuration interface is exposed through a singleton policy_manager class, which has functions like add_interface or show_acl that access the underlying C++ objects. I'm a big fan of using Python to do anything which isn't performance critical, so the first thing to do was expose this as a set of Python functions. Boost::python to the rescue for this part - it took only an hour or so to expose these functions in Python.

One thing I really did not want to do, was to have to repeat each function declaration over and over for different parts of the interface. I already had the usual .h/.cpp files for the C++ code. It was a 15 minute task with emacs to transform the function declarations in the .h file into some macros that captured the essentials of the functions, i.e. the names and details of the parameters, in a way that could expanded differently as required in the different places.

The C pre-processor (CPP) is very limiting in what it can do. To me the nec plus ultra of integrated macro facilities has always been the DEC assemblers. They didn't do much more than CPP, but the little there was made a huge difference to the power and flexibility. Of course there's always M4 - I used it for a big project once in the past and it is amazingly powerful, but I'd prefer to do without the extra build complication, not to mention remembering how to use it.

I came up with what amounts to an Application Specific Language to describe my functions. It's not especially pretty but it gets the job done. Oh, for a "shift" function (like the shell) to deal elegantly with variadic parameters.

All that done (it actually didn't take very long), I wanted an RPC so that my Rest server didn't have to be in the same process as the operational stuff. Pyro was the answer to that. It's a truly amazing little package, that lets you export a Python object with close to zero effort. My RPC server is about a dozen lines of code. It creates the Python object corresponding to my singleton policy_manager, and exports it. Problem solved, with amazingly little work.

The next step was to select a framework for the Rest server itself. I looked at Django, but it's really designed to do a lot more than just serving web pages, and correspondingly rich and complex. I settled on Flask, which has a very intuitive and simple way of relating URLs to the code that serves them.

To minimize the amount of work needed, I made every class work the same way. For the POST method, there is an add_... function, for GET a show_... function and so on. So the heart of the server is a table which maps Rest prefixes to the family name of the function. As a trivial example, the prefix 'interfaces/' maps to "interface", so a GET to 'interfaces/' results in a call to show_interface. There's some generic code to do things like extracting names from URLs (e.g. 'interfaces/eth0/hosts/1.2.3.4') then depending on the verb to construct the corresponding function call.

One tricky point was the construction of links in the Rest output. The underlying C++ code knows nothing about Rest, and certainly not about the specific URL structure in use, and I want to keep it that way - so it can't generate explicit links. The solution is to pass a class name as well as the leaf instance name - e.g. {'class':'interface', 'name':'eth0'} - as the value for a link to another object. In the C++ code these attributes are pointers, so the generic attribute output code understands that pointers should be passed in this form. Then the Rest server code has a post-scanner that takes the output from the show_... functions, and using its class-to-URL mapping table, turns these into full-formed URLs.

One nice thing is that the Python code in the Rest server has no idea what is and isn't available for each class. If a class doesn't support POST, for example, then the server will generate a call to add_<class> anyway. Pyro will eventually determine that there is no such function, returning a corresponding error, which the Rest server turns into a 405 (method not permitted) error.


I made a decision early on to support only Json. This seems to be pretty normal now. XML generates a lot more work around metadata, for no added functional value. It wouldn't be hard to also generate XML, but it's hard to see why it would be useful.

In the end it took about three days of work, spread out over the Christmas and New Year break, to get a fully functioning Rest interface. There has been quite a bit more work since in the backend, as I've realised how information should be presented. The Rest server is about 250 lines of Python. Each new class adds exactly one more line, to the mapping table. And there's no PHP!