Thursday, 3 January 2019

Dr Larry Roberts, RIP - a personal retrospective

I learned yesterday that Dr Larry Roberts passed away on December 26th, at the relatively young age,  these days, of 81. I had the good fortune to work closely with Larry at his company Anagran, and since then also.

To me he was always just Larry, not Dr Roberts or Dr Larry. We were colleagues (though he was my boss) and we worked closely together. It was a privilege to know him this well. He was quite a humble man close up, though this wasn't at all the common perception of him. It fell to me to take his very visionary technical concepts, and turn them into something an engineer could go off and build. Let's just say there was sometimes quite a gulf between the two.

Larry created Anagran to pursue his conviction that flow-based routing was a superior technology, that could change the way the whole Internet worked. That was fitting, because if any one person could be said to have created the Internet, it was Larry. Other people make the claim, on the basis of having invented some core technology. But Larry was the person who convinced the US Government, which is to say the defense research agency (DARPA), to fund it. That made it accessible to every university and many private companies, long before most people had even heard of it - my then-employer was connected to the Arpanet, as it was called then, in the early 1980s.

A brief explanation of flow routing is as follows. Conventional (packet based) routers look at every data packet, at the addresses and other information in it, to decide how to treat it and where to send. It takes incredible ingenuity to do this fast enough to keep up with high-speed data transmission. Larry's idea was to do this only once for each network connection (e.g. each web page), thereby amortizing the work over, these days, hundreds of packets. It's not simple though, because for each packet you have to find the connection - the "flow" of flow routing - that it relates to. This too requires considerable ingenuity. By 2000 or so, the engineering tradeoffs were such that flow routing was demonstrably cheaper to build. However the established vendors, especially Cisco, had invested huge amounts and large teams in the technology of packet routers, and weren't about to make the switch.

In 1998 Larry created Caspian Networks to pursue this idea, attracting huge amounts of funding - over $400M during the life of the company. They did build a product, but the technology was barely ready for it. The result was large and expensive, and sold only to a handful of customers.

Larry realised this was the wrong approach. In 2004 he created Anagran, to apply the flow routing concept to a much smaller device. Thanks to a brilliant CTO from his past who he enticed to join him again, this was achieved. The Anagran FR-1000 was about a quarter the size, power consumption and cost of its traditional equivalent, the Cisco 7600. Technically, it was a huge success.

I left Cisco to join Anagran as the head of engineering in 2006. It took us another year to get to a shippable product, and then we learned the sad truth. Networks were so critical to companies' business that they weren't about to take the risk of switching to an unknown vendor just to save a relatively tiny amount in their overall IT budget.

Larry was not just a visionary. Part of his concept for Anagran was a truly innovative way to manage traffic, based on completely new mathematics and algorithms. He had implemented this himself, as a proof of concept, in what must surely be the biggest spreadsheet ever created. It used every single one of the 32768 rows supported by Excel. If a single cell was changed, it took about 10 minutes to recalculate the sheet. The concept, once you understood it, was simple enough, but turning it into something that would deal with a real world traffic mix and that could be implemented in our hardware, was a big job. It occupied most of my time for over a year, and even today we are constantly improving it. The result is described in US Patent 8509074. It was through working on this together that we really got to know each other.

This turned out to be key to the survival of Anagran. We repurposed the hardware we had built, to use this algorithm to control users' traffic in a service provider network, and successfully sold it as such. The company's eventual demise was a result of being a hardware company: hardware has to be "refreshed", which is to say reinvented from scratch, every few years. And our revenue was not enough to sustain that. Software, on the other hand, just carries on, constantly under change but never needing to be started over. Cisco IOS probably still has lines in it from when it was first coded in 1983.

Larry was a genius and a visionary, but nobody can be everything at once. Some people found him overwhelming, and he could be brutally abrupt with people who didn't know what they were talking about. He was also a huge optimist when it came to Anagran's business prospects, which led to strained relationships with the investors.

Anagran finally closed down in 2011. I'm very pleased to say that Larry's brilliant flow management invention survives, since the company's assets - especially the patents - were purchased by the company I founded in 2013, Saisei Networks, and his work is very much still in use. We continued to work with Larry in his post-Anagran ventures and I saw him often.

We'll miss you, Larry, even - maybe especially - the times when your ability to see ahead of everyone else, and incomprehension that they couldn't see it too, made life challenging.

Rest in Peace.


Tuesday, 1 January 2019

Return to Anza Borrego

An Englishman can't really complain about the Bay Area weather, but it does get chilly and miserable around the end of the year. So we made a last minute decision to escape to the desert for a few days. It was our fifth trip to Anza Borrego, and our second with our Toyota FJ - you can read about the first, exactly three years ago, here. Since then we made one quick trip to see the wildflowers a couple of years ago. We rented a gigantic GMC Yukon XL, the only 4WD that Enterprise in San Diego could find for us. We nicknamed him Obelix, after the super-strong character in Asterix.

It's a long drive and it was dinner time when we arrived at our rented condo. The condo (really a house, joined by one wall to its neighbor) was very pleasant, thanks to vrbo.com - nice furniture, fantastic view, very comfortable enormous bed, to which we retired early. Just as well because we were awoken at 7am by the Grumpy Old Man next door, complaining that we were blocking his garage. We weren't, and as far as I could tell he didn't go out all day anyway, but Grumpy Gotta Grump. It was the perfect opportunity to make an early start, out at sunrise. But we went back to bed anyway, and it was after 11 by the time we started.

Day 1: Badlands, Truckhaven Trail


We wanted to revisit the badlands. It's an extraordinary place, visible from above at Font's Point. Driving through them is a completely different experience, only possible with a serious 4WD vehicle.

One frustration with the park is that there is no perfect map. The best paper map shows most trails, but not all of them - it would be too cluttered. The USGS 25000:1 topo maps are amazingly detailed, showing trails and how they relate to other features. What's more, they're free to download to an excellent iPad app, which gives you GPS location and many other features. The only problem is that they are updated very infrequently for rural areas - maybe every 50 years or less. They show "Jeep Trails" which have long since been banished to wilderness areas or just disappeared, and they don't show trails which have been created lately - as in, within the living memory of most of the population. There are several important trails in Anza Borrego that come into this category.

In this case we chose a trail which does appear on the topo map, though not on the paper map. It starts at the end of the dead-end road headed due north from where Yaqui Pass Road meets Borrego Springs Road and turns left. At first it seems like the driveway for a few houses and lots, but then it sets out confusingly eastbound, with several unmarked side trails. Eventually it joins Rainbow Wash, where you can turn left to the bottom of Font's Point, or right as we did. You need to turn left at the Cut Across Trail, which means keeping your eyes open because this is another that is not on the topo map. Most of it is just a sandy trail crossing several washes. At the end it enters the badlands, winding in and out of the landscape of low hills made of something between dried mud and sandstone. It had rained just before our arrival and there were green shoots everywhere - except in the badlands, which are truly lunar with not a plant in sight. The soil must be very alkaline.

Badlands in the setting sun
Winding through the badlands brings you eventually to Una Palma. At least it used to be - now the trunk of the palm lies on the ground. Five Palms, further along, has only four. We didn't count at 17 Palms.

The trail exits via Arroyo Salado onto the main road (S-22). There's a more interesting route, though, along the old Truckhaven Trail, which climbs out of the arroyo to the north-east. This road was built in the 1920s, the first road access to Borrego Springs. "Doc" Beaty led the effort by local ranchers, using mule-drawn scrapers to his own design.

I drove it on my last trip and found it mostly easy, climbing from one arroyo to another. There is one difficult stretch, bulldozed up the side of an arroyo to bypass a landslide further down. Even that, though steep and rocky, was easy enough if taken slowly and carefully. What a difference this time! The steep climb is very eroded and rocky. It requires very great care, constantly steering around and over big rocks. There is a second climb, part of the original 1920s road, which was just a steep dirt road before. Now it too is deeply rutted and full of big rocks. By chance I found the dashcam video of my 2015 trip, which shows the difference very clearly. Still, FJ made both climbs without a care in the world, using low gear but with no need for lockers.

Dinner on our second night was at Carlee's, the best bar in town (maybe the only one too), steak and ribs accompanied by margaritas and beer, followed by a few games of pool.

Day 2: Canyon Sin Nombre, Diablo Dropoff, Fish Creek


Today's goal was to drive through Canyon Sin Nombre (that's its name, No Name Canyon) then across to the Diablo Dropoff, a very steep one-way trail into Fish Creek Wash. We did this back in 2015, one of the classic Anza Borrego journeys. Sin Nombre is like a large-scale version of the badlands, with tall canyon walls made of similar crumbly almost-sandstone. There are lots of side canyons that you can hike into and explore.

The link to the dropoff is Arroyo Seco del Diablo, another long, twisty and spectacular canyon, and an easy drive. At least, it was last time. About half way through we came upon a stopped truck, whose crew of two were puzzling over how to traverse a large and very recent rockfall. There was no way either of our vehicles could climb over it. There was a possible bypass, which involved climbing onto and over a pile of soft sand about six feet tall. There were no tire tracks either over the rockfall, or over the sand pile, meaning we were the first people to try it.

We spent some time discussing possible tracks. Between us we were fully equipped, with shovels, jacks, traction boards and a winch. Still neither of us wanted to get stuck, and above all neither of us wanted to roll off the side of the sand pile.

Our new companion, Ryan, went first but didn't get far. He hadn't engaged lockers, and the wheels just spun in the deep sand as he tried to climb it. Worse, he slid alarmingly sideways. He backed down again, and we discussed some more, using the time to shovel the worst of the soft sand out of the way.

While he aired down to try again, I made my attempt. I'd already aired down to my usual 25 psi - not real airing down, like to 18 psi, but enough to make life easier for the tires over sharp rocks and such.  I engaged low gear, turned on all the locking, made a running start at the hill... and bingo, there I was on top. I paused briefly but the car was at an awkward angle, way short of its rollover angle but still very uncomfortable. There was another tippy moment dropping off the hill and then... I was through!

Ryan followed shortly, after locking everything he could. Then we were off to the Diablo Dropoff. This is a pretty steep angle in a shallow canyon, in itself not too serious. But the trail has been very badly damaged by people trying to go up, their wheels spinning and making deep holes in the sandy surface. The challenge is to negotiate these without losing lateral control, which is to say sliding sideways to a bad conclusion. From within the vehicle it's not too bad, though the occasional slight sideslip as a wheel goes into a hole certainly gets your attention. It looks a lot worse from outside.

There's a second drop further down, a bit easier in my opinion, and a bit of moderate rock crawling at the bottom. And then you're in Fish Creek, which is an easy sandy wash. We drove upstream as far as Sandstone Canyon, which is like a smaller version of Titus Canyon in Death Valley, winding through the narrow gap between high sandstone walls. We got about half way in before encountering the rockfall which has blocked it for years. There are tire tracks over the rocks and deeper into the canyon, but neither we nor our new companions were ready to try that.

We'd been so absorbed by all these events that we hadn't eaten lunch, and now it was 4pm and the sun was setting fast. We found a place in the main canyon where we could catch the very last of the sun while we feasted on cheese and crackers. From there it's a long drive out to the hard road, taking nearly an hour, with continuous magnificent scenery.

Our final stop was the Iron Door, the dive bar in Ocotillo Wells which is a great place for a post-trail beer. And nothing else. The very first time we went there, my partner asked for tea. "We got beer" was the response. "OK, I'll have a beer" - a wise reaction.

Day 3


Our main goal for today was a repeat run up Rockhouse Road. We did this during our wildflower visit, with Obelix who for his size did a surprisingly good job on the narrow twisty upper part of the trail.

Inspiration Point and the Dump Trail


But first, I wanted to visit Inspiration Point. This is another viewpoint over the badlands, a little north of Font's Point, with its own trail from the main road. The paper map shows the trail continuing westwards towards the main road again, though none of this is depicted on the topo map. And indeed there's a short but steep dropoff which goes straight into a very narrow, twisty track between the low hills of the western badlands. There were plenty of tire tracks, which is always encouraging, especially when you don't have a good map to help you at ambiguous junctions, of which there were plenty.

Just after one of them, we came to an unpassable rock fall in the bottom of the narrow canyon. Even if we could have climbed over or round it, the trail disappeared on the other side, replaced by a deep sand drift. We backed up to the last junction, and spotted some tracks that climbed out of the shallow wash. We followed these as they twisted around, the original canyon always in sight to the left, sometimes very close, sometimes further off. The other tracks gradually faded away until finally we were following the traces of just one vehicle, which had probably passed in the last 24 hours. We hoped he knew what he was doing.

Eventually his tracks did a long, shallow S-turn down into the floor of the canyon. From there it was a straightforward drive along what at this point has the picturesque name of the Dump Trail. The reason eventually becomes clear, at a crossroads on the corner of the county dump. The paper map shows the trail simply ending there, which seems improbable - and very annoying if true. By now there were lots of tracks again, so there must be some way out.

Eventually, after a few exploratory wanderings, we followed the dump's fence south and then west, ending up on its access road. From there it was a short drive to the main road.

Rockhouse Road


Rockhouse Road provides access to the eastern end of the cutely named Alcoholic Pass, leading over the ridge from Coyote Valley. We'd thought about hiking up it - we did it once in the opposite direction, on our very first visit, stopping at the ridge. But there was a strong, cold wind. We went further up the trail than we did with Obelix, onto the narrow part that eventually leads to Hidden Spring. It was very rocky and in poor condition, so we decided to stop and have our lunch. It was so windy that we ate inside FJ, something we normally never do. While we were eating we were passed by two FJs racing along the trail. I guess they made it to the end - we saw them again later on the main road.
Looking down from Rockhouse Road

The view from our lunch spot was spectacular, from several hundred feet above the valley floor and Clark Dry Lake. This time there was no carpet of wild flowers, but the ocotillos were just starting to bloom, with their bright red flowers contrasting with their deep green leaf-covered stems.

Font's Point and Vista del Malpais


There were still a couple of hours before sunset when we reached the main road. We've always visited Font's Point, the classic overview of the badlands, so that's where we went. It's an easy drive up a very wide sandy wash - I've done it a couple of times in 2WD rental cars. You just have to be careful to stay in the tire tracks and avoid any deep sand - though I understand rental cars routinely get stuck. Once we saw one that had barely made it off the highway before burying itself up to the hubs in sand.

Badlands, from Vista del Malpais
As we were coming back down the wash, I noticed a Jeep zip off into a side turning, Short Wash. I've seen it on the map but never before managed to figure out where it was - the topo map doesn't show it. It's always interesting to drive a new trail, but this one had something else: a side trail to a place called Vista del Malpais (Badlands View). That seemed interesting, so we turned right. None of this is shown on the topo map, so finding the side trails was a challenge. We found the turnoff using clues from the bends shown on the paper map. A narrow, twisty trail led through the badlands, ending before a final short hike to the ridge. The view was breathtaking, much closer than at Font's Point. We soaked up the view, then turned back onto Short Wash.

We were a little surprised, maybe a quarter mile later, to see a sign for Vista del Malpais up another side track. We followed it, along a bigger trail that ended in a small parking lot on the ridge. The real Vista del Malpais was very impressive too, but we were happy to have found our very own one.

After that it was back to the house. Dinner that night was at La Casa de Zorro, Borrego Springs' only "fancy" restaurant, conveniently only a mile from our house. We've eaten there before and it was decent, but this time we were not so impressed. In future we'll probably stick to Carlee's and the other every-day places in the town. And then, next morning up early for the long drive up I-5 back home.

Tuesday, 30 October 2018

Kotlin Part 2 - a real world example for Kotlin

In Part 1 I described my pleasure at finding what seemed to be, on the face it, an alternative to Python for larger programs where compile-time type safety is essential. And then the difficulties I ran into when I actually tried to use it. But in the end, I got a working program which could access our system's Rest API using the khttp package. It was time to move on and start building the pieces needed for a Kotlin replacement for our Python CLI.

Our system generates in real time the metadata for its Rest API, retrievable via another Rest call. This describes each object class, and each attribute of each class. The attributes of a class include its name, its datatype, and various properties such as whether it can be modified. The result of a Rest GET call is a Json string containing a tuple of (name, value) for each requested attribute. The value is always passed as a Json string. For display purposes that is all we need. But sometimes we would like to convert it to its native value, for example so we can perform comparisons or calculate an average across a sequence of historical values.

In Python, this is easy - a good consequence of the completely dynamic type structure. We keep an object for each datatype, which knows how to convert a string to a native value, and vice versa. When the conversion function is called, it returns a Python object of the correct type. As long as are careful never to mix values for different attributes (which we don't have a use case for), everything works fine. If we did happen to, say, try to add a string to a date, we will get an exception at runtime, which we can catch.

In C++ it's harder, because of course there is complete type checking. But our backend code, which is busily transforming data for tens of thousands of flows and millions of packets per second into Rest-accessible analytics, it is necessary.

The key is a C++ pure virtual base type called generic_variable. We can ask an attribute to retrieve from a C++ object (e.g. the representation of a user or an application) its current value, which it returns as a pointer to a generic variable. Later we can, for example, compare it with the value for another object, or perform arithmetic on it.

The owner of a generic variable knows nothing about the specific type of its content. But he does know that he can take two generic variables generated by the same attribute, and ask them to compare with each other, add to each other and so on. They can also be asked to produce their value as a string, or as a floating point number.

What happens if you try to perform an inappropriate operation, like adding two enums, or asking for the float value of a string? You simply get some sensible, if useless, default.

This is very easy to do in C++. The code looks something like this:

template<class C> class typed_generic_variable : public generic_variable
{
    public:
        typedef typed_generic_variable<C> my_type;
    private:
        C my_value = C();
    public:
        typed_generic_variable(const C &v) : my_value(v) { }
        string str() const { return lexical_cast<string>(my_value); }
        void set(const string &s) { my_value = lexical_cast<C>(s); }
        my_type *clone() const { return new my_type(my_value); }
        bool less(const generic_variable *other) const
        {
            my_type *other_typed = dynamic_cast<my_type*>(other);
            return other_typed ? my_value < other_typed->my_value : false;
        }
        bool add(const generic_variable *other) const
        {
            my_type *other_typed = dynamic_cast<my_type*>(other);
            if (other_typed) {
                my_value += other_typed->my_value;
            }
        }
        // and so on...
}

The point here is that in this declaration, we can use the template parameter type C exactly as though it was the name of a class. We can use it to create a new object, we can use it in arithmetic expressions, we can invoke static class functions ("companion objects" in Kotlin). When the compiler deals with the declaration of a class like this, it doesn't worry about the semantics. It only considers that when you instantiate an object of the class. In the above case, if I try to create a typed_generic_variable<foo> where the foo class does not define a += operator, then the compiler will complain.

Two very helpful C++ features here are dynamic_cast and lexical_cast. The former allows us to ask a generic variable whether it is in fact the same derived type as ourself, and to treat it as such if it is. The latter, originally introduced by Boost, makes it easy to convert to and from a string without worrying about the details.

I'll admit this looks quite complicated, but actually it's very simple to code and to figure out what is going on. The language doesn't require me to do anything special to make the type-specific class work. The code is no different than if I had explicitly coded variants for int, float, string and so on - except that I only had to write it once.

(In our actual implementation, we make extensive of template metaprogramming (MPL), so in fact if I do try to create such a variable, the add function will simply be defined as a no-op. But that's more detail than we need for the Kotlin comparison).

The goal in the Kotlin re-implementation was to use the same concept. I kind of assumed that its generic type feature, which uses the underlying Java machinery, would take care of things. But I was sadly disappointed. But this is already too long, so more in Part 3.

Kotlin, Part 1 - oh well, nice try guys

It amazes that new programming languages continue to appear, if anything even faster than ever. In the last few years there have been Scala, D, R and recently I came across Kotlin. At first sight, it looked like a good type-safe alternative to Python. It is one of several "better Java than Java" languages, like Scala, optimised for economy of expression. It runs on the system's JVM, meaning that you can ship a Kotlin program with a very high probability that it will run just about anywhere.

To save you reading this whole blog, here's an executive summary:

  • Kotlin is a very neat toy programming language, great for teaching and such
  • Its apparent simplicity fades very quickly when you try to do any real-world programming
  • Many things which are simple and intuitive to do in Python or C++ require very convoluted coding in Kotlin
  • In particular, Kotlin "generics" - Java-speak for what C++ calls templates - are completely useless for any real-world programming
  • Overall, Kotlin is always just frustratingly short of usable for any actual problem
  • That said, I guess it's fine for GUI programming, since it is now the default language for Android development

Most of my code is written in either C++ or Python. There's no substitute for C++ when you need ultimate performance coupled with high reliability. Being strongly typed, you can pretty much turn the code upside down and shake it (formally known as "refactoring") and if it compiles, there's a good chance it will work.

Python is fantastic for writing short programs, and very convenient as they get larger. All our product's middleware that does things like managing the history database, and our CLI, are written in Python. It's easy to write, and as easy as can be hoped to understand. But refactoring is a nightmare. If function F used to take a P as an argument, but now it wants a Q, there is no way to be sure you've caught all the call sites and changed them. One day, in some obscure corner case, F will get called with a P, and the program will die. This means you absolutely cannot use it for anything where reliability is vital, like network software. It's OK if a failure just means a quiet curse from a human user, or if there is some automatic restart.

So for a long time, I have really wanted to see a language with the ease of use and breadth of library support that Python has, coupled with compile time type safety. When I read the overview of Kotlin, I thought YES! - this is it.

I downloaded both Kotlin and the Intellij IDE, to which it seems to be joined at the hip, and wrote a toy program - bigger than Hello World, but less than a page of code. The IDE did its job perfectly, Kotlin's clever constructs (like the "Elvis operator", ?:) were easy to understand and just right as a solution. I was very happy.

Our CLI and associated infrastructure has really got too big for Python, so it was the obvious candidate for transformation to Kotlin. Basically it is a translator from our Rest API to something a bit more human friendly, so the first thing needed is a Rest friendly HTTP library. Two minutes with Google found khttp, which is a Kotlin redo of the Python Requests package which is exactly what we use. Perfect.

Well, except it doesn't form part of the standard Kotlin distribution. I downloaded the source and built it, with no problems. But there seems to be absolutely no way to make a private build like this known to the Kotlin compiler or to Intellij. I searched my whole computer for existing Java libraries, hoping I could copy it to the same place. Nothing I did worked.

The khttp website also included some mysterious invocations that can be given to Maven. Now, if Java programming is your day job, well, first you have my every sympathy. But second, you're probably familiar with Maven. It's an XML based (yuck!) redo of Make, that is at the heart of all Java development. (Well, it used to be, now apparently the up and coming thing is Gradle - why would you only have one obscure, incomprehensible build system when you can have two?)

So, all you have to do is plug this handful of lines into your Maven files, and everything will work!

Except... Intellij doesn't actually use Maven. I (once again) searched my whole computer for the Maven files I needed to modify, and they weren't there. After a lot of Googling, I finally found how to  get it to export Maven files. Then I edited them according to the instructions, and ran Maven from the command line using these new files. And - amazingly - it worked. By some magic it downloaded hundreds of megabytes of libraries, then built my Kotlin program - which ran and did what I wanted. And if I ran it again, it found all the hundreds of megabytes already there, and just ran the compiler. When I ran my little program, it fired off Rest requests and turned the Json results into Kotlin data structures. Perfect, exactly what I wanted.

But as I said, Intellij doesn't actually use Maven. Goodness knows what it does use, under the covers. So now I had to create a brand new Maven-based project, using my existing source file and my precious Maven config. And now, with Maven having put all the libraries where the compiler is expecting to find them, Intellij's own build system would build my program. In theory there is a place where you can tell Intellij where to find packages on the web, which ought to have been perfect. But in practice, when you get to the right page, it shows an empty list of places, and has no way to do add to it. I guess probably there's an undocumented configuration file you can edit.

That's a good point to break off. In Part 2, I'll talk about my experience trying to build a real-world application using Kotlin.




Monday, 27 August 2018

Enlightenment for Hartmut

There's something very romantic about a lighted passenger train passing through the night. Mysterious voyagers on their way to mysterious destinations, seen briefly as they cross the night-silent countryside.

It looks good on the garden railway too, a train chugging along in the dark garden. You can imagine yourself standing on a hillside, the rural tortillard trundling through the night, taking a few sleepy farmers home from the market. So it has been my goal since the beginning to have all the passenger trains illuminated.

Two of them I did a while back, but the third was still waiting. The engine is the LGB Saxon IV K 0-4-4-0 Mallett, which we christened Hartmut - alongside his bigger brother Helmut, the big 0-6-6-0 Mallet (2085D) of uncertain prototype. He has an authentic train of three coaches from the Royal Saxon Railway.

The first two trains are Thomas, with his coaches Annie and Clarabel, and Marcel with his two French coaches. In Thomas's case, it was partly driven by necessity. He has completely rigid wheels, with no vertical freedom of movement. He would constantly get stuck because often only one wheel is in contact with the rail on one side. Between dirty track and intentional dead sections - for example, on points - that can just never work. The solution was to adapt one of the coaches to collect power too, and run a cable to the engine. While I was at it, it seemed simple enough to add lighting as well.

So Thomas, and each of his coaches, have 4-pin JST connectors to form a link throughout the train. The first coach has LGB pickups on both axles, and metal wheels. In the roof are a couple of 6V grain of wheat bulbs in series, from Micromark.

Inside Thomas is a buck converter from eBay that reduces the 18V track power to something suitable to drive the lights. In real life there were just a couple of oil lamps, lit at dusk by the guard, very different from the bright fluorescent lamps in modern trains. You wouldn't be able to read, you'd just about be able to make out your neighbours' features - should you want to. Running the nominal 12V light chain on 9V gives just the right dim, yellowish gloom. The lights are controlled by the DCC decoder in the engine, meaning you have to remember to turn them on.

When I first installed the lights in Marcel's train, I didn't bother with the power pickup in the coaches, since he is an LGB engine with pick-up skates and some vertical flexibility to the wheels. But even so he sometimes has trouble especially on the siding pointwork. When I restored him to operation after his lengthy service as a guinea pig for my intelligent locomotive experiments, I added power pickup to the first of his coaches. Apart from that, the setup is identical to Thomas.

One thing I realised is that there's no point in being able to control the lights. During the day, and even quite a long way into twilight, they are invisible, so it's harmless to have them on. And at night, you always want them on. That means there's no need to connect to the locomotive, which simplifies things a lot. Instead I fitted power pickups on the first coach. This also meant I could use smaller 2-pin JST connectors, which are easier to connect between the coaches and less likely to get in the way of the couplings.

The grain of wheat bulbs work well enough but they are a pain to install. On the web I saw some LED light strips for LGB coaches, so I bought three of them. In fact they are just short segments of readily-available LED ribbon, with six LEDs, together with some connectors and a big (8200µF) capacitor for each one, so they don't flicker on dirty track. It would have been a lot cheaper and just as simple to have bought a 6-foot length of ribbon.

To finish the job needed a little eBay buck converter hidden on the floor of the coach, to turn the track power into something suitable for the LEDs. It turned out that 9V was just right for them, too. I used one of the 8200µF capacitors, so dirty track has no effect. The lights stay on for about 10 seconds even when the track power is turned off completely.

The little circuit board with (left to right) the big reservoir
capacitor, the voltage reduction board, and the
bridge rectifier.
In real life all trains carried - and still do - a red tail lamp. This is very important because it makes it easy to see whether part of the train has gone missing. The signalmen had to watch each train carefully to be sure the light was still there, day or night. If not, it was time to send an urgent message - a bell code of 4-5 in UK practice - to the previous signal box, so another train wasn't cleared to run into whatever was left behind. It has been on my list for a while to add a tail lamp to the long goods train, so I thought Hartmut's train should have one too.

The under-body wiring, with the wheels on the left side
removed.
In my box of LGB bits and pieces I found an LGB tail lamp. It's designed to clip on to a vehicle, and fits perfectly onto the veranda of Hartmut's coach. Whether that is the prototypically correct position, I have no idea. I searched for a picture of the tail end of an old-fashioned German passenger train, but to no avail. Goods trains evidently carried two tail lamps, one on each side as high as possible. The lamp comes with a bulb in a huge brass holder, that would be very conspicuous. I replaced it with a red LED, the wiring concealed behind the veranda. The wiring hides a 2K2 resistor which reduces the LED's brightness, though having watched the train at night, I think a higher value (lower current, less light) would have been better.

The LED strip installed in the roof, secured in place with two
little bridges of Sugru.
One unexpected problem I ran into is that the LED strips kept falling off the roof. They have double-sided sticky tape on the back, but it clearly wasn't up to the job. I added a layer of double-sided sticky foam, and then made little bridges of Sugru to be doubly sure. If you haven't used it, Sugru is wonderful stuff. It's a bit like epoxy putty, but much simpler to use since it doesn't have to be mixed. You open up a little foil envelope, take out a piece the size of the end of your thumb, and mould it to whatever shape you need. It is set reasonably hard after a few hours, and completely after 24 hours or so. And it lasts for ever. The first thing I ever used it for was to hold a heavy soap rack in place in the shower. After several years it is as strong as ever. I used quite a bit more to hold wires in place, especially underneath the carriages.

I put Hartmut and his train back together, keen to see them trundling around in the twilight. But I was disappointed. The lights came on, but Hartmut showed no signs of mobility. On the bench I discovered that his Zimo DCC decoder, which he has had since I first got him nearly 20 years ago, had died. Luckily I had a spare, but on that one the light outputs seem to be non-functional. Since Hartmut only ever goes in one direction, I just hot-wired the front lights to be on all the time. Running at night, I noticed that the interior light in the cab is way too bright, so that will need a resistor added somewhere.

After all that, Hartmut and his train are lit up like the others. They look very good and slightly mysterious as he chuffs slowly round the layout in the dark.

Hartmut and his train by twilight

Thursday, 23 August 2018

Flowers for Marcel - the end of my intelligent locomotive experiment

This weekend I finally abandoned my attempt to build an intelligent, "internet of things" locomotive for my garden railway. It was a very disappointing decision, because it so nearly worked. It worked on the bench, but the realities of a garden railway with dirty track, poor connections and so on meant that it was never reliable outside.

I'd chosen the most reliable and simplest of all my LGB locomotives, the Corpet-Loubet 0-6-0. This was the first one I ever bought, back in 1999, and had given perfect, trouble-free service ever since, using a Zimo decoder for DCC.

Objectives


The objectives for the project were:
  1. 100% compatibility with DCC - the loco should work on my DCC-enabled layout exactly like any other
  2. tight feedback control over speed, so the selected speed would be maintained regardless of load or layout voltage variations
  3. "non stop" operation to keep the loco moving over dirty track, dead spots and so on, using a substantial super-capacitor
  4. constant status reporting over WiFi of several things including motor current, track voltage and distance travelled
  5. in addition to DCC control, the ability to control speed, accessories, CV settings and everything else over WiFi

Design


I selected the Particle Photon as the microprocessor. It is about the size of a USB memory stick, yet has a powerful ARM processor running at 125 MHz and, built-in WiFi. My earlier experiments with adding WiFi to the Arduino had been a painful failure, so this was extremely important. Another important advantage is that the libraries are a superset of the Arduino. As it turns out, however, it isn't really suited to this job - more on that later.

To meet the first requirement, I needed an implementation of a DCC decoder. Fortunately there is a very nice one out there, NmraDcc. It's written for the Arduino, but needed only one small change to run on the Photon - because the Photon does not support hardware timer interrupts. You write functions to handle speed changes, accessory operations and so on, and it calls them when the corresponding DCC commands are received.

For a computer to be useful, it needs to communicate with the world around it. I needed the following inputs:
  1. Track voltage (analog)
  2. Internal power supply voltage (analog)
  3. Supercap voltage (analog)
  4. Motor current (analog)
  5. Motor back-EMF, used to measure speed and hence keep it constant (analog)
  6. Axle position sensor, used to measure absolute train position (digital)
  7. DCC signal, via opto-coupler from track voltage (digital)
  8. Motor control signal feedback (digital)
  9. Motor over-current (digital)
  10. Accessory over-current (digital)
and the following outputs, all digital:
  1. Control signal to motor control FET, operated via built-in pulse width modulation (PWM)
  2. Control to motor reversing relay
  3. Accessory outputs, as many as possible (limited by number of available GPIO pins)
The first thing to get right was the power supply. The non-stop feature is built using two 30F 2.7V supercaps in series, giving a 5.4V 15F capacitor. This is kept charged by a tiny buck regulator from Ebay, adjusted to give a constant 5.3V output. It has built-in current limiting to 1A. When the loco is placed on the track it takes about a minute for the cap to charge. A 3A boost regulator, also from Ebay, then turns the 5.3V back into 16V to run the motor. An arrangement of hefty Schottky diodes (chosen because of the lower voltage drop) normally takes current direct from the track, but as soon as the track voltage drops below 16V, the supercap provides the power. This arrangement is also used on my LGB track cleaner, where it works perfectly.

The power supply to the microprocessor comes via a second Ebay buck regulator, that drops the supply voltage to 12V for the accessories (lights etc), then a 7805 linear regulator that provides a stable 5V supply. At least, it is supposed to, though that turned out to be one of the weak spots of the design.

The motor is controlled via a power Mosfet, an IFR9540 that can happily switch 20 amps. Reversing uses a relay. The conventional way to control a motor these days is via an H-bridge (e.g. an L298) but I couldn't see how to get the back EMF this way, so I went for a relay instead. (I did think of a way later but by then I had built the board).

The analog inputs are first scaled by a couple of resistors, so they will never be more than the 3.3V input range of the CPU, then connected to it via a 4K7 resistor. This ought to protect the CPU, but experience showed that it doesn't.

The digital outputs are buffered via 4K7 resistors to an ULN2804 8-way Darlington. The accessory outputs, of which there are 6, are taken to a terminal block.

Software


That just leaves the software. There are several components to this:
  1. DCC decoder, using NmraDcc
  2. Basic motor and accessory control, turning the intended speed into a PWM ratio to control the motor voltage, direction into the sense of the reversing relay, and setting the accessory outputs
  3. Feedback-based motor control, reading the actual speed and adjusting the motor voltage so it matches the intended speed
  4. WiFi interface, sending and receiving messages.
  5. Generating status messages
  6. Interpreting commands received by WiFi
I invented a log format, which is sent once per second and includes not only the items to be monitored but also various internal variables that show how the motor feedback calculations are working. These messages are sent to an IP multicast address, so the loco does not need to be configured for where to send them. They contain the IP address of the loco, which can be used for sending commands back to it. A simple Python program listens to the messages and logs them, and allows commands to be typed and sent to the loco.

Speed Control


The hard part of the software is the feedback based speed control. The concept is simple enough: measure the actual motor speed, and adjust the motor control so that it matches the desired speed as set by the user. It seems simple, but control systems are always a compromise between agility, i.e. responding quickly to changed circumstances, and stability, which most importantly means not oscillating. In this case there's no point if it takes say ten seconds to respond, since the changed circumstances - like going round a tight curve - will likely have gone away.

Motor speed is measured by back EMF, the voltage that all electric motors generate in the opposite direction to the supply. Since the motor is controlled by turning the power on and off, it's straightforward to measure the voltage while no power is applied. A complication is that it is not at all linear with the motor speed, so the software needs to understand the actual relationship.

I also implemented direct speed compensation in response to supply voltage variation. As the supply voltage drops (due for example to dirty track) the motor feed is directly increased in proportion. This sounds like a good idea, but it does lead to problems as described below.

In practice I never found a set of control parameters that really worked. Anything that gave enough agility also led, some of the time, to control oscillation, meaning that the loco sped up and slowed down as it moved along at a constant speed setting.

Photon - the Good, the Bad and the Ugly


The Photon seemed the perfect part to use for the CPU in this project. It has a lot of good points, but in the end the bad points overcome them:
  • it takes a long time to start. By default, it not only has to find and log on to the WiFi network, but also set up a connection to its cloud-based server. This can take up to 10 seconds. In the situation where it is rebooting because it has briefly lost power, this is completely hopeless. It's possible to program it to avoid the second part, but it still takes a couple of seconds before it starts running its program again. The PIC processors typically used on DCC decoders are running code within milliseconds, meaning that the outage passes unnoticed.
  • it's impossible to run fine grained (microsecond) timers. For some reason to do with the cloud server, again, the finest timer resolution you can get is one millisecond. It would have been good to control the output FET directly in software, but that's impossible. Instead you have to use the on-board PWM, which significantly complicates the software.
  • it's electrically very fragile. Even a momentary signal over 5V applied to its inputs, even via a 4K7 resistor - meaning the current is limited to a milliamp - instantly destroys the whole chip.
  • it loses its configuration pretty easily - the WiFi credentials and worse, the credentials needed to contact the cloud server. Of the five Photons that gave their lives to this project (see previous point), one in particular lost it every time the software crashed. Reinstalling it is a painful process, requiring a physical USB connection to a computer and a whole series of arcane commands.
  • it's very difficult to debug. This is normal for a small embedded processor or Arduino, but the Photon is trying to be one step above this. There's no interactive debugger support, and no way to do "debug with print statements" either. Programs bigger than you could run on an Arduino are just impossible to develop as a result.
The Photon is no doubt a good part to use if there is a rock-solid power supply, it's interfaced only to gentle things using the same stable supply, and the program is pretty simple. But none of those applied in this case.

Practical Experience



The Photon board, carefully trimmed to fit - just! - in the available space.
The CPU is to the right, with the power controller to the left.
I spent a long time - several months, on and off - trying to make this work. Getting all the necessary electronics onto a board that would fit inside the loco was quite a challenge. It looks as though there is loads of room, until you try squeezing all the components in. There is also a fairly severe height limitation, especially at the sides in the water tanks. It would be easy enough using tiny surface mount parts, but they don't lend themselves to prototyping. Instead I used 0.1" stripboard, and a lot of wires.

The first Photon was a victim even before the board was built, when I was testing the basic circuit ideas using plug-in breadboards. Two wires touched and pfff! - the end of the first Photon. Another one died when I foolishly ran the board without it being firmly screwed down in the locomotive, and short-circuited the traces underneath it on some tool on the bench. That did a lot of damage, and not just the CPU. Two more just mysteriously died, for no obvious reason.

The feed-forward in the speed control turns out to have an unfortunate effect on dirty track. As the voltage drops, the loco tries to pull more current, the voltage drops further, and so on. If the initial voltage drop is large enough - i.e. the track is dirty enough - the track voltage drops so low that the loco stops. In the end it's better just to accept the loco slowing down.

There was a similar problem with the non-stop circuit, which is trying to keep its supercap fully charged even if the track voltage is lower than normal. That also leads to the same kind of evil feedback loop.

The biggest problem, though, was with the power supply to the CPU. On the bench, everything worked perfectly. But out on the garden track, no matter how carefully I isolated it, no matter how many smoothing caps I introduced, the CPU would unpredictably reset itself. This wouldn't matter, except that (see above) it then took several seconds before the train would start to move again, if at all.

Lessons Learned


The main thing I learned from this is not to do it again. The benefit is really not worth it. That said, if I did do it again I would;
  • use a CPU board that is more of a microprocessor rather than a "cloud of internet of things".
  • enforce total galvanic isolation of the CPU from all of the ugly electrical side of things. There are chips available that do this even for analog signals.
  • build a power supply for the CPU which is galvanically isolated - there are some neat chips that do this - and protected in every possible way from any kind of electrical ugliness: supercaps to keep it running for tens of seconds, beefy zeners to protect against over-voltage spikes, and so on.
My plan for now is to re-purpose the board I built. It's still useful to have real-time monitoring of the electrical conditions around the track. I plan to put it into one of Marcel's carriages, just as a passive WiFi reporter of track voltage.

The End of the Experiment



Marcel's new simplified electronics - just a Zimo decoder and
a tiny board to make the 9V for the carriage llighting.
I might have persevered for longer, but summer time came round, and Marcel's two French coaches looked lonely sitting in their siding. I really wanted to get him back on the layout and running. It took me just an hour to construct a carrier board for his old Zimo decoder, and a tiny Ebay buck converter to run the lights. Very soon he was chugging happily round the layout with his coaches in tow. He seems extremely happy, and probably very glad to get off my workbench and back outdoors.



Marcel on the layout with his short train of two French coaches








Sunday, 20 May 2018

First Trip to the Grand Canyon, 1983

In 1983 I attended a big trade show and conference in Las Vegas, with my friend and colleague Kevin. The purpose of our trip was to demonstrate the very first implementation of the new standard for connecting computers, OSI - long since eclipsed by the Internet.

We'd both spent a lot of time in the Boston area, home of our employer, Digital Equipment Corporation (DEC) - then one of the biggest computer companies in the world, though now long since defunct. But neither of of us had been to the vast expanses of the West before. We decided that since we were so close, we should explore it, including a visit to the Grand Canyon. We rented a car, and set off. The car was a Renault 11, during one of the occasional brief periods when French car manufacturers tried to sell in the US. Even though contemporary American cars weren't that great, the Renault did not compare favourably. They didn't last long.

The first leg of the journey goes across the desert to Kingman, of which more later. Just before getting there, we saw a side turning to a place called Chloride, so we decided to go and see it. After a mile or so down a paved but unkempt road, we arrived. It was my first sight of a genuine ghost town. Well, not quite - a handful of houses looked occupied, but mostly it was deserted. (I've been back there several times. In the last few years it has come back to life, and even has a general store - I have a tee-shirt to prove it). It was built as a dormitory for an enormous quarry, invisible from the town but vast when seen from the air.

Interstate I-40 heads straight east from Kingman towards Flagstaff and the turn-off for the Canyon. But US Route 66 takes a more leisurely path, and how could we resist that? It starts by heading north-east towards the western end of the Canyon, getting quite close in the Hualapai country at Peach Springs. It was on this stretch that we ran into the most impressive hailstorm I've ever seen. Visibility was quite literally zero. We pulled off the road, deafened by the noise of golf-ball sized hail hitting the car. Within a few minutes it was over. Surprisingly, the car was undamaged, and we continued on our way.

Shortly after Peach Springs and the turnoff to the Havasupai village in the Canyon itself, we came to Grand Canyon Caverns, which I believe is still there. This is a big underground cave system, with absolutely nothing to do with the Grand Canyon apart form borrowing its name. We opted to go for the tour. After descending a long, cold, damp staircase cut into the rock, we finally arrived at a huge underground cavern, full of the expected stalagtites and stalagmites. The guide pointed to some very faint scratches high up on the wall, explaining that this was where a giant sloth had fallen through a hole in the roof, thousands of years ago. We couldn't help asking where the sloth had got to - after all, he could hardly have climbed out. We were assured that it was in a museum, but we didn't really believe there had ever been a giant sloth, just some scratch-like marks on the wall.

And yet... years later, we were passing through Price, Utah, and stopped at the museum there. And in pride of place is the skeleton of a giant sloth. It didn't say where they found it, but I couldn't help wondering whether it was the very same one.

It was dark when we arrived, so it was next day before we saw the Canyon. It's vast beyond belief the first time, and every time afterwards too. There is nothing to be said about it that hasn't already been said thousands of times. We visited all the lookout points, walked up and down the rim, visited the Visitor Center, and all the other things millions of tourists have done before and since.

Taking the road eastwards out of the park lets you stop at several more lookout points, and so we did. It eventually brings you to the Navajo trading post at Cameron, on US 89. It was the first time I'd ever seen Indian country, and I knew nothing at all about the Navajo culture. It all seemed very poor and depressing to me, a few scattered small houses and trailers here and there. It was only years later when, thanks to an excellent Navajo guide at Monument Valley and then to Tony Hillerman's novels, I learned that it is traditional to live spread out like this.

It was late when we started our journey back, direct on I-40 this time. We were treated to a spectacular thunderstorm, the lightning striking into the distant mountains almost continuously. Back then there were still towns that I-40 passed through, one direction either side of a town center consisting entirely of fast food outfits and motels. We stopped in one of these for an entirely forgettable dinner.

By the time we got to Kingman it was late and we decided to stop for the night. We found a motel on the outskirts and asked the guy on the desk where we could get some dessert. He directed us to a nearby Dairy Queen. We'd never heard of Dairy Queen, and even then in 1983 it was a bit past its heyday, a fast food joint with a distinct tendency towards over-sweetened desserts. In a small, isolated town like Kingman it was the night-life centre for the local youth population. The parking lot was full of pickups and teenagers, girls in high heels and boys in their best evening outfits. It was just like a scene from Grease.

Many years later we stopped for lunch in Kingman and I set out on a quest to find the Dairy Queen. It didn't take long. It had only recently closed down, and was exactly as I remembered it - though without the partying teenagers. In small towns, of which there are plenty in the United States, Dairy Queen is still popular. Arriving late one Friday evening in Globe, Arizona, it was the only place still open. And when we visited Prineville, Oregon for the total eclipse last year, it was the place to take my grandson for a slightly nostalgic ice-cream.

The following morning we returned to Vegas. By way of a change, we took the road westwards across the Colorado, joining US 95 northwards through Searchlight. It was the first time I'd seen one of these seemingly infinite long, straight roads. They're even more impressive in the mountains, where you cross one crest and see the next hour of your life stretching down into the valley then climbing up to the next ridgeline. We took our little Renault up to its maximum speed, unimpressive by today's standards but pretty scary considering its handling.

And soon we were back in Vegas, in plenty of time for our flight back to England. I've been back several times to the Canyon, which never loses its power to impress, and I've got to know a lot more of the vast American West. But the memories of that first trip remain vivid. There are no pictures, though - I guess I didn't have a camera with me, back in the days before cellphones had even been invented, much less become cameras and everything else.