Sunday 30 August 2020

Some Network History - Open Systems Interconnection (OSI)

The standards for Open Systems Interconnection (OSI) were a big part of my job from 1980 until 1991. This is a very personal view of what happened, and why it all went wrong.

Background

It's hard to remember now that computers were not always networked together. When you buy a $10 Raspberry Pi, or a $50K server, it's connected to the Internet as soon as you turn it on. Not only can you find cute kitten pictures, but it will load new software and all sorts of behind-the-scenes things you probably aren't even aware of.

It wasn't always so. In the 1970s, "computer" meant a giant mainframe, typically with a whole building or floor of one to itself. They cost a fortune, and they were self-contained - they didn't need to communicate with anything else. The nearest thing to networking was "Remote Job Entry" (RJE) - typically a card reader and a lineprinter, with a controller, connected over a high-speed data line. High speed as in 9600 bits/sec, or about a thousandth of typical WiFi bandwidth. It would take a long time to load even a single kitten picture at that speed. These were used in places that needed access to the computer, but couldn't justify the cost of one - branch offices, remote buildings on a campus and so on.

Each of the mainframe companies - IBM and the "BUNCH" (Burroughs, Univac and others) - did RJE their own way. There were no standards or industry agreements, even though they were all doing exactly the same thing. Communication was over a "leased circuit" - a dedicated, and horribly expensive, telephone line directly between the two places. There was nothing that could be called a "network".

The company I worked for, DEC, was the pioneer for smaller computers - minicomputers. These were inexpensive enough that you could have several, which typically needed to share data - for example to run the machines in a factory. For this it had defined its own network architecture, called DECnet, which was the first peer-to-peer commercial network ever. It allowed DEC's VAXes and PDP-11s to communicate with each other, to share files, access applications and various other things.

They also needed to access data held on the mainframe. For this, we wrote software that pretended to be an RJE terminal. To get data, we would send a pretend card deck that ran a job to print the file, then intercept the "lineprinter" output. A similar ruse would send data in the other direction. At one point I was responsible for all these strange "emulation" products. There was one for the IBM 2780 terminal, and one for each of the other mainframe manufacturers. They were a nightmare to maintain, because none of these RJE protocols was documented. They had been worked out by reverse engineering the messages over the data link. So we were constantly running into special cases that the original code didn't know about.

X.25 - The First "Open" Networking

The first inkling of something better came along in the mid-70s. The world's phone companies - at that time still nationalised "PTT"s - had got together through CCITT, their standards body, and come up with something called X.25. This allowed computers to connect just like on the telephone or telex networks. No prior arrangement was needed, you just sent a message which was the equivalent of dialing a phone call, and then you could send and receive data.

My first networking job at DEC, in 1979, was to implement X.25 for the PDP-11 and the VAX. Just a few countries had networks - the UK, France, Germany, and the US, which had two incompatible ones. Although there was a "standard", it had so many options and variations that every network was different and needed its own variant of the software. It was also expensive to use, with a charge for every single byte of data. Getting a connection was a challenge, since the whole concept was such a novelty for the behemoth monopoly PTT organisations.

Apart from the technical difficulties of X.25, there was a much more fundamental problem. As one industry wit put it at the time, "Now I've taught my computers to talk to each other, I find they have nothing to say." There was no standard way to, say exchange files, or log in to a remote computer. Manufacturers could write their own, but that defeated the object of the "open" network in the first place.

There were a couple of efforts to improve this situation. In the US the Arpanet had been funded by the government in 1969, to connect research and government laboratories. It was this that ultimately led to the Internet, but that was a long way off in 1980. There was a similar effort in the UK, led by the universities, to develop standard protocols for common tasks. Each one was published with a different colour cover, so they were called the "Colour Book Protocols".

OSI is Invented

Having a different standard in every country wasn't a great idea either. International standards for all kinds of things have been produced by the International Standards Organization (ISO) since its creation in 1947 - everything from railway equipment to film standards (the ISO film speed for example). Their work included computers. ISO 646, also known as ASCII, was the first standard for character codes. It was the obvious place to put together standards that would be accepted world wide.

The effort needed a name, and "Open Systems Interconnection" (OSI) was selected. 

By then, the concept of protocol "layers" was well established. X.25 had three layers: the physical layer that dealt with how bits were sent across the wire; layer 2 (data link) that got data reliably across a single connection; and layer 3 (network) that took it through the network via what are now called routers. The first task of the ISO effort was to come up with a formal model of protocol layering. This is probably the only piece of the effort that anyone has still heard of, the "seven layer model" published in 1979 as ISO 7498.

The first four layers of the model - as described above, plus the "transport" layer 4 - were already well accepted and not controversial, though the details of their implementation certainly were. The last three layers were however more or less invented out of nothing and weren't aligned at all with the way application protocols were built, then or now.

The "session layer" (layer 5) was conceptually imported from IBM's SNA architecture, though all the details were completely different. It was extremely complicated, reflecting things like the need to control half-duplex (one direction at a time) modems. There wasn't a single application protocol that used it to do anything except simple pass through.

The presentation layer's overall goals were never very clear. What it turned into was a universal data metadata and encoding, called ASN.1. It was useful, in that it allowed message formats and such to be expressed in terms of datatypes rather than byte layouts. But it was vastly overcomplicated for what it did.

The OSI Transport Protocol

My own involvement with OSI started in 1980. Definition of the OSI transport protocol was taking place in an obscure Geneva-based group called ECMA. DEC wanted to be involved, and sent me along. My first meeting was at the Hotel La Pérouse in Nice. The work was already well advanced. To call it a dogs' breakfast would be a big disservice to both dogs and breakfasts. There were groups who thought the transport protocol should rely entirely on the network for reliability, and others who thought it should be able to recover from a limited class of errors. Other arcane distinctions, including the need for alignment with CCITT - the telco's standards club - meant had it had no less than four separate "classes", which in reality were distinct protocols having no more in common than a few parts of the encoding.

My task was to add a fifth. All of the work so far was intended to work in conjunction with X.25, which provided a "reliable" network service. If you sent a packet it would be delivered or, exceptionally, the network could tell you that it had been unable to deliver something. It would never (in theory anyway) just drop a packet without telling you, nor misorder them. DECnet, as well as the emerging Arpanet, made a different assumption. They kept the network layer as simple as possible, and relied on the transport layer to detect anything that went wrong, and fix it. That meant a more complex transport protocol. This incidentally is how the Internet works, with TCP as the transport protocol.

I spent the next 18 months designing the "Class 4 Transport Protocol" (the others were numbered from 0 to 3, don't ask), TP4 for short. It worked exactly the same as DECnet's equivalent protocol, NSP, and TCP, but the encoding had to be compatible, as far as possible, with the other classes. However the operation was completely different. Practically speaking, a complete implementation of the OSI transport protocol required five completely separate protocol implementations.

I got a lot of guidance and help within DEC, but at ECMA and later ISO I was on my own. Nobody else cared about TP4, nor understood it. That suited me perfectly. It was published in 1981 as ECMA-72.

Maybe because I was really the only one doing any technical work in the group, when the current chair was moved on to another project by his company, I was asked to take that on. It was quite an honour - I was only 28, in the world of standards which (as in politics) tends to be dominated by people towards the end of their careers. That also meant that I got to attend ISO meetings, representing ECMA, the beginning of a long involvement. 

ISO adopted the ECMA proposal for the transport protocol, all five incompatible classes of it, without any technical changes. It was later published as ISO 8073.

Around this time I took up DEC's offer to move to the US for a while, to lead a team building software to connect to IBM systems using their SNA architecture. At least, that was what I was told. In reality, they already had someone for the job, and I was just backup. That gave me plenty of time to work with the network architecture team there, the people responsible for the design of DECnet. The team was really smart and had a big influence on my career, at DEC and subsequently.

ISO meetings were held all around the world, hosted by the various national standards bodies (like BSI, ANSI and AFNOR) and their industry members like IBM and DEC. In those early days I went to meetings in Paris, London, California, Washington DC, Tokyo and others. 

The day before the California meeting, in Newport Beach, we had a very hush-hush meeting at DEC. It was the only time I was in the same room as the CEO and founder, Ken Olsen, along with our genius CTO, Gordon Bell, and our head of standards. The occasion was a meeting with the CEO of ICL, the British computer company which was still important then, and a high powered team on his side. ICL was convinced that IBM was trying to take over computer networking and impose SNA on the world. That would be a disaster for us, since SNA was very firmly oriented to the mainframe world and not designed for peer-to-peer computing at all. Ken was readily convinced that salvation lie in the creation of international standards that IBM would be obliged to follow, which is to say OSI.

This completely transformed my role in things. Until then, my standards work had been an interesting diversion, the kind of thing that large companies do pro bono for the good of the industry. I thoroughly enjoyed it but nobody at DEC really cared much. Suddenly, it was a key element of the company's strategy, with me and a handful of others at its heart.

In 1983 something extraordinary happened. We were invited by China to have our meeting there, the first international technical meeting that China ever hosted. That meeting, in Tianjin, deserves its own article.

The OSI Network Layer

Shortly after the Tianjin meeting there was a shake-up in the way the various working committees were structured, which left the chair of the network layer group (SC6/WG2) open. This was by far the most complex area of OSI. The meetings were routinely attended by nearly 100 people. It was also extremely controversial, and from DEC's point of view the most important area. I was astounded when I was asked if I'd be willing to chair it. I later learned some of the negotiations behind this from Gary Robinson, for many years DEC's head of standards and an extremely wily political operator. (He was responsible for the tricky compromises that allowed Ethernet and other LAN standards to go ahead despite enormous fundamental disagreement - Token Ring and Token Bus were still very much alive). In essence, the other possible candidates, all much more qualified and experienced than me, had too many enemies. I hadn't yet made any, so I became chair of what was officially ISO/IEC JTC1/SC6/WG2, the OSI network layer group, and went on to acquire plenty of my own enemies.

The problem with the network layer was a complete schism between the circuit view of things and the packet view. The telcos had built X.25, at great expense, and saw that as the model for the network. The user of the network established a "connection", and packets were delivered tidily and in order across the connection. The packet view, which included DEC, was that the network could only be trusted to deliver packets, and then not reliably, and should make no effort to do any more. It could safely be left to the transport layer to fix up the resulting errors.

In OSI-speak, these were respectively the "connection-oriented network service", or CONS, and the "connectionless network service", or CLNS. By the time I arrived there had already been years of debate and architectural hypothesis about how to somehow combine these two views. This had generated one of the most incomprehensible "standard" documents of all time, the "Internal Organisation of the Network Layer" (IONL, ISO 8648). The dust was just about beginning to settle on the only way forward, which was to allow the two to progress in parallel. There was no compromise possible.

The telcos hated this, because it pushed their precious X.25 networks down into a subsidiary role underneath a universal packet protocol, making all of their expensively engineered reliability features unnecessary. From our (DEC) view, this was far better than the complex engineering required to somehow stitch together an "internet" from a sequence of connections. Building a network router is hard enough. There's no need, or point, to make it even harder.

So by the time I was in charge of things, we had two parallel efforts. The CLNS side was led almost entirely by DEC, with excellent support from others in the US. As a result we were able to make rapid progress. We came up with a relatively simple protocol with no options, variants and all the other horrors than bedevilled OSI. It was standardized as ISO 8473, the Connectionless Network Protocol (CLNP). 

As chair, I had a duty to be non partisan. On the other hand, I had no duty to actively help the CONS camp. Between the complexity of X.25, the additional complexity of trying to use it as an internet protocol, and internal divisions within the camp, they had little chance of success. After years of work they never did come up with anything that could be built.

That said, this schism did enormous damage to OSI, and was a major factor in its ultimate demise. To us at DEC it was obvious that CONS was a doomed sideshow, but to an observer it just showed a complete inability to make decisions or come up with something that could be built.

DECnet-OSI

That really highlights the basic flaw of the OSI process. Creating complex technology in a committee just doesn't work. It's hard enough to get a network architecture right, without having to embody delicate political compromises in every aspect of the design. Successful standards like TCP, IP and HTTP/HTML were designed by a single person or a small group under strong leadership. Where possible, we did the same thing at DEC. For example the routing protocol for OSI, universally called "IS-IS", was developed by a small team at DEC, and it still works. With modifications to support IP as well as OSI, it is still used by many of world's large telcos. We managed to get that through the OSI process with hardly any changes.

At DEC we had whole-heartedly adopted OSI as the future of networking. DECnet, our very successful networking system, was rebranded DECnet-OSI and was to be completely restructured to use the OSI protocols. We even persuaded James Martin, a well-known author of IBM-oriented textbooks, to write a book about it. That probably deserves its own article too. As it turned out, DECnet-OSI never really happened. That was more to do with internal engineering execution problems than with OSI itself, since we carefully picked only the bits that could be made to work.

The OSI Transaction Processing Protocol (or not)

In 1987 I got involved in another part of OSI. IBM had never really tried to influence the OSI lower layers or to try to make them like SNA. But suddenly they came up with the idea of imposing it on the upper layers. SNA had a very complex upper layer structure, mostly oriented around traditional mainframe networking like remote job entry. But they had finally woken up to peer-to-peer networking and added something called LU6.2 to support it. Their idea was to make LU6.2 an integral part of OSI, so that all applications of OSI would in effect be SNA applications. It was a good idea from their point of view, and was very strongly supported by senior management there.

We knew this was coming because of the way ISO works. It started as a "club" of the national standards bodies, and to a large degree still is. This means that proposals can't be submitted directly to ISO, they have to pass through a national standards body - or at least, they did at the time, things have changed a bit since then.

The question was, what to do about it? IBM were heavily constrained by the existing standards and projects. If they had come along with this five years earlier, it would have been much harder to stop, but now they had to find an empty spot they could introduce it to. This they did, under the guise of "transaction processing". So at the 1987 meeting in Tokyo, there was a "New Work Item" for transaction processing, as another application layer standard. To this was attached all of the IBM contributions, which is to say LU6.2 warmed over.

I got a call about a month before the meeting from DEC's CTO, saying, "John, we need you to go and stop this." In the standards process it is almost impossible to stop anything. Once a piece of work is under way, it will continue. Actually terminating a project or committee is virtually impossible. Typically committees continue to meet for years after they no longer serve any useful purpose. So if you want to stop something, you have to either divert it into something harmless, or ensure that it makes no progress.

An experienced chair knows that there are some people who, while working with the very best of intentions, will just about guarantee that nothing ever emerges. It's just the way they're made. I have had the good fortune to know several. You may ask, why "good" fortune? The answer is that if you don't want something to work out, you arrange for them to be put in charge of it. I couldn't possibly say whether something like this may have influenced the failure of the CONS work to deliver.

For IBM's LU6.2 proposal, though, this would not work. They had put some technically strong people from their network engineering centre in La Gaude, France in charge of it. In truth I had little idea what I would do until I got to the meeting. It turned out that there were three camps:

  • IBM and others who liked the idea of LU6.2 being part of OSI
  • Those who thought that making it part of the standard would act against IBM's interests, by making it easier to compete with them. While these people were "enemies of IBM" and in some sense on the same side as me, as far as this meeting was concerned, they were my opponents. For example, France's Bull was in this camp.
  • Those who didn't want it. This turned out to be just me, and ICL.
So I was hardly in a position of strength. In addition, I hadn't been able to make any official contribution to the meeting ahead of time. On the other hand, the people IBM had sent knew little about OSI and the way the upper layers had evolved. They seemed to believe they could do as they had, for example, with Token Ring (and as DEC and Xerox had with Ethernet as well) - just show up with a spec and get it approved as a standard. But things had already gone way too far for that. There were already too many bits and pieces of protocols and services defined.

This was their Achilles' Heel. In the end it was remarkably easy to divert the activity to a study of the requirements for transaction processing (and it turned out there weren't any), and how they could best be met with existing OSI work. Only then would extensions be studied. This was instant death to the idea of just sticking an OSI rubber stamp on LU6.2.

That all makes it sound very easy, though. I was on my own against a large group of people who all wanted me to fail. It was one of the toughest things I'ver ever done. Luckily there were a lot of DEC people and other friends at other parts of the meeting, so the evenings and weekend were very enjoyable as usual. 

There was one person at the meeting who genuinely frightened me. He was incredibly rude and aggressive during the formal meeting, to the point where it became very personal. It was a ten minute walk from the meeting place, just opposite the Tokyo Tower, to our usual hotel, the Shiba Park. I spent those ten minutes looking over my shoulder to be sure he wasn't following me.

That had an interesting consequence. The head of the US delegation was from IBM, and very much of the old school. He was close to retirement and, like most standards people of that era, very much a gentleman. A few weeks later, I was invited, along with DEC's head of standards, to a meeting at IBM's office in New York City. There the IBM guy apologised profusely, and very professionally, on behalf of both IBM and the United States - even though the person in question didn't work for IBM.

I don't exactly remember what happened after that meeting, but I think IBM just quietly dropped the idea and it faded away.

OSI Management

DECnet had powerful remote management capabilities, essential in a networked environment. We knew that if OSI was to be useful, it had to have the same. There was a management activity but for years it had been very academic and gone nowhere. There were some smart people in the UK who wanted management to work too, and between us we came up with everything required: a protocol, and a formal way to specify the metadata. In the end it never got implemented, because OSI was already struggling by the time it was ready. But it was a nice piece of work. It also got me to several interesting places I otherwise would have no reason to go to.

Why Did OSI Fail?


My final OSI meeting was in 1991, in San Diego. By then I had moved to a new job in the company and was no longer involved with the DECnet architecture. In any case the writing was on the wall: the OSI concept would happen, but it would happen through the Internet protocol suite under development in the IETF. DEC officially made the change shortly afterwards.

Why was OSI such a total failure? It was the work of hundreds of network experts, many of whom really were the top people in their fields. Yet hardly a single trace of it remains. On the other hand the concept of universal computer interconnection has been a huge success, way beyond the dreams of the OSI founders. All they hoped for was the possibility of open communication, they didn't expect it to be a constant feature of the way we use computers. The only thing is, this is all done using the protocols developed by the IETF and loosely called TCP/IP.

OSI was way too complex, with too many options and choices. It was a nightmare to implement, made worse because this was before open source caught on. Some companies tried to make a living selling complete OSI protocol stacks, but that was never really a success. At DEC we had a full OSI implementation several years before DECnet-OSI, but hardly anyone bought it - only a few academic and research users.

I think the main reason was that there was no compelling use case. That seems hard to believe now, but in 1990 it was a chicken and egg situation - until the connectivity was available, there was no use for it. My old boss at DEC said the main reason TCP/IP took over was that Sun was shipping it as part of their BSD-based software, and it was just there, free and available. Because of that, people started to find uses for it. That also happened to coincide with the invention of the World Wide Web in 1990. It was only a minuscule shadow of what it has become, but was a reason to be connected.

By 1995 it was obvious that the future of networking lay with the IETF and TCP/IP. In Europe there were still efforts to keep OSI alive, but without manufacturer support they went nowhere. Around 1997 I was paid to write a study of why the IETF had been so much more successful than ISO. The simple answer is that while IETF is a committee, or actually a collection of numerous committees, each individual standard is produced by at most two or three people. It is then discussed and may get modified, but it is not "design by committee". That is less true now than it was in 1995 - all organisations tend to become sclerotic with age. But back then its motto was "rough consensus and working code". It got stuff done.

Conclusion


From a personal point of view, OSI was one of the most interesting things I've ever done. It taught me a great deal about how to lead in situations where you have absolutely no official authority. It took me on many, many journeys to fascinating places around the world. It also provided my introduction to the woman who would later be my life partner, though that isn't part of this story.

It can be endlessly debated whether OSI was a complete waste of time and effort, or whether it postponed open networking long enough for IBM's SNA to lose its predominant role, making room for TCP/IP. We will never know.

2 comments:

Unknown said...

Hello, John !
La standardisation OSI a durablement marqué tous les délégués. Je crois que nul ne peut avoir une vue exhaustive de ce mouvement aux multiples facettes. En ce qui me concerne, la mission la plus difficile qui m'ait été confiée a été le protocole transactionnel. À ce moment de la standardisation, le modèle OSI était déjà mort, mais l'Europe venait de gagner une partie juridique très importante : la fin du monopole d'IBM dans le transactionnel bancaire. La norme LU6.2 a donc été une monnaie d'échange, et le protocole OSI TP a été un prétexte dans une tribune ouverte pour que les compagnies puissent développer leurs passerelles vers le monde bancaire (c'est-à-dire vendre du matériel et du logiciel dans ce secteur qui représentait 40% du business informatique et 80 % du business des mainframes)... Dans ce jeu très complexe, il y avait donc une facette juridique (le procès Europe vs IBM), une facette technique (la certification des protocoles entre machines hétérogènes) et une facette commerciale très importante. Il reste que pour moi, la standardisation a été une formidable opportunité de comprendre des cultures différentes et de lier des amitiés qui ont duré jusqu'à maintenant.

Best regards
Alain Bron
https://alainbron.ublog.com/

Ken said...

Thank you for writing this. I joined DEC in 1984 in the field organization (best company I've ever worked for, even when I've been self-employed). I watched the ISO OSI - SNA - TCP/IP Token Ring - Ethernet battle from the front lines. (Also the Unix wars.) I knew tcp/ip was the winner when I installed a MicroVAX II at the University of Washington running Unix. I had to hook it up to their tcp/ip network and I barely knew how to spell Unix at the time. Yet I vaguely recalled something about a hosttable and in few minutes it was up and running. At that point I knew everything else was doomed.

Fun history, thank you again. Now to hit the rest of your blog...

Ken