The standards for Open Systems Interconnection (OSI) were a big part of my job from 1980 until 1991. This is a very personal view of what happened, and why it all went wrong.
It's hard to remember now that computers were not always networked together. When you buy a $10 Raspberry Pi, or a $50K server, it's connected to the Internet as soon as you turn it on. Not only can you find cute kitten pictures, but it will load new software and all sorts of behind-the-scenes things you probably aren't even aware of.
It wasn't always so. In the 1970s, "computer" meant a giant mainframe, typically with a whole building or floor of one to itself. They cost a fortune, and they were self-contained - they didn't need to communicate with anything else. The nearest thing to networking was "Remote Job Entry" (RJE) - typically a card reader and a lineprinter, with a controller, connected over a high-speed data line. High speed as in 9600 bits/sec, or about a thousandth of typical WiFi bandwidth. It would take a long time to load even a single kitten picture at that speed. These were used in places that needed access to the computer, but couldn't justify the cost of one - branch offices, remote buildings on a campus and so on.
Each of the mainframe companies - IBM and the "BUNCH" (Burroughs, Univac and others) - did RJE their own way. There were no standards or industry agreements, even though they were all doing exactly the same thing. Communication was over a "leased circuit" - a dedicated, and horribly expensive, telephone line directly between the two places. There was nothing that could be called a "network".
The company I worked for, DEC, was the pioneer for smaller computers - minicomputers. These were inexpensive enough that you could have several, which typically needed to share data - for example to run the machines in a factory. For this it had defined its own network architecture, called DECnet, which was the first peer-to-peer commercial network ever. It allowed DEC's VAXes and PDP-11s to communicate with each other, to share files, access applications and various other things.
They also needed to access data held on the mainframe. For this, we wrote software that pretended to be an RJE terminal. To get data, we would send a pretend card deck that ran a job to print the file, then intercept the "lineprinter" output. A similar ruse would send data in the other direction. At one point I was responsible for all these strange "emulation" products. There was one for the IBM 2780 terminal, and one for each of the other mainframe manufacturers. They were a nightmare to maintain, because none of these RJE protocols was documented. They had been worked out by reverse engineering the messages over the data link. So we were constantly running into special cases that the original code didn't know about.
X.25 - The First "Open" Networking
The first inkling of something better came along in the mid-70s. The world's phone companies - at that time still nationalised "PTT"s - had got together through CCITT, their standards body, and come up with something called X.25. This allowed computers to connect just like on the telephone or telex networks. No prior arrangement was needed, you just sent a message which was the equivalent of dialing a phone call, and then you could send and receive data.
My first networking job at DEC, in 1979, was to implement X.25 for the PDP-11 and the VAX. Just a few countries had networks - the UK, France, Germany, and the US, which had two incompatible ones. Although there was a "standard", it had so many options and variations that every network was different and needed its own variant of the software. It was also expensive to use, with a charge for every single byte of data. Getting a connection was a challenge, since the whole concept was such a novelty for the behemoth monopoly PTT organisations.
Apart from the technical difficulties of X.25, there was a much more fundamental problem. As one industry wit put it at the time, "Now I've taught my computers to talk to each other, I find they have nothing to say." There was no standard way to, say exchange files, or log in to a remote computer. Manufacturers could write their own, but that defeated the object of the "open" network in the first place.
There were a couple of efforts to improve this situation. In the US the Arpanet had been funded by the government in 1969, to connect research and government laboratories. It was this that ultimately led to the Internet, but that was a long way off in 1980. There was a similar effort in the UK, led by the universities, to develop standard protocols for common tasks. Each one was published with a different colour cover, so they were called the "Colour Book Protocols".
OSI is Invented
Having a different standard in every country wasn't a great idea either. International standards for all kinds of things have been produced by the International Standards Organization (ISO) since its creation in 1947 - everything from railway equipment to film standards (the ISO film speed for example). Their work included computers. ISO 646, also known as ASCII, was the first standard for character codes. It was the obvious place to put together standards that would be accepted world wide.
The effort needed a name, and "Open Systems Interconnection" (OSI) was selected.
By then, the concept of protocol "layers" was well established. X.25 had three layers: the physical layer that dealt with how bits were sent across the wire; layer 2 (data link) that got data reliably across a single connection; and layer 3 (network) that took it through the network via what are now called routers. The first task of the ISO effort was to come up with a formal model of protocol layering. This is probably the only piece of the effort that anyone has still heard of, the "seven layer model" published in 1979 as ISO 7498.
The first four layers of the model - as described above, plus the "transport" layer 4 - were already well accepted and not controversial, though the details of their implementation certainly were. The last three layers were however more or less invented out of nothing and weren't aligned at all with the way application protocols were built, then or now.
The "session layer" (layer 5) was conceptually imported from IBM's SNA architecture, though all the details were completely different. It was extremely complicated, reflecting things like the need to control half-duplex (one direction at a time) modems. There wasn't a single application protocol that used it to do anything except simple pass through.
The presentation layer's overall goals were never very clear. What it turned into was a universal data metadata and encoding, called ASN.1. It was useful, in that it allowed message formats and such to be expressed in terms of datatypes rather than byte layouts. But it was vastly overcomplicated for what it did.
The OSI Transport Protocol
My own involvement with OSI started in 1980. Definition of the OSI transport protocol was taking place in an obscure Geneva-based group called ECMA. DEC wanted to be involved, and sent me along. My first meeting was at the Hotel La Pérouse in Nice. The work was already well advanced. To call it a dogs' breakfast would be a big disservice to both dogs and breakfasts. There were groups who thought the transport protocol should rely entirely on the network for reliability, and others who thought it should be able to recover from a limited class of errors. Other arcane distinctions, including the need for alignment with CCITT - the telco's standards club - meant had it had no less than four separate "classes", which in reality were distinct protocols having no more in common than a few parts of the encoding.
My task was to add a fifth. All of the work so far was intended to work in conjunction with X.25, which provided a "reliable" network service. If you sent a packet it would be delivered or, exceptionally, the network could tell you that it had been unable to deliver something. It would never (in theory anyway) just drop a packet without telling you, nor misorder them. DECnet, as well as the emerging Arpanet, made a different assumption. They kept the network layer as simple as possible, and relied on the transport layer to detect anything that went wrong, and fix it. That meant a more complex transport protocol. This incidentally is how the Internet works, with TCP as the transport protocol.
I spent the next 18 months designing the "Class 4 Transport Protocol" (the others were numbered from 0 to 3, don't ask), TP4 for short. It worked exactly the same as DECnet's equivalent protocol, NSP, and TCP, but the encoding had to be compatible, as far as possible, with the other classes. However the operation was completely different. Practically speaking, a complete implementation of the OSI transport protocol required five completely separate protocol implementations.
I got a lot of guidance and help within DEC, but at ECMA and later ISO I was on my own. Nobody else cared about TP4, nor understood it. That suited me perfectly. It was published in 1981 as ECMA-72.
Maybe because I was really the only one doing any technical work in the group, when the current chair was moved on to another project by his company, I was asked to take that on. It was quite an honour - I was only 28, in the world of standards which (as in politics) tends to be dominated by people towards the end of their careers. That also meant that I got to attend ISO meetings, representing ECMA, the beginning of a long involvement.
ISO adopted the ECMA proposal for the transport protocol, all five incompatible classes of it, without any technical changes. It was later published as ISO 8073.
Around this time I took up DEC's offer to move to the US for a while, to lead a team building software to connect to IBM systems using their SNA architecture. At least, that was what I was told. In reality, they already had someone for the job, and I was just backup. That gave me plenty of time to work with the network architecture team there, the people responsible for the design of DECnet. The team was really smart and had a big influence on my career, at DEC and subsequently.
ISO meetings were held all around the world, hosted by the various national standards bodies (like BSI, ANSI and AFNOR) and their industry members like IBM and DEC. In those early days I went to meetings in Paris, London, California, Washington DC, Tokyo and others.
The day before the California meeting, in Newport Beach, we had a very hush-hush meeting at DEC. It was the only time I was in the same room as the CEO and founder, Ken Olsen, along with our genius CTO, Gordon Bell, and our head of standards. The occasion was a meeting with the CEO of ICL, the British computer company which was still important then, and a high powered team on his side. ICL was convinced that IBM was trying to take over computer networking and impose SNA on the world. That would be a disaster for us, since SNA was very firmly oriented to the mainframe world and not designed for peer-to-peer computing at all. Ken was readily convinced that salvation lie in the creation of international standards that IBM would be obliged to follow, which is to say OSI.
This completely transformed my role in things. Until then, my standards work had been an interesting diversion, the kind of thing that large companies do pro bono for the good of the industry. I thoroughly enjoyed it but nobody at DEC really cared much. Suddenly, it was a key element of the company's strategy, with me and a handful of others at its heart.
In 1983 something extraordinary happened. We were invited by China to have our meeting there, the first international technical meeting that China ever hosted. That meeting, in Tianjin, deserves its own article.
The OSI Network Layer
Shortly after the Tianjin meeting there was a shake-up in the way the various working committees were structured, which left the chair of the network layer group (SC6/WG2) open. This was by far the most complex area of OSI. The meetings were routinely attended by nearly 100 people. It was also extremely controversial, and from DEC's point of view the most important area. I was astounded when I was asked if I'd be willing to chair it. I later learned some of the negotiations behind this from Gary Robinson, for many years DEC's head of standards and an extremely wily political operator. (He was responsible for the tricky compromises that allowed Ethernet and other LAN standards to go ahead despite enormous fundamental disagreement - Token Ring and Token Bus were still very much alive). In essence, the other possible candidates, all much more qualified and experienced than me, had too many enemies. I hadn't yet made any, so I became chair of what was officially ISO/IEC JTC1/SC6/WG2, the OSI network layer group, and went on to acquire plenty of my own enemies.
The problem with the network layer was a complete schism between the circuit view of things and the packet view. The telcos had built X.25, at great expense, and saw that as the model for the network. The user of the network established a "connection", and packets were delivered tidily and in order across the connection. The packet view, which included DEC, was that the network could only be trusted to deliver packets, and then not reliably, and should make no effort to do any more. It could safely be left to the transport layer to fix up the resulting errors.
In OSI-speak, these were respectively the "connection-oriented network service", or CONS, and the "connectionless network service", or CLNS. By the time I arrived there had already been years of debate and architectural hypothesis about how to somehow combine these two views. This had generated one of the most incomprehensible "standard" documents of all time, the "Internal Organisation of the Network Layer" (IONL, ISO 8648). The dust was just about beginning to settle on the only way forward, which was to allow the two to progress in parallel. There was no compromise possible.
The telcos hated this, because it pushed their precious X.25 networks down into a subsidiary role underneath a universal packet protocol, making all of their expensively engineered reliability features unnecessary. From our (DEC) view, this was far better than the complex engineering required to somehow stitch together an "internet" from a sequence of connections. Building a network router is hard enough. There's no need, or point, to make it even harder.
So by the time I was in charge of things, we had two parallel efforts. The CLNS side was led almost entirely by DEC, with excellent support from others in the US. As a result we were able to make rapid progress. We came up with a relatively simple protocol with no options, variants and all the other horrors than bedevilled OSI. It was standardized as ISO 8473, the Connectionless Network Protocol (CLNP).
As chair, I had a duty to be non partisan. On the other hand, I had no duty to actively help the CONS camp. Between the complexity of X.25, the additional complexity of trying to use it as an internet protocol, and internal divisions within the camp, they had little chance of success. After years of work they never did come up with anything that could be built.
That said, this schism did enormous damage to OSI, and was a major factor in its ultimate demise. To us at DEC it was obvious that CONS was a doomed sideshow, but to an observer it just showed a complete inability to make decisions or come up with something that could be built.
That really highlights the basic flaw of the OSI process. Creating complex technology in a committee just doesn't work. It's hard enough to get a network architecture right, without having to embody delicate political compromises in every aspect of the design. Successful standards like TCP, IP and HTTP/HTML were designed by a single person or a small group under strong leadership. Where possible, we did the same thing at DEC. For example the routing protocol for OSI, universally called "IS-IS", was developed by a small team at DEC, and it still works. With modifications to support IP as well as OSI, it is still used by many of world's large telcos. We managed to get that through the OSI process with hardly any changes.
At DEC we had whole-heartedly adopted OSI as the future of networking. DECnet, our very successful networking system, was rebranded DECnet-OSI and was to be completely restructured to use the OSI protocols. We even persuaded James Martin, a well-known author of IBM-oriented textbooks, to write a book about it. That probably deserves its own article too. As it turned out, DECnet-OSI never really happened. That was more to do with internal engineering execution problems than with OSI itself, since we carefully picked only the bits that could be made to work.
The OSI Transaction Processing Protocol (or not)
In 1987 I got involved in another part of OSI. IBM had never really tried to influence the OSI lower layers or to try to make them like SNA. But suddenly they came up with the idea of imposing it on the upper layers. SNA had a very complex upper layer structure, mostly oriented around traditional mainframe networking like remote job entry. But they had finally woken up to peer-to-peer networking and added something called LU6.2 to support it. Their idea was to make LU6.2 an integral part of OSI, so that all applications of OSI would in effect be SNA applications. It was a good idea from their point of view, and was very strongly supported by senior management there.
We knew this was coming because of the way ISO works. It started as a "club" of the national standards bodies, and to a large degree still is. This means that proposals can't be submitted directly to ISO, they have to pass through a national standards body - or at least, they did at the time, things have changed a bit since then.
The question was, what to do about it? IBM were heavily constrained by the existing standards and projects. If they had come along with this five years earlier, it would have been much harder to stop, but now they had to find an empty spot they could introduce it to. This they did, under the guise of "transaction processing". So at the 1987 meeting in Tokyo, there was a "New Work Item" for transaction processing, as another application layer standard. To this was attached all of the IBM contributions, which is to say LU6.2 warmed over.
I got a call about a month before the meeting from DEC's CTO, saying, "John, we need you to go and stop this." In the standards process it is almost impossible to stop anything. Once a piece of work is under way, it will continue. Actually terminating a project or committee is virtually impossible. Typically committees continue to meet for years after they no longer serve any useful purpose. So if you want to stop something, you have to either divert it into something harmless, or ensure that it makes no progress.
An experienced chair knows that there are some people who, while working with the very best of intentions, will just about guarantee that nothing ever emerges. It's just the way they're made. I have had the good fortune to know several. You may ask, why "good" fortune? The answer is that if you don't want something to work out, you arrange for them to be put in charge of it. I couldn't possibly say whether something like this may have influenced the failure of the CONS work to deliver.
For IBM's LU6.2 proposal, though, this would not work. They had put some technically strong people from their network engineering centre in La Gaude, France in charge of it. In truth I had little idea what I would do until I got to the meeting. It turned out that there were three camps:
- IBM and others who liked the idea of LU6.2 being part of OSI
- Those who thought that making it part of the standard would act against IBM's interests, by making it easier to compete with them. While these people were "enemies of IBM" and in some sense on the same side as me, as far as this meeting was concerned, they were my opponents. For example, France's Bull was in this camp.
- Those who didn't want it. This turned out to be just me, and ICL.