Wednesday, 24 August 2011

Boost: a retrospective (part 2) - the Good

In part 1, I explained how I came to regard Boost as an essential part of C++ programming. There are some parts of Boost that I can be pretty sure I'll use in any decent sized program, to the point where I have a generic header file that pulls them all in and makes them accessible without even needing to ask.

Regex

Before Boost came along, you couldn't really use regular expressions in C/C++. Which is a great pity, because they are just incredibly useful especially if you have to deal with any kind of human input. There is a Gnu regex package, but it is GPL, so unusable in anything you plan to sell or keep to yourself, and it has a determinedly C-flavor interface which means you'd have to write a C++ wrapper round it anyhow. My Winlife32 program had to parse human-style text input without regex, and what an incredible pain the neck that was!

Lexical Cast

Can you remember how to use atoi, itoa, atof,  and all the other zillion variants along the same lines? No, neither can I - and actually quite a lot that you'd expect to find, don't even exist. lexical_cast to the rescue! To convert a string to any type - including your own types as long as they define a stream >> operator - just write lexical_cast<type>(str). If it can't be converted, you get an exception which you can use to trigger an appropriate error message.

In the other direction, lexical_cast<string>(value) will convert anything that has a stream << operator to a string. Simple, but indispensible.

Function / Bind

I already talked about these in Part 1. I can't imagine trying to write code without them now. Although they are proscribed by Google's internal coding standard, because they "would encourage functional style programming". I have no idea why that is supposed to be bad!

Format

Printf is incredibly useful, but fraught with issues viewed from a 2011 C++ perspective. It's not type-safe, it's very fragile and can cause your program to just roll over and die. And of course there's no question of dealing with user-defined types. Boost Format is used pretty much exactly like printf, except it fixes all of these problems and more. For example:

cout << format("unit %d temp = %.2f deg C") % index % get_temp(index);

will do exactly what you'd expect (note the neat reuse of the % operator, similar to Python by the way). But actually, you don't need the "d" of "%d", because it knows it's dealing with an int, and will format it accordingly. Replace the int with a type of your own, having a stream << operator, and it will work too.

There's a lot more to it, if you want to use it - numerous extra formatting options, positional and named arguments.

Foreach

The STL containers, and their Boost extensions, are incredibly useful. But iterating through their contents is so painful, syntactically. After the hundredth time you've typed something like:

for (vector<int>::const_iterator i=vec.begin(); i!=vec.end(); ++i)

you are really just about ready to scream. Boost Foreach to the rescue! You can replace all this with:

foreach (int i, vec)

(OK, I've cheated a little, my generic header file #defines "foreach" as "BOOST_FOREACH" just to make the code prettier). Notice that i is just an int, not an iterator, so you don't need to use '*' or '->' with it. This is especially neat for containers of pointers, which get very awkward. (Boost has another solution for those, too, the Pointer Container library - though I've always found it doesn't quite do what I need).

The new C++0x standard makes the problem go away, since it has a built-in container iteration syntax, as well as the auto type declarator. But Foreach will have saved a decade or so of tedious and ugly typing.

Intrusive

I've eulogised elsewhere about this. Suffice it here to say that you get all the convenience of the STL container types, without the behind the scenes manipulation of little extra memory blocks, and their associated run-time cost. For anyone whose system-programming teeth were cut in C or assembler (or Bliss!), this is just so much nicer.

Date_Time

Who hasn't struggled with all the complexities of date and time? Input, output, arithmetic - they're all a nightmare. Nearly all of these problems go away with Date_Time. Date and time arithmetic is simple, comparisons are simple. Unfortunately input and output are heavily tied into the C++ locale system, which is basically incomprehensible. It's much easier to write your own parsing and output code than it is to figure out how to make locales work. But that's a nit because for everything else, these classes are indispensible.

Others

Just because I haven't mentioned them, doesn't mean I don't like the other bits of Boost (though see the forthcoming part on the Bad and the Ugly). Special mention should go to Thread and Python, which are both indispensible if you want to use threads or interact with Python, respectively. Each of them takes a fairly ugly C interface and wraps an elegant C++ interface around it. Python makes it trivial to combine Python and C++ code, or support Python scripting within a C++ app.

Part 3: the Bad and the Ugly

No comments: