Using C++ on microcontrollers code size tricks


Introduction

During the development of the Miosix kernel I had the opportunity of trying the GNU GCC compiler to write C++ code in an embedded evironment (LPC2138 microcontroller). As a result of all this time spent coding the kernel, I also learned various simple techniques to minimize the code size of a C++ program.
In this article I will present these tecniques, that are applicable when programming with GCC for a microcontroller. At the end of the article I'll also add a template project that can be used as a starting point for writing applications.

Tip #1: use -Wl,--gc-sections and -ffunction-sections

Consider this scenario: you write many functions (or member functions of some class) in a C/C++ source file, because you are writing a library; then in a project which uses that library, you only call a couple of those functions. And here is the bad surprise: in your final binary file you will get all the functions of the library, even the ones you never call. If you were writing a PC application you wouldn't mind about that, after all PCs have 100s of GBytes of hard disk space. But in a microcontroller things are different, you maybe only have 64KByes of FLASH memory, and the code size of unused functions might make the difference between a program that fits into the memory, and one that does not.
A naive solution might be to write each function in a separate source file, but today's compiler technology have surpassed this technique. In GCC, the solution is to pass these flags to the compiler:  -ffunction-sections and -Wl,--gc-sections. The first option is passed when compiling source files. It tells the compiler to put each function in a section of its own (the .text.* familiy of sections), keeping all functions separate even if they appear in the same object file. The second option is used at link time. It tells the linker to perform a "garbage collection" of all sections. Putting it simple, the linker starts from main() marking all functions called by main() as used, and continues recursively. At the end, all the unreachable functions are removed from the binary file.
Now, it is worth noting that this problem exists even when programming in C, and the solution is the same. However, it is especially important to know about these compiler flags when programming in C++. This is because the C++ standard library itself is compiled with -ffunction-sections, so this technique reduces code size significantly when using C++.

Tip #2: if you don't use exceptions and/or RTTI, inform the compiler using -fno-exceptions and/or -fno-rtti

Exceptions and RTTI are two features that have an overhead associated to using them. I'm not saying that you should not use them when programming microcontrollers, it depends whether your microcontroller has enough FLASH/RAM to support them, and whether your application has hard real-time constraints or not. However, if you decide not to use them, the compiler might add the runtime support for them anyway. This is bad, and is in contrast to the zero overhead principle of C++: "You don't have to pay for what you don't use". The solution is to explicitly tell the compiler that you are not using those features. To do so, the GCC compiler has the -fno-exceptions option to disable exception support, and the  -fno-rtti option to disable RTTI support. Note that you should pass these options both when compiling and when linking.

Tip #3: overload the default new, delete, new[], delete[]

The GCC compiler comes with fully functional versions of operator new and delete. However, they are not well suited for use on microcontrollers for two reason:
Luckily, the default implementation of operator new, delete, new[] and delete[] can be overridden. You do so by defining these operators in one of your source files. Here's an example:

void *operator new(size_t size)
{
    return malloc(size);
}

void *operator new[](size_t size)
{
    return malloc(size);
}

void operator delete(void *p)
{
    free(p);
}

void operator delete[](void *p)
{
    free(p);
}

These minimal implementations get the required memory by simply calling malloc()/free(). Note that by overriding new/delete the constructors/destructors of the classes you instantiate will still be called, so there's no need to worry about that.

Tip #4: redefine the verbose terminate handler if you use exceptions

If you have tried using exceptions with GCC in an embedded environment, you might have noted that that the code size increases suddenly because the code to support exception gets linked with your code. It is a one-time cost since when you use exceptions once, the exception support runtime is included, but when you start using exceptions in more points of your code, code size won't increase again. However, it would be interesting to reduce even this one-time code size penalty, and turns out it is possible.
The cause of most of the code size increase is the verbose terminate handler. This is the function that is called when an uncaught exception is encountered in your code. The default function in GCC prints the name of the uncaught exception, but to do so it must demangle its name. And the demangling code is large and complicated.
The solution is to redefine the __verbose_terminate_handler in the __gnu_cxx namespace. To do so add the following code to one of your source files:

namespace __gnu_cxx {

void __verbose_terminate_handler()
{
  abort();
}

}

This function must never return. To do so this implementation simply aborts the program, but other options are printing an error message on a serial port and entering an infinite loop, etc.
The important thing is that by redefining this function you prevent the original implementation from being linked in your binary file, reducing code size.

Conclusions

While at first it might seem that these tips might be difficult to apply, it is not true. The first two tips require a simple change in the Makefile, while the other two require to add a source file to your project with the required code. Just that.

The template project

To see how these tips can be implemented, or to start immediately writing C++ code for a LPC2138 microcontroller, download the template project here.
The features of the template project are:
The tip #3 and #4 are implemented in the syscalls.cpp file, together with the syscalls to support malloc and printf.
The line "EX= -fno-exceptions -fno-rtti" in the makefile can be commented out to enable exceptions support, while the "SRC" line in the makefile allows for adding other source files.