An approach for tracking memory leaks in C code

Memory leaks are a frequent problem in C code that are difficult to detect and debug. Here’s our approach to figure them out.

Context

The C language family carries a heirloom that, depending on one’s point of view, may be either a blessing or a curse: manual memory management. The C computing model is simple in that regard: when you allocate memory outside of the data stack (on the so-called heap region of process memory), you’re responsible for deallocating (freeing) it. If you forget about this allocation, your process will be unable to reuse this chunk of memory. This type of error is called a memory leak. No garbage collector is coming to save you is you run out of memory by repeatedly making this mistake.

Introductory texts for the C and C++ languages often mask how common memory leaks are. These tutorials often allocate and free whatever memory is needed in the same function. In real code, we allocate memory on the heap in order to communicate data from some function to another one, quite distant in time and code location. As code is layered and grown in a collaboration context, the original execution paths that were well tested against memory leaks become branchy, the conventions they mirror become obscure. Real-life memory leaks happen in corner cases, as a result of code refactoring, because of misunderstandings between collaborators, or so on. Memory leaks stem of implementation complexity, even when this complexity is actively minimized.

We propose here what has been for us an efficient and simple approach to tracking memory leaks in debug code, involving data structures that are easy to (mostly) strip off from release code. We stake no claim of originality nor even non-triviality. Still, this approach is too valuable not to talk about it, so the following text presents its motivation and implementation.

The blame game

Our approach hangs on asking the question: “Who’s responsible for freeing this memory?” From a certain point in the code, memory is allocated: it is then used by various other bits, and then should be freed. So, from the moment memory is allocated, we have to choose a function to be responsible for freeing it: it is the owner of the allocated memory chunk. It can be the function that’s performed the call to malloc. Alternatively, this ownership can be transferred to a function that is called by the allocator, or a caller of that function. Other functions may be passed the pointer to the memory chunk, but not its ownership: they are simply users of the chunk, they either read from or write on it, but they’re not supposed to free it.

In this framework, a memory leak is a function having forgotten its owner role. Debugging such a problem reduces to figuring out the ownership transfer path between functions, in order to identify who is not doing its job. These paths are difficult to label properly during development. Such labels tend to be parallel to code, so as soon as the two become out of sync, they hinder the leak tracking more than they help it. Instead, let us assume that the poor sob who does the debugging should be able to trace the ownership chain just fine as long as they know where to start. Just knowing the number of bytes being leaked, or even the addresses and lengths of the leaked chunks, don’t help us a lot. However, knowing where in the code that chunk was allocated does 80% of the work. Fortunately, C compilers provide very useful tools to help with this kind of tracking.

tl;dr – We want to blame code that allocates on the heap for not ensuring that memory to get freed.

The allocation dictionary

First, we shall wrap our memory allocator of choice within a function of our own. This example leverages the standard malloc routine to perform the actual memory allocations, but it could be applied to any other allocator.

#ifdef MEMORY_LEAK_TRACKING
typedef struct _allocPrefix_
{
    unsigned long from;
    size_t size;
} allocPrefix;
#endif

void* my_mallocFrom( size_t size, unsigned long from )
{
#ifdef MEMORY_LEAK_TRACKING
    allocPrefix* prefix = (allocPrefix*)malloc( size + sizeof( allocPrefix ) );
    if( NULL == prefix )
    {
        return NULL;
    }
    prefix->from = from;
    prefix->size = size;
    g_allocDict[ from % ALLOC_DICT_LEN ] += size;
    return prefix + 1;
#else
    return malloc( size );
#endif
}

Of course, to enable memory leak tracking, one must compile with the macro MEMORY_LEAK_TRACKING defined (either in code or through a compiler option). We use here a global array named g_allocDict, declared somewhere prior to this function:

#ifdef MEMORY_LEAK_TRACKING
#define ALLOC_DICT_LEN   (1 << 20);
unsigned long g_allocDict[ ALLOC_DICT_LEN ] = { 0 };
#endif

This is the allocation dictionary: it will track how many bytes have been allocated at any given location in the code. This location is stored prior to the memory chunk actually requested by the caller of my_mallocFrom, along with the size of the allocated chunk. We also have to wrap the reciprocal to malloc to free allocated memory:

void my_free( void* ptr )
{
#ifdef MEMORY_LEAK_TRACKING
    allocPrefix* prefix = (allocPrefix*)( ptr - sizeof( allocPrefix ) );
    g_allocDict[ prefix->from % ALLOC_DICT_LEN ] -= prefix->size;
    free( prefix );
#else
    free( ptr );
#endif
}

Using the custom allocator

Now, everywhere in our code where we need to allocate memory on the heap, we replace our usage of malloc with my_mallocFrom, as well as free with my_free. For each such allocation, we put up a unique number as second parameter. Of course, doing this by hand is super tedious, as one has to keep track of all already used tags.

Fortunately, the compiler can automate this for us, through the use of macros. The compiler maintains special macros __FILE__ and __LINE__ during compilation, which other macros of our own may leverage. In this case, we want a unique number based on code location. The __FILE__ macro is of no use, but certainly __LINE__ can help. Assuming we could tag each file of a project with a unique numerical identifier (trivial, or at least, not as tedious as maintaining code location tags), we could combine it with __LINE__ in a unique code location tag:

#define MEM_ALLOC_TAG        ( ( ( UNIQUE_FILE_ID & 0xff ) << 12 ) | ( __LINE__ & 0xfff ) )
#define my_malloc( size )    my_mallocFrom( (size), MEM_ALLOC_TAG )

Thanks to the late evaluation of macros by the C compiler, UNIQUE_FILE_ID resolves to its most recent definition (which should be somewhere at the beginning of each file of C code in the project) and __LINE__ to the number of the current line, each time my_malloc is encountered by the compiler.

The definition of MEM_ALLOC_TAG above goes along that of ALLOC_DICT_LEN earlier. My code location tagging scheme holds tags on 20 bits: the 8 higher bits are reserved for the file ID, and the 12 lower ones, the line. I thus expect my project to contain, at most, 256 files with the .c extension, each of which should have less than 4096 lines. Exceeding these limits won’t break the scheme, but might result in unexpected code locations (e.g. allocating on line number 5000 would track it at the same location as line 904). The resulting allocation dictionary holds on 4~MB. It’s quite big, but easy to stomach for debug builds on modern computers and smartphones.

Tracking memory leaks

In a perfect program that does not leak memory, all memory that was allocated is in turn freed; when the program exits, all entries of the allocation dictionary should be 0. Thus, let us check the contents of the allocation dictionary just before exiting, by invoking this function:

void listAllocDict()
{
#ifdef MEMORY_LEAK_TRACKING
    unsigned long i, file_id, line;

    for( i = 0; i < ALLOC_DICT_LEN; ++i )
    {
        if( 0 < g_allocDict[ i ] )
        {
            file_id = ( i >> 12 );
            line = ( i & 0xfff );
            printf(
                    "File %03u, line %03u: %lu bytes\n",
                    file_id,
                    line,
                    g_allocDict[ i ]
                    );
        }
    }
#endif
}

Passage of ownership

Assume we are implementing a data structure that relies on heap allocation, such as an auto-extending array or tree or linked list. The allocations are hidden in the implementation. Hence, by default, the code location tags maintained in the dictionary correspond to the locations in the data structure build and maintenance code. So, assume that this code is well made and does not leak memory. However, if the code that uses this data structure misuses it, it may start leaking some memory it has allocated. Hence, the tracking report given by listAllocDict will report leaks from locations in the data structure code. If this code is only used from one other routine in the project, we can easily climb up to the misbehaving owner of the memory. However, if this is an abstract data structure used by multiple other routines, the tracking report is useless.

The key is to recognize here that, if the data structure code is known not to leak, the ownership of its allocations is effectively transfered to its user. In this case, all allocations performed within the data structure code should forward the location of the initial creation of the data structure. Let us illustrate with an example. Consider that we are making a self-extending byte buffer:

typedef struct _elasticBuffer_
{
    void*  bytes;
    size_t size;
} elasticBuffer;

int elasticBuffer_create( elasticBuffer* eb, size_t init_size )
{
    if( init_size < 1 )
    {
        return 0;
    }
    eb->bytes = my_malloc( init_size );
    if( NULL == eb->bytes )
    {
        return 0;
    }
    eb->bytes = init_size;
    return 1;
}

void elasticBuffer_free( elasticBuffer* eb )
{
    if( NULL != eb->bytes )
    {
        free( eb->bytes );
        eb->bytes = NULL;
    }
    eb->size = 0;
}

So, if one creates one of these buffers (with elasticBuffer_create) but then fails to free it (using elasticBuffer_free), the leak will be accounted for the line in elasticBuffer_create where my_malloc is called. We should instead transfer the responsibility for this allocation to the caller of elasticBuffer_create. Let us rewrite this function:

#define elasticBuffer_create( eb, init_size ) elasticBuffer_create_from( (eb), (init_size), MEM_ALLOC_TAG )
int elasticBuffer_create_from(
        elasticBuffer* eb,
        size_t init_size,
        unsigned long from )
{
    if( init_size < 1 )
    {
        return 0;
    }
    eb->bytes = my_mallocFrom( init_size, from );
    if( NULL == eb->bytes )
    {
        return 0;
    }
    eb->bytes = init_size;
    return 1;
}

By replacing my_malloc with my_mallocFrom, we effectively push back the responsibility for this allocation to the caller. If said caller fails to call elasticBuffer_free to get rid of its buffer, all bytes accounted for the buffer will be blamed on that caller.

Let us further assume that the buffer needs to be extended beyond its initial size. In support of that task, we implement a function to reallocate the buffer:

int elasticBuffer_realloc( elasticBuffer* eb, size_t new_size )
{
    if( new_size > eb->size )
    {
        void* new_bytes = my_malloc( new_size );
        if( NULL == new_bytes )
        {
            return 0;
        }
        memcpy( new_bytes, eb->bytes, eb->size );
        my_free( eb->bytes );
        eb->bytes = new_bytes;
        eb->size = new_size;
    }

    return 1;
}

Here again, if the buffer is not freed using elasticBuffer_free after such a reallocation has happened, the leak will be blamed on function elasticBuffer_realloc. To avoid this, the self-extending buffer could carry with itself the original caller code location: thereby, all bytes ever allocated for this buffer will be blamed for that bit of code. Here is the rewrite of the relevant bits:

typedef struct _elasticBuffer_
{
    void*         bytes;
    size_t        size;
    unsigned long from;
} elasticBuffer;

#define elasticBuffer_create( eb, init_size ) elasticBuffer_create_from( (eb), (init_size), MEM_ALLOC_TAG )
int elasticBuffer_create_from(
        elasticBuffer* eb,
        size_t init_size,
        unsigned long from )
{
    if( init_size < 1 )
    {
        return 0;
    }
    eb->bytes = my_mallocFrom( init_size, from );
    if( NULL == eb->bytes )
    {
        return 0;
    }
    eb->bytes = init_size;
    eb->from = from;
    return 1;
}

int elasticBuffer_realloc( elasticBuffer* eb, size_t new_size )
{
    if( new_size > eb->size )
    {
        void* new_bytes = my_mallocFrom( new_size, eb->from );
        if( NULL == new_bytes )
        {
            return 0;
        }
        memcpy( new_bytes, eb->bytes, eb->size );
        my_free( eb->bytes );
        eb->bytes = new_bytes;
        eb->size = new_size;
    }

    return 1;
}

Limitations and caveats

The main caveat on this approach is that it puts a small burden even on code that has been compiled without the allocation dictionary (MEMORY_LEAK_TRACKING undefined). In this case, the code location tag remains in the parameter list of functions that forward it; it also remains in structures that must carry it across their lifetime, like the elasticBuffer sketched above. Passing one more word-size parameter is not an enormous overhead, except maybe in the innermost loop of processes for which performance is tightly CPU-bound. Rigorous profiling is obviously required before pointing the finger at this overhead as the cause of slow code.

The other limitation with respect to using special libraries such as valgrind is that only one’s own code can be effectively audited. Binary libraries against which the program is linked are not compiled with one’s own allocator. Open source libraries can be recompiled, but they must be patched to use the custom allocator, and thereby they become under our maintenance responsibility (ick!).

Exercises for the reader

  1. Standard C offers the realloc function as an alternative/supplement to malloc. This function may either extend a previously allocated chunk of memory or allocate a new one entirely to accomodate the required size. Should this function be wrapped, like we did malloc? What code location tag should it be accounted against?
  2. The allocation dictionary is both very large and very sparse: even if there are many locations in the code where heap allocation is used, less than 1% of the 4-MB dictionary defined above will be used. Derive another dictionary addressing scheme that allows a dictionary to fit in 64 KB (so 2^14 word elements). Likely, code analysis scripts will be needed to map tags to code locations from the final dictionary listing.
  3. Memory leaks are even more pernicious when implementing multi-threaded programs, in which certain threads allocate memory to be used and freed in other threads. Modify the code to avoid the allocation dictionary from being corrupted by race conditions.
  4. Implement macros that would completely rid the program of any memory leak tracking code when macro MEMORY_LEAK_TRACKING is undefined. Minimize boilerplate.
Commentaires