Memory leaks are a frequent problem in C code that are difficult to detect and debug. Here’s our approach to figure them out.
Context
The C language family carries a heirloom that, depending on one’s point of view, may be either a blessing or a curse: manual memory management. The C computing model is simple in that regard: when you allocate memory outside of the data stack (on the so-called heap region of process memory), you’re responsible for deallocating (freeing) it. If you forget about this allocation, your process will be unable to reuse this chunk of memory. This type of error is called a memory leak. No garbage collector is coming to save you is you run out of memory by repeatedly making this mistake.
Introductory texts for the C and C++ languages often mask how common memory leaks are. These tutorials often allocate and free whatever memory is needed in the same function. In real code, we allocate memory on the heap in order to communicate data from some function to another one, quite distant in time and code location. As code is layered and grown in a collaboration context, the original execution paths that were well tested against memory leaks become branchy, the conventions they mirror become obscure. Real-life memory leaks happen in corner cases, as a result of code refactoring, because of misunderstandings between collaborators, or so on. Memory leaks stem of implementation complexity, even when this complexity is actively minimized.
We propose here what has been for us an efficient and simple approach to tracking memory leaks in debug code, involving data structures that are easy to (mostly) strip off from release code. We stake no claim of originality nor even non-triviality. Still, this approach is too valuable not to talk about it, so the following text presents its motivation and implementation.
The blame game
Our approach hangs on asking the question: “Who’s responsible for freeing this
memory?” From a certain point in the code, memory is allocated: it is then
used by various other bits, and then should be freed. So, from the moment
memory is allocated, we have to choose a function to be responsible for
freeing it: it is the owner of the allocated memory chunk. It can be the
function that’s performed the call to malloc
. Alternatively, this ownership
can be transferred to a function that is called by the allocator, or a caller
of that function. Other functions may be passed the pointer to the memory
chunk, but not its ownership: they are simply users of the chunk, they
either read from or write on it, but they’re not supposed to free it.
In this framework, a memory leak is a function having forgotten its owner role. Debugging such a problem reduces to figuring out the ownership transfer path between functions, in order to identify who is not doing its job. These paths are difficult to label properly during development. Such labels tend to be parallel to code, so as soon as the two become out of sync, they hinder the leak tracking more than they help it. Instead, let us assume that the poor sob who does the debugging should be able to trace the ownership chain just fine as long as they know where to start. Just knowing the number of bytes being leaked, or even the addresses and lengths of the leaked chunks, don’t help us a lot. However, knowing where in the code that chunk was allocated does 80% of the work. Fortunately, C compilers provide very useful tools to help with this kind of tracking.
tl;dr – We want to blame code that allocates on the heap for not ensuring that memory to get freed.
The allocation dictionary
First, we shall wrap our memory allocator of choice within a function of our
own. This example leverages the standard malloc
routine to perform the
actual memory allocations, but it could be applied to any other
allocator.
#ifdef MEMORY_LEAK_TRACKING
typedef struct _allocPrefix_
{
unsigned long from;
size_t size;
} allocPrefix;
#endif
void* my_mallocFrom( size_t size, unsigned long from )
{
#ifdef MEMORY_LEAK_TRACKING
allocPrefix* prefix = (allocPrefix*)malloc( size + sizeof( allocPrefix ) );
if( NULL == prefix )
{
return NULL;
}
prefix->from = from;
prefix->size = size;
g_allocDict[ from % ALLOC_DICT_LEN ] += size;
return prefix + 1;
#else
return malloc( size );
#endif
}
Of course, to enable memory leak tracking, one must compile with the macro
MEMORY_LEAK_TRACKING
defined (either in code or through a compiler option).
We use here a global array named g_allocDict
, declared somewhere prior to
this function:
#ifdef MEMORY_LEAK_TRACKING
#define ALLOC_DICT_LEN (1 << 20);
unsigned long g_allocDict[ ALLOC_DICT_LEN ] = { 0 };
#endif
This is the allocation dictionary: it will track how many bytes have been
allocated at any given location in the code. This location is stored prior to
the memory chunk actually requested by the caller of my_mallocFrom
, along
with the size of the allocated chunk. We also have to wrap the reciprocal to
malloc
to free allocated memory:
void my_free( void* ptr )
{
#ifdef MEMORY_LEAK_TRACKING
allocPrefix* prefix = (allocPrefix*)( ptr - sizeof( allocPrefix ) );
g_allocDict[ prefix->from % ALLOC_DICT_LEN ] -= prefix->size;
free( prefix );
#else
free( ptr );
#endif
}
Using the custom allocator
Now, everywhere in our code where we need to allocate memory on the heap, we
replace our usage of malloc
with my_mallocFrom
, as well as free
with
my_free
. For each such allocation, we put up a unique number as second
parameter. Of course, doing this by hand is super tedious, as one has to keep
track of all already used tags.
Fortunately, the compiler can automate this for us, through the use of macros.
The compiler maintains special macros __FILE__
and __LINE__
during
compilation, which other macros of our own may leverage. In this case, we want
a unique number based on code location. The __FILE__
macro is of no use, but
certainly __LINE__
can help. Assuming we could tag each file of a project
with a unique numerical identifier (trivial, or at least, not as tedious as
maintaining code location tags), we could combine it with __LINE__
in a
unique code location tag:
#define MEM_ALLOC_TAG ( ( ( UNIQUE_FILE_ID & 0xff ) << 12 ) | ( __LINE__ & 0xfff ) )
#define my_malloc( size ) my_mallocFrom( (size), MEM_ALLOC_TAG )
Thanks to the late evaluation of macros by the C compiler, UNIQUE_FILE_ID
resolves to its most recent definition (which should be somewhere at the
beginning of each file of C code in the project) and __LINE__
to the number
of the current line, each time my_malloc
is encountered by the compiler.
The definition of MEM_ALLOC_TAG
above goes along that of ALLOC_DICT_LEN
earlier. My code location tagging scheme holds tags on 20 bits: the 8 higher
bits are reserved for the file ID, and the 12 lower ones, the line. I thus
expect my project to contain, at most, 256 files with the .c
extension, each
of which should have less than 4096 lines. Exceeding these limits won’t break
the scheme, but might result in unexpected code locations (e.g. allocating on
line number 5000 would track it at the same location as line 904). The
resulting allocation dictionary holds on 4~MB. It’s quite big, but easy to
stomach for debug builds on modern computers and smartphones.
Tracking memory leaks
In a perfect program that does not leak memory, all memory that was allocated is in turn freed; when the program exits, all entries of the allocation dictionary should be 0. Thus, let us check the contents of the allocation dictionary just before exiting, by invoking this function:
void listAllocDict()
{
#ifdef MEMORY_LEAK_TRACKING
unsigned long i, file_id, line;
for( i = 0; i < ALLOC_DICT_LEN; ++i )
{
if( 0 < g_allocDict[ i ] )
{
file_id = ( i >> 12 );
line = ( i & 0xfff );
printf(
"File %03u, line %03u: %lu bytes\n",
file_id,
line,
g_allocDict[ i ]
);
}
}
#endif
}
Passage of ownership
Assume we are implementing a data structure that relies on heap allocation,
such as an auto-extending array or tree or linked list. The allocations are
hidden in the implementation. Hence, by default, the code location tags
maintained in the dictionary correspond to the locations in the data structure
build and maintenance code. So, assume that this code is well made and does
not leak memory. However, if the code that uses this data structure misuses
it, it may start leaking some memory it has allocated. Hence, the tracking
report given by listAllocDict
will report leaks from locations in the data
structure code. If this code is only used from one other routine in the
project, we can easily climb up to the misbehaving owner of the memory.
However, if this is an abstract data structure used by multiple other
routines, the tracking report is useless.
The key is to recognize here that, if the data structure code is known not to leak, the ownership of its allocations is effectively transfered to its user. In this case, all allocations performed within the data structure code should forward the location of the initial creation of the data structure. Let us illustrate with an example. Consider that we are making a self-extending byte buffer:
typedef struct _elasticBuffer_
{
void* bytes;
size_t size;
} elasticBuffer;
int elasticBuffer_create( elasticBuffer* eb, size_t init_size )
{
if( init_size < 1 )
{
return 0;
}
eb->bytes = my_malloc( init_size );
if( NULL == eb->bytes )
{
return 0;
}
eb->bytes = init_size;
return 1;
}
void elasticBuffer_free( elasticBuffer* eb )
{
if( NULL != eb->bytes )
{
free( eb->bytes );
eb->bytes = NULL;
}
eb->size = 0;
}
So, if one creates one of these buffers (with elasticBuffer_create
) but then
fails to free it (using elasticBuffer_free
), the leak will be accounted for
the line in elasticBuffer_create
where my_malloc
is called. We should
instead transfer the responsibility for this allocation to the caller of
elasticBuffer_create
. Let us rewrite this function:
#define elasticBuffer_create( eb, init_size ) elasticBuffer_create_from( (eb), (init_size), MEM_ALLOC_TAG )
int elasticBuffer_create_from(
elasticBuffer* eb,
size_t init_size,
unsigned long from )
{
if( init_size < 1 )
{
return 0;
}
eb->bytes = my_mallocFrom( init_size, from );
if( NULL == eb->bytes )
{
return 0;
}
eb->bytes = init_size;
return 1;
}
By replacing my_malloc
with my_mallocFrom
, we effectively push back the
responsibility for this allocation to the caller. If said caller fails to call
elasticBuffer_free
to get rid of its buffer, all bytes accounted for the
buffer will be blamed on that caller.
Let us further assume that the buffer needs to be extended beyond its initial size. In support of that task, we implement a function to reallocate the buffer:
int elasticBuffer_realloc( elasticBuffer* eb, size_t new_size )
{
if( new_size > eb->size )
{
void* new_bytes = my_malloc( new_size );
if( NULL == new_bytes )
{
return 0;
}
memcpy( new_bytes, eb->bytes, eb->size );
my_free( eb->bytes );
eb->bytes = new_bytes;
eb->size = new_size;
}
return 1;
}
Here again, if the buffer is not freed using elasticBuffer_free
after such a
reallocation has happened, the leak will be blamed on function
elasticBuffer_realloc
. To avoid this, the self-extending buffer could carry
with itself the original caller code location: thereby, all bytes ever
allocated for this buffer will be blamed for that bit of code. Here is the
rewrite of the relevant bits:
typedef struct _elasticBuffer_
{
void* bytes;
size_t size;
unsigned long from;
} elasticBuffer;
#define elasticBuffer_create( eb, init_size ) elasticBuffer_create_from( (eb), (init_size), MEM_ALLOC_TAG )
int elasticBuffer_create_from(
elasticBuffer* eb,
size_t init_size,
unsigned long from )
{
if( init_size < 1 )
{
return 0;
}
eb->bytes = my_mallocFrom( init_size, from );
if( NULL == eb->bytes )
{
return 0;
}
eb->bytes = init_size;
eb->from = from;
return 1;
}
int elasticBuffer_realloc( elasticBuffer* eb, size_t new_size )
{
if( new_size > eb->size )
{
void* new_bytes = my_mallocFrom( new_size, eb->from );
if( NULL == new_bytes )
{
return 0;
}
memcpy( new_bytes, eb->bytes, eb->size );
my_free( eb->bytes );
eb->bytes = new_bytes;
eb->size = new_size;
}
return 1;
}
Limitations and caveats
The main caveat on this approach is that it puts a small burden even on code
that has been compiled without the allocation dictionary
(MEMORY_LEAK_TRACKING
undefined). In this case, the code location tag
remains in the parameter list of functions that forward it; it also remains in
structures that must carry it across their lifetime, like the elasticBuffer
sketched above. Passing one more word-size parameter is not an enormous
overhead, except maybe in the innermost loop of processes for which
performance is tightly CPU-bound. Rigorous profiling is obviously required
before pointing the finger at this overhead as the cause of slow code.
The other limitation with respect to using special libraries such as
valgrind
is that only one’s own code can be
effectively audited. Binary libraries against which the program is linked are
not compiled with one’s own allocator. Open source libraries can be
recompiled, but they must be patched to use the custom allocator, and thereby
they become under our maintenance responsibility (ick!).
Exercises for the reader
- Standard C offers the
realloc
function as an alternative/supplement tomalloc
. This function may either extend a previously allocated chunk of memory or allocate a new one entirely to accomodate the required size. Should this function be wrapped, like we didmalloc
? What code location tag should it be accounted against? - The allocation dictionary is both very large and very sparse: even if there are many locations in the code where heap allocation is used, less than 1% of the 4-MB dictionary defined above will be used. Derive another dictionary addressing scheme that allows a dictionary to fit in 64 KB (so 2^14 word elements). Likely, code analysis scripts will be needed to map tags to code locations from the final dictionary listing.
- Memory leaks are even more pernicious when implementing multi-threaded programs, in which certain threads allocate memory to be used and freed in other threads. Modify the code to avoid the allocation dictionary from being corrupted by race conditions.
- Implement macros that would completely rid the program of any memory leak
tracking code when macro
MEMORY_LEAK_TRACKING
is undefined. Minimize boilerplate.