Saturday, February 12, 2005


default malloc debugging capabilities in GNU libc

The notes from GNU libc manuals.

Memory checking in a program can be enabled as automatic or manual. The former is one by setting the environment variable MALLOC_CHECK_:
MALLOC_CHECK_=1 my_prog

This mechanism is able to catch a fair number of boundary overflows and, in some cases, to protect the program from crashing. The action undertaken when a fault is detected depends on the value of MALLOC_CHECK_: 1 prints a warning message to stderr but does not abort the program; 2 aborts the program without any output; and 3 combines the effects of 1 and 2. MALLOC_CHECK_=3 must produce a core file and some error messages into STDERR.

From the glibc docs:

You can ask malloc to check the consistency of dynamic storage by using the mcheck function. This function is a GNU extension, declared in `mcheck.h'.

Function: int mcheck (void (*abortfn) (enum mcheck_status status))

Calling mcheck tells malloc to perform occasional consistency checks. These will catch things such as writing past the end of a block that was allocated with malloc. The abortfn argument is the function to call when an inconsistency is found. If you supply a null pointer, then mcheck uses a default function which prints a message and calls abort (see section Aborting a Program). The function you supply is called with one argument, which says what sort of inconsistency was detected; its type is described below.

It is too late to begin allocation checking once you have allocated anything with malloc. So mcheck does nothing in that case. The function returns -1 if you call it too late, and 0 otherwise (when it is successful).

The easiest way to arrange to call mcheck early enough is to use the option -lmcheck when you link your program; then you don't need to modify your program source at all. Alternately you might use a debugger to insert a call to mcheck whenever the program is started, for example these gdb commands will automatically call mcheck whenever the program starts:
(gdb) break main
Breakpoint 1, main (argc=2, argv=0xbffff964) at whatever.c:10
(gdb) command 1
Type commands for when breakpoint 1 is hit, one per line.
End with a line saying just "end".
>call mcheck(0)
(gdb) ...

Another possibility to check for and guard against bugs in the use of malloc, realloc and free is to set the environment variable MALLOC_CHECK_. When MALLOC_CHECK_ is set, a special (less efficient) implementation is used which is designed to be tolerant against simple errors, such as double calls of free with the same argument, or overruns of a single byte (off-by-one bugs). Not all such errors can be proteced against, however, and memory leaks can result. If MALLOC_CHECK_ is set to 0, any detected heap corruption is silently ignored; if set to 1, a diagnostic is printed on stderr; if set to 2, abort is called immediately. This can be useful because otherwise a crash may happen much later, and the true cause for the problem is then very hard to track down.

So, what's the difference between using MALLOC_CHECK_ and linking with -lmcheck? MALLOC_CHECK_ is orthognal with respect to -lmcheck. -lmcheck has been added for backward compatibility. Both MALLOC_CHECK_ and -lmcheck should uncover the same bugs - but using MALLOC_CHECK_ you don't need to recompile your application.

If you want to check the whole heap and not only one block, you can call mcheck_check_all() to walk through all the active blocks. You also can instruct the memory management routines to use mcheck_check_all(), instead of checking only the current block by initializing mcheck_pedantic() instead of mcheck(). Be aware, though, that this approach is rather time consuming.

