Introduction
The purpose of this article is to help you understand what a memory leak is and how to detect it using valgrind.
What is a memory leak?
A process leaks memory when it dynamically allocates memory at run time and forgets about the allocation, never using that memory again. This memory becomes a wasted resource as the kernel cannot allocate it to any other process either, as it still belongs to the process allocating it. The memory will be reclaimed by the OS when the process allocating it exits.
What is Valgrind?
Valgrind is an award-winning instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and threading bugs, and profile your programs in detail.
The Valgrind tool suite provides a number of debugging and profiling tools. The most popular is Memcheck, a memory checking tool which can detect many common memory errors such as:
* Touching memory you shouldn't (eg. overrunning heap block boundaries, or reading/writing freed memory, reading/writing inappropriate areas on the stack).
* Using values before they have been initialized.
* Incorrect freeing of memory, such as double-freeing heap blocks.
* Memory leaks - where pointers to malloc'd blocks are lost forever.
* Mismatched use of malloc/new/new [ ] vs free/delete/delete [ ].
* Overlapping src and dst pointers in memcpy() and related functions.Memcheck is only one of the tools in the Valgrind suite. Other tools included in the tool suite are:
- Cachegrind: a profiling tool which produces detailed data on cache (miss) and branch (misprediction) events.
- Callgrind: a profiling tool that shows cost relationships across function calls, optionally with cache simulation similar to Cachegrind.
- Massif: a space profiling tool. It allows you to explore in detail which parts of your program allocate memory.
- Helgrind: a debugging tool for threaded programs. Helgrind looks for various kinds of synchronisation errors in code that uses the POSIX PThreads API.We are only going to look at Memcheck in this article, as it is the most important and commonly used tool.
Common flags
Flag | Definition |
--track-fds=<yes | no>|print out a list of open file descriptors on exit. |
--time-stamp=<yes | no>|each message is preceded with an indication of the elapsed wallclock time. |
--leak-check=<no | summary|yes|full> [default: summary]|When enabled, search for memory leaks when the client program finishes. A memory leak means a malloc’d block, which has not yet been free’d, but to which no pointer can be found. Such a block can never be free’d by the program, since no pointer to it exists. If set to summary, it says how many leaks occurred. If set to full or yes, it gives details of each individual leak. |
--log-file=<filename> | Specifies the file that Valgrind should send all of its messages to. |
When should (and shouldn't) memcheck be used?
Memcheck should be used whenever you suspect that a process is leaking memory. These are typically processes run for extended periods - like most daemons, or some desktop applications, like web browsers. Processes leaking memory are have the following symptoms:
* Show a gradual increase in memory usage, when you monitor them in a utility like top (sort by memory usage).
* Have out of memory errors. Processes differ in how they handle out of memory errors, some will exit immediately, others will log a message and continue working.
* Host runs OOM killer.Leaking memory will usually not cause the process to:
- Occupy 100% of the CPU cycles.
- Corrupt memory or existing data.
How does memcheck work?
- Install the debuginfo rpms for the program to be memchecked.For example, for the snmpd daemon, you need to install net-snmp-debuginfo.
- Install the debuginfo rpms for the dynamically linked libraries.
- Find the dynamically linked libraries using ldd, then find the rpms which provide these libraries:
ldd /usr/sbin/snmpd | cut -f 3 -d " " | xargs rpm -qf | sort | uniq
Install the respective debuginfo rpms, for the rpms listed above.
- Run the program under valgrind:
valgrind -v --leak-check=full --show-reachable=yes --log-file=snmpd.memchk /usr/sbin/snmpd -f -Lsd -Lf /dev/null -p /var/run/snmpd.pid -a
Understanding memcheck's log
Memcheck only really detects two kinds of errors: use of illegal addresses, and use of undefined values. Nevertheless, this is enough to help you discover all sorts of memory-management problems in code. A majority of the log records each instance of these errors. Understanding these errors requires an understanding of the C language and how stack traces work, which is beyond the scope of this article. However, its easy to understand the leak summary that memcheck posts at the end of the log. Here's a sample:
==3313== LEAK SUMMARY:
==3313== definitely lost: 60,698 bytes in 479 blocks.
==3313== indirectly lost: 2,482 bytes in 25 blocks.
==3313== possibly lost: 3,608 bytes in 37 blocks.
==3313== still reachable: 1,422,154 bytes in 23,944 blocks.
==3313== suppressed: 0 bytes in 0 blocks.
If the summary shows that some memory was 'definitely lost', the case should be escalated to SEG with the reproducer information and memcheck log.
If the summary shows no (0) bytes definitely lost, but there is some indirect or possible memory loss, you should collect additional data that proves memory loss. This can include snapshots over time of:
- /proc/[number]/statm - Provides information about memory status of a process in pages.
- /proc/meminfo - reports the amount of free and used memory (both physical and swap) on the system
- Out of memory messages from the process
- log files - syslog, debug log for the process, etc.
Examples
Here's a simple C program that leaks memory:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>int main () {
//The variables
char c;
char * str = "Hello world.";
char * str1 = "How are you doing?";// The pointers
char * buffer;// Assign values
c = 'a';
buffer = &c;// print!
printf ("Variables:\n");
printf ("c: %c\n", c);
printf ("\n");
printf ("Pointers:\n");
printf ("str: %s\n", str);
printf ("buffer(%p): %c\n", buffer, *buffer);
printf ("\n");//repoint the pointers
printf ("Repointing the pointers\n");
buffer = str;
printf ("buffer(%p): %s\n", buffer, buffer);
printf ("\n");//Allocate memory
printf ("Allocating memory\n");
buffer = malloc (1024);
if (buffer == NULL) {
printf ("Could not allocate memory, aborting!!");
abort ();
}printf ("Copying stuff to memory\n");
memset(buffer, 0, 1024);
strncpy (buffer, str1, strlen(str1));
printf ("buffer(%p): %s\n", buffer, buffer);
printf ("\n");// free(buffer);
//Memory leak! Uncomment the free call above to fix.
printf ("resetting buffer:\n");
buffer = "good bye world!";
printf ("buffer(%p): %s\n", buffer, buffer);
printf ("\n");exit (0);
}
Compile the program:
$ gcc -g -o example example.c
Run it under valgrind:
valgrind -v --leak-check=full --show-reachable=yes --log-file=example.memchk ./example
The contents of example.memchck:
==9152== Memcheck, a memory error detector.
==9152== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==9152== Using LibVEX rev 1804, a library for dynamic binary translation.
==9152== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==9152== Using valgrind-3.3.0, a dynamic binary instrumentation framework.
==9152== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==9152== For more details, rerun with: -v
==9152==
==9152== My PID = 9152, parent PID = 3263. Prog and args are:
==9152== ./example
==9152==
==9152==
==9152== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 1)
==9152== malloc/free: in use at exit: 1,024 bytes in 1 blocks.
==9152== malloc/free: 1 allocs, 0 frees, 1,024 bytes allocated.
==9152== For counts of detected errors, rerun with: -v
==9152== searching for pointers to 1 not-freed blocks.
==9152== checked 65,104 bytes.
==9152==
==9152== 1,024 bytes in 1 blocks are definitely lost in loss record 1 of 1
==9152== at 0x4A0739E: malloc (vg_replace_malloc.c:207)
==9152== by 0x4007E4: main (example.c:36)
==9152==
==9152== LEAK SUMMARY:
==9152== definitely lost: 1,024 bytes in 1 blocks.
==9152== possibly lost: 0 bytes in 0 blocks.
==9152== still reachable: 0 bytes in 0 blocks.
==9152== suppressed: 0 bytes in 0 blocks.
The interesting bits are in the LEAK SUMMARY, which shows a definite memory leak. Its relatively easy to understand from the stack trace just above it that the leaked memory was allocated on line 36 of example.c.
Additional Resources
Quick start: http://valgrind.org/docs/manual/QuickStart.html
User manual: http://valgrind.org/docs/manual/manual.html
Memcheck manual: http://valgrind.org/docs/manual/mc-manual.html
I ran a LnL session on this topic in the BNE office. The presentation slides used during the talk are attached (if anyone is interested in doing the same talk in their office).
Real world examples
IT #274215: Memory leak occurs when snmpd is used.