Tags:
create new tag
view all tags

How to use Ghostscript with valgrind

Dumb usage

Valgrind can be used direct on pretty much any binary, but it is prone to false positives, and the backtraces it gives may not be hugely helpful. It is nonetheless a good first thing to try if you have a command line that appears to be misbehaving. Simply do:

valgrind --track-origins=yes <command line that was failing>

so, for example:

valgrind --track-origins=yes bin/gs -sDEVICE=ppmraw -r 72 -Z: -o out.ppm examples/tiger.eps

Problems with dumb usage

clist false positives

The clist works by having a buffer that is filled up with serialised commands, and written out to a file. These commands are padded to a certain size to fit within the clist, resulting in uninitialised padding bytes. At the point at which the clist writes these to a file, valgrind is forced to assume that the values in these bytes are significant, and so flags them as errors.

One solution to this would be to initialise the buffer by setting it all to known values at the start, but this would a) take extra time, and b) mask real problems.

The best solution is to use memory clists rather than file ones. The bytes are written into the memory list, and valgrind carries over the defined/undefined status of those bytes with them. When the bytes are then played back, we can correctly be informed of undefined value usage.

SSE false positives

The SSE code (for halftoning etc) writes bytes into 'wide aligned' buffers (say 32 for the purposes of argument), and then operates on them to map bytes down to bits. If we are operating on a region that is not a multiple of 32 pixels wide, we have a suffix of bytes that are left uninitialised. When the SSE code loads all 32 bytes at once and operates on them, valgrind complains that it is working on uninitialised values.

To fix this we have extra code in the system to ensure that the 32byte buffers are labelled as being defined. This code is triggered by building with the define PACIFY_VALGRIND.

Memory manager blocks

GS uses a series of memory managers within it - rather than every block going to the system memory allocator, they are typically allocated through a gs memory manager - either for the purposes of object type tracking, or for allocating within a chunk. This means that "end user visible" memory objects are typically part of larger system blocks.

Valgrind can spot over/underruns of system blocks, but without specific help, cannot spot over/underruns of such wrapped blocks. For example, If we underrun a wrapped block, we run into the extra bytes put there by the wrapping. That is enough to crash the program at some stage, but not enough to trip valgrinds detection.

To improve this, we can add some extra code into the compiled program that calls Valgrind to tell it that the wrapped blocks should be distrusted/trusted as we run. This code is triggered by building with the define HAVE_VALGRIND (as to do it all the time would fail on systems that don't have valgrind installed).

Not all allocators have currently been updated to work with this. Memento has, but other allocators may need to be looked at in future. This problem will not cause false positives, but might cause valgrind to miss things that it shouldn't.

Better builds

The easiest way to build with these options enabled is to use one of the following families of make targets:

  • vg, gsvg, gpcl6vg, gxpsvg, gpdlvg (and vgclean). These build (and clean) release builds with the extra valgrind magic baked in.

  • debugvg, gsdebugvg, gpcl6debugvg, gxpsdebugvg, gpdldebugvg (and debugvgclean). These build (and clean) debug builds with the extra valgrind magic baked in.

  • mementovg, gsmementovg, gpcl6mementovg, gxpsmementovg, gpdlmementovg (and mementovgclean). These build (and clean) memento builds with the extra valgrind magic baked in.

Sadly, it is in the nature of such memory access issues that building a new version can move stuff around just enough to not trip the detection, but if you can reproduce the problem in a debug (or memento) build you'll find it much easier to debug than in a release build.

Smarter use of Valgrind.

With the correct use of valgrind, you can run it with gdb, stop it when problems are found, and examine values etc.

By default this requires 2 terminal windows, which I shall call (in a fit of originality) 1 and 2. (To see how to do it in just 1 terminal window, see the next section.)

In terminal 1, we run the actual code, with a slightly modified command:

valgrind --track-origins=yes --vgdb=yes --vgdb-error=0 <the gs command to run>

This will start up, and then pause, saying something like:

==26849== Memcheck, a memory error detector
==26849== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==26849== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==26849== Command: debugbin/gpcl6 -sOutputFile=out.pdf -sDEVICE=pdfwrite -r75 -Z: -dNOPAUSE -dBATCH -dClusterJob ./tests_private/pcl/pcl5cats/Subset/QX509SC2.BIN
==26849==
==26849== (action at startup) vgdb me ...
==26849==
==26849== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==26849==   /path/to/gdb debugbin/gpcl6
==26849== and then give GDB the following command
==26849==   target remote | /usr/lib/valgrind/../../bin/vgdb --pid=26849
==26849== --pid is optional if only one valgrind process is running

In terminal 2, we then start gdb:

gdb debugbin/gpcl6

and then we tell gdb to connect to valgrind, and talk to the emulated process there.

target remote | /usr/lib/valgrind/../../bin/vgdb --pid=26849

(where that command is lifted from the output in terminal 1.)

At this point, the program is ready to go, and you can use standard gdb commands to add breakpoints, for example:

break Memento_breakpoint

or you can let the program run:

continue

(or 'c' for short).

You should now see the program continue to run in terminal 1. As soon as valgrind hits a problem, it'll output the information in terminal 1, and stop in terminal 2 for you to be able to use normal commands to examine the stack and values.

A typical message (in terminal 1) might be:

==26849== Conditional jump or move depends on uninitialised value(s)
==26849==    at 0x7DEF61: pdf_text_set_cache (gdevpdtt.c:158)
==26849==    by 0x95471B: gs_text_setcachedevice (gstext.c:670)
==26849==    by 0x90ED43: gs_setcachedevice_double (gschar.c:54)
==26849==    by 0x90EDB9: gs_setcachedevice_float (gschar.c:65)
==26849==    by 0xAD1206: pl_bitmap_build_char (plchar.c:459)
==26849==    by 0xA2369D: show_proceed (gxchar.c:1238)
==26849==    by 0xA21F51: continue_show (gxchar.c:767)
==26849==    by 0xA21EE9: gx_show_text_process (gxchar.c:744)
==26849==    by 0x9543C9: gs_text_process (gstext.c:575)
==26849==    by 0x7E7C41: pdf_text_process (gdevpdtt.c:3328)
==26849==    by 0x9543C9: gs_text_process (gstext.c:575)
==26849==    by 0xAFD791: show_char_foreground (pctext.c:507)
==26849==  Uninitialised value was created by a heap allocation
==26849==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26849==    by 0x9422E8: gs_heap_alloc_bytes (gsmalloc.c:193)
==26849==    by 0x6DF574: chunk_obj_alloc (gsmchunk.c:909)
==26849==    by 0x6DF7D4: chunk_alloc_bytes_immovable (gsmchunk.c:971)
==26849==    by 0xB4041B: pl_main_alloc_instance (plmain.c:696)
==26849==    by 0xACD2F0: gsapi_new_instance (plapi.c:59)
==26849==    by 0xB3EF8C: main (realmain.c:30)
==26849==
==26849== (action on error) vgdb me ...

In terminal 2, I would see:

Program received signal SIGTRAP, Trace/breakpoint trap.
0x00000000007def61 in pdf_text_set_cache (pte=0xaf74f50, pw=0xffefff180, control=TEXT_SET_CACHE_DEVICE) at ./devices/vector/gdevpdtt.c:158
158             if ((glyph != GS_NO_GLYPH && penum->output_char_code != GS_NO_CHAR) || !pdev->PS_accumulator) {

So, one of those values is unknown, but which? Asking gdb to print the values doesn't help:

(gdb) print glyph
$1 = 32
(gdb) print penum->output_char_code
$2 = 0
(gdb) print pdev->PS_accumulator
$3 = 0

One way is to add some printfs, recompile, rerun and see which one of the printfs throws the valgrind error, but this is a painful and slow process. Fortunately there is another way:

(gdb) print &glyph
$4 = (gs_glyph *) 0xffefff028
(gdb) monitor check_memory defined 0xffefff028 8
Address 0xFFEFFF028 len 8 defined
 Address 0xffefff028 is on thread 1's stack
 in frame #0, created by pdf_text_set_cache (gdevpdtt.c:92)

That looks OK. Let's try the second candidate:

(gdb) print &penum->output_char_code
$5 = (gs_char *) 0xaf75228
(gdb) print sizeof(penum->output_char_code)
$6 = 8
(gdb) monitor check_memory defined 0xaf75228 8
Address 0xAF75228 len 8 not defined:
Uninitialised value at 0xAF75228 was created by a heap allocation
==26849==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26849==    by 0x9422E8: gs_heap_alloc_bytes (gsmalloc.c:193)
==26849==    by 0x6DF574: chunk_obj_alloc (gsmchunk.c:909)
==26849==    by 0x6DF7D4: chunk_alloc_bytes_immovable (gsmchunk.c:971)
==26849==    by 0xB4041B: pl_main_alloc_instance (plmain.c:696)
==26849==    by 0xACD2F0: gsapi_new_instance (plapi.c:59)
==26849==    by 0xB3EF8C: main (realmain.c:30)
 Address 0xaf75228 is 56,536 bytes inside a block of size 65,584 alloc'd
==26849==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26849==    by 0x9422E8: gs_heap_alloc_bytes (gsmalloc.c:193)
==26849==    by 0x6DF574: chunk_obj_alloc (gsmchunk.c:909)
==26849==    by 0x6DF804: chunk_alloc_bytes (gsmchunk.c:977)
==26849==    by 0x6DF8CA: chunk_alloc_byte_array (gsmchunk.c:1005)
==26849==    by 0x7EC10E: copied_Encoding_alloc (gxfcopy.c:288)
==26849==    by 0x7EEF9A: copy_font_type42 (gxfcopy.c:1306)
==26849==    by 0x7F1951: gs_copy_font (gxfcopy.c:2160)
==26849==    by 0x7C9177: pdf_base_font_alloc (gdevpdtb.c:317)
==26849==    by 0x7CDEEE: pdf_font_descriptor_alloc (gdevpdtd.c:202)
==26849==    by 0x7E2833: pdf_make_font_resource (gdevpdtt.c:1516)
==26849==    by 0x7E3FF0: pdf_obtain_font_resource_encoded (gdevpdtt.c:2048)

Bingo. So penum->output_char_code is the offending value here.

In general the "monitor" command lets you talk to valgrind and ask it to do things.

(gdb) monitor help

will give you a list of things to try.

If you happen to be using a memento build, then you can make use of that too:

call Memento_find(penum)

will tell you whether penum is in an allocated (or freed) block (and when it was allocated/freed), and how big it is/was.

call Memento_info(penum)

will give you the events that a block was involved in (such as creation, reallocing, reference count changes, freeing etc), and a backtrace for each of those.

vdb.pl: A wrapper to simplify this

All this is a bit of a faff: Starting valgrind, then starting gdb, copying across the magic runes to make gdb talk to valgrind, and having to swap back and forth between terminals to look at output/control what happens next can be (depending on your setup at least) awkward and error prone.

To that end, I've written a small(ish) bit of perl to help. vdb.pl is attached to this page, (helpfully renamed to vbd.pl.txt by the twiki in a fit of security pantomime).

Now, you can just do:

vdb.pl membin/gs -o out.ppm -sDEVICE=ppmraw examples/tiger.eps

and it will invoke both valgrind and gdb, copy across the magic beans automatically, and then multiplex both valgrind and gdb output into a single stream.

The downside to this script is that gdb realises it is being controlled through a pipe and so won't do terminal editing. If anyone knows how to work around this, please let me know. I think it's something to do with using a pseudo tty, (pty in perl), but I don't know how to hook that up to the open3 call.

New version: vdb4.pl uses rlwrap (install it if you haven't got it) to give you command-line editing too.

-- Robin Watts - 2019-01-14

Comments

Topic attachments
I Attachment History Action Size Date Who Comment
Texttxt vdb.pl.txt r2 r1 manage 3.5 K 2019-01-22 - 16:53 RobinWatts vdb.pl - wrapper for valgrind/gdb
Texttxt vdb4.pl.txt r1 manage 10.8 K 2019-07-15 - 17:03 RobinWatts vdb4.pl.txt
Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r7 - 2019-07-15 - RobinWatts
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright 2014 Artifex Software Inc