NAME¶
libunwind-dynamic -- libunwind-support for runtime-generated code
INTRODUCTION¶
For libunwind to do its job, it needs to be able to reconstruct the
frame
state of each frame in a call-chain. The frame state describes the subset
of the machine-state that consists of the
frame registers (typically
the instruction-pointer and the stack-pointer) and all callee-saved registers
(preserved registers). The frame state describes each register either by
providing its current value (for frame registers) or by providing the location
at which the current value is stored (callee-saved registers).
For statically generated code, the compiler normally takes care of emitting
unwind-info which provides the minimum amount of information needed to
reconstruct the frame-state for each instruction in a procedure. For
dynamically generated code, the runtime code generator must use the dynamic
unwind-info interface provided by libunwind to supply the equivalent
information. This manual page describes the format of this information in
detail.
For the purpose of this discussion, a
procedure is defined to be an
arbitrary piece of
contiguous code. Normally, each procedure directly
corresponds to a function in the source-language but this is not strictly
required. For example, a runtime code-generator could translate a given
function into two separate (discontiguous) procedures: one for
frequently-executed (hot) code and one for rarely-executed (cold) code.
Similarly, simple source-language functions (usually leaf functions) may get
translated into code for which the default unwind-conventions apply and for
such code, it is not strictly necessary to register dynamic unwind-info.
A procedure logically consists of a sequence of
regions. Regions are
nested in the sense that the frame state at the end of one region is, by
default, assumed to be the frame state for the next region. Each region is
thought of as being divided into a
prologue, a
body, and an
epilogue. Each of them can be empty. If non-empty, the prologue sets up
the frame state for the body. For example, the prologue may need to allocate
some space on the stack and save certain callee-saved registers. The body
performs the actual work of the procedure but does not change the frame state
in any way. If non-empty, the epilogue restores the previous frame state and
as such it undoes or cancels the effect of the prologue. In fact, a single
epilogue may undo the effect of the prologues of several (nested) regions.
We should point out that even though the prologue, body, and epilogue are
logically separate entities, optimizing code-generators will generally
interleave instructions from all three entities. For this reason, the dynamic
unwind-info interface of libunwind makes no distinction whatsoever between
prologue and body. Similarly, the exact set of instructions that make up an
epilogue is also irrelevant. The only point in the epilogue that needs to be
described explicitly by the dynamic unwind-info is the point at which the
stack-pointer gets restored. The reason this point needs to be described is
that once the stack-pointer is restored, all values saved in the deallocated
portion of the stack frame become invalid and hence libunwind needs to know
about it. The portion of the frame state not saved on the stack is assume to
remain valid through the end of the region. For this reason, there is usually
no need to describe instructions which restore the contents of callee-saved
registers.
Within a region, each instruction that affects the frame state in some fashion
needs to be described with an operation descriptor. For this purpose, each
instruction in the region is assigned a unique index. Exactly how this index
is derived depends on the architecture. For example, on RISC and EPIC-style
architecture, instructions have a fixed size so it's possible to simply number
the instructions. In contrast, most CISC use variable-length instruction
encodings, so it is usually necessary to use a byte-offset as the index. Given
the instruction index, the operation descriptor specifies the effect of the
instruction in an abstract manner. For example, it might express that the
instruction stores calle-saved register r1 at offset 16 in the stack frame.
PROCEDURES¶
A runtime code-generator registers the dynamic unwind-info of a procedure by
setting up a structure of type unw_dyn_info_t and calling _U_dyn_register(),
passing the address of the structure as the sole argument. The members of the
unw_dyn_info_t structure are described below:
- void *next
-
Private to libunwind. Must not be used by the application.
- void *prev
-
Private to libunwind. Must not be used by the application.
- unw_word_t start_ip
-
The start-address of the instructions of the procedure (remember: procedure
are defined to be contiguous pieces of code, so a single code-range is
sufficient).
- unw_word_t end_ip
-
The end-address of the instructions of the procedure (non-inclusive, that
is, end_ip-start_ip is the size of the procedure in bytes).
- unw_word_t gp
-
The global-pointer value in use for this procedure. The exact meaing of the
global-pointer is architecture-specific and on some architecture, it is
not used at all.
- int32_t format
-
The format of the unwind-info. This member can be one of
UNW_INFO_FORMAT_DYNAMIC, UNW_INFO_FORMAT_TABLE, or
UNW_INFO_FORMAT_REMOTE_TABLE.
- union u
-
This union contains one sub-member structure for every possible unwind-info
format:
- unw_dyn_proc_info_t pi
-
This member is used for format UNW_INFO_FORMAT_DYNAMIC.
- unw_dyn_table_info_t ti
-
This member is used for format UNW_INFO_FORMAT_TABLE.
- unw_dyn_remote_table_info_t rti
-
This member is used for format UNW_INFO_FORMAT_REMOTE_TABLE.
The format of these sub-members is described in detail below.
This is the preferred dynamic unwind-info format and it is generally the one
used by full-blown runtime code-generators. In this format, the details of a
procedure are described by a structure of type unw_dyn_proc_info_t. This
structure contains the following members:
- unw_word_t name_ptr
-
The address of a (human-readable) name of the procedure or 0 if no such
name is available. If non-zero, The string stored at this address must be
ASCII NUL terminated. For source languages that use name-mangling (such as
C++ or Java) the string stored at this address should be the
demangled version of the name.
- unw_word_t handler
-
The address of the personality-routine for this procedure.
Personality-routines are used in conjunction with exception handling. See
the C++ ABI draft (http://www.codesourcery.com/cxx-abi/) for an overview
and a description of the personality routine. If the procedure has no
personality routine, handler must be set to 0.
- uint32_t flags
-
A bitmask of flags. At the moment, no flags have been defined and this
member must be set to 0.
- unw_dyn_region_info_t *regions
-
A NULL-terminated linked list of region-descriptors. See section ``Region
descriptors'' below for more details.
This format is generally used when the dynamically generated code was derived
from static code and the unwind-info for the dynamic and the static versions
is identical. For example, this format can be useful when loading
statically-generated code into an address-space in a non-standard fashion
(i.e., through some means other than dlopen()). In this format, the details of
a group of procedures is described by a structure of type unw_dyn_table_info.
This structure contains the following members:
- unw_word_t name_ptr
-
The address of a (human-readable) name of the procedure or 0 if no such
name is available. If non-zero, The string stored at this address must be
ASCII NUL terminated. For source languages that use name-mangling (such as
C++ or Java) the string stored at this address should be the
demangled version of the name.
- unw_word_t segbase
-
The segment-base value that needs to be added to the segment-relative
values stored in the unwind-info. The exact meaning of this value is
architecture-specific.
- unw_word_t table_len
-
The length of the unwind-info (table_data) counted in units of words
(unw_word_t).
- unw_word_t table_data
-
A pointer to the actual data encoding the unwind-info. The exact format is
architecture-specific (see architecture-specific sections below).
The remote table-info format has the same basic purpose as the regular
table-info format. The only difference is that when libunwind uses the
unwind-info, it will keep the table data in the target address-space (which
may be remote). Consequently, the type of the table_data member is unw_word_t
rather than a pointer. This implies that libunwind will have to access the
table-data via the address-space's access_mem() call-back, rather than through
a direct memory reference.
From the point of view of a runtime-code generator, the remote table-info format
offers no advantage and it is expected that such generators will describe
their procedures either with the proc-info format or the normal table-info
format. The main reason that the remote table-info format exists is to enable
the address-space-specific find_proc_info() callback (see
unw_create_addr_space(3)) to return unwind tables whose data remains in remote
memory. This can speed up unwinding (e.g., for a debugger) because it reduces
the amount of data that needs to be loaded from remote memory.
REGIONS DESCRIPTORS¶
A region descriptor is a variable length structure that describes how each
instruction in the region affects the frame state. Of course, most
instructions in a region usualy do not change the frame state and for those,
nothing needs to be recorded in the region descriptor. A region descriptor is
a structure of type unw_dyn_region_info_t and has the following members:
- unw_dyn_region_info_t *next
-
A pointer to the next region. If this is the last region, next is
NULL.
- int32_t insn_count
-
The length of the region in instructions. Each instruction is assumed to
have a fixed size (see architecture-specific sections for details). The
value of insn_count may be negative in the last region of a procedure
(i.e., it may be negative only if next is NULL). A negative value
indicates that the region covers the last N instructions of the
procedure, where N is the absolute value of insn_count.
- uint32_t op_count
-
The (allocated) length of the op_count array.
- unw_dyn_op_t op
-
An array of dynamic unwind directives. See Section ``Dynamic unwind
directives'' for a description of the directives.
A region descriptor with an insn_count of zero is an
empty region and
such regions are perfectly legal. In fact, empty regions can be useful to
establish a particular frame state before the start of another region.
A single region list can be shared across multiple procedures provided those
procedures share a common prologue and epilogue (their bodies may differ, of
course). Normally, such procedures consist of a canned prologue, the body, and
a canned epilogue. This could be described by two regions: one covering the
prologue and one covering the epilogue. Since the body length is variable, the
latter region would need to specify a negative value in insn_count such that
libunwind knows that the region covers the end of the procedure (up to the
address specified by end_ip).
The region descriptor is a variable length structure to make it possible to
allocate all the necessary memory with a single memory-allocation request. To
facilitate the allocation of a region descriptors libunwind provides a helper
routine with the following synopsis:
size_t _U_dyn_region_size(int op_count);
This routine returns the number of bytes needed to hold a region descriptor with
space for op_count unwind directives. Note that the length of the op array
does not have to match exactly with the number of directives in a region.
Instead, it is sufficient if the op array contains at least as many entries as
there are directives, since the end of the directives can always be indicated
with the UNW_DYN_STOP directive.
DYNAMIC UNWIND DIRECTIVES¶
A dynamic unwind directive describes how the frame state changes at a particular
point within a region. The description is in the form of a structure of type
unw_dyn_op_t. This structure has the following members:
- int8_t tag
-
The operation tag. Must be one of the unw_dyn_operation_t values described
below.
- int8_t qp
-
The qualifying predicate that controls whether or not this directive is
active. This is useful for predicated architecturs such as IA-64 or ARM,
where the contents of another (callee-saved) register determines whether
or not an instruction is executed (takes effect). If the directive is
always active, this member should be set to the manifest constant
_U_QP_TRUE (this constant is defined for all architectures, predicated or
not).
- int16_t reg
-
The number of the register affected by the instruction.
- int32_t when
-
The region-relative number of the instruction to which this directive
applies. For example, a value of 0 means that the effect described by this
directive has taken place once the first instruction in the region has
executed.
- unw_word_t val
-
The value to be applied by the operation tag. The exact meaning of this
value varies by tag. See Section ``Operation tags'' below.
It is perfectly legitimate to specify multiple dynamic unwind directives with
the same when value, if a particular instruction has a complex effect on the
frame state.
Empty regions by definition contain no actual instructions and as such the
directives are not tied to a particular instruction. By convention, the when
member should be set to 0, however.
There is no need for the dynamic unwind directives to appear in order of
increasing when values. If the directives happen to be sorted in that order,
it may result in slightly faster execution, but a runtime code-generator
should not go to extra lengths just to ensure that the directives are sorted.
IMPLEMENTATION NOTE: should libunwind implementations for certain architectures
prefer the list of unwind directives to be sorted, it is recommended that such
implementations first check whether the list happens to be sorted already and,
if not, sort the directives explicitly before the first use. With this
approach, the overhead of explicit sorting is only paid when there is a real
benefit and if the runtime code-generator happens to generated sorted lists
naturally, the performance penalty is limited to a simple O(N) check.
The possible operation tags are defined by enumeration type unw_dyn_operation_t
which defines the following values:
- UNW_DYN_STOP
-
Marks the end of the dynamic unwind directive list. All remaining entries
in the op array of the region-descriptor are ignored. This tag is
guaranteed to have a value of 0.
- UNW_DYN_SAVE_REG
-
Marks an instruction which saves register reg to register val.
- UNW_DYN_SPILL_FP_REL
-
Marks an instruction which spills register reg to a frame-pointer-relative
location. The frame-pointer-relative offset is given by the value stored
in member val. See the architecture-specific sections for a description of
the stack frame layout.
- UNW_DYN_SPILL_SP_REL
-
Marks an instruction which spills register reg to a stack-pointer-relative
location. The stack-pointer-relative offset is given by the value stored
in member val. See the architecture-specific sections for a description of
the stack frame layout.
- UNW_DYN_ADD
-
Marks an instruction which adds the constant value val to register reg. To
add subtract a constant value, store the two's-complement of the value in
val. The set of registers that can be specified for this tag is described
in the architecture-specific sections below.
- UNW_DYN_POP_FRAMES
-
.PP
- UNW_DYN_LABEL_STATE
-
.PP
- UNW_DYN_COPY_STATE
-
.PP
- UNW_DYN_ALIAS
-
.PP unw_dyn_op_t
_U_dyn_op_save_reg(); _U_dyn_op_spill_fp_rel(); _U_dyn_op_spill_sp_rel();
_U_dyn_op_add(); _U_dyn_op_pop_frames(); _U_dyn_op_label_state();
_U_dyn_op_copy_state(); _U_dyn_op_alias(); _U_dyn_op_stop();
IA-64 SPECIFICS¶
- meaning of segbase member in table-info/table-remote-info format - format of
table_data in table-info/table-remote-info format - instruction size: each
bundle is counted as 3 instructions, regardless of template (MLX) - describe
stack-frame layout, especially with regards to sp-relative and fp-relative
addressing - UNW_DYN_ADD can only add to ``sp'' (always a negative value); use
POP_FRAMES otherwise
SEE ALSO¶
libunwind(3),
_U_dyn_register(3),
_U_dyn_cancel(3)
AUTHOR¶
David Mosberger-Tang
Email:
dmosberger@gmail.com
WWW:
http://www.nongnu.org/libunwind/.