Using Stub Objects
- by user9154181
Having told the long and winding tale of where stub objects came
from and how we use them to build Solaris, I'd like to focus now
on the the nuts and bolts of building and using them.
The following new features were added to the Solaris link-editor (ld)
to support the production and use of stub objects:
-z stub
This new command line option informs ld that it is to build a
stub object rather than a normal object. In this mode, it
accepts the same command line arguments as usual, but will
quietly ignore any objects and sharable object dependencies.
STUB_OBJECT Mapfile Directive
In order to build a stub version of an object, its mapfile must
specify the STUB_OBJECT directive. When producing a non-stub
object, the presence of STUB_OBJECT causes the link-editor
to perform extra validation to ensure that the stub and non-stub
objects will be compatible.
ASSERT Mapfile Directive
All data symbols exported from the object must have
an ASSERT symbol directive in the mapfile that declares
them as data and supplies the size, binding, bss attributes, and
symbol aliasing details. When building the stub objects, the
information in these ASSERT directives is used to create
the data symbols. When building the real object, these
ASSERT directives will ensure that the real object matches
the linking interface presented by the stub.
Although ASSERT was added to the link-editor in order to
support stub objects, they are a general purpose feature that
can be used independently of stub objects. For instance
you might choose to use an ASSERT directive if you have a
symbol that must have a specific address in order for the object
to operate properly and you want to automatically ensure that
this will always be the case.
The material
presented here is derived from a document I originally wrote
during the development effort, which had the dual goals of providing
supplemental materials for the stub
object PSARC case, and as a set of edits that were eventually applied to
the Oracle Solaris Linker and Libraries Manual (LLM).
The Solaris 11 LLM contains this information in a more polished form.
Stub Objects
A stub object
is a shared object, built entirely from mapfiles, that supplies the same
linking interface as the real object, while containing no code or data.
Stub objects cannot be used at runtime. However, an application
can be built against a stub object, where the stub object provides
the real object name to be used at runtime, and then use the real
object at runtime.
When building a stub object, the link-editor ignores any object or library
files specified on the command line, and these files need not exist in order
to build a stub. Since the compilation step can be omitted, and because
the link-editor has relatively little work to do, stub objects can be built
very quickly.
Stub objects can be used to solve a variety of build problems:
Speed
Modern machines, using a version of make with the ability to
parallelize operations, are capable of compiling and linking
many objects simultaneously, and doing so offers significant
speedups. However, it is typical that a given object will depend
on other objects, and that there will be a core set of objects
that nearly everything else depends on. It is necessary to impose
an ordering that builds each object before any other object that
requires it. This ordering creates bottlenecks that reduce the
amount of parallelization that is possible and limits the overall
speed at which the code can be built.
Complexity/Correctness
In a large body of code, there can be a large number of dependencies
between the various objects. The makefiles or other build descriptions
for these objects can become very complex and difficult to understand or
maintain. The dependencies can change as the system evolves.
This can cause a given set of makefiles to become slightly
incorrect over time, leading to race conditions and mysterious rare
build failures.
Dependency Cycles
It might be desirable to organize code as cooperating shared
objects, each of which draw on the resources provided by the other.
Such cycles cannot be supported in an environment where objects
must be built before the objects that use them, even though the
runtime linker is fully capable of loading and using such objects
if they could be built.
Stub shared objects offer an alternative method for building code that
sidesteps the above issues. Stub objects can be quickly built for all the
shared objects produced by the build. Then, all the real shared objects
and executables can be built in parallel, in any order, using the stub
objects to stand in for the real objects at link-time. Afterwards, the
executables and real shared objects are kept, and the stub shared
objects are discarded.
Stub objects are built from a mapfile, which must satisfy
the following requirements.
The mapfile must specify the STUB_OBJECT directive.
This directive informs the link-editor that the object can be
built as a stub object, and as such causes the link-editor to
perform validation and sanity checking intended to guarantee that
an object and its stub will always provide identical linking
interfaces.
All function and data symbols that make up the external interface
to the object must be explicitly listed in the mapfile.
The mapfile must use symbol scope
reduction ('*'), to remove any symbols not explicitly listed from
the external interface.
All global data exported from the object must have
an ASSERT symbol attribute in the mapfile to specify the
symbol type, size, and bss attributes. In the case where there
are multiple symbols that reference the same data, the ASSERT
for one of these symbols must specify the TYPE and SIZE
attributes, while the others must use the ALIAS attribute to
reference this primary symbol.
Given such a mapfile, the stub and real versions of the shared object
can be built using the same command line for each, adding the '-z stub'
option to the link for the stub object, and omiting the option
from the link for the real object.
To demonstrate these ideas, the following code implements a shared
object named idx5, which exports data from a 5 element array of integers,
with each element initialized to contain its zero-based array index.
This data is available as a global array, via an alternative alias data symbol
with weak binding, and via a functional interface.
% cat idx5.c
int _idx5[5] = { 0, 1, 2, 3, 4 };
#pragma weak idx5 = _idx5
int
idx5_func(int index)
{
if ((index 4))
return (-1);
return (_idx5[index]);
}
A mapfile is required to describe the interface provided by this shared object.
% cat mapfile
$mapfile_version 2
STUB_OBJECT;
SYMBOL_SCOPE {
_idx5 {
ASSERT { TYPE=data; SIZE=4[5] };
};
idx5 {
ASSERT { BINDING=weak; ALIAS=_idx5 };
};
idx5_func;
local:
*;
};
The following main program is used to print all the index
values available from the idx5 shared object.
% cat main.c
#include <stdio.h>
extern int _idx5[5], idx5[5], idx5_func(int);
int
main(int argc, char **argv)
{
int i;
for (i = 0; i
The following commands create a stub version of this shared object
in a subdirectory named stublib. elfdump is used to verify that the
resulting object is a stub. The command used to build the stub
differs from that of the real object only in the addition of the
-z stub option, and the use of a different output file name. This
demonstrates the ease with which stub generation can be added to an
existing makefile.
% cc -Kpic -G -M mapfile -h libidx5.so.1 idx5.c -o stublib/libidx5.so.1 -zstub
% ln -s libidx5.so.1 stublib/libidx5.so
% elfdump -d stublib/libidx5.so | grep STUB
[11] FLAGS_1 0x4000000 [ STUB ]
The main program can now be built, using the stub object to stand
in for the real shared object, and setting a runpath that will
find the real object at runtime. However, as we have not yet built the
real object, this program cannot yet be run. Attempts to
cause the system to load the stub object are rejected, as the runtime
linker knows that stub objects lack the actual code and data
found in the real object, and cannot execute.
% cc main.c -L stublib -R '$ORIGIN/lib' -lidx5 -lc
% ./a.out
ld.so.1: a.out: fatal: libidx5.so.1: open failed: No such file or directory
Killed
% LD_PRELOAD=stublib/libidx5.so.1 ./a.out
ld.so.1: a.out: fatal: stublib/libidx5.so.1: stub shared object cannot be used at runtime
Killed
We build the real object using the same command as we used to build
the stub, omitting the -z stub option, and writing the results to a different
file.
% cc -Kpic -G -M mapfile -h libidx5.so.1 idx5.c -o lib/libidx5.so.1
Once the real object has been built in the lib subdirectory, the program
can be run.
% ./a.out
[0] 0 0 0
[1] 1 1 1
[2] 2 2 2
[3] 3 3 3
[4] 4 4 4
Mapfile Changes
The version 2 mapfile syntax was extended in a number of places
to accommodate stub objects.
Conditional Input
The version 2 mapfile syntax has the ability conditionalize mapfile
input using the $if control directive. As you might imagine, these
directives are used frequently with ASSERT directives for data, because
a given data symbol will frequently have a different size in 32 or 64-bit
code, or on differing hardware such as x86 versus sparc.
The link-editor maintains an internal table of names that can be used
in the logical expressions evaluated by $if and $elif. At startup, this
table is initialized with items that describe the class of object
(_ELF32 or _ELF64) and the type of the target machine (_sparc or _x86).
We found that there were a small number of cases in the Solaris
code base in which we needed to know what kind of object we were producing,
so we added the following new predefined items in order to address that
need:
NameMeaning
......
_ET_DYNshared object
_ET_EXECexecutable object
_ET_RELrelocatable object
......
STUB_OBJECT Directive
The new STUB_OBJECT directive informs the link-editor that the object
described by the mapfile can be built as a stub object.
STUB_OBJECT;
A stub shared object is built entirely from the information
in the mapfiles supplied on the command line. When the -z stub
option is specified to build a stub
object, the presence of the STUB_OBJECT directive in a mapfile is
required, and the link-editor uses the information in symbol
ASSERT attributes to create global symbols that match those of
the real object.
When the real object is built, the presence of STUB_OBJECT
causes the link-editor to verify that the mapfiles accurately
describe the real object interface,
and that a stub object built from them will provide
the same linking interface as the real object it represents.
All function and data symbols that make up the external interface
to the object must be explicitly listed in the mapfile.
The mapfile must use symbol scope
reduction ('*'), to remove any symbols not explicitly listed from
the external interface.
All global data in the object is
required to have an ASSERT attribute that specifies the symbol type
and size.
If the ASSERT BIND attribute is
not present, the link-editor provides a default assertion that the
symbol must be GLOBAL.
If the ASSERT SH_ATTR attribute is
not present, or does not specify that the section is
one of BITS or NOBITS, the link-editor provides a default assertion
that the associated section is BITS.
All data symbols that describe the
same address and size are required to have ASSERT ALIAS attributes
specified in the mapfile. If aliased symbols are discovered that do
not have an ASSERT ALIAS specified, the link fails and no object
is produced.
These rules ensure that the mapfiles contain a description of the
real shared object's linking interface that is sufficient to produce a
stub object with a completely compatible linking interface.
SYMBOL_SCOPE/SYMBOL_VERSION ASSERT Attribute
The SYMBOL_SCOPE and SYMBOL_VERSION mapfile directives were
extended with a symbol attribute named ASSERT. The syntax for
the ASSERT attribute is as follows:
ASSERT {
ALIAS = symbol_name;
BINDING = symbol_binding;
TYPE = symbol_type;
SH_ATTR = section_attributes;
SIZE = size_value;
SIZE = size_value[count];
};
The ASSERT attribute is used to specify the expected characteristics of
the symbol. The link-editor compares the symbol characteristics
that result from the link to those given by ASSERT attributes.
If the real and asserted attributes do not agree, a fatal
error is issued and the output object is not created.
In normal use, the link editor evaluates the ASSERT attribute when
present, but does not require them, or provide default values for them.
The presence of the STUB_OBJECT directive in a mapfile alters the
interpretation of ASSERT to require them under some circumstances,
and to supply default assertions if explicit ones are not present.
See the definition of the STUB_OBJECT Directive for the details.
When the -z stub command line option is specified to build a stub object,
the information provided by ASSERT attributes is used to define the
attributes of the global symbols provided by the object.
ASSERT accepts the following:
ALIAS
Name of a previously defined symbol that this symbol is an alias for.
An alias symbol has the same type, value, and size as the main symbol.
The ALIAS attribute is mutually exclusive to the TYPE, SIZE, and
SH_ATTR attributes, and cannot be used with them. When ALIAS
is specified, the type, size, and section attributes are obtained
from the alias symbol.
BIND
Specifies an ELF symbol binding, which can be any of the
STB_ constants defined in <sys/elf.h>, with the STB_ prefix removed
(e.g. GLOBAL, WEAK).
TYPE
Specifies an ELF symbol type, which can be any of the
STT_ constants defined in <sys/elf.h>, with the STT_ prefix removed
(e.g. OBJECT, COMMON, FUNC). In addition, for compatibility with
other mapfile usage, FUNCTION and DATA can be specified, for
STT_FUNC and STT_OBJECT, respectively. TYPE is mutually exclusive to
ALIAS, and cannot be used in conjunction with it.
SH_ATTR
Specifies attributes of the section associated with the symbol.
The section_attributes that can be specified are given
in the following table:
Section AttributeMeaning
BITSSection is not of type SHT_NOBITS
NOBITSSection is of type SHT_NOBITS
SH_ATTR is mutually exclusive to
ALIAS, and cannot be used in conjunction with it.
SIZE
Specifies the expected symbol size. SIZE is mutually exclusive to
ALIAS, and cannot be used in conjunction with it.
The syntax for the size_value argument is as described
in the discussion of the
SIZE attribute below.
SIZE
The SIZE symbol attribute existed before support for stub objects
was introduced. It is used to set the size attribute of a given symbol.
This attribute results in the creation of a symbol definition.
Prior to the introduction of the ASSERT SIZE attribute, the value of
a SIZE attribute was always numeric. While attempting to apply ASSERT
SIZE to the objects in the Solaris ON consolidation, I found that many
data symbols have a size based on the natural machine wordsize
for the class of object being produced. Variables declared as
long, or as a pointer,
will be 4 bytes in size in a 32-bit object, and 8 bytes in a 64-bit object.
Initially, I employed the conditional $if directive to handle these cases
as follows:
$if _ELF32
foo { ASSERT { TYPE=data; SIZE=4 } };
bar { ASSERT { TYPE=data; SIZE=20 } };
$elif _ELF64
foo { ASSERT { TYPE=data; SIZE=8 } };
bar { ASSERT { TYPE=data; SIZE=40 } };
$else
$error UNKNOWN ELFCLASS
$endif
I found that the situation occurs frequently enough that this is
cumbersome. To simplify this case, I introduced the
idea of the addrsize symbolic name, and of a repeat count,
which together make it simple to specify machine word scalar or array
symbols.
Both the SIZE, and ASSERT SIZE attributes support this syntax:
The size_value argument can be a numeric value, or it can be
the symbolic name addrsize.
addrsize represents the size of a machine word capable
of holding a memory address. The link-editor substitutes the value
4 for addrsize when building 32-bit objects, and the value
8 when building 64-bit objects. addrsize is useful for
representing the size of pointer variables and C variables of type
long, as it automatically adjusts for 32 and 64-bit objects
without requiring the use of conditional input.
The size_value argument can be optionally suffixed with a
count value, enclosed in square brackets. If
count is present, size_value and count
are multiplied together to obtain the final size value.
Using this feature, the example above can be written more naturally as:
foo { ASSERT { TYPE=data; SIZE=addrsize } };
bar { ASSERT { TYPE=data; SIZE=addrsize[5] } };
Exported Global Data Is Still A Bad Idea
As you can see, the additional plumbing added to the Solaris
link-editor to support stub objects is minimal. Furthermore,
about 90% of that plumbing is dedicated to handling global
data.
We have long advised against global data
exported from shared objects. There are many ways in which global
data does not fit well with dynamic linking. Stub objects simply provide
one more reason to avoid this practice. It is always better to export all
data via a functional interface. You should always hide your
data, and make it available to your users via a function
that they can call to acquire the address of the data item. However,
If you do have to support global data for a stub, perhaps because you
are working with an already existing object, it is still easilily done,
as shown above.
Oracle does not like us to discuss hypothetical new features that
don't exist in shipping product, so I'll end this section with
a speculation.
It might be possible to do more in this area
to ease the difficulty of dealing with objects that have global data
that the users of the library don't need. Perhaps someday...
Conclusions
It is easy to create stub objects for most objects. If your library
only exports function symbols, all you have
to do to build a faithful stub object is to add
STUB_OBJECT;
and then to use the same link command you're currently using, with
the addition of the -z stub option.
Happy Stubbing!