Type tags (SBCL Internals)

6.1 Type tags

The in-memory representation of Lisp data includes type information about each object. This type information takes the form of a lowtag in the low bits of each pointer to heap space, a widetag for each boxed immediate value and a header (also with a widetag) at the start of the allocated space for each object. These tags are used to inform both the GC and Lisp code about the type and allocated size of Lisp objects.

6.1.1 Lowtags

Objects allocated on the Lisp heap are aligned to a double-word boundary, leaving the low-order bits (which would normally identify a particular octet within the first two words) available for use to hold type information. This turns out to be three bits on 32-bit systems and four bits on 64-bit systems.

Of these 8 or 16 tags, we have some constraints for allocation:

We need 6 of the low 8 bits of the word for widetags, meaning that one out of every four lowtags must be an other-immediate lowtag.
We have four pointer types. Instance (struct and CLOS) pointers, function pointers, list pointers, and other pointers.
fixnums are required to have their lowtags be comprised entirely of zeros.
There are additional constraints around the ordering of the pointer types, particularly with respect to list pointers (the NIL-cons hack).

Complicating this issue is that while the lowtag space is three or four bits wide, some of the lowtags are effectively narrower. The other-immediate tags effectively have a two-bit lowtag, and fixnums have historically been one bit narrower than the other lowtags (thus even-fixnum-lowtag and odd-fixnum-lowtag) though with the recent work on wider fixnums on 64-bit systems this is no longer necessarily so.

The lowtags are specified in src/compiler/generic/early-objdef.lisp.

6.1.1.1 Fixnums

Fixnums are signed integers represented as immediate values. In SBCL, these integers are (- n-word-bits n-fixnum-tag-bits) bits wide, stored in the most-significant section of a machine word.

The reason that fixnum tags are required to have the low n-fixnum-tag-bits as zeros is that it allows for addition and subtraction to be performed using native machine instructions directly, and multiplication and division can be performed likewise using a simple shift instruction to compensate for the effect of the tag.

6.1.1.2 Other-immediates

Other-immediates are the lowtag part of widetag values. Due to the constraints of widetag allocation, one out of every four lowtags must be a widetag (alternately, the width of the other-immediate lowtag is two bits).

6.1.1.3 Pointers

There are four different pointer lowtags, largely for optimization purposes.

We have a distinct list pointer tag so that we can do a listp test by simply checking the pointer tag instead of needing to retrieve a header word for each cons cell. This effectively halves the memory cost of cons cells.
We have a distinct instance pointer tag so that we do not need to check a header word for each instance when doing a type check. This saves a memory access for retrieving the class of an instance.
We have a distinct function pointer tag so that we do not need to check a header word to determine if a given pointer is directly funcallable (that is, if the pointer is to a closure, a simple-fun, or a funcallable-instance). This saves a memory access in the type test prior to funcall or apply of a function object.
We have one last pointer tag for everything else. Obtaining further type information from these pointers requires fetching the header word and dispatching on the widetag.

6.1.2 Widetags

Widetags are used for three purposes. First, to provide type information for immediate (non-pointer) data such as characters. Second, to provide “marker” values for things such as unbound slots. Third, to provide type information for objects stored on the heap.

Because widetags are used for immediate data they must have a lowtag component. This ends up being the other-immediate lowtags. For various reasons it was deemed convenient for widetags to be no more than eight bits wide, and with 27 or more distinct array types (depending on build-time configuration), seven numeric types, markers, and non-numeric heap object headers there ends up being more than 32 widetags required (though less than 64). This combination of factors leads to the requirement that one out of every four lowtags be an other-immediate lowtag.

As widetags are involved in type tests for non-CLOS objects, their allocation is carefully arranged to allow for certain type tests to be cheaper than they might otherwise be.

The numeric types are arranged to make rational, float, real, complex and number type tests become range tests on the widetag.
The array types are arranged to make various type tests become range tests on the widetag.
The string types have disjoint ranges, but have been arranged so that their ranges differ only by one bit, allowing the stringp type test to become a masking operation followed by a range test or a masking operation followed by a simple comparison.
There may be other clevernesses, these are just what can be found through reading the comments above the widetag definition.

The widetags are specified in src/compiler/generic/early-objdef.lisp.