www.digitalmars.com

D Programming Language 2.0


Last update Tue Nov 27 21:24:12 2007

D Application Binary Interface

A D implementation that conforms to the D ABI (Application Binary Interface) will be able to generate libraries, DLL's, etc., that can interoperate with D binaries built by other implementations.

Most of this specification remains TBD (To Be Defined).

C ABI

The C ABI referred to in this specification means the C Application Binary Interface of the target system. C and D code should be freely linkable together, in particular, D code shall have access to the entire C ABI runtime library.

Basic Types

TBD

Structs

Conforms to the target's C ABI struct layout.

Classes

An object consists of:

offset contents
0 pointer to vtable
ptrsize monitor
ptrsize*2... non-static members

The vtable consists of:

offset contents
0 pointer to instance of ClassInfo
ptrsize... pointers to virtual member functions

The class definition:

class XXXX
{
    ....
};

Generates the following:

Interfaces

TBD

Arrays

A dynamic array consists of:

offset contents
0 array dimension
size_t pointer to array data

A dynamic array is declared as:

type[] array;

whereas a static array is declared as:

type[dimension] array;

Thus, a static array always has the dimension statically available as part of the type, and so it is implemented like in C. Static array's and Dynamic arrays can be easily converted back and forth to each other.

Associative Arrays

Associative arrays consist of a pointer to an opaque, implementation defined type. The current implementation is contained in phobos/internal/aaA.d.

Reference Types

D has reference types, but they are implicit. For example, classes are always referred to by reference; this means that class instances can never reside on the stack or be passed as function parameters.

When passing a static array to a function, the result, although declared as a static array, will actually be a reference to a static array. For example:

int[3] abc;

Passing abc to functions results in these implicit conversions:

void func(int[3] array); // actually <reference to><array[3] of><int>
void func(int* p);       // abc is converted to a pointer
			 // to the first element
void func(int[] array);	 // abc is converted to a dynamic array

Name Mangling

D accomplishes typesafe linking by mangling a D identifier to include scope and type information.

MangledName:
    _D QualifiedName Type
    _D QualifiedName M Type

QualifiedName:
    SymbolName
    SymbolName QualifiedName

SymbolName:
    LName
    TemplateInstanceName

The M means that the symbol is a function that requires a this pointer.

Template Instance Names have the types and values of its parameters encoded into it:

TemplateInstanceName:
     __T LName TemplateArgs Z

TemplateArgs:
    TemplateArg
    TemplateArg TemplateArgs

TemplateArg:
    T Type
    V Type Value
    S LName

Value:
    n
    Number
    N Number
    e HexFloat
    c HexFloat c HexFloat
    A Number Value...

HexFloat:
    NAN
    INF
    NINF
    N HexDigits P Exponent
    HexDigits P Exponent

Exponent:
    N Number
    Number

HexDigits:
    HexDigit
    HexDigit HexDigits

HexDigit:
    Digit
    A
    B
    C
    D
    E
    F
n
is for null arguments.
Number
is for positive numeric literals (including character literals).
N Number
is for negative numeric literals.
e HexFloat
is for real and imaginary floating point literals.
c HexFloat c HexFloat
is for complex floating point literals.
Width Number _ HexDigits
Width is whether the characters are 1 byte (a), 2 bytes (w) or 4 bytes (d) in size. Number is the number of characters in the string. The HexDigits are the hex data for the string.
A Number Value...
An array literal. Value is repeated Number times.
Name:
    Namestart
    Namestart Namechars

Namestart:
    _
    Alpha

Namechar:
    Namestart
    Digit

Namechars:
    Namechar
    Namechar Namechars

A Name is a standard D identifier.

LName:
    Number Name

Number:
    Digit
    Digit Number

Digit:
    0
    1
    2
    3
    4
    5
    6
    7
    8
    9

An LName is a name preceded by a Number giving the number of characters in the Name.

Type Mangling

Types are mangled using a simple linear scheme:

Type:
    Const
    Invariant
    TypeArray
    TypeSarray
    TypeAarray
    TypePointer
    TypeFunction
    TypeIdent
    TypeClass
    TypeStruct
    TypeEnum
    TypeTypedef
    TypeDelegate
    TypeNone
    TypeVoid
    TypeByte
    TypeUbyte
    TypeShort
    TypeUshort
    TypeInt
    TypeUint
    TypeLong
    TypeUlong
    TypeFloat
    TypeDouble
    TypeReal
    TypeIfloat
    TypeIdouble
    TypeIreal
    TypeCfloat
    TypeCdouble
    TypeCreal
    TypeBool
    TypeChar
    TypeWchar
    TypeDchar
    TypeTuple

Const:
    x Type

Invariant:
    y Type

TypeArray:
    A Type

TypeSarray:
    G Number Type

TypeAarray:
    H Type Type

TypePointer:
    P Type

TypeFunction:
    CallConvention Arguments ArgClose Type

CallConvention:
    F
    U
    W
    V
    R

Arguments:
    Argument
    Argument Arguments

Argument:
    Type
    J Type
    K Type
    L Type

ArgClose
    X
    Y
    Z

TypeIdent:
    I LName

TypeClass:
    C LName

TypeStruct:
    S LName

TypeEnum:
    E LName

TypeTypedef:
    T LName

TypeDelegate:
    D TypeFunction

TypeNone:
    n

TypeVoid:
    v

TypeByte:
    g

TypeUbyte:
    h

TypeShort:
    s

TypeUshort:
    t

TypeInt:
    i

TypeUint:
    k

TypeLong:
    l

TypeUlong:
    m

TypeFloat:
    f

TypeDouble:
    d

TypeReal:
    e

TypeIfloat:
    o

TypeIdouble:
    p

TypeIreal:
    j

TypeCfloat:
    q

TypeCdouble:
    r

TypeCreal:
    c

TypeBool:
    b

TypeChar:
    a

TypeWchar:
    u

TypeDchar:
    w

TypeTuple:
    B Number Arguments

Function Calling Conventions

The extern (C) calling convention matches the C calling convention used by the supported C compiler on the host system. The extern (D) calling convention for x86 is described here.

Register Conventions

Return Value

Parameters

The parameters to the non-variadic function:

	foo(a1, a2, ..., an);

are passed as follows:

a1
a2
...
an
hidden
this

where hidden is present if needed to return a struct value, and this is present if needed as the this pointer for a member function or the context pointer for a nested function.

The last parameter is passed in EAX rather than being pushed on the stack if the following conditions are met:

Parameters are always pushed as multiples of 4 bytes, rounding upwards, so the stack is always aligned on 4 byte boundaries. They are pushed most significant first. out and ref are passed as pointers. Static arrays are passed as pointers to their first element. On Windows, a real is pushed as a 10 byte quantity, a creal is pushed as a 20 byte quantity. On Linux, a real is pushed as a 12 byte quantity, a creal is pushed as two 12 byte quantities. The extra two bytes of pad occupy the 'most significant' position.

The callee cleans the stack.

The parameters to the variadic function:

	void foo(int p1, int p2, int[] p3...)
	foo(a1, a2, ..., an);

are passed as follows:

p1
p2
a3
hidden
this

The variadic part is converted to a dynamic array and the rest is the same as for non-variadic functions.

The parameters to the variadic function:

	void foo(int p1, int p2, ...)
	foo(a1, a2, a3, ..., an);

are passed as follows:

an
...
a3
a2
a1
_arguments
hidden
this

The caller is expected to clean the stack. _argptr is not passed, it is computed by the callee.

Exception Handling

Windows

Conforms to the Microsoft Windows Structured Exception Handling conventions.

Linux

Uses static address range/handler tables. TBD

Garbage Collection

The interface to this is found in phobos/internal/gc.

Runtime Helper Functions

These are found in phobos/internal.

Module Initialization and Termination

TBD

Unit Testing

TBD

Symbolic Debugging

D has types that are not represented in existing C or C++ debuggers. These are dynamic arrays, associative arrays, and delegates. Representing these types as structs causes problems because function calling conventions for structs are often different than that for these types, which causes C/C++ debuggers to misrepresent things. For these debuggers, they are represented as a C type which does match the calling conventions for the type. The dmd compiler will generate only C symbolic type info with the -gc compiler switch.

Types for C Debuggers
D type C representation
dynamic array unsigned long long
associative array void*
delegate long long
dchar unsigned long

For debuggers that can be modified to accept new types, the following extensions help them fully support the types.

Codeview Debugger Extensions

The D dchar type is represented by the special primitive type 0x78.

D makes use of the Codeview OEM generic type record indicated by LF_OEM (0x0015). The format is:

Codeview OEM Extensions for D
field size 2 2 2 2 2 2
D Type Leaf Index OEM Identifier recOEM num indices type index type index
dynamic array LF_OEM OEM 1 2 @index @element
associative array LF_OEM OEM 2 2 @key @element
delegate LF_OEM OEM 3 2 @this @function


OEM 0x42
index type index of array index
key type index of key
element type index of array element
this type index of context pointer
function type index of function

These extensions can be pretty-printed by obj2asm.

The Ddbg debugger supports them.

Dwarf Debugger Extensions

The following leaf types are added:

Dwarf Extensions for D
D type Identifier Value Format
dynamic array DW_TAG_darray_type 0x41 DW_AT_type is element type
associative array DW_TAG_aarray_type 0x42 DW_AT_type, is element type, DW_AT_containing_type key type
delegate DW_TAG_delegate_type 0x43 DW_AT_type, is function type, DW_AT_containing_type is 'this' type

These extensions can be pretty-printed by dumpobj.

The ZeroBUGS debugger supports them.