[Return to Library]  [TOC]  [PREV]  SECT--  [NEXT]  [INDEX] [Help]

4    Data Manipulation

This chapter discusses the passing and storage of data. The topics included are:


[Return to Library]  [TOC]  [PREV]  SECT--  [NEXT]  [INDEX] [Help]

4.1    Data Passing

The following sections define the calling standard conventions for passing data between procedures in a call chain. An argument item represents one unit of data being passed between procedures. The following topics are covered:


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.1.1    Argument Passing Mechanisms

This Digital UNIX calling standard defines three classes of argument items according to the mechanism used to pass the argument:

Argument items are not self-defining; interpretation of each argument item depends on agreement between the calling and called procedures.

This standard does not dictate which of the three mechanisms must be used by a given language compiler. Language semantics and interoperability considerations might require different mechanisms to be used in different situations.


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.1.2    Normal Argument List Structure

The argument list in a Digital UNIX call is an ordered set of zero or more argument items, which together comprise a logically contiguous structure known as the argument item sequence. An argument item is represented in 64 bits.

An argument item can be used to pass arguments by immediate value, by reference, and by descriptor. Any combination of these mechanisms in an argument list is permitted.

Although the argument items form a logically contiguous sequence, they are, in practice, mapped to integer and floating-point registers and to memory in a fashion that can produce a physically discontiguous argument list. Registers $16 - $21 and $f16 - $f21 are used to pass the first six items of the argument item sequence. Additional argument items must be passed in a memory argument list that must be located at 0(SP) at the time of the call.

Table 4-1 specifies the standard locations in which argument items can be passed.


Table 4-1: Argument Item Locations
Argument ItemInteger RegistersFloating-point RegistersStack
$16  $f16  - 
$17  $f17  - 
$18  $f18  - 
$19  $f19  - 
$20  $f20  - 
$21  $f21  - 
7 ... n      0(SP) ... (n-7)*8(SP) 

The following general rules determine the location of any specific argument:

The argument list, including the in-memory portion, as well as the portion passed in registers, can be read from and written to by the called procedure. Therefore, the calling procedure must not make any assumptions about the validity of any part of the argument list after the completion of a call.


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.1.3    Homed Memory Argument List Structure

It is, in certain cases, useful to form a contiguous in-memory structure that includes the contents of all the formal parameter values in the program; for example, C procedures that use varying length argument lists). In nearly all these cases, a compiler can arrange to allocate and initialize this structure so that those parameter values passed in registers are placed adjacent to those parameters passed on the stack, without making a copy of the stack arguments. The storage for the parameters passed in registers is called the argument home area. (See Figure 3-1 and Figure 3-2.) Figure 4-1 shows the resulting in-memory homed argument list structure.


Figure 4-1: In-Memory Homed Argument List Structure


Generally, it is not possible to tell statically whether a particular argument is an integer or floating-point argument. Therefore, it is necessary to store integer and floating-point register argument contents in this structure. However, it is sometimes possible to determine statically that there are no floating-point arguments anywhere either in registers or on the stack. In this case, the first six entries can be omitted. To facilitate this special case, the address used to reference this structure is always the address of the first integer argument position.

The C-language type va_list is used to iterate through a variable argument list. The va_list type can be defined as follows:

    typedef struct {
        char     *base;
        int      offset;
        } va_list;

To load the next integer argument, the program reads the quadword at location (base+offset) and adds 8 to offset. To load the next floating-point argument, if offset is less than or equal to 6*8, the program reads the quadword location (base+offset-6*8). Otherwise, the program reads the quadword at location (base+offset). In both cases, the program adds 8 to offset. For details, see the file /usr/include/stdarg.h.


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.1.4    Argument Lists and High-Level Languages

High-level language functional notations for procedure call arguments are mapped into argument item sequences according to the following requirements:


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.1.5    Unused Bits in Passed Data

Whenever data is passed by value between two procedures in registers (as is the case for the first six input arguments and return values) or in memory (as is the case for arguments after the first six), the bits not used by the data are usually sign-extended or zero-extended.

Table 4-2 defines the various data type requirements for size and their extension to set or clear unused bits.


Table 4-2: Unused Bits in Passed Data
Data TypeType Designator (bytes)Data Size TypeRegister Extension TypeMemory Extension
Byte logical  BU  1  Zero64  Zero64 
Word logical  WU  2  Zero64  Zero64 
Longword logical  LU  4  Sign64  Sign64 
Quadword logical  QU  8  Data64  Data64 
Byte integer  1  Sign64  Sign64 
Word integer  2  Sign64  Sign64 
Longword integer  4  Sign64  Sign64 
Quadword integer  8  Data64  Data64 
F floating  4  Hard  Data32 
D floating  8  Hard  Data64 
G floating  8  Hard  Data64 
F floating complex  FC  2*4  2*Hard  2*Data32 
D floating complex  DC  2*8  2*Hard  2*Data64 
G floating complex  GC  2*8  2*Hard  2*Data64 
IEEE floating single S  FS  4  Hard  Data32 
IEEE floating double T  FT  8  Hard  Data64 
IEEE floating extended X  FX  16  n/a  n/a 
IEEE floating single S complex  FSC  2*4  2*Hard  2*Data32 
IEEE floating double T complex  FTC  2*8  2*Hard  2*Data64 
IEEE floating extended X complex  FXC  2*16  n/a  n/a 
Structures  N/A    Nostd  Nostd 
Small arrays of 8 bytes or less  N/A  [le ] 8  Nostd  Nostd 
32-bit address  N/A  4  Sign64  Sign64 
64-bit address  N/A  8  Data64  Data64 

The following table contains the definitions for the extension type symbols used in Table 4-2:


Sign Extension TypeDefinition
Sign32  Sign-extended to 32 bits. The state of bits <63:32> is unpredictable. 
Sign64  Sign-extended to 64 bits. 
Zero32  Zero-extended to 32 bits. The state of bits <63:32> is unpredictable. 
Zero64  Zero-extended to 64 bits. 
Data32  Data is 32 bits. The state of bits <63:32> is unpredictable. 
2 * Data32  Two single-precision parts of the complex value are stored in memory as independent floating-point values with each handled as Data32
Data64  Data is 64 bits. 
2 * Data64  Two double-precision parts of the complex value are stored in memory as independent floating-point values with each handled as Data64
Hard  Passed in the layout defined by the Alpha Architecture Reference Manual
2 * Hard  Two double-precision parts of the complex value are stored in a pair of registers as independent floating-point values with each handled as Hard
Nostd  The state of all high-order bits not occupied by the data is unpredictable across a call or return. 


Note

Sign64, when applied to a longword logical, duplicates bit 31 through bits <63:32>. This duplication can cause the 64-bit integer value to appear negative. However, careful use of 32-bit arithmetic and 64-bit logical instructions (with no right shifts) will preserve the 32-bit unsigned nature of the argument.


Because of the varied rules for sign extension of data when passed as arguments, calling and called routines must agree on the data type of each argument. No implicit data type conversions can be assumed between the calling procedure and the called procedure.


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.1.6    Sending Data

The following sections define the calling standard requirements for mechanisms to send data and the order of argument evaluation.


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.1.6.1    Sending Mechanism
In Section 4.1.1, the allowable argument-passing mechanisms are immediate value, reference, and descriptor. The following list describes the requirements for using these mechanisms.

Note that extended floating-point values are not passed using the immediate value mechanism. Instead, they are passed using the by-reference mechanism. (When by-value semantics are required, however, it might be necessary to make a copy of the actual parameter and pass a reference to that copy to avoid improper alias effects.)

Note also that when a record is passed by immediate value, the component types have no bearing on how the argument is aligned. The record will always be quadword-aligned.


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.1.6.2    Order of Argument Evaluation
Because most high-level languages do not specify the order of evaluation of arguments with respect to side effects, those language processors can evaluate arguments in any convenient order. The choice of argument evaluation order and code generation strategy is constrained only by the definition of the particular language. Programs should not depend on the order of evaluation of arguments.


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.1.7    Returning Data

A standard function must return its function value by one of the following mechanisms:

These mechanisms are the only standard means available for returning function values. They support the important language-independent data types. Functions that return values by any mechanism other than those specified here are nonstandard, language-specific functions.

The following sections describe each of the three standard mechanisms for returning function values.


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.1.7.1    Function Value Return By Immediate Value
The following list describes the two types of immediate value function returned:


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.1.7.2    Function Value Return By Reference
A function value is returned by reference if and only if the function value satisfies the following criteria:

The actual-argument list and the formal-argument list are shifted to the right by one argument item. The new first argument item is reserved for the address of the function value.

The calling procedure must provide the required contiguous storage and pass the address of the storage as the first argument. This address must specify storage that is naturally aligned according to the data type of the function value.

The called function must write the function value to the storage described by the first argument.


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.1.7.3    Function Value Return By Descriptor
A function value is returned by descriptor if and only if the function value satisfies all of the following criteria:

Function results returned by descriptor are not permitted in a standard call.

Typically, the called routine creates the return object on its stack and leaves it there on return. This process is referred to as the stack return mechanism. The exit code of the called routine does not restore SP to its value before the call because, if it did, the return value would be left unprotected in memory below SP. The calling routine must be prepared for SP to have a different value after the call than the pointer had before the call.


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.2    Data Allocation

Data allocation refers to the method of storing data in memory. The following sections cover these topics:


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.2.1    Data Alignment

In the Alpha environment, memory references to data that is not naturally aligned can result in alignment faults. Such alignment faults can severely degrade the performance of all procedures that reference the unnaturally aligned data.

To avoid such performance degradation, all data values for programs running on Digital UNIX systems should be naturally aligned. Table 4-4 contains information on data alignment.


Table 4-4: Data Alignment Addresses
Data TypeAlignment Starting Position
8-bit character string  Byte boundary 
16-bit integer  Address that is a multiple of 2 (word alignment) 
32-bit integer  Address that is a multiple of 4 (longword alignment) 
64-bit integer  Address that is a multiple of 8 (quadword alignment) 
Single-precision real value  Address that is a multiple of 4 (longword alignment) 
Double-precision real value  Address that is a multiple of 8 (quadword alignment) 
Extended-precision real value  Address that is a multiple of 16 (octaword alignment) 
Single-precision complex value  Address that is a multiple of 4 (longword alignment) 
Double-precision complex value  Address that is a multiple of 8 (quadword alignment) 
Extended-precision complex value  Address that is a multiple of 16 (octaword alignment) 
Data types larger than 64 bits  Quadword or greater alignment. (Alignments larger than quadword are language-specific or application defined) 

For aggregates such as strings, arrays, and records, the data type to be considered for purposes of alignment is not the aggregate itself, but the elements that make up the aggregate. The alignment requirement of an aggregate is that all elements of the aggregate be naturally aligned. Varying 8-bit character strings, for example, must start at addresses that are a multiple of at least 2 (word alignment) because of the 16-bit count at the beginning of the string; 32-bit integer arrays start at a longword boundary, regardless of the extent of the array.

Note that the rules in Section 4.1.6.1 for passing by value an argument that is a record always provide quadword alignment of the record value independent of the normal alignment requirement of the record. If deemed appropriate by the implementation, normal alignment can be established within the called procedure by making a copy of the record argument at a suitably aligned location.


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.2.2    Granularity of Memory

Granularity of memory refers to the smallest unit in which memory can be accessed. In the Alpha architecture, although memory is byte-addressed, the granularity is a longword. Even for longword-sized data, it is often expedient for execution efficiency to access memory in quadword units. In the presence of multiple threads of execution (whether on multiple processors or a single processor), allocation of more than one data element within a single quadword can lead to more complicated access sequences (for example, using ldx_l/stx_c) and/or latent and hard to diagnose errors because of nonobvious and implicit data sharing. Therefore, it is generally recommended that independent variables (that is, variables not combined in a larger aggregate) be allocated on quadword boundaries.


[Return to Library]  [TOC]  [PREV]  --SECT  SECT--  [NEXT]  [INDEX] [Help]

4.2.3    Record Layout Conventions

The Digital UNIX calling standard record layout conventions are designed to provide good run-time performance on all implementations of the Alpha architecture. Only the standard record layouts may be used across standard interfaces or between languages. Languages can support other language-specific record layout conventions, but such other record layouts are nonstandard.

The aligned record layout conventions ensure the following:

The aligned record layout is defined by the following conventions: