llvm-for-llvmta/docs/TableGen/ProgRef.rst

1966 lines
73 KiB
ReStructuredText
Raw Permalink Normal View History

2022-04-25 10:02:23 +02:00
===============================
TableGen Programmer's Reference
===============================
.. sectnum::
.. contents::
:local:
Introduction
============
The purpose of TableGen is to generate complex output files based on
information from source files that are significantly easier to code than the
output files would be, and also easier to maintain and modify over time. The
information is coded in a declarative style involving classes and records,
which are then processed by TableGen. The internalized records are passed on
to various backends, which extract information from a subset of the records
and generate one or more output files. These output files are typically
``.inc`` files for C++, but may be any type of file that the backend
developer needs.
This document describes the LLVM TableGen facility in detail. It is intended
for the programmer who is using TableGen to produce code for a project. If
you are looking for a simple overview, check out the :doc:`TableGen Overview
<./index>`. The various ``xxx-tblgen`` commands used to invoke TableGen are
described in :doc:`xxx-tblgen: Target Description to C++ Code
<../CommandGuide/tblgen>`.
An example of a backend is ``RegisterInfo``, which generates the register
file information for a particular target machine, for use by the LLVM
target-independent code generator. See :doc:`TableGen Backends <./BackEnds>`
for a description of the LLVM TableGen backends, and :doc:`TableGen
Backend Developer's Guide <./BackGuide>` for a guide to writing a new
backend.
Here are a few of the things backends can do.
* Generate the register file information for a particular target machine.
* Generate the instruction definitions for a target.
* Generate the patterns that the code generator uses to match instructions
to intermediate representation (IR) nodes.
* Generate semantic attribute identifiers for Clang.
* Generate abstract syntax tree (AST) declaration node definitions for Clang.
* Generate AST statement node definitions for Clang.
Concepts
--------
TableGen source files contain two primary items: *abstract records* and
*concrete records*. In this and other TableGen documents, abstract records
are called *classes.* (These classes are different from C++ classes and do
not map onto them.) In addition, concrete records are usually just called
records, although sometimes the term *record* refers to both classes and
concrete records. The distinction should be clear in context.
Classes and concrete records have a unique *name*, either chosen by
the programmer or generated by TableGen. Associated with that name
is a list of *fields* with values and an optional list of *superclasses*
(sometimes called base or parent classes). The fields are the primary data that
backends will process. Note that TableGen assigns no meanings to fields; the
meanings are entirely up to the backends and the programs that incorporate
the output of those backends.
A backend processes some subset of the concrete records built by the
TableGen parser and emits the output files. These files are usually C++
``.inc`` files that are included by the programs that require the data in
those records. However, a backend can produce any type of output files. For
example, it could produce a data file containing messages tagged with
identifiers and substitution parameters. In a complex use case such as the
LLVM code generator, there can be many concrete records and some of them can
have an unexpectedly large number of fields, resulting in large output files.
In order to reduce the complexity of TableGen files, classes are used to
abstract out groups of record fields. For example, a few classes may
abstract the concept of a machine register file, while other classes may
abstract the instruction formats, and still others may abstract the
individual instructions. TableGen allows an arbitrary hierarchy of classes,
so that the abstract classes for two concepts can share a third superclass that
abstracts common "sub-concepts" from the two original concepts.
In order to make classes more useful, a concrete record (or another class)
can request a class as a superclass and pass *template arguments* to it.
These template arguments can be used in the fields of the superclass to
initialize them in a custom manner. That is, record or class ``A`` can
request superclass ``S`` with one set of template arguments, while record or class
``B`` can request ``S`` with a different set of arguments. Without template
arguments, many more classes would be required, one for each combination of
the template arguments.
Both classes and concrete records can include fields that are uninitialized.
The uninitialized "value" is represented by a question mark (``?``). Classes
often have uninitialized fields that are expected to be filled in when those
classes are inherited by concrete records. Even so, some fields of concrete
records may remain uninitialized.
TableGen provides *multiclasses* to collect a group of record definitions in
one place. A multiclass is a sort of macro that can be "invoked" to define
multiple concrete records all at once. A multiclass can inherit from other
multiclasses, which means that the multiclass inherits all the definitions
from its parent multiclasses.
`Appendix C: Sample Record`_ illustrates a complex record in the Intel X86
target and the simple way in which it is defined.
Source Files
============
TableGen source files are plain ASCII text files. The files can contain
statements, comments, and blank lines (see `Lexical Analysis`_). The standard file
extension for TableGen files is ``.td``.
TableGen files can grow quite large, so there is an include mechanism that
allows one file to include the content of another file (see `Include
Files`_). This allows large files to be broken up into smaller ones, and
also provides a simple library mechanism where multiple source files can
include the same library file.
TableGen supports a simple preprocessor that can be used to conditionalize
portions of ``.td`` files. See `Preprocessing Facilities`_ for more
information.
Lexical Analysis
================
The lexical and syntax notation used here is intended to imitate
`Python's`_ notation. In particular, for lexical definitions, the productions
operate at the character level and there is no implied whitespace between
elements. The syntax definitions operate at the token level, so there is
implied whitespace between tokens.
.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
TableGen supports BCPL-style comments (``// ...``) and nestable C-style
comments (``/* ... */``).
TableGen also provides simple `Preprocessing Facilities`_.
Formfeed characters may be used freely in files to produce page breaks when
the file is printed for review.
The following are the basic punctuation tokens::
- + [ ] { } ( ) < > : ; . ... = ? #
Literals
--------
Numeric literals take one of the following forms:
.. productionlist::
TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
DecimalInteger: ["+" | "-"] ("0"..."9")+
HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
BinInteger: "0b" ("0" | "1")+
Observe that the :token:`DecimalInteger` token includes the optional ``+``
or ``-`` sign, unlike most languages where the sign would be treated as a
unary operator.
TableGen has two kinds of string literals:
.. productionlist::
TokString: '"' (non-'"' characters and escapes) '"'
TokCode: "[{" (shortest text not containing "}]") "}]"
A :token:`TokCode` is nothing more than a multi-line string literal
delimited by ``[{`` and ``}]``. It can break across lines and the
line breaks are retained in the string.
The current implementation accepts the following escape sequences::
\\ \' \" \t \n
Identifiers
-----------
TableGen has name- and identifier-like tokens, which are case-sensitive.
.. productionlist::
ualpha: "a"..."z" | "A"..."Z" | "_"
TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")*
Note that, unlike most languages, TableGen allows :token:`TokIdentifier` to
begin with an integer. In case of ambiguity, a token is interpreted as a
numeric literal rather than an identifier.
TableGen has the following reserved keywords, which cannot be used as
identifiers::
assert bit bits class code
dag def else false foreach
defm defset defvar field if
in include int let list
multiclass string then true
.. warning::
The ``field`` reserved word is deprecated.
Bang operators
--------------
TableGen provides "bang operators" that have a wide variety of uses:
.. productionlist::
BangOperator: one of
: !add !and !cast !con !dag
: !empty !eq !foldl !foreach !filter
: !ge !getdagop !gt !head !if
: !interleave !isa !le !listconcat !listsplat
: !lt !mul !ne !not !or
: !setdagop !shl !size !sra !srl
: !strconcat !sub !subst !substr !tail
: !xor
The ``!cond`` operator has a slightly different
syntax compared to other bang operators, so it is defined separately:
.. productionlist::
CondOperator: !cond
See `Appendix A: Bang Operators`_ for a description of each bang operator.
Include files
-------------
TableGen has an include mechanism. The content of the included file
lexically replaces the ``include`` directive and is then parsed as if it was
originally in the main file.
.. productionlist::
IncludeDirective: "include" `TokString`
Portions of the main file and included files can be conditionalized using
preprocessor directives.
.. productionlist::
PreprocessorDirective: "#define" | "#ifdef" | "#ifndef"
Types
=====
The TableGen language is statically typed, using a simple but complete type
system. Types are used to check for errors, to perform implicit conversions,
and to help interface designers constrain the allowed input. Every value is
required to have an associated type.
TableGen supports a mixture of low-level types (e.g., ``bit``) and
high-level types (e.g., ``dag``). This flexibility allows you to describe a
wide range of records conveniently and compactly.
.. productionlist::
Type: "bit" | "int" | "string" | "dag"
:| "bits" "<" `TokInteger` ">"
:| "list" "<" `Type` ">"
:| `ClassID`
ClassID: `TokIdentifier`
``bit``
A ``bit`` is a boolean value that can be 0 or 1.
``int``
The ``int`` type represents a simple 64-bit integer value, such as 5 or
-42.
``string``
The ``string`` type represents an ordered sequence of characters of arbitrary
length.
``bits<``\ *n*\ ``>``
The ``bits`` type is a fixed-sized integer of arbitrary length *n* that
is treated as separate bits. These bits can be accessed individually.
A field of this type is useful for representing an instruction operation
code, register number, or address mode/register/displacement. The bits of
the field can be set individually or as subfields. For example, in an
instruction address, the addressing mode, base register number, and
displacement can be set separately.
``list<``\ *type*\ ``>``
This type represents a list whose elements are of the *type* specified in
angle brackets. The element type is arbitrary; it can even be another
list type. List elements are indexed from 0.
``dag``
This type represents a nestable directed acyclic graph (DAG) of nodes.
Each node has an *operator* and zero or more *arguments* (or *operands*).
An argument can be
another ``dag`` object, allowing an arbitrary tree of nodes and edges.
As an example, DAGs are used to represent code patterns for use by
the code generator instruction selection algorithms. See `Directed
acyclic graphs (DAGs)`_ for more details;
:token:`ClassID`
Specifying a class name in a type context indicates
that the type of the defined value must
be a subclass of the specified class. This is useful in conjunction with
the ``list`` type; for example, to constrain the elements of the list to a
common base class (e.g., a ``list<Register>`` can only contain definitions
derived from the ``Register`` class).
The :token:`ClassID` must name a class that has been previously
declared or defined.
Values and Expressions
======================
There are many contexts in TableGen statements where a value is required. A
common example is in the definition of a record, where each field is
specified by a name and an optional value. TableGen allows for a reasonable
number of different forms when building up value expressions. These forms
allow the TableGen file to be written in a syntax that is natural for the
application.
Note that all of the values have rules for converting them from one type to
another. For example, these rules allow you to assign a value like ``7``
to an entity of type ``bits<4>``.
.. productionlist::
Value: `SimpleValue` `ValueSuffix`*
:| `Value` "#" `Value`
ValueSuffix: "{" `RangeList` "}"
:| "[" `RangeList` "]"
:| "." `TokIdentifier`
RangeList: `RangePiece` ("," `RangePiece`)*
RangePiece: `TokInteger`
:| `TokInteger` "..." `TokInteger`
:| `TokInteger` "-" `TokInteger`
:| `TokInteger` `TokInteger`
.. warning::
The peculiar last form of :token:`RangePiece` is due to the fact that the
"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
two consecutive tokens, with values ``1`` and ``-5``, instead of "1", "-",
and "5". The use of hyphen as the range punctuation is deprecated.
Simple values
-------------
The :token:`SimpleValue` has a number of forms.
.. productionlist::
SimpleValue: `TokInteger` | `TokString`+ | `TokCode`
A value can be an integer literal, a string literal, or a code literal.
Multiple adjacent string literals are concatenated as in C/C++; the simple
value is the concatenation of the strings. Code literals become strings and
are then indistinguishable from them.
.. productionlist::
SimpleValue2: "true" | "false"
The ``true`` and ``false`` literals are essentially syntactic sugar for the
integer values 1 and 0. They improve the readability of TableGen files when
boolean values are used in field initializations, bit sequences, ``if``
statements. etc. When parsed, these literals are converted to integers.
.. note::
Although ``true`` and ``false`` are literal names for 1 and 0, we
recommend as a stylistic rule that you use them for boolean
values only.
.. productionlist::
SimpleValue3: "?"
A question mark represents an uninitialized value.
.. productionlist::
SimpleValue4: "{" [`ValueList`] "}"
ValueList: `ValueListNE`
ValueListNE: `Value` ("," `Value`)*
This value represents a sequence of bits, which can be used to initialize a
``bits<``\ *n*\ ``>`` field (note the braces). When doing so, the values
must represent a total of *n* bits.
.. productionlist::
SimpleValue5: "[" `ValueList` "]" ["<" `Type` ">"]
This value is a list initializer (note the brackets). The values in brackets
are the elements of the list. The optional :token:`Type` can be used to
indicate a specific element type; otherwise the element type is inferred
from the given values. TableGen can usually infer the type, although
sometimes not when the value is the empty list (``[]``).
.. productionlist::
SimpleValue6: "(" `DagArg` [`DagArgList`] ")"
DagArgList: `DagArg` ("," `DagArg`)*
DagArg: `Value` [":" `TokVarName`] | `TokVarName`
This represents a DAG initializer (note the parentheses). The first
:token:`DagArg` is called the "operator" of the DAG and must be a record.
See `Directed acyclic graphs (DAGs)`_ for more details.
.. productionlist::
SimpleValue7: `TokIdentifier`
The resulting value is the value of the entity named by the identifier. The
possible identifiers are described here, but the descriptions will make more
sense after reading the remainder of this guide.
.. The code for this is exceptionally abstruse. These examples are a
best-effort attempt.
* A template argument of a ``class``, such as the use of ``Bar`` in::
class Foo <int Bar> {
int Baz = Bar;
}
* The implicit template argument ``NAME`` in a ``class`` or ``multiclass``
definition (see `NAME`_).
* A field local to a ``class``, such as the use of ``Bar`` in::
class Foo {
int Bar = 5;
int Baz = Bar;
}
* The name of a record definition, such as the use of ``Bar`` in the
definition of ``Foo``::
def Bar : SomeClass {
int X = 5;
}
def Foo {
SomeClass Baz = Bar;
}
* A field local to a record definition, such as the use of ``Bar`` in::
def Foo {
int Bar = 5;
int Baz = Bar;
}
Fields inherited from the record's parent classes can be accessed the same way.
* A template argument of a ``multiclass``, such as the use of ``Bar`` in::
multiclass Foo <int Bar> {
def : SomeClass<Bar>;
}
* A variable defined with the ``defvar`` or ``defset`` statements.
* The iteration variable of a ``foreach``, such as the use of ``i`` in::
foreach i = 0...5 in
def Foo#i;
.. productionlist::
SimpleValue8: `ClassID` "<" `ValueListNE` ">"
This form creates a new anonymous record definition (as would be created by an
unnamed ``def`` inheriting from the given class with the given template
arguments; see `def`_) and the value is that record. A field of the record can be
obtained using a suffix; see `Suffixed Values`_.
Invoking a class in this manner can provide a simple subroutine facility.
See `Using Classes as Subroutines`_ for more information.
.. productionlist::
SimpleValue9: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
:| `CondOperator` "(" `CondClause` ("," `CondClause`)* ")"
CondClause: `Value` ":" `Value`
The bang operators provide functions that are not available with the other
simple values. Except in the case of ``!cond``, a bang
operator takes a list of arguments enclosed in parentheses and performs some
function on those arguments, producing a value for that
bang operator. The ``!cond`` operator takes a list of pairs of arguments
separated by colons. See `Appendix A: Bang Operators`_ for a description of
each bang operator.
Suffixed values
---------------
The :token:`SimpleValue` values described above can be specified with
certain suffixes. The purpose of a suffix is to obtain a subvalue of the
primary value. Here are the possible suffixes for some primary *value*.
*value*\ ``{17}``
The final value is bit 17 of the integer *value* (note the braces).
*value*\ ``{8...15}``
The final value is bits 8--15 of the integer *value*. The order of the
bits can be reversed by specifying ``{15...8}``.
*value*\ ``[4...7,17,2...3,4]``
The final value is a new list that is a slice of the list *value* (note
the brackets). The
new list contains elements 4, 5, 6, 7, 17, 2, 3, and 4. Elements may be
included multiple times and in any order.
*value*\ ``.`` *field*
The final value is the value of the specified *field* in the specified
record *value*.
The paste operator
------------------
The paste operator (``#``) is the only infix operator availabe in TableGen
expressions. It allows you to concatenate strings or lists, but has a few
unusual features.
The paste operator can be used when specifying the record name in a
:token:`Def` or :token:`Defm` statement, in which case it must construct a
string. If an operand is an undefined name (:token:`TokIdentifier`) or the
name of a global :token:`Defvar` or :token:`Defset`, it is treated as a
verbatim string of characters. The value of a global name is not used.
The paste operator can be used in all other value expressions, in which case
it can construct a string or a list. Rather oddly, but consistent with the
previous case, if the *right-hand-side* operand is an undefined name or a
global name, it is treated as a verbatim string of characters. The
left-hand-side operand is treated normally.
`Appendix B: Paste Operator Examples`_ presents examples of the behavior of
the paste operator.
Statements
==========
The following statements may appear at the top level of TableGen source
files.
.. productionlist::
TableGenFile: `Statement`*
Statement: `Assert` | `Class` | `Def` | `Defm` | `Defset` | `Defvar`
:| `Foreach` | `If` | `Let` | `MultiClass`
The following sections describe each of these top-level statements.
``class`` --- define an abstract record class
---------------------------------------------
A ``class`` statement defines an abstract record class from which other
classes and records can inherit.
.. productionlist::
Class: "class" `ClassID` [`TemplateArgList`] `RecordBody`
TemplateArgList: "<" `TemplateArgDecl` ("," `TemplateArgDecl`)* ">"
TemplateArgDecl: `Type` `TokIdentifier` ["=" `Value`]
A class can be parameterized by a list of "template arguments," whose values
can be used in the class's record body. These template arguments are
specified each time the class is inherited by another class or record.
If a template argument is not assigned a default value with ``=``, it is
uninitialized (has the "value" ``?``) and must be specified in the template
argument list when the class is inherited. If an argument is assigned a
default value, then it need not be specified in the argument list. The
template argument default values are evaluated from left to right.
The :token:`RecordBody` is defined below. It can include a list of
superclasses from which the current class inherits, along with field definitions
and other statements. When a class ``C`` inherits from another class ``D``,
the fields of ``D`` are effectively merged into the fields of ``C``.
A given class can only be defined once. A ``class`` statement is
considered to define the class if *any* of the following are true (the
:token:`RecordBody` elements are described below).
* The :token:`TemplateArgList` is present, or
* The :token:`ParentClassList` in the :token:`RecordBody` is present, or
* The :token:`Body` in the :token:`RecordBody` is present and not empty.
You can declare an empty class by specifying an empty :token:`TemplateArgList`
and an empty :token:`RecordBody`. This can serve as a restricted form of
forward declaration. Note that records derived from a forward-declared
class will inherit no fields from it, because those records are built when
their declarations are parsed, and thus before the class is finally defined.
.. _NAME:
Every class has an implicit template argument named ``NAME`` (uppercase),
which is bound to the name of the :token:`Def` or :token:`Defm` inheriting
the class. The value of ``NAME`` is undefined if the class is inherited by
an anonymous record.
See `Examples: classes and records`_ for examples.
Record Bodies
`````````````
Record bodies appear in both class and record definitions. A record body can
include a parent class list, which specifies the classes from which the
current class or record inherits fields. Such classes are called the
superclasses or parent classes of the class or record. The record body also
includes the main body of the definition, which contains the specification
of the fields of the class or record.
.. productionlist::
RecordBody: `ParentClassList` `Body`
ParentClassList: [":" `ParentClassListNE`]
ParentClassListNE: `ClassRef` ("," `ClassRef`)*
ClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
A :token:`ParentClassList` containing a :token:`MultiClassID` is valid only
in the class list of a ``defm`` statement. In that case, the ID must be the
name of a multiclass.
.. productionlist::
Body: ";" | "{" `BodyItem`* "}"
BodyItem: (`Type` | "code") `TokIdentifier` ["=" `Value`] ";"
:| "let" `TokIdentifier` ["{" `RangeList` "}"] "=" `Value` ";"
:| "defvar" `TokIdentifier` "=" `Value` ";"
:| `Assert`
A field definition in the body specifies a field to be included in the class
or record. If no initial value is specified, then the field's value is
uninitialized. The type must be specified; TableGen will not infer it from
the value. The keyword ``code`` may be used to emphasize that the field
has a string value that is code.
The ``let`` form is used to reset a field to a new value. This can be done
for fields defined directly in the body or fields inherited from
superclasses. A :token:`RangeList` can be specified to reset certain bits
in a ``bit<n>`` field.
The ``defvar`` form defines a variable whose value can be used in other
value expressions within the body. The variable is not a field: it does not
become a field of the class or record being defined. Variables are provided
to hold temporary values while processing the body. See `Defvar in a Record
Body`_ for more details.
When class ``C2`` inherits from class ``C1``, it acquires all the field
definitions of ``C1``. As those definitions are merged into class ``C2``, any
template arguments passed to ``C1`` by ``C2`` are substituted into the
definitions. In other words, the abstract record fields defined by ``C1`` are
expanded with the template arguments before being merged into ``C2``.
.. _def:
``def`` --- define a concrete record
------------------------------------
A ``def`` statement defines a new concrete record.
.. productionlist::
Def: "def" [`NameValue`] `RecordBody`
NameValue: `Value` (parsed in a special manner)
The name value is optional. If specified, it is parsed in a special mode
where undefined (unrecognized) identifiers are interpreted as literal
strings. In particular, global identifiers are considered unrecognized.
These include global variables defined by ``defvar`` and ``defset``.
If no name value is given, the record is *anonymous*. The final name of an
anonymous record is unspecified but globally unique.
Special handling occurs if a ``def`` appears inside a ``multiclass``
statement. See the ``multiclass`` section below for details.
A record can inherit from one or more classes by specifying the
:token:`ParentClassList` clause at the beginning of its record body. All of
the fields in the parent classes are added to the record. If two or more
parent classes provide the same field, the record ends up with the field value
of the last parent class.
As a special case, the name of a record can be passed in a template argument
to that record's superclasses. For example:
.. code-block:: text
class A <dag d> {
dag the_dag = d;
}
def rec1 : A<(ops rec1)>
The DAG ``(ops rec1)`` is passed as a template argument to class ``A``. Notice
that the DAG includes ``rec1``, the record being defined.
The steps taken to create a new record are somewhat complex. See `How
records are built`_.
See `Examples: classes and records`_ for examples.
Examples: classes and records
-----------------------------
Here is a simple TableGen file with one class and two record definitions.
.. code-block:: text
class C {
bit V = 1;
}
def X : C;
def Y : C {
let V = 0;
string Greeting = "Hello!";
}
First, the abstract class ``C`` is defined. It has one field named ``V``
that is a bit initialized to 1.
Next, two records are defined, derived from class ``C``; that is, with ``C``
as their superclass. Thus they both inherit the ``V`` field. Record ``Y``
also defines another string field, ``Greeting``, which is initialized to
``"Hello!"``. In addition, ``Y`` overrides the inherited ``V`` field,
setting it to 0.
A class is useful for isolating the common features of multiple records in
one place. A class can initialize common fields to default values, but
records inheriting from that class can override the defaults.
TableGen supports the definition of parameterized classes as well as
nonparameterized ones. Parameterized classes specify a list of variable
declarations, which may optionally have defaults, that are bound when the
class is specified as a superclass of another class or record.
.. code-block:: text
class FPFormat <bits<3> val> {
bits<3> Value = val;
}
def NotFP : FPFormat<0>;
def ZeroArgFP : FPFormat<1>;
def OneArgFP : FPFormat<2>;
def OneArgFPRW : FPFormat<3>;
def TwoArgFP : FPFormat<4>;
def CompareFP : FPFormat<5>;
def CondMovFP : FPFormat<6>;
def SpecialFP : FPFormat<7>;
The purpose of the ``FPFormat`` class is to act as a sort of enumerated
type. It provides a single field, ``Value``, which holds a 3-bit number. Its
template argument, ``val``, is used to set the ``Value`` field.
Each of the eight records is defined with ``FPFormat`` as its superclass. The
enumeration value is passed in angle brackets as the template argument. Each
record will inherent the ``Value`` field with the appropriate enumeration
value.
Here is a more complex example of classes with template arguments. First, we
define a class similar to the ``FPFormat`` class above. It takes a template
argument and uses it to initialize a field named ``Value``. Then we define
four records that inherit the ``Value`` field with its four different
integer values.
.. code-block:: text
class ModRefVal <bits<2> val> {
bits<2> Value = val;
}
def None : ModRefVal<0>;
def Mod : ModRefVal<1>;
def Ref : ModRefVal<2>;
def ModRef : ModRefVal<3>;
This is somewhat contrived, but let's say we would like to examine the two
bits of the ``Value`` field independently. We can define a class that
accepts a ``ModRefVal`` record as a template argument and splits up its
value into two fields, one bit each. Then we can define records that inherit from
``ModRefBits`` and so acquire two fields from it, one for each bit in the
``ModRefVal`` record passed as the template argument.
.. code-block:: text
class ModRefBits <ModRefVal mrv> {
// Break the value up into its bits, which can provide a nice
// interface to the ModRefVal values.
bit isMod = mrv.Value{0};
bit isRef = mrv.Value{1};
}
// Example uses.
def foo : ModRefBits<Mod>;
def bar : ModRefBits<Ref>;
def snork : ModRefBits<ModRef>;
This illustrates how one class can be defined to reorganize the
fields in another class, thus hiding the internal representation of that
other class.
Running ``llvm-tblgen`` on the example prints the following definitions:
.. code-block:: text
def bar { // Value
bit isMod = 0;
bit isRef = 1;
}
def foo { // Value
bit isMod = 1;
bit isRef = 0;
}
def snork { // Value
bit isMod = 1;
bit isRef = 1;
}
``let`` --- override fields in classes or records
-------------------------------------------------
A ``let`` statement collects a set of field values (sometimes called
*bindings*) and applies them to all the classes and records defined by
statements within the scope of the ``let``.
.. productionlist::
Let: "let" `LetList` "in" "{" `Statement`* "}"
:| "let" `LetList` "in" `Statement`
LetList: `LetItem` ("," `LetItem`)*
LetItem: `TokIdentifier` ["<" `RangeList` ">"] "=" `Value`
The ``let`` statement establishes a scope, which is a sequence of statements
in braces or a single statement with no braces. The bindings in the
:token:`LetList` apply to the statements in that scope.
The field names in the :token:`LetList` must name fields in classes inherited by
the classes and records defined in the statements. The field values are
applied to the classes and records *after* the records inherit all the fields from
their superclasses. So the ``let`` acts to override inherited field
values. A ``let`` cannot override the value of a template argument.
Top-level ``let`` statements are often useful when a few fields need to be
overriden in several records. Here are two examples. Note that ``let``
statements can be nested.
.. code-block:: text
let isTerminator = 1, isReturn = 1, isBarrier = 1, hasCtrlDep = 1 in
def RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>;
let isCall = 1 in
// All calls clobber the non-callee saved registers...
let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0,
MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, XMM0, XMM1, XMM2,
XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] in {
def CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst, variable_ops),
"call\t${dst:call}", []>;
def CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops),
"call\t{*}$dst", [(X86call GR32:$dst)]>;
def CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops),
"call\t{*}$dst", []>;
}
Note that a top-level ``let`` will not override fields defined in the classes or records
themselves.
``multiclass`` --- define multiple records
------------------------------------------
While classes with template arguments are a good way to factor out commonality
between multiple records, multiclasses allow a convenient method for
defining multiple records at once. For example, consider a 3-address
instruction architecture whose instructions come in two formats: ``reg = reg
op reg`` and ``reg = reg op imm`` (e.g., SPARC). We would like to specify in
one place that these two common formats exist, then in a separate place
specify what all the operations are. The ``multiclass`` and ``defm``
statements accomplish this goal. You can think of a multiclass as a macro or
template that expands into multiple records.
.. productionlist::
MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
: [":" `ParentMultiClassList`]
: "{" `Statement`+ "}"
ParentMultiClassList: `MultiClassID` ("," `MultiClassID`)*
MultiClassID: `TokIdentifier`
As with regular classes, the multiclass has a name and can accept template
arguments. A multiclass can inherit from other multiclasses, which causes
the other multiclasses to be expanded and contribute to the record
definitions in the inheriting multiclass. The body of the multiclass
contains a series of statements that define records, using :token:`Def` and
:token:`Defm`. In addition, :token:`Defvar`, :token:`Foreach`, and
:token:`Let` statements can be used to factor out even more common elements.
The :token:`If` statement can also be used.
Also as with regular classes, the multiclass has the implicit template
argument ``NAME`` (see NAME_). When a named (non-anonymous) record is
defined in a multiclass and the record's name does not contain a use of the
template argument ``NAME``, such a use is automatically prepended
to the name. That is, the following are equivalent inside a multiclass::
def Foo ...
def NAME#Foo ...
The records defined in a multiclass are instantiated when the multiclass is
"invoked" by a ``defm`` statement outside the multiclass definition. Each
``def`` statement produces a record. As with top-level ``def`` statements,
these definitions can inherit from multiple superclasses.
See `Examples: multiclasses and defms`_ for examples.
``defm`` --- invoke multiclasses to define multiple records
-----------------------------------------------------------
Once multiclasses have been defined, you use the ``defm`` statement to
"invoke" multiclasses and process the multiple record definitions in those
multiclasses. Those record definitions are specified by ``def``
statements in the multiclasses, and indirectly by ``defm`` statements.
.. productionlist::
Defm: "defm" [`NameValue`] `ParentClassList` ";"
The optional :token:`NameValue` is formed in the same way as the name of a
``def``. The :token:`ParentClassList` is a colon followed by a list of at least one
multiclass and any number of regular classes. The multiclasses must
precede the regular classes. Note that the ``defm`` does not have a body.
This statement instantiates all the records defined in all the specified
multiclasses, either directly by ``def`` statements or indirectly by
``defm`` statements. These records also receive the fields defined in any
regular classes included in the parent class list. This is useful for adding
a common set of fields to all the records created by the ``defm``.
The name is parsed in the same special mode used by ``def``. If the name is
not included, a globally unique name is provided. That is, the following
examples end up with different names::
defm : SomeMultiClass<...>; // A globally unique name.
defm "" : SomeMultiClass<...>; // An empty name.
The ``defm`` statement can be used in a multiclass body. When this occurs,
the second variant is equivalent to::
defm NAME : SomeMultiClass<...>;
More generally, when ``defm`` occurs in a multiclass and its name does not
include a use of the implicit template argument ``NAME``, then ``NAME`` will
be prepended automatically. That is, the following are equivalent inside a
multiclass::
defm Foo : SomeMultiClass<...>;
defm NAME#Foo : SomeMultiClass<...>;
See `Examples: multiclasses and defms`_ for examples.
Examples: multiclasses and defms
--------------------------------
Here is a simple example using ``multiclass`` and ``defm``. Consider a
3-address instruction architecture whose instructions come in two formats:
``reg = reg op reg`` and ``reg = reg op imm`` (immediate). The SPARC is an
example of such an architecture.
.. code-block:: text
def ops;
def GPR;
def Imm;
class inst <int opc, string asmstr, dag operandlist>;
multiclass ri_inst <int opc, string asmstr> {
def _rr : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
(ops GPR:$dst, GPR:$src1, GPR:$src2)>;
def _ri : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
(ops GPR:$dst, GPR:$src1, Imm:$src2)>;
}
// Define records for each instruction in the RR and RI formats.
defm ADD : ri_inst<0b111, "add">;
defm SUB : ri_inst<0b101, "sub">;
defm MUL : ri_inst<0b100, "mul">;
Each use of the ``ri_inst`` multiclass defines two records, one with the
``_rr`` suffix and one with ``_ri``. Recall that the name of the ``defm``
that uses a multiclass is prepended to the names of the records defined in
that multiclass. So the resulting definitions are named::
ADD_rr, ADD_ri
SUB_rr, SUB_ri
MUL_rr, MUL_ri
Without the ``multiclass`` feature, the instructions would have to be
defined as follows.
.. code-block:: text
def ops;
def GPR;
def Imm;
class inst <int opc, string asmstr, dag operandlist>;
class rrinst <int opc, string asmstr>
: inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
(ops GPR:$dst, GPR:$src1, GPR:$src2)>;
class riinst <int opc, string asmstr>
: inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
(ops GPR:$dst, GPR:$src1, Imm:$src2)>;
// Define records for each instruction in the RR and RI formats.
def ADD_rr : rrinst<0b111, "add">;
def ADD_ri : riinst<0b111, "add">;
def SUB_rr : rrinst<0b101, "sub">;
def SUB_ri : riinst<0b101, "sub">;
def MUL_rr : rrinst<0b100, "mul">;
def MUL_ri : riinst<0b100, "mul">;
A ``defm`` can be used in a multiclass to "invoke" other multiclasses and
create the records defined in those multiclasses in addition to the records
defined in the current multiclass. In the following example, the ``basic_s``
and ``basic_p`` multiclasses contain ``defm`` statements that refer to the
``basic_r`` multiclass. The ``basic_r`` multiclass contains only ``def``
statements.
.. code-block:: text
class Instruction <bits<4> opc, string Name> {
bits<4> opcode = opc;
string name = Name;
}
multiclass basic_r <bits<4> opc> {
def rr : Instruction<opc, "rr">;
def rm : Instruction<opc, "rm">;
}
multiclass basic_s <bits<4> opc> {
defm SS : basic_r<opc>;
defm SD : basic_r<opc>;
def X : Instruction<opc, "x">;
}
multiclass basic_p <bits<4> opc> {
defm PS : basic_r<opc>;
defm PD : basic_r<opc>;
def Y : Instruction<opc, "y">;
}
defm ADD : basic_s<0xf>, basic_p<0xf>;
The final ``defm`` creates the following records, five from the ``basic_s``
multiclass and five from the ``basic_p`` multiclass::
ADDSSrr, ADDSSrm
ADDSDrr, ADDSDrm
ADDX
ADDPSrr, ADDPSrm
ADDPDrr, ADDPDrm
ADDY
A ``defm`` statement, both at top level and in a multiclass, can inherit
from regular classes in addition to multiclasses. The rule is that the
regular classes must be listed after the multiclasses, and there must be at least
one multiclass.
.. code-block:: text
class XD {
bits<4> Prefix = 11;
}
class XS {
bits<4> Prefix = 12;
}
class I <bits<4> op> {
bits<4> opcode = op;
}
multiclass R {
def rr : I<4>;
def rm : I<2>;
}
multiclass Y {
defm SS : R, XD; // First multiclass R, then regular class XD.
defm SD : R, XS;
}
defm Instr : Y;
This example will create four records, shown here in alphabetical order with
their fields.
.. code-block:: text
def InstrSDrm {
bits<4> opcode = { 0, 0, 1, 0 };
bits<4> Prefix = { 1, 1, 0, 0 };
}
def InstrSDrr {
bits<4> opcode = { 0, 1, 0, 0 };
bits<4> Prefix = { 1, 1, 0, 0 };
}
def InstrSSrm {
bits<4> opcode = { 0, 0, 1, 0 };
bits<4> Prefix = { 1, 0, 1, 1 };
}
def InstrSSrr {
bits<4> opcode = { 0, 1, 0, 0 };
bits<4> Prefix = { 1, 0, 1, 1 };
}
It's also possible to use ``let`` statements inside multiclasses, providing
another way to factor out commonality from the records, especially when
using several levels of multiclass instantiations.
.. code-block:: text
multiclass basic_r <bits<4> opc> {
let Predicates = [HasSSE2] in {
def rr : Instruction<opc, "rr">;
def rm : Instruction<opc, "rm">;
}
let Predicates = [HasSSE3] in
def rx : Instruction<opc, "rx">;
}
multiclass basic_ss <bits<4> opc> {
let IsDouble = 0 in
defm SS : basic_r<opc>;
let IsDouble = 1 in
defm SD : basic_r<opc>;
}
defm ADD : basic_ss<0xf>;
``defset`` --- create a definition set
--------------------------------------
The ``defset`` statement is used to collect a set of records into a global
list of records.
.. productionlist::
Defset: "defset" `Type` `TokIdentifier` "=" "{" `Statement`* "}"
All records defined inside the braces via ``def`` and ``defm`` are defined
as usual, and they are also collected in a global list of the given name
(:token:`TokIdentifier`).
The specified type must be ``list<``\ *class*\ ``>``, where *class* is some
record class. The ``defset`` statement establishes a scope for its
statements. It is an error to define a record in the scope of the
``defset`` that is not of type *class*.
The ``defset`` statement can be nested. The inner ``defset`` adds the
records to its own set, and all those records are also added to the outer
set.
Anonymous records created inside initialization expressions using the
``ClassID<...>`` syntax are not collected in the set.
``defvar`` --- define a variable
--------------------------------
A ``defvar`` statement defines a global variable. Its value can be used
throughout the statements that follow the definition.
.. productionlist::
Defvar: "defvar" `TokIdentifier` "=" `Value` ";"
The identifier on the left of the ``=`` is defined to be a global variable
whose value is given by the value expression on the right of the ``=``. The
type of the variable is automatically inferred.
Once a variable has been defined, it cannot be set to another value.
Variables defined in a top-level ``foreach`` go out of scope at the end of
each loop iteration, so their value in one iteration is not available in
the next iteration. The following ``defvar`` will not work::
defvar i = !add(i, 1)
Variables can also be defined with ``defvar`` in a record body. See
`Defvar in a Record Body`_ for more details.
``foreach`` --- iterate over a sequence of statements
-----------------------------------------------------
The ``foreach`` statement iterates over a series of statements, varying a
variable over a sequence of values.
.. productionlist::
Foreach: "foreach" `ForeachIterator` "in" "{" `Statement`* "}"
:| "foreach" `ForeachIterator` "in" `Statement`
ForeachIterator: `TokIdentifier` "=" ("{" `RangeList` "}" | `RangePiece` | `Value`)
The body of the ``foreach`` is a series of statements in braces or a
single statement with no braces. The statements are re-evaluated once for
each value in the range list, range piece, or single value. On each
iteration, the :token:`TokIdentifier` variable is set to the value and can
be used in the statements.
The statement list establishes an inner scope. Variables local to a
``foreach`` go out of scope at the end of each loop iteration, so their
values do not carry over from one iteration to the next. Foreach loops may
be nested.
The ``foreach`` statement can also be used in a record :token:`Body`.
.. Note that the productions involving RangeList and RangePiece have precedence
over the more generic value parsing based on the first token.
.. code-block:: text
foreach i = [0, 1, 2, 3] in {
def R#i : Register<...>;
def F#i : Register<...>;
}
This loop defines records named ``R0``, ``R1``, ``R2``, and ``R3``, along
with ``F0``, ``F1``, ``F2``, and ``F3``.
``if`` --- select statements based on a test
--------------------------------------------
The ``if`` statement allows one of two statement groups to be selected based
on the value of an expression.
.. productionlist::
If: "if" `Value` "then" `IfBody`
:| "if" `Value` "then" `IfBody` "else" `IfBody`
IfBody: "{" `Statement`* "}" | `Statement`
The value expression is evaluated. If it evaluates to true (in the same
sense used by the bang operators), then the statements following the
``then`` reserved word are processed. Otherwise, if there is an ``else``
reserved word, the statements following the ``else`` are processed. If the
value is false and there is no ``else`` arm, no statements are processed.
Because the braces around the ``then`` statements are optional, this grammar rule
has the usual ambiguity with "dangling else" clauses, and it is resolved in
the usual way: in a case like ``if v1 then if v2 then {...} else {...}``, the
``else`` associates with the inner ``if`` rather than the outer one.
The :token:`IfBody` of the then and else arms of the ``if`` establish an
inner scope. Any ``defvar`` variables defined in the bodies go out of scope
when the bodies are finished (see `Defvar in a Record Body`_ for more details).
The ``if`` statement can also be used in a record :token:`Body`.
``assert`` --- check that a condition is true
---------------------------------------------
The ``assert`` statement checks a boolean condition to be sure that it is true
and prints an error message if it is not.
.. productionlist::
Assert: "assert" `condition` "," `message` ";"
If the boolean condition is true, the statement does nothing. If the
condition is false, it prints a nonfatal error message. The **message**, which
can be an arbitrary string expression, is included in the error message as a
note. The exact behavior of the ``assert`` statement depends on its
placement.
* At top level, the assertion is checked immediately.
* In a record definition, the statement is saved and all assertions are
checked after the record is completely built.
* In a class definition, the assertions are saved and inherited by all
the record definitions that inherit from the class. The assertions are
then checked when the records are completely built. [this placement is not
yet available]
* In a multiclass definition, ... [this placement is not yet available]
Additional Details
==================
Directed acyclic graphs (DAGs)
------------------------------
A directed acyclic graph can be represented directly in TableGen using the
``dag`` datatype. A DAG node consists of an operator and zero or more
arguments (or operands). Each argument can be of any desired type. By using
another DAG node as an argument, an arbitrary graph of DAG nodes can be
built.
The syntax of a ``dag`` instance is:
``(`` *operator* *argument1*\ ``,`` *argument2*\ ``,`` ... ``)``
The operator must be present and must be a record. There can be zero or more
arguments, separated by commas. The operator and arguments can have three
formats.
====================== =============================================
Format Meaning
====================== =============================================
*value* argument value
*value*\ ``:``\ *name* argument value and associated name
*name* argument name with unset (uninitialized) value
====================== =============================================
The *value* can be any TableGen value. The *name*, if present, must be a
:token:`TokVarName`, which starts with a dollar sign (``$``). The purpose of
a name is to tag an operator or argument in a DAG with a particular meaning,
or to associate an argument in one DAG with a like-named argument in another
DAG.
The following bang operators are useful for working with DAGs:
``!con``, ``!dag``, ``!empty``, ``!foreach``, ``!getdagop``, ``!setdagop``, ``!size``.
Defvar in a record body
-----------------------
In addition to defining global variables, the ``defvar`` statement can
be used inside the :token:`Body` of a class or record definition to define
local variables. The scope of the variable extends from the ``defvar``
statement to the end of the body. It cannot be set to a different value
within its scope. The ``defvar`` statement can also be used in the statement
list of a ``foreach``, which establishes a scope.
A variable named ``V`` in an inner scope shadows (hides) any variables ``V``
in outer scopes. In particular, ``V`` in a record body shadows a global
``V``, and ``V`` in a ``foreach`` statement list shadows any ``V`` in
surrounding record or global scopes.
Variables defined in a ``foreach`` go out of scope at the end of
each loop iteration, so their value in one iteration is not available in
the next iteration. The following ``defvar`` will not work::
defvar i = !add(i, 1)
How records are built
---------------------
The following steps are taken by TableGen when a record is built. Classes are simply
abstract records and so go through the same steps.
1. Build the record name (:token:`NameValue`) and create an empty record.
2. Parse the superclasses in the :token:`ParentClassList` from left to
right, visiting each superclass's ancestor classes from top to bottom.
a. Add the fields from the superclass to the record.
b. Substitute the template arguments into those fields.
c. Add the superclass to the record's list of inherited classes.
3. Apply any top-level ``let`` bindings to the record. Recall that top-level
bindings only apply to inherited fields.
4. Parse the body of the record.
* Add any fields to the record.
* Modify the values of fields according to local ``let`` statements.
* Define any ``defvar`` variables.
5. Make a pass over all the fields to resolve any inter-field references.
6. Add the record to the master record list.
Because references between fields are resolved (step 5) after ``let`` bindings are
applied (step 3), the ``let`` statement has unusual power. For example:
.. code-block:: text
class C <int x> {
int Y = x;
int Yplus1 = !add(Y, 1);
int xplus1 = !add(x, 1);
}
let Y = 10 in {
def rec1 : C<5> {
}
}
def rec2 : C<5> {
let Y = 10;
}
In both cases, one where a top-level ``let`` is used to bind ``Y`` and one
where a local ``let`` does the same thing, the results are:
.. code-block:: text
def rec1 { // C
int Y = 10;
int Yplus1 = 11;
int xplus1 = 6;
}
def rec2 { // C
int Y = 10;
int Yplus1 = 11;
int xplus1 = 6;
}
``Yplus1`` is 11 because the ``let Y`` is performed before the ``!add(Y,
1)`` is resolved. Use this power wisely.
Using Classes as Subroutines
============================
As described in `Simple values`_, a class can be invoked in an expression
and passed template arguments. This causes TableGen to create a new anonymous
record inheriting from that class. As usual, the record receives all the
fields defined in the class.
This feature can be employed as a simple subroutine facility. The class can
use the template arguments to define various variables and fields, which end
up in the anonymous record. Those fields can then be retrieved in the
expression invoking the class as follows. Assume that the field ``ret``
contains the final value of the subroutine.
.. code-block:: text
int Result = ... CalcValue<arg>.ret ...;
The ``CalcValue`` class is invoked with the template argument ``arg``. It
calculates a value for the ``ret`` field, which is then retrieved at the
"point of call" in the initialization for the Result field. The anonymous
record created in this example serves no other purpose than to carry the
result value.
Here is a practical example. The class ``isValidSize`` determines whether a
specified number of bytes represents a valid data size. The bit ``ret`` is
set appropriately. The field ``ValidSize`` obtains its initial value by
invoking ``isValidSize`` with the data size and retrieving the ``ret`` field
from the resulting anonymous record.
.. code-block:: text
class isValidSize<int size> {
bit ret = !cond(!eq(size, 1): 1,
!eq(size, 2): 1,
!eq(size, 4): 1,
!eq(size, 8): 1,
!eq(size, 16): 1,
true: 0);
}
def Data1 {
int Size = ...;
bit ValidSize = isValidSize<Size>.ret;
}
Preprocessing Facilities
========================
The preprocessor embedded in TableGen is intended only for simple
conditional compilation. It supports the following directives, which are
specified somewhat informally.
.. productionlist::
LineBegin: beginning of line
LineEnd: newline | return | EOF
WhiteSpace: space | tab
CComment: "/*" ... "*/"
BCPLComment: "//" ... `LineEnd`
WhiteSpaceOrCComment: `WhiteSpace` | `CComment`
WhiteSpaceOrAnyComment: `WhiteSpace` | `CComment` | `BCPLComment`
MacroName: `ualpha` (`ualpha` | "0"..."9")*
PreDefine: `LineBegin` (`WhiteSpaceOrCComment`)*
: "#define" (`WhiteSpace`)+ `MacroName`
: (`WhiteSpaceOrAnyComment`)* `LineEnd`
PreIfdef: `LineBegin` (`WhiteSpaceOrCComment`)*
: ("#ifdef" | "#ifndef") (`WhiteSpace`)+ `MacroName`
: (`WhiteSpaceOrAnyComment`)* `LineEnd`
PreElse: `LineBegin` (`WhiteSpaceOrCComment`)*
: "#else" (`WhiteSpaceOrAnyComment`)* `LineEnd`
PreEndif: `LineBegin` (`WhiteSpaceOrCComment`)*
: "#endif" (`WhiteSpaceOrAnyComment`)* `LineEnd`
..
PreRegContentException: `PreIfdef` | `PreElse` | `PreEndif` | EOF
PreRegion: .* - `PreRegContentException`
:| `PreIfdef`
: (`PreRegion`)*
: [`PreElse`]
: (`PreRegion`)*
: `PreEndif`
A :token:`MacroName` can be defined anywhere in a TableGen file. The name has
no value; it can only be tested to see whether it is defined.
A macro test region begins with an ``#ifdef`` or ``#ifndef`` directive. If
the macro name is defined (``#ifdef``) or undefined (``#ifndef``), then the
source code between the directive and the corresponding ``#else`` or
``#endif`` is processed. If the test fails but there is an ``#else``
clause, the source code between the ``#else`` and the ``#endif`` is
processed. If the test fails and there is no ``#else`` clause, then no
source code in the test region is processed.
Test regions may be nested, but they must be properly nested. A region
started in a file must end in that file; that is, must have its
``#endif`` in the same file.
A :token:`MacroName` may be defined externally using the ``-D`` option on the
``xxx-tblgen`` command line::
llvm-tblgen self-reference.td -Dmacro1 -Dmacro3
Appendix A: Bang Operators
==========================
Bang operators act as functions in value expressions. A bang operator takes
one or more arguments, operates on them, and produces a result. If the
operator produces a boolean result, the result value will be 1 for true or 0
for false. When an operator tests a boolean argument, it interprets 0 as false
and non-0 as true.
.. warning::
The ``!getop`` and ``!setop`` bang operators are deprecated in favor of
``!getdagop`` and ``!setdagop``.
``!add(``\ *a*\ ``,`` *b*\ ``, ...)``
This operator adds *a*, *b*, etc., and produces the sum.
``!and(``\ *a*\ ``,`` *b*\ ``, ...)``
This operator does a bitwise AND on *a*, *b*, etc., and produces the
result. A logical AND can be performed if all the arguments are either
0 or 1.
``!cast<``\ *type*\ ``>(``\ *a*\ ``)``
This operator performs a cast on *a* and produces the result.
If *a* is not a string, then a straightforward cast is performed, say
between an ``int`` and a ``bit``, or between record types. This allows
casting a record to a class. If a record is cast to ``string``, the
record's name is produced.
If *a* is a string, then it is treated as a record name and looked up in
the list of all defined records. The resulting record is expected to be of
the specified *type*.
For example, if ``!cast<``\ *type*\ ``>(``\ *name*\ ``)``
appears in a multiclass definition, or in a
class instantiated inside a multiclass definition, and the *name* does not
reference any template arguments of the multiclass, then a record by
that name must have been instantiated earlier
in the source file. If *name* does reference
a template argument, then the lookup is delayed until ``defm`` statements
instantiating the multiclass (or later, if the defm occurs in another
multiclass and template arguments of the inner multiclass that are
referenced by *name* are substituted by values that themselves contain
references to template arguments of the outer multiclass).
If the type of *a* does not match *type*, TableGen raises an error.
``!con(``\ *a*\ ``,`` *b*\ ``, ...)``
This operator concatenates the DAG nodes *a*, *b*, etc. Their operations
must equal.
``!con((op a1:$name1, a2:$name2), (op b1:$name3))``
results in the DAG node ``(op a1:$name1, a2:$name2, b1:$name3)``.
``!cond(``\ *cond1* ``:`` *val1*\ ``,`` *cond2* ``:`` *val2*\ ``, ...,`` *condn* ``:`` *valn*\ ``)``
This operator tests *cond1* and returns *val1* if the result is true.
If false, the operator tests *cond2* and returns *val2* if the result is
true. And so forth. An error is reported if no conditions are true.
This example produces the sign word for an integer::
!cond(!lt(x, 0) : "negative", !eq(x, 0) : "zero", true : "positive")
``!dag(``\ *op*\ ``,`` *arguments*\ ``,`` *names*\ ``)``
This operator creates a DAG node with the given operator and
arguments. The *arguments* and *names* arguments must be lists
of equal length or uninitialized (``?``). The *names* argument
must be of type ``list<string>``.
Due to limitations of the type system, *arguments* must be a list of items
of a common type. In practice, this means that they should either have the
same type or be records with a common superclass. Mixing ``dag`` and
non-``dag`` items is not possible. However, ``?`` can be used.
Example: ``!dag(op, [a1, a2, ?], ["name1", "name2", "name3"])`` results in
``(op a1-value:$name1, a2-value:$name2, ?:$name3)``.
``!empty(``\ *a*\ ``)``
This operator produces 1 if the string, list, or DAG *a* is empty; 0 otherwise.
A dag is empty if it has no arguments; the operator does not count.
``!eq(`` *a*\ `,` *b*\ ``)``
This operator produces 1 if *a* is equal to *b*; 0 otherwise.
The arguments must be ``bit``, ``bits``, ``int``, ``string``, or
record values. Use ``!cast<string>`` to compare other types of objects.
``!filter(``\ *var*\ ``,`` *list*\ ``,`` *predicate*\ ``)``
This operator creates a new ``list`` by filtering the elements in
*list*. To perform the filtering, TableGen binds the variable *var* to each
element and then evaluates the *predicate* expression, which presumably
refers to *var*. The predicate must
produce a boolean value (``bit``, ``bits``, or ``int``). The value is
interpreted as with ``!if``:
if the value is 0, the element is not included in the new list. If the value
is anything else, the element is included.
``!foldl(``\ *init*\ ``,`` *list*\ ``,`` *acc*\ ``,`` *var*\ ``,`` *expr*\ ``)``
This operator performs a left-fold over the items in *list*. The
variable *acc* acts as the accumulator and is initialized to *init*.
The variable *var* is bound to each element in the *list*. The
expression is evaluated for each element and presumably uses *acc* and
*var* to calculate the accumulated value, which ``!foldl`` stores back in
*acc*. The type of *acc* is the same as *init*; the type of *var* is the
same as the elements of *list*; *expr* must have the same type as *init*.
The following example computes the total of the ``Number`` field in the
list of records in ``RecList``::
int x = !foldl(0, RecList, total, rec, !add(total, rec.Number));
If your goal is to filter the list and produce a new list that includes only
some of the elements, see ``!filter``.
``!foreach(``\ *var*\ ``,`` *sequence*\ ``,`` *expr*\ ``)``
This operator creates a new ``list``/``dag`` in which each element is a
function of the corresponding element in the *sequence* ``list``/``dag``.
To perform the function, TableGen binds the variable *var* to an element
and then evaluates the expression. The expression presumably refers
to the variable *var* and calculates the result value.
If you simply want to create a list of a certain length containing
the same value repeated multiple times, see ``!listsplat``.
``!ge(``\ *a*\ `,` *b*\ ``)``
This operator produces 1 if *a* is greater than or equal to *b*; 0 otherwise.
The arguments must be ``bit``, ``bits``, ``int``, or ``string`` values.
``!getdagop(``\ *dag*\ ``)`` --or-- ``!getdagop<``\ *type*\ ``>(``\ *dag*\ ``)``
This operator produces the operator of the given *dag* node.
Example: ``!getdagop((foo 1, 2))`` results in ``foo``. Recall that
DAG operators are always records.
The result of ``!getdagop`` can be used directly in a context where
any record class at all is acceptable (typically placing it into
another dag value). But in other contexts, it must be explicitly
cast to a particular class. The ``<``\ *type*\ ``>`` syntax is
provided to make this easy.
For example, to assign the result to a value of type ``BaseClass``, you
could write either of these::
BaseClass b = !getdagop<BaseClass>(someDag);
BaseClass b = !cast<BaseClass>(!getdagop(someDag));
But to create a new DAG node that reuses the operator from another, no
cast is necessary::
dag d = !dag(!getdagop(someDag), args, names);
``!gt(``\ *a*\ `,` *b*\ ``)``
This operator produces 1 if *a* is greater than *b*; 0 otherwise.
The arguments must be ``bit``, ``bits``, ``int``, or ``string`` values.
``!head(``\ *a*\ ``)``
This operator produces the zeroth element of the list *a*.
(See also ``!tail``.)
``!if(``\ *test*\ ``,`` *then*\ ``,`` *else*\ ``)``
This operator evaluates the *test*, which must produce a ``bit`` or
``int``. If the result is not 0, the *then* expression is produced; otherwise
the *else* expression is produced.
``!interleave(``\ *list*\ ``,`` *delim*\ ``)``
This operator concatenates the items in the *list*, interleaving the
*delim* string between each pair, and produces the resulting string.
The list can be a list of string, int, bits, or bit. An empty list
results in an empty string. The delimiter can be the empty string.
``!isa<``\ *type*\ ``>(``\ *a*\ ``)``
This operator produces 1 if the type of *a* is a subtype of the given *type*; 0
otherwise.
``!le(``\ *a*\ ``,`` *b*\ ``)``
This operator produces 1 if *a* is less than or equal to *b*; 0 otherwise.
The arguments must be ``bit``, ``bits``, ``int``, or ``string`` values.
``!listconcat(``\ *list1*\ ``,`` *list2*\ ``, ...)``
This operator concatenates the list arguments *list1*, *list2*, etc., and
produces the resulting list. The lists must have the same element type.
``!listsplat(``\ *value*\ ``,`` *count*\ ``)``
This operator produces a list of length *count* whose elements are all
equal to the *value*. For example, ``!listsplat(42, 3)`` results in
``[42, 42, 42]``.
``!lt(``\ *a*\ `,` *b*\ ``)``
This operator produces 1 if *a* is less than *b*; 0 otherwise.
The arguments must be ``bit``, ``bits``, ``int``, or ``string`` values.
``!mul(``\ *a*\ ``,`` *b*\ ``, ...)``
This operator multiplies *a*, *b*, etc., and produces the product.
``!ne(``\ *a*\ `,` *b*\ ``)``
This operator produces 1 if *a* is not equal to *b*; 0 otherwise.
The arguments must be ``bit``, ``bits``, ``int``, ``string``,
or record values. Use ``!cast<string>`` to compare other types of objects.
``!not(``\ *a*\ ``)``
This operator performs a logical NOT on *a*, which must be
an integer. The argument 0 results in 1 (true); any other
argument results in 0 (false).
``!or(``\ *a*\ ``,`` *b*\ ``, ...)``
This operator does a bitwise OR on *a*, *b*, etc., and produces the
result. A logical OR can be performed if all the arguments are either
0 or 1.
``!setdagop(``\ *dag*\ ``,`` *op*\ ``)``
This operator produces a DAG node with the same arguments as *dag*, but with its
operator replaced with *op*.
Example: ``!setdagop((foo 1, 2), bar)`` results in ``(bar 1, 2)``.
``!shl(``\ *a*\ ``,`` *count*\ ``)``
This operator shifts *a* left logically by *count* bits and produces the resulting
value. The operation is performed on a 64-bit integer; the result
is undefined for shift counts outside 0...63.
``!size(``\ *a*\ ``)``
This operator produces the size of the string, list, or dag *a*.
The size of a DAG is the number of arguments; the operator does not count.
``!sra(``\ *a*\ ``,`` *count*\ ``)``
This operator shifts *a* right arithmetically by *count* bits and produces the resulting
value. The operation is performed on a 64-bit integer; the result
is undefined for shift counts outside 0...63.
``!srl(``\ *a*\ ``,`` *count*\ ``)``
This operator shifts *a* right logically by *count* bits and produces the resulting
value. The operation is performed on a 64-bit integer; the result
is undefined for shift counts outside 0...63.
``!strconcat(``\ *str1*\ ``,`` *str2*\ ``, ...)``
This operator concatenates the string arguments *str1*, *str2*, etc., and
produces the resulting string.
``!sub(``\ *a*\ ``,`` *b*\ ``)``
This operator subtracts *b* from *a* and produces the arithmetic difference.
``!subst(``\ *target*\ ``,`` *repl*\ ``,`` *value*\ ``)``
This operator replaces all occurrences of the *target* in the *value* with
the *repl* and produces the resulting value. The *value* can
be a string, in which case substring substitution is performed.
The *value* can be a record name, in which case the operator produces the *repl*
record if the *target* record name equals the *value* record name; otherwise it
produces the *value*.
``!substr(``\ *string*\ ``,`` *start*\ [``,`` *length*]\ ``)``
This operator extracts a substring of the given *string*. The starting
position of the substring is specified by *start*, which can range
between 0 and the length of the string. The length of the substring
is specified by *length*; if not specified, the rest of the string is
extracted. The *start* and *length* arguments must be integers.
``!tail(``\ *a*\ ``)``
This operator produces a new list with all the elements
of the list *a* except for the zeroth one. (See also ``!head``.)
``!xor(``\ *a*\ ``,`` *b*\ ``, ...)``
This operator does a bitwise EXCLUSIVE OR on *a*, *b*, etc., and produces
the result. A logical XOR can be performed if all the arguments are either
0 or 1.
Appendix B: Paste Operator Examples
===================================
Here is an example illustrating the use of the paste operator in record names.
.. code-block:: text
defvar suffix = "_suffstring";
defvar some_ints = [0, 1, 2, 3];
def name # suffix {
}
foreach i = [1, 2] in {
def rec # i {
}
}
The first ``def`` does not use the value of the ``suffix`` variable. The
second def does use the value of the ``i`` iterator variable, because it is not a
global name. The following records are produced.
.. code-block:: text
def namesuffix {
}
def rec1 {
}
def rec2 {
}
Here is a second example illustrating the paste operator in field value expressions.
.. code-block:: text
def test {
string strings = suffix # suffix;
list<int> integers = some_ints # [4, 5, 6];
}
The ``strings`` field expression uses ``suffix`` on both sides of the paste
operator. It is evaluated normally on the left hand side, but taken verbatim
on the right hand side. The ``integers`` field expression uses the value of
the ``some_ints`` variable and a literal list. The following record is
produced.
.. code-block:: text
def test {
string strings = "_suffstringsuffix";
list<int> ints = [0, 1, 2, 3, 4, 5, 6];
}
Appendix C: Sample Record
=========================
One target machine supported by LLVM is the Intel x86. The following output
from TableGen shows the record that is created to represent the 32-bit
register-to-register ADD instruction.
.. code-block:: text
def ADD32rr { // InstructionEncoding Instruction X86Inst I ITy Sched BinOpRR BinOpRR_RF
int Size = 0;
string DecoderNamespace = "";
list<Predicate> Predicates = [];
string DecoderMethod = "";
bit hasCompleteDecoder = 1;
string Namespace = "X86";
dag OutOperandList = (outs GR32:$dst);
dag InOperandList = (ins GR32:$src1, GR32:$src2);
string AsmString = "add{l} {$src2, $src1|$src1, $src2}";
EncodingByHwMode EncodingInfos = ?;
list<dag> Pattern = [(set GR32:$dst, EFLAGS, (X86add_flag GR32:$src1, GR32:$src2))];
list<Register> Uses = [];
list<Register> Defs = [EFLAGS];
int CodeSize = 3;
int AddedComplexity = 0;
bit isPreISelOpcode = 0;
bit isReturn = 0;
bit isBranch = 0;
bit isEHScopeReturn = 0;
bit isIndirectBranch = 0;
bit isCompare = 0;
bit isMoveImm = 0;
bit isMoveReg = 0;
bit isBitcast = 0;
bit isSelect = 0;
bit isBarrier = 0;
bit isCall = 0;
bit isAdd = 0;
bit isTrap = 0;
bit canFoldAsLoad = 0;
bit mayLoad = ?;
bit mayStore = ?;
bit mayRaiseFPException = 0;
bit isConvertibleToThreeAddress = 1;
bit isCommutable = 1;
bit isTerminator = 0;
bit isReMaterializable = 0;
bit isPredicable = 0;
bit isUnpredicable = 0;
bit hasDelaySlot = 0;
bit usesCustomInserter = 0;
bit hasPostISelHook = 0;
bit hasCtrlDep = 0;
bit isNotDuplicable = 0;
bit isConvergent = 0;
bit isAuthenticated = 0;
bit isAsCheapAsAMove = 0;
bit hasExtraSrcRegAllocReq = 0;
bit hasExtraDefRegAllocReq = 0;
bit isRegSequence = 0;
bit isPseudo = 0;
bit isExtractSubreg = 0;
bit isInsertSubreg = 0;
bit variadicOpsAreDefs = 0;
bit hasSideEffects = ?;
bit isCodeGenOnly = 0;
bit isAsmParserOnly = 0;
bit hasNoSchedulingInfo = 0;
InstrItinClass Itinerary = NoItinerary;
list<SchedReadWrite> SchedRW = [WriteALU];
string Constraints = "$src1 = $dst";
string DisableEncoding = "";
string PostEncoderMethod = "";
bits<64> TSFlags = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0 };
string AsmMatchConverter = "";
string TwoOperandAliasConstraint = "";
string AsmVariantName = "";
bit UseNamedOperandTable = 0;
bit FastISelShouldIgnore = 0;
bits<8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 };
Format Form = MRMDestReg;
bits<7> FormBits = { 0, 1, 0, 1, 0, 0, 0 };
ImmType ImmT = NoImm;
bit ForceDisassemble = 0;
OperandSize OpSize = OpSize32;
bits<2> OpSizeBits = { 1, 0 };
AddressSize AdSize = AdSizeX;
bits<2> AdSizeBits = { 0, 0 };
Prefix OpPrefix = NoPrfx;
bits<3> OpPrefixBits = { 0, 0, 0 };
Map OpMap = OB;
bits<3> OpMapBits = { 0, 0, 0 };
bit hasREX_WPrefix = 0;
FPFormat FPForm = NotFP;
bit hasLockPrefix = 0;
Domain ExeDomain = GenericDomain;
bit hasREPPrefix = 0;
Encoding OpEnc = EncNormal;
bits<2> OpEncBits = { 0, 0 };
bit HasVEX_W = 0;
bit IgnoresVEX_W = 0;
bit EVEX_W1_VEX_W0 = 0;
bit hasVEX_4V = 0;
bit hasVEX_L = 0;
bit ignoresVEX_L = 0;
bit hasEVEX_K = 0;
bit hasEVEX_Z = 0;
bit hasEVEX_L2 = 0;
bit hasEVEX_B = 0;
bits<3> CD8_Form = { 0, 0, 0 };
int CD8_EltSize = 0;
bit hasEVEX_RC = 0;
bit hasNoTrackPrefix = 0;
bits<7> VectSize = { 0, 0, 1, 0, 0, 0, 0 };
bits<7> CD8_Scale = { 0, 0, 0, 0, 0, 0, 0 };
string FoldGenRegForm = ?;
string EVEX2VEXOverride = ?;
bit isMemoryFoldable = 1;
bit notEVEX2VEXConvertible = 0;
}
On the first line of the record, you can see that the ``ADD32rr`` record
inherited from eight classes. Although the inheritance hierarchy is complex,
using superclasses is much simpler than specifying the 109 individual fields for each
instruction.
Here is the code fragment used to define ``ADD32rr`` and multiple other
``ADD`` instructions:
.. code-block:: text
defm ADD : ArithBinOp_RF<0x00, 0x02, 0x04, "add", MRM0r, MRM0m,
X86add_flag, add, 1, 1, 1>;
The ``defm`` statement tells TableGen that ``ArithBinOp_RF`` is a
multiclass, which contains multiple concrete record definitions that inherit
from ``BinOpRR_RF``. That class, in turn, inherits from ``BinOpRR``, which
inherits from ``ITy`` and ``Sched``, and so forth. The fields are inherited
from all the parent classes; for example, ``IsIndirectBranch`` is inherited
from the ``Instruction`` class.