@@ -6,26 +6,193 @@ our $VERSION = '3.58';
6
6
7
7
=head1 NAME
8
8
9
- ExtUtils::ParseXS::Node - Classes for nodes of an ExtUtils::ParseXS AST
9
+ ExtUtils::ParseXS::Node - Classes for nodes of an Abstract Syntax Tree
10
10
11
11
=head1 SYNOPSIS
12
12
13
- XXX TBC
13
+ # Create a node to represent the Foo part of an XSUB; then
14
+ # top-down parse it into a subtree; then top-down emit the
15
+ # contents of the subtree as C code.
16
+
17
+ my $foo = ExtUtils::ParseXS::Node::Foo->new();
18
+ $foo->parse(...)
19
+ or die;
20
+ $foo->as_code(...);
14
21
15
22
=head1 DESCRIPTION
16
23
17
- XXX Sept 2024: this is Work In Progress. This API is currently private and
18
- subject to change. Most of ParseXS doesn't use an AST, and instead
19
- maintains just enough state to emit code as it parses. This module
20
- represents the start of an effort to make it use an AST instead.
24
+ This API is currently private and subject to change.
25
+
26
+ Node that as of May 2025, this is a Work In Progress. An AST is created
27
+ for each parsed XSUB, but those nodes aren't yet linked into a
28
+ higher-level tree representing the whole XS file.
21
29
22
- An C<ExtUtils::ParseXS::Node > class, and its various subclasses, hold the
30
+ The C<ExtUtils::ParseXS::Node > class, and its various subclasses, hold the
23
31
state for the nodes of an Abstract Syntax Tree (AST), which represents the
24
32
parsed state of an XS file.
25
33
26
- Each node is basically a hash of fields. Which field names are legal
27
- varies by the node type. The hash keys and values can be accessed
28
- directly: there are no getter/setter methods.
34
+ Each node is a hash of fields. Which field names are legal varies by the
35
+ node type. The hash keys and values can be accessed directly: there are no
36
+ getter/setter methods.
37
+
38
+ Each node may have a C<kids > field which points to an array of all the
39
+ children of that node: this is what provides the tree structure. In
40
+ addition, some of those kids may also have direct links from fields for
41
+ quick access. For example, the C<xsub_decl > child object of an C<xsub >
42
+ object can be accessed in either of these ways:
43
+
44
+ $xsub_object->{kids}[0]
45
+ $xsub_object->{decl}
46
+
47
+ Most object-valued node fields within a tree point only to their direct
48
+ children; however, both C<INPUT_line > and C<OUTPUT_line > have an
49
+ C<ioparam > field which points to the C<IO_Param > object associated with
50
+ this line, which is located elsewhere in the tree.
51
+
52
+ The various C<foo_part > nodes divide the parsing of the main body of the
53
+ XSUB into sections where different sets of keywords are allowable, and
54
+ where various bits of code can be conveniently emitted.
55
+
56
+ =head2 Methods
57
+
58
+ There are two main methods, in addition to new(), which are present in all
59
+ subclasses. First, parse() consumes lines from the source to satisfy the
60
+ construct being parsed. It may itself create objects of lower-level
61
+ constructs and call parse on them. For example, C<Node::xbody::parse() >
62
+ may create a C<Node::input_part > node and call parse() on it, which will
63
+ create C<Node::INPUT > or C<Node::PREINIT > nodes as appropriate, and so on.
64
+
65
+ Secondly, as_code() descends its sub-tree, outputting the tree as C code.
66
+
67
+ Note that parsing and code-generation are done as two separate phases;
68
+ parse() should only build a tree and never emit code.
69
+
70
+ In addition to C<$self > , both these methods are always provided with
71
+ these three parameters:
72
+
73
+ =over
74
+
75
+ =item C<$pxs >
76
+
77
+ An C<ExtUtils::ParseXS > object which contains the overall processing
78
+ state. In particular, it has warning and croaking methods, and holds the
79
+ lines read in from the source file for the current paragraph.
80
+
81
+ =item C<$xsub >
82
+
83
+ The current C<ExtUtils::ParseXS::xsub > node being processed.
84
+
85
+ =item C<$xbody >
86
+
87
+ The current C<ExtUtils::ParseXS::xbody > node being processed. Note that
88
+ in the presence of a C<CASE > keyword, an XSUB can have multiple bodies.
89
+
90
+ =back
91
+
92
+ The parse() and as_code() methods for some subclasses may have additional
93
+ parameters.
94
+
95
+ Some subclasses may have additional helper methods.
96
+
97
+ =head2 Class Hierachy
98
+
99
+ C<Node > and its sub-classes form the following inheritance hierarchy.
100
+ Various abstract classes are used by concrete subclasses where the
101
+ processing and/or fields are similar: for example, C<CODE > , C<PPCODE > etc
102
+ all consume a block of uninterpreted lines from the source file until the
103
+ next keyword, and emit that code, possibly wrapped in C<#line > directives.
104
+ This common behaviour is provided by the C<codeblock > class.
105
+
106
+ Node
107
+ xsub
108
+ xsub_decl
109
+ ReturnType
110
+ Param
111
+ IO_Param
112
+ Params
113
+ xbody
114
+ input_part
115
+ init_part
116
+ code_part
117
+ output_part
118
+ cleanup_part
119
+ autocall
120
+ oneline
121
+ NOT_IMPLEMENTED_YET
122
+ CASE
123
+ enable
124
+ EXPORT_XSUB_SYMBOLS
125
+ PROTOTYPES
126
+ SCOPE
127
+ VERSIONCHECK
128
+ multiline
129
+ multiline_merged
130
+ C_ARGS
131
+ INTERFACE
132
+ INTERFACE_MACRO
133
+ OVERLOAD
134
+ ATTRS
135
+ PROTOTYPE
136
+ codeblock
137
+ CODE
138
+ CLEANUP
139
+ INIT
140
+ POSTCALL
141
+ PPCODE
142
+ PREINIT
143
+ keylines
144
+ ALIAS
145
+ INPUT
146
+ OUTPUT
147
+ keyline
148
+ ALIAS_line
149
+ INPUT_line
150
+ OUTPUT_line
151
+
152
+
153
+ =head2 Abstract Syntax Tree structure
154
+
155
+ A typical XSUB might compile to a tree with a structure similar to the
156
+ following. Note that this is unrelated to the inheritance hierarchy
157
+ shown above.
158
+
159
+ xsub
160
+ xsub_decl
161
+ ReturnType
162
+ Params
163
+ Param
164
+ Param
165
+ ...
166
+ CASE # for when a CASE keyword being present implies multiple
167
+ # bodies; otherwise, just a bare xbody node.
168
+ xbody
169
+ # per-body copy of declaration Params, augmented by
170
+ # data from INPUT and OUTPUT sections
171
+ Params
172
+ IO_Param
173
+ IO_Param
174
+ ...
175
+ input_part
176
+ INPUT
177
+ INPUT_line
178
+ INPUT_line
179
+ ...
180
+ PREINIT
181
+ init_part
182
+ INIT
183
+ code_part
184
+ CODE
185
+ output_part
186
+ OUTPUT
187
+ OUTPUT_line
188
+ OUTPUT_line
189
+ ...
190
+ POSTCALL
191
+ cleanup_part
192
+ CLEANUP
193
+ CASE
194
+ xbody
195
+ ...
29
196
30
197
=cut
31
198
0 commit comments