source: proto/pabloj/trunk/docs/notes @ 2981

Last change on this file since 2981 was 2981, checked in by ksherdy, 6 years ago

Grammar updates. Eliminated optional fw on stream types. Removed typedef.

File size: 10.1 KB
Line 
1================================================================================
2PabloJ Compiler Design
3================================================================================
4This document describes the state of the PabloJ compiler at Version 0.8.0.
5Original implementation is modelled on the Pablo Python compiler version 2615.
6
7Version 0.8.0
8
9Carry macro style support.
10Validates against XMLWF test suite.
11Validates against large XML file set (roads.gml, ...) .
12
13================================================================================
14PabloS Builtins
15================================================================================
16
17TODO
18
191. Add field width to pablo.operation builtin calls.
20        e.g. pablo.operation() -> pablos.operation<fw>(...)
21
22================================================================================
23PabloS Types
24================================================================================
25
261. Strings and integers are constants that must be initialized on declaration.
27
282. 2^k stream types are initialized to simd<k>::constant(0).
29
303. Mask(2^k, constantValue) can be used to initialized stream<2^k> primitives.
31
32================================================================================
33PabloS Visitors
34================================================================================
35
361. All PabloS conditional expression E are wrapped in BitBlock::any(E) calls.
37
38================================================================================
39PabloB Memory and Block-at-a-time Processing Model
40================================================================================
41
421. Pass by reference.
43
44   TODO - Think this out.
45
46   Call-out streams are re-initialized (reset to zero) on each block iteration.
47
48   
49
50================================================================================
51Pablo vs. PabloJ
52================================================================================
53
541. PabloJ requires explicit variable declarations
55
56e.g. bitblock s; s = simd::constant<1>(0);
57
58(2) Current PabloB grammar does not support boolean OR.
59
60Work around. Added PabloJCarryTest runtime check that returns a type BitBlock.
61Wrapped 'if' / 'while' conditions within bitblock::any() IDISA call.
62
63(3) TODO - Avoid carry object creation in Kernel w/o carries.
64
65(4) BitBlock EOF_mask --> const BitBlock mask
66
67(5) 
68
69PabloS - Design and implementation issues.
70
71(1) Grammar/Parser/Semantic Analyzer
72
73    The current Scatter grammar does not restrict many programs accepted by the parser.
74       
75        At this time the intention is that a semantic analysis phase will generate semantic errors.
76       
77    For example,
78   
79    (a) PabloS grammar permits top-level type declarations of all types.
80        However, the intention is to restrict to stream struct definitions.
81   
82    (b) PabloS grammar does not enforce def/use for stream structs prior to use in stream function defintions,
83        i.e. PabloS grammar does not prevent the definition of stream functions prior to stream structs.
84       
85    (c) PabloS grammar does not prevent the definition of new types within the body of
86        stream functions.   
87
88 (2) No semantic analysis, no type checking.
89       
90
91PabloB - Design and implementation issues.
92
93(1) 'static' initialization section vs. 'kernel' constructor
94
95(2) integer types uint8, uint16, uint32, uint64, uint128, uint256
96  - if integer types are restricted 'sections' of stream then no apparent need exists for signed integers
97
98C++ Code Generation Strategy
99
100(1) sythetic variable, scope, identifiers, tree compression, etc...
101
102PabloJ - Pablo language implementation.
103
104(1) Translates arbitrary length 2^k bit field width streams to an equivalent block-at-a-time target language implementation.
105(2) ...
106
107PabloJ Design Differences
108
1091. Programmer defined stream function variable declaration. PabloJ does not gather local variables.
110
111PabloJ Implementation Differences
112
1132. Split CarryIntro into separate passes:
114(i) Add carry macros to AST.
115
116Pablo 'if' - Note: The PabloJ translation of 'if' differs slightly from Pablo (Python).
117               
118E - Pablo Expression
119S - Pablo Statement     
120               
121Pablo --> C++/IDISA block-at-a-time target language implementation.
122   
123if ( E ) { S* } (else { S'* } )?
124           
125-->
126           
127if (bitblock::any(E | carryQ.any_carry(base,count)) { S* } else { carrrQ.enqueue_dequeue(base, count); }
128           
129Pablo 'while'
130           
131while ( E ) { S* }
132
133-->
134
135if (bitblock::any(E | carryQ.any_carry(base, count))
136{
137  S*
138  while(E)
139  {
140    S*
141  }
142}
143else
144{
145  carryQ.enqueue_dequeue(base, count);
146
147
148Pablo 'body'
149
150-->
151
152  body += carryQ.carryAdjust(base, count);
153       
154Missing Features
155
1561. Support for negative constant integers (-1).
157//2. Code indentation.   
158//3. Pablo '1' --> simd::constant<1>(0)
159//4. Add assert compiler.
160//5. Add SIMD register dump.
161//6. carryQ.CarryQ_Adjust
1627. Map substitutions.
163
164PabloJ Design Issues
165 
1661. Front-end / Back-end decoupling. Support for multiple target languages (C/C++).
167
168   TODO
169   
170   Replace all occurrences of X.getCPPCode() with a generic getCode call
171   that is configurable based on the target back-end. e.g. C macro syntax
172   versus C++ template syntax etc.
173 
174   Done.
175   
1762. Logical division of the compiler into phases to promote the decoupling of visitor passes.
177
178   (a) Analysis - Gather information and populate symbol table information.
179   
180   (b) AST modification
181   
182   (c) Target language instruction generation (unparsing).
183
184   Work in progress.
185   
1863. AST Visitor naming conventions.
187
188   Term non-AST modifying visitors XVisitor and AST modifying visitors XXFormers.
189   Done.
190     
191   Done.
192
1934. Verify function call expressions versus function call statements in C++ unparsing.
194   Review the the current scheme.
195         
196   Pending.     
197   
1985. Decision to analyze a list of AST nodes in the semantic analyzer.
199   Perhaps this should be skipped we should simply push processing up to the program node level.
200   This may reduce future code changes.
201   Plan to add a graph type to the compiler to organize the stream program generation.
202
2036. Decision to call AST transformer a 'semantic analyzer'.
204
2057. Add a symbol table and type checking to rid the compiler of the upper case naming
206   convention for stream function and stream structures.
207   
2088. Pass around a configuration object or have setters on the semantic analyzer object.     
209
2109. ApplicationConfigurationFactory generates an ApplicationConfiguration object. Not sure if this oject
211   should be a singleton.
212   
21310. Semantic analyzer. Add Scatter logging.   
214
21511. Clone AST or not. Clone.
216
21711. X2Lang design decision. enum/assert etc...
218
21912. do_block()/do_final_block() code generation implementation.
220
221        i.e. How do we represent/output both do_block and do_final_block function/code?
222
223        We attach nodes to the Pablo AST to produce do_block() versus do_final_block() code.
224
225        Case do_block()
226
227->
228
229        Case do_final_block()
230
231->
232       
233        Implementation options:
234       
235        (a) Parse source file twice to produce two separate AST.
236        (b) Clone complete AST. Apply AST transformation passes to both ASTs.
237        (c) Add both body and final block bodies to a single AST. We accomplish this through
238            the addition/introduce of both PabloS and PabloB grammars and translate
239            from PabloS to PabloB.
240       
241        An additional decision remains as to which point in Pablo AST transformation we clone ASTs.
242
243        Design choice taken (c): Translate from PabloS to PabloB.
244
24513. Carry mode (ci/co) and do_block/do_final_block are independent concepts.
246
247        In the current approach we restrict the carry mode dependent transformation of the AST to
248        the transformation to:
249       
250        Pablo while
251       
252->     
253       
254        Prior carry operations considered carry mode in the selection of carry dependent operations.
255        For example, final block carry producing operations are not 'strictly' required to write
256        carry out values.
257
25814. 3-Addr Form (SSA) AST permits the applications of simplification rules (local optimization)
259        of the AST.     
260
261Additional Notes
262
263Ci|Co
264-----
265F | F
266F | T
267T | F
268T | T
269       
270+ Carry Modes   
271
272Pablo language control structures.     
273
274Neither
275
276Ci (Carry In Only Mode)
277
278- While statements iteration i>1
279
280Co (Carry Out Only Mode)
281
282Ci/Co (Carry In / Carry Out Mode)
283
284- While statements (1st iteration)
285- If statements
286
287+ Potential Block Types
288
289Initial block(ScanToFirst)
290Intermediate block (Non-initial and Non-final block)
291Final block
292Initial/Final block
293
29415. Apply &~ EOF_mask to all pablo builtin operations.
295
296+ Final Block EOF_mask
297
298Builtins
299
300AdvanceThenScanThru(M,S)        --> AdvanceThenScanThru(M, (S &~ EOF_mask))
301AdvanceThenScanTo(M,S)          --> AdvanceThenScanThru(M,~(S &~ EOF_mask))
302
303inFile(E) --> S & EOF_mask  --> assert_0(inFile(E))
304
305i.e. look at only bits within the file extent
306
307i.e. tests only bits within the file the extent and then test none of this set of bits is marked
308
309atEOF(E)  --> S &~ EOF_mask --> assert_0(atEOF(E))
310
311i.e. look at only bits beyond the EOF
312 
313i.e. tests no bit beyond the file extents is marked
314
315Control Structures
316
317while (E) { S }
318
319-->
320
321if(bitblock::any( (E | carry_test(lower,count) &~ EOF_mask ) )
322{
323        S
324       
325        while (bitblock::any(simd_and(E, EOF_mask))
326        {
327                S
328        }
329}
330
331if (E) { S }
332
333-->
334
335if(bitblock::any( (E | carry_test(lower,count) &~ EOF_mask ) )
336{
337        S
338}
339
34016. PabloB grammar design issue (together with Carry InfoSet design)
341
342How do we add stream function level declaration / initialization code to the output?
343
344Options:
345
346(a) Add explicit carry info set information to the PabloB function definition representation.
347
348    No, this would burden the programmer at the PabloB level.
349    The idea is the we remove carry counting ... from the developer
350    but still we need carry count / advn carry count information
351    for code output.
352   
353(b) Hold carry info set information in a symbol table external
354        to both PabloB. Query this data structure during code generation.
355   
356(a) and (b) are similar in that both options remove explicit carry counting from the programmer.
357
358We currently translate PabloS to PabloB and PabloS code is used to populate the information
359contained in the symbol table used by PabloB.
360
361We could populate similar symbol table information based on PabloB code alone.
362
363More investigation is required here.
364
36517. Limitations of a macro driven carry handling approach.
366
367Fill in this section ...
368
36918. PabloB needs both bitwise OR as well as boolean OR.
370
37119. Translate to Dotty for AST diagrams.
372   
373
Note: See TracBrowser for help on using the repository browser.