source: proto/pabloj/trunk/docs/notes @ 3004

Last change on this file since 3004 was 3004, checked in by ksherdy, 6 years ago

Implemented framework skeleton to support PabloB IDISA function call translation to C,C++,LLVM.

File size: 10.7 KB
Line 
1================================================================================
2PabloJ Compiler Design
3================================================================================
4This document describes the state of the PabloJ compiler at Version 0.8.0.
5Original implementation is modelled on the Pablo Python compiler version 2615.
6
7Version 0.8.0
8
9Carry macro style support.
10Validates against XMLWF test suite.
11Validates against large XML file set (roads.gml, ...) .
12
13================================================================================
14PabloS Builtins
15================================================================================
16
17TODO
18
191. Add field width to pablo.operation builtin calls.
20        e.g. pablo.operation() -> pablos.operation<fw>(...)
21
22--------------------------------------------------------------------------------
23IDISA Builtins Design
24--------------------------------------------------------------------------------
25
26PabloB syntax   --> PabloB Parser --> output (C,C++,LLVM)
27       
28                                                ^
29                                                |
30                                               
31                                IDISA Library Generator
32                       
33                                                ^
34                                                |
35                                                architecture (sse2, ...)
36       
37       
38        AST:
39       
40                IDISA Function Call
41                /               |               \
42        cid                     fw              args_list       
43       
44       
45IDISA enum and Signatures ...
46
47Signatures has-a build method
48
49Build method takes a builder
50
51Generator Visitor takes a builder argument     
52       
53================================================================================
54PabloS Types
55================================================================================
56
571. Strings and integers are constants that must be initialized on declaration.
58
592. 2^k stream types are initialized to simd<k>::constant(0).
60
613. Mask(2^k, constantValue) can be used to initialized stream<2^k> primitives.
62
63================================================================================
64PabloS Visitors
65================================================================================
66
671. All PabloS conditional expression E are wrapped in BitBlock::any(E) calls.
68
69================================================================================
70PabloB Memory and Block-at-a-time Processing Model
71================================================================================
72
731. Pass by reference.
74
75   TODO - Think this out.
76
77   Call-out streams are re-initialized (reset to zero) on each block iteration.
78
79   
80
81================================================================================
82Pablo vs. PabloJ
83================================================================================
84
851. PabloJ requires explicit variable declarations
86
87e.g. bitblock s; s = simd::constant<1>(0);
88
89(2) Current PabloB grammar does not support boolean OR.
90
91Work around. Added PabloJCarryTest runtime check that returns a type BitBlock.
92Wrapped 'if' / 'while' conditions within bitblock::any() IDISA call.
93
94(3) TODO - Avoid carry object creation in Kernel w/o carries.
95
96(4) BitBlock EOF_mask --> const BitBlock mask
97
98(5) 
99
100PabloS - Design and implementation issues.
101
102(1) Grammar/Parser/Semantic Analyzer
103
104    The current Scatter grammar does not restrict many programs accepted by the parser.
105       
106        At this time the intention is that a semantic analysis phase will generate semantic errors.
107       
108    For example,
109   
110    (a) PabloS grammar permits top-level type declarations of all types.
111        However, the intention is to restrict to stream struct definitions.
112   
113    (b) PabloS grammar does not enforce def/use for stream structs prior to use in stream function defintions,
114        i.e. PabloS grammar does not prevent the definition of stream functions prior to stream structs.
115       
116    (c) PabloS grammar does not prevent the definition of new types within the body of
117        stream functions.   
118
119 (2) No semantic analysis, no type checking.
120       
121
122PabloB - Design and implementation issues.
123
124(1) 'static' initialization section vs. 'kernel' constructor
125
126(2) integer types uint8, uint16, uint32, uint64, uint128, uint256
127  - if integer types are restricted 'sections' of stream then no apparent need exists for signed integers
128
129C++ Code Generation Strategy
130
131(1) sythetic variable, scope, identifiers, tree compression, etc...
132
133PabloJ - Pablo language implementation.
134
135(1) Translates arbitrary length 2^k bit field width streams to an equivalent block-at-a-time target language implementation.
136(2) ...
137
138PabloJ Design Differences
139
1401. Programmer defined stream function variable declaration. PabloJ does not gather local variables.
141
142PabloJ Implementation Differences
143
1442. Split CarryIntro into separate passes:
145(i) Add carry macros to AST.
146
147Pablo 'if' - Note: The PabloJ translation of 'if' differs slightly from Pablo (Python).
148               
149E - Pablo Expression
150S - Pablo Statement     
151               
152Pablo --> C++/IDISA block-at-a-time target language implementation.
153   
154if ( E ) { S* } (else { S'* } )?
155           
156-->
157           
158if (bitblock::any(E | carryQ.any_carry(base,count)) { S* } else { carrrQ.enqueue_dequeue(base, count); }
159           
160Pablo 'while'
161           
162while ( E ) { S* }
163
164-->
165
166if (bitblock::any(E | carryQ.any_carry(base, count))
167{
168  S*
169  while(E)
170  {
171    S*
172  }
173}
174else
175{
176  carryQ.enqueue_dequeue(base, count);
177
178
179Pablo 'body'
180
181-->
182
183  body += carryQ.carryAdjust(base, count);
184       
185Missing Features
186
1871. Support for negative constant integers (-1).
188//2. Code indentation.   
189//3. Pablo '1' --> simd::constant<1>(0)
190//4. Add assert compiler.
191//5. Add SIMD register dump.
192//6. carryQ.CarryQ_Adjust
1937. Map substitutions.
194
195PabloJ Design Issues
196 
1971. Front-end / Back-end decoupling. Support for multiple target languages (C/C++).
198
199   TODO
200   
201   Replace all occurrences of X.getCPPCode() with a generic getCode call
202   that is configurable based on the target back-end. e.g. C macro syntax
203   versus C++ template syntax etc.
204 
205   Done.
206   
2072. Logical division of the compiler into phases to promote the decoupling of visitor passes.
208
209   (a) Analysis - Gather information and populate symbol table information.
210   
211   (b) AST modification
212   
213   (c) Target language instruction generation (unparsing).
214
215   Work in progress.
216   
2173. AST Visitor naming conventions.
218
219   Term non-AST modifying visitors XVisitor and AST modifying visitors XXFormers.
220   Done.
221     
222   Done.
223
2244. Verify function call expressions versus function call statements in C++ unparsing.
225   Review the the current scheme.
226         
227   Pending.     
228   
2295. Decision to analyze a list of AST nodes in the semantic analyzer.
230   Perhaps this should be skipped we should simply push processing up to the program node level.
231   This may reduce future code changes.
232   Plan to add a graph type to the compiler to organize the stream program generation.
233
2346. Decision to call AST transformer a 'semantic analyzer'.
235
2367. Add a symbol table and type checking to rid the compiler of the upper case naming
237   convention for stream function and stream structures.
238   
2398. Pass around a configuration object or have setters on the semantic analyzer object.     
240
2419. ApplicationConfigurationFactory generates an ApplicationConfiguration object. Not sure if this oject
242   should be a singleton.
243   
24410. Semantic analyzer. Add Scatter logging.   
245
24611. Clone AST or not. Clone.
247
24811. X2Lang design decision. enum/assert etc...
249
25012. do_block()/do_final_block() code generation implementation.
251
252        i.e. How do we represent/output both do_block and do_final_block function/code?
253
254        We attach nodes to the Pablo AST to produce do_block() versus do_final_block() code.
255
256        Case do_block()
257
258->
259
260        Case do_final_block()
261
262->
263       
264        Implementation options:
265       
266        (a) Parse source file twice to produce two separate AST.
267        (b) Clone complete AST. Apply AST transformation passes to both ASTs.
268        (c) Add both body and final block bodies to a single AST. We accomplish this through
269            the addition/introduce of both PabloS and PabloB grammars and translate
270            from PabloS to PabloB.
271       
272        An additional decision remains as to which point in Pablo AST transformation we clone ASTs.
273
274        Design choice taken (c): Translate from PabloS to PabloB.
275
27613. Carry mode (ci/co) and do_block/do_final_block are independent concepts.
277
278        In the current approach we restrict the carry mode dependent transformation of the AST to
279        the transformation to:
280       
281        Pablo while
282       
283->     
284       
285        Prior carry operations considered carry mode in the selection of carry dependent operations.
286        For example, final block carry producing operations are not 'strictly' required to write
287        carry out values.
288
28914. 3-Addr Form (SSA) AST permits the applications of simplification rules (local optimization)
290        of the AST.     
291
292Additional Notes
293
294Ci|Co
295-----
296F | F
297F | T
298T | F
299T | T
300       
301+ Carry Modes   
302
303Pablo language control structures.     
304
305Neither
306
307Ci (Carry In Only Mode)
308
309- While statements iteration i>1
310
311Co (Carry Out Only Mode)
312
313Ci/Co (Carry In / Carry Out Mode)
314
315- While statements (1st iteration)
316- If statements
317
318+ Potential Block Types
319
320Initial block(ScanToFirst)
321Intermediate block (Non-initial and Non-final block)
322Final block
323Initial/Final block
324
32515. Apply &~ EOF_mask to all pablo builtin operations.
326
327+ Final Block EOF_mask
328
329Builtins
330
331AdvanceThenScanThru(M,S)        --> AdvanceThenScanThru(M, (S &~ EOF_mask))
332AdvanceThenScanTo(M,S)          --> AdvanceThenScanThru(M,~(S &~ EOF_mask))
333
334inFile(E) --> S & EOF_mask  --> assert_0(inFile(E))
335
336i.e. look at only bits within the file extent
337
338i.e. tests only bits within the file the extent and then test none of this set of bits is marked
339
340atEOF(E)  --> S &~ EOF_mask --> assert_0(atEOF(E))
341
342i.e. look at only bits beyond the EOF
343 
344i.e. tests no bit beyond the file extents is marked
345
346Control Structures
347
348while (E) { S }
349
350-->
351
352if(bitblock::any( (E | carry_test(lower,count) &~ EOF_mask ) )
353{
354        S
355       
356        while (bitblock::any(simd_and(E, EOF_mask))
357        {
358                S
359        }
360}
361
362if (E) { S }
363
364-->
365
366if(bitblock::any( (E | carry_test(lower,count) &~ EOF_mask ) )
367{
368        S
369}
370
37116. PabloB grammar design issue (together with Carry InfoSet design)
372
373How do we add stream function level declaration / initialization code to the output?
374
375Options:
376
377(a) Add explicit carry info set information to the PabloB function definition representation.
378
379    No, this would burden the programmer at the PabloB level.
380    The idea is the we remove carry counting ... from the developer
381    but still we need carry count / advn carry count information
382    for code output.
383   
384(b) Hold carry info set information in a symbol table external
385        to both PabloB. Query this data structure during code generation.
386   
387(a) and (b) are similar in that both options remove explicit carry counting from the programmer.
388
389We currently translate PabloS to PabloB and PabloS code is used to populate the information
390contained in the symbol table used by PabloB.
391
392We could populate similar symbol table information based on PabloB code alone.
393
394More investigation is required here.
395
39617. Limitations of a macro driven carry handling approach.
397
398Fill in this section ...
399
40018. PabloB needs both bitwise OR as well as boolean OR.
401
40219. Translate to Dotty for AST diagrams.
403   
404
Note: See TracBrowser for help on using the repository browser.