source: proto/pabloj/trunk/docs/Notes @ 3282

Last change on this file since 3282 was 3192, checked in by ksherdy, 6 years ago

Sweeping changes to sync branch with trunk.

File size: 10.7 KB
Line 
1================================================================================
2PabloJ Compiler Design
3================================================================================
4This document describes the state of the PabloJ compiler at Version 0.8.0.
5
6Original implementation is modelled against Pablo Python compiler version 2615.
7parabix2_pablo.py version 2797.
8
9Version 0.8.0
10
11Carry macro style support.
12Validates against XMLWF test suite.
13Validates against large XML file set (roads.gml, ...) .
14
15================================================================================
16PabloS Builtins
17================================================================================
18
19TODO
20
211. Add field width to pablo.operation builtin calls.
22        e.g. pablo.operation() -> pablos.operation<fw>(...)
23
24--------------------------------------------------------------------------------
25IDISA Builtins Design
26--------------------------------------------------------------------------------
27
28PabloB syntax   --> PabloB Parser --> output (C,C++,LLVM)
29       
30                                                ^
31                                                |
32                                               
33                                IDISA Library Generator
34                       
35                                                ^
36                                                |
37                                                architecture (sse2, ...)
38       
39       
40        AST:
41       
42                IDISA Function Call
43                /               |               \
44        cid                     fw              args_list       
45       
46       
47IDISA enum and Signatures ...
48
49Signatures has-a build method
50
51Build method takes a builder
52
53Generator Visitor takes a builder argument     
54       
55================================================================================
56PabloS Types
57================================================================================
58
591. Strings and integers are constants that must be initialized on declaration.
60
612. 2^k stream types are initialized to simd<k>::constant(0).
62
633. Mask(2^k, constantValue) can be used to initialized stream<2^k> primitives.
64
65================================================================================
66PabloS Visitors
67================================================================================
68
691. All PabloS conditional expression E are wrapped in BitBlock::any(E) calls.
70
71================================================================================
72PabloB Memory and Block-at-a-time Processing Model
73================================================================================
74
751. Pass by reference.
76
77   TODO - Think this out.
78
79   Call-out streams are re-initialized (reset to zero) on each block iteration.
80
81   
82
83================================================================================
84Pablo vs. PabloJ
85================================================================================
86
871. PabloJ requires explicit variable declarations
88
89e.g. bitblock s; s = simd::constant<1>(0);
90
91(2) Current PabloB grammar does not support boolean OR.
92
93Work around. Added PabloJCarryTest runtime check that returns a type BitBlock.
94Wrapped 'if' / 'while' conditions within bitblock::any() IDISA call.
95
96(3) TODO - Avoid carry object creation in Kernel w/o carries.
97
98(4) BitBlock EOF_mask --> const BitBlock mask
99
100(5) 
101
102PabloS - Design and implementation issues.
103
104(1) Grammar/Parser/Semantic Analyzer
105
106    The current Scatter grammar does not restrict many programs accepted by the parser.
107       
108        At this time the intention is that a semantic analysis phase will generate semantic errors.
109       
110    For example,
111   
112    (a) PabloS grammar permits top-level type declarations of all types.
113        However, the intention is to restrict to stream struct definitions.
114   
115    (b) PabloS grammar does not enforce def/use for stream structs prior to use in stream function defintions,
116        i.e. PabloS grammar does not prevent the definition of stream functions prior to stream structs.
117       
118    (c) PabloS grammar does not prevent the definition of new types within the body of
119        stream functions.   
120
121 (2) No semantic analysis, no type checking.
122       
123
124PabloB - Design and implementation issues.
125
126(1) 'static' initialization section vs. 'kernel' constructor
127
128(2) integer types uint8, uint16, uint32, uint64, uint128, uint256
129  - if integer types are restricted 'sections' of stream then no apparent need exists for signed integers
130
131C++ Code Generation Strategy
132
133(1) sythetic variable, scope, identifiers, tree compression, etc...
134
135PabloJ - Pablo language implementation.
136
137(1) Translates arbitrary length 2^k bit field width streams to an equivalent block-at-a-time target language implementation.
138(2) ...
139
140PabloJ Design Differences
141
1421. Programmer defined stream function variable declaration. PabloJ does not gather local variables.
143
144PabloJ Implementation Differences
145
1462. Split CarryIntro into separate passes:
147(i) Add carry macros to AST.
148
149Pablo 'if' - Note: The PabloJ translation of 'if' differs slightly from Pablo (Python).
150               
151E - Pablo Expression
152S - Pablo Statement     
153               
154Pablo --> C++/IDISA block-at-a-time target language implementation.
155   
156if ( E ) { S* } (else { S'* } )?
157           
158-->
159           
160if (bitblock::any(E | carryQ.any_carry(base,count)) { S* } else { carrrQ.enqueue_dequeue(base, count); }
161           
162Pablo 'while'
163           
164while ( E ) { S* }
165
166-->
167
168if (bitblock::any(E | carryQ.any_carry(base, count))
169{
170  S*
171  while(E)
172  {
173    S*
174  }
175}
176else
177{
178  carryQ.enqueue_dequeue(base, count);
179
180
181Pablo 'body'
182
183-->
184
185  body += carryQ.carryAdjust(base, count);
186       
187Missing Features
188
1891. Support for negative constant integers (-1).
190//2. Code indentation.   
191//3. Pablo '1' --> simd::constant<1>(0)
192//4. Add assert compiler.
193//5. Add SIMD register dump.
194//6. carryQ.CarryQ_Adjust
1957. Map substitutions.
196
197PabloJ Design Issues
198 
1991. Front-end / Back-end decoupling. Support for multiple target languages (C/C++).
200
201   TODO
202   
203   Replace all occurrences of X.getCPPCode() with a generic getCode call
204   that is configurable based on the target back-end. e.g. C macro syntax
205   versus C++ template syntax etc.
206 
207   Done.
208   
2092. Logical division of the compiler into phases to promote the decoupling of visitor passes.
210
211   (a) Analysis - Gather information and populate symbol table information.
212   
213   (b) AST modification
214   
215   (c) Target language instruction generation (unparsing).
216
217   Work in progress.
218   
2193. AST Visitor naming conventions.
220
221   Term non-AST modifying visitors XVisitor and AST modifying visitors XXFormers.
222   Done.
223     
224   Done.
225
2264. Verify function call expressions versus function call statements in C++ unparsing.
227   Review the the current scheme.
228         
229   Pending.     
230   
2315. Decision to analyze a list of AST nodes in the semantic analyzer.
232   Perhaps this should be skipped we should simply push processing up to the program node level.
233   This may reduce future code changes.
234   Plan to add a graph type to the compiler to organize the stream program generation.
235
2366. Decision to call AST transformer a 'semantic analyzer'.
237
2387. Add a symbol table and type checking to rid the compiler of the upper case naming
239   convention for stream function and stream structures.
240   
2418. Pass around a configuration object or have setters on the semantic analyzer object.     
242
2439. ApplicationConfigurationFactory generates an ApplicationConfiguration object. Not sure if this oject
244   should be a singleton.
245   
24610. Semantic analyzer. Add Scatter logging.   
247
24811. Clone AST or not. Clone.
249
25011. X2Lang design decision. enum/assert etc...
251
25212. do_block()/do_final_block() code generation implementation.
253
254        i.e. How do we represent/output both do_block and do_final_block function/code?
255
256        We attach nodes to the Pablo AST to produce do_block() versus do_final_block() code.
257
258        Case do_block()
259
260->
261
262        Case do_final_block()
263
264->
265       
266        Implementation options:
267       
268        (a) Parse source file twice to produce two separate AST.
269        (b) Clone complete AST. Apply AST transformation passes to both ASTs.
270        (c) Add both body and final block bodies to a single AST. We accomplish this through
271            the addition/introduce of both PabloS and PabloB grammars and translate
272            from PabloS to PabloB.
273       
274        An additional decision remains as to which point in Pablo AST transformation we clone ASTs.
275
276        Design choice taken (c): Translate from PabloS to PabloB.
277
27813. Carry mode (ci/co) and do_block/do_final_block are independent concepts.
279
280        In the current approach we restrict the carry mode dependent transformation of the AST to
281        the transformation to:
282       
283        Pablo while
284       
285->     
286       
287        Prior carry operations considered carry mode in the selection of carry dependent operations.
288        For example, final block carry producing operations are not 'strictly' required to write
289        carry out values.
290
29114. 3-Addr Form (SSA) AST permits the applications of simplification rules (local optimization)
292        of the AST.     
293
294Additional Notes
295
296Ci|Co
297-----
298F | F
299F | T
300T | F
301T | T
302       
303+ Carry Modes   
304
305Pablo language control structures.     
306
307Neither
308
309Ci (Carry In Only Mode)
310
311- While statements iteration i>1
312
313Co (Carry Out Only Mode)
314
315Ci/Co (Carry In / Carry Out Mode)
316
317- While statements (1st iteration)
318- If statements
319
320+ Potential Block Types
321
322Initial block(ScanToFirst)
323Intermediate block (Non-initial and Non-final block)
324Final block
325Initial/Final block
326
32715. Apply &~ EOF_mask to all pablo builtin operations.
328
329+ Final Block EOF_mask
330
331Builtins
332
333AdvanceThenScanThru(M,S)        --> AdvanceThenScanThru(M, (S &~ EOF_mask))
334AdvanceThenScanTo(M,S)          --> AdvanceThenScanThru(M,~(S &~ EOF_mask))
335
336inFile(E) --> S & EOF_mask  --> assert_0(inFile(E))
337
338i.e. look at only bits within the file extent
339
340i.e. tests only bits within the file the extent and then test none of this set of bits is marked
341
342atEOF(E)  --> S &~ EOF_mask --> assert_0(atEOF(E))
343
344i.e. look at only bits beyond the EOF
345 
346i.e. tests no bit beyond the file extents is marked
347
348Control Structures
349
350while (E) { S }
351
352-->
353
354if(bitblock::any( (E | carry_test(lower,count) &~ EOF_mask ) )
355{
356        S
357       
358        while (bitblock::any(simd_and(E, EOF_mask))
359        {
360                S
361        }
362}
363
364if (E) { S }
365
366-->
367
368if(bitblock::any( (E | carry_test(lower,count) &~ EOF_mask ) )
369{
370        S
371}
372
37316. PabloB grammar design issue (together with Carry InfoSet design)
374
375How do we add stream function level declaration / initialization code to the output?
376
377Options:
378
379(a) Add explicit carry info set information to the PabloB function definition representation.
380
381    No, this would burden the programmer at the PabloB level.
382    The idea is the we remove carry counting ... from the developer
383    but still we need carry count / advn carry count information
384    for code output.
385   
386(b) Hold carry info set information in a symbol table external
387        to both PabloB. Query this data structure during code generation.
388   
389(a) and (b) are similar in that both options remove explicit carry counting from the programmer.
390
391We currently translate PabloS to PabloB and PabloS code is used to populate the information
392contained in the symbol table used by PabloB.
393
394We could populate similar symbol table information based on PabloB code alone.
395
396More investigation is required here.
397
39817. Limitations of a macro driven carry handling approach.
399
400Fill in this section ...
401
40218. PabloB needs both bitwise OR as well as boolean OR.
403
40419. Translate to Dotty for AST diagrams.
405   
406
Note: See TracBrowser for help on using the repository browser.