aha
flowcharts and first source
Click Flowcharts to see Machine Forth Source code
Use the BACK funtion in your browser to return to a Flowchart
\ 01/14/01 aha \ \ the aha compiler is not a conventional Forth interpreter/compiler \ it compiles tokenized Forth source \ most system use Source as strings packed together as one large string \ it is a a small change to counted packed strings for source \ also a small change to counted packed threaded strings and pointers for source \ one more change to 4 types of records \ record bits are most significant bits in each record (19,18,17,16 max) \ they are tested by shifting left into carry leaving \ left justfied counters for binary records \ shift back right to get right justified counters, CFA, or function tokens \ function tokens are shifted right and masked out \ most significant bits in record word \ 1ro0 binary (opcode, data) ( 1 word + 4Nx 5 bit opcodes, or Nx 20 bit words) \ sequential, N records in source produce N records in object \ repeated, 1 record in source produces N records in object \ 011 defined word or name being defined as *CFA, pointer to CFA field \ in source is token, link, cfa+flag, count, packed string (5 words) \ pointers to cfa ( 1 word, flag bits ) \ 010 function tokens 60 ( 6 bits, 3/ word + 2 flag bits, inline binary tokens) \ 00x comments, token+count, packed strings follow \ \ binary record bit 19 true, bit 18 repeat flag, bit 17 opcode flag, bit 16 zero \ bit 0-15 16 bit count for copy or repeat \ function token records 00t22222t11111t00000 3 6 bit tokens and \ literal function tokens pick up inlined 20 bit tokens \ \ funtion token opcode tokens can be packed 3 per word with 36 other tokens \ when opcode tokens can be packed 4 per word they become binary opcode records \ 60 function tokens, 0 can't be packed into slot 2, \ 0 function token needs non zero token in higher slot \ 0-31, opcode tokens match opcodes in 0-31 \ function token table (this is not the actual sequence just an example) \ \ 0 jump \ opcode compilers \ 1 t0 \ 2 c0 \ 3 call \ 4 ; \ 5 # \ 6 @a+ \ 7 @r+ \ 8 @a \ 9 !a+ \ 10 !r+ \ 11 !a \ 12 com \ 13 and \ 14 xor \ 15 + \ 16 2* \ 17 2/ \ 18 +* \ 19 a \ 20 a! \ 21 dup \ 22 drop \ 23 over \ 24 >r \ 25 r> \ 26 nop \ 27 dup#! (last opcode) \ 28 x \ 29 x \ 30 x \ 31 x \ 32 cells \ immediate nop \ 33 cell+ \ 34 spaces \ 35 cr \ 36 @ \ traditional names \ 37 ! \ 38 -1 \ 39 begin \ 40 -; \ explict tail recursion \ 41 # \ displayed as number only, followed by inline binary record \ 42 : \ defining word \ 43 if \ 44 -if \ 45 skip \ 46 then \ 47 else \ 48 while \ 49 -while \ 50 repeat \ 51 -until \ 52 until \ 53 ljump \ 54 swap \ 55 rot \ 56 or \ 57 0 \ 58 [ \ switch to temp compile area \ 59 ] \ switch back, save a, execute temp area, restore a \ 60 postpone \ 61 immediate \ 62 hide \ 63 null \ source representation \ \ conventional string representation Forth example 424 bits as bytes \ 1060 bits with ANSI Forth mandated F21 characters \ : myword ( -- ) over com and xor -if oldword then ; \ \ as tokens \ def:,link,cfa,3,my,wo,rd,t-(,2,--, ),t-over-com-and,t-xor--if,oldword,t-then-; \ 300 bits \ def:,link,cfa,3,my,wo,rd,t-(,2,--, ),b1,over-com-and-xor,t-if,oldword,t-then-; \ 320 bits but faster with 4 less function tokens being executed \ \ ~3 times smaller source, 100x faster compile, ~10x compression on aha source \ simple aha compile just compiles object code from source \ for the source code browser/debugger/compiler \ name dictionary gets moved from source to object, discarded from source \ name dictionary pointer tokens are discarded from source \ binary record, opcode and data tags are kept, binary records are discarded \ aha compile only, interprets by temp compiling and executing [ 3 calc ] \ proposed first cut memory map \ \ 0 boot code, 0 interrupt routine \ 80 temp compile \ 100 machine Forth compile table \ 200 aha object name/code dictionary \ 300 source tokens, source dictionary \ 400 2nd 1k page DRAM, 0-3FFF 1st 16k page \ later \ compiling builds second dictionary in object code \ obj name dictionary separate from code diction for dram speed \ cfa of aha words are unknown in application source ? \ must use duplicate uninitialized entry searched \ searched at compile time? \ first cut of aha compiler, lots of comments \ the simple compiler does not further compress source or link source and object \ compiler is called to process records pointed to return address \ compilation address pointer *comp in A in all compiler words \ except in ones accessed via the funtion token jump table \ ~ 50 words of compiled code for aha compiler \ 64 words for the function table \ ~ 100 words of compiled code for the code pointed to by the function table \ ~ 214 total words of object code for aha \ <256 total words of compiled code \ The use of two pointers, *record in R and *compile in A \ requires >r at the start of each word and r> at the end before the return\ compile any number of binary records : binary ( rec *rec -- *rec ) \ check for repeated binary bit 18 true >r -if 2* 2* \ skip opocode flag in bit 17, reset carry bit 16=false dup r@+ >r \ get record r: *rec record s: count count begin r> dup !a+ >r \ copy one record over nop nop + \ left justified counter, 82ns inner loop /4 -until r> drop \ end of repeated record else binary block else 2* 2* dup \ skip opocode flag in bit 17, reset carry bit 16=false begin @r+ !a+ over nop + -until \ move one record, left justified counter then r> ;
\ compile named word : CFA, ( rec *rec -- *rec ) \ string flag in carry, rec=*CFA*8 16 bit *CFA >r 2* 2/ 2/ 2/ 2/ a >r a! @a \ fetch CFA+ r: *rec *comp s: CFA+ a: *CFA if 2* \ immediate flag in carry s: 2CFA+ -if [ $ 3 + ] # >r 2* 2/ 2/ >r ; \ call to cfa for immediate else 2/ call \ machine Forth compile homepage subroutine call then \ if cfa 0 set it else drop r> dup !a a! \ r: *rec s: a: *comp then r> a! r> ; \ restore A s: *rec a:*comp \ skip comment records : skipr ( rec *rec -- *rec) \ T has been left shifted 3 times but carry is zero >r 2/ 2/ 2/ \ convert record to count r> nop nop nop \ s: count *record + ; \ bump record pointer s: *rec a: *comp
\ compile function token from token table : token ( rec *rec -- *rec) \ process 1 to 3 function tokens >r 2/ 2/ \ token table 00t22222t11111t00000 begin while \ while there are still tokens dup 3F # and \ s: token t00000 [ ttable ] # \ s: token t0 ttable ( 100h whatever) + [ $ 3 + ] # \ return address >r >r ; nop nop nop \ index call into table s: token 2/ 2/ 2/ 2/ 2/ 2/ repeat r> ; \ s: *rec
\ compile records : crecord ( rec *rec -- *rec ) \ binary, string, comment, function tokens >r -if 2* \ msb true for binary types data or opcode tokens r> binary >r \ else non-binary else 2* \ bit 18 true for string types -if 2* \ bit 17 true for cfa -if r> CFA, >r \ r: *rec s: else r> skipr >r then \ else non-string token table else r> token >r then then r> ;
\ simple aha compiler : compiler ( -- ) \ r: rec* a: comp* begin @r+ while 2* \ continue compiling while not end record zero r> crecord >r \ remove *rec from return stack before call repeat r> drop ; \ a: comp* \ aha compiler and its own source should fit in 1K words
aha is a system for F21 that uses tokenized source to compress source and speed compilation. Components include the initial simple compiler. A second compiler with further compress source code and link it to object code for source level debugging. Combining small size with microsecond compile times will make source level browsing with instant compiles possible.
The function tokens represent the Machine Forth comiler that
compiles to about 100 (decimal) words of object code. The function
token table requires another 64 words of memory. The simple
aha compiler compiles to about 50 words of object
and this version can compile a maximum of 50 million source
words per second for repeated binary opcode token records.
31 Machine Forth words used to define aha compiler
: >r -if 2* dup r@+ begin r> !a+ over nop + -until drop else then ; 2/ a a! @a if [ $ ] # call !a while and repeat6 names defined in the aha compiler source, all but compiler can be hidden
binary CFA, token skipr crecord compiler
Construction in progress 02/06/01.