This appendix gives some insight in how the compiler works. A reasonable knowledge of the PIC architecture is assumed.
The JAL compiler is organized in a number of phases:
The parse phase reads the input files, checks the syntax and semantics, and produces and internal abstract syntax tree which represents the source. Nearly all user error messages are generated in this phase. Two optimizations takes place: constant expressions are evaluated and if and while statements with a constant condition are replaced appropriately.
The syntax tree generated by the parse phase and worked upon by the subsequent phases is kept in memory. The nodes of the tree have a fixed size (~ 40 bytes). For each node the source location which caused the node to be generated is also stored (outside the node). This internal tree is the main reason the compiler uses large amounts of memory (a few megabytes) to compile even a medium size source.
The first optimize phase examines the tree and tries to transform it into a semantically equivalent tree which will generate better (smaller, faster etc.) code. Variables, statements and routines which are not used are removed. Tail calls are replaced by jumps to save stack entries. Some expressions are replaced by simpler ones (e.g. multiply becomes shift). Unused code and variables are removed. The tree is also simplified to reduce its size and to make the work of subsequent phases easier.
The squash phase replaces constructs in the tree which can not easily be translated to PIC instructions by semantically equivalent constructs which can, e.g. multiply and most shifts are replaced by calls to the run time library. This run time library is build into the compiler: it has nothing to do with the libraries which are provided with the compiler.
The second optimize phase performs the same optimizations as the first one, except for the few optimizations which would invalidate the work of the squash phase.
The register allocation phase scans the tree and assigns an address to all variables which do not yet have and address and need one. Variables which are never used will not get an address because they have been removed by the optimize phases.
The code generation phase replaces all constructs in the syntax tree with assembler instructions.
The assembly phase traverses the syntax tree and generates the assembler and hex files which are written to disk.
After the assembly phase the checks for the number of registers, the code size and the number of stack levels are done. When one of these checks fails the assembler file will still be a available to the user to see why his program uses so much resources, but the hex file will not be generated.
The optional simulate phase simulates the machine instructions in the internal copy of the hex file and checks the asserts.
The compiler code contains a large amount of consistency checks. When such a check fails the compiler will issue an error message and abort. An example of such a check is the assembly phase which verifies that each assembled machine instruction is available on the target architecture. The -386 option makes the compiler run somewhat faster by disabling most checks.
During register allocation first all bit addresses are assigned, next all byte addresses. Bit and byte registers are hence not reused optimally when one part of the call tree uses lots of bits while the other part does not.
Within both categories addresses are assigned using the fixed stack algorithm: the compiler gives each variable the highest address it ever has in the call tree. This gives the illusion of a real stack without the expense. The down side is that the jal compiler can only compile a full program and that recursion is forbidden.
Byte expressions are evaluated in the w register. Operators which have no corresponding machine instruction (e.g. multiply) are replaced in the squash phase with a call to the run-time library. When necessary the squash phase will rearrange complex expressions (including function calls) and insert temporary variables.
A byte assignment first evaluates the expression in the w register and then moves the value to the appropriate file register.
The byte expressions and assignments in the next table are recognized as special cases.
| jal source fragment | assembler instructions | 
| x = 0 | clrf x | 
| x = x + 1 | incf x, f | 
| x = x - 1 | decf x, f | 
| x = x << 1 | clrc; rlf x, f | 
| x = x >> 1 | clrc; rrf x, f | 
| x << 4 | swapf y, w; andlw 0xF0 | 
| x >> 4 | swapf y, w; andlw 0x0F | 
Bit assignments are translated into conditional bit set and clears. The first construct is used when either the target is volatile or the expression contains the target, otherwise the second (more compact) construct is used.
      target = true
   else
      target = false
   end if
   
   x = false
   if expression then
      target = true
   end if
Bit expressions always appear as conditions. A bit expression is translated into a set of conditional skips and jumps. When possible a skip is used, otherwise the inverted skip over a jump is used. The code generated for an exclusive or operator will evaluate the second operand twice.
The following trivial bit assignments are recognized as special cases.
| jal source fragment | assembler instruction | 
| b = true | bsf 31, 2 | 
| b = false | bcf 31, 2 | 
A routine which has the pragma jump_table is coded in the last memory page. The jump table for volatile parameters is also put there. When the jump table and the routines do not fit in the last page an error message is generated. When a jump table or a routine with pragma jump table is present the page select bits.
Without interrupts the code which is generated starts at address 0. When one or more interrupt routines are present the code starts with a jump around the interrupt routines and the concatenated interrupt routines start at address 2.
pseudo variables and volatile parameters
A pseudo variable has a 'get function or a 'put procedure or both. The use of a pseudo variable generates the appropriate function or procedure call.
For each variable which is ever passed as a volatile parameter two jump table entries are generated, one for get and one for put. The index into the jump table is passed as the actual parameter. The use of a volatile parameter generates a call to the jump table with the passed index. The value is passed to or from the get and put procedures using a fixed (global) variable.
The compiler does not analyze the passing and use of volatile parameters. Hence it must assume that each use of a volatile parameter requires the maximum amount of stack used by any volatile put or get routine. This implies that use of a volatile parameter somewhere below a volatile put or get routine is not allowed because - as far as the compiler can know - it could require an infinite amount of stack entries.
The strange architecture of the SX (not totally their fault, it is actually an improvement on the midrange PIC architecture) requires some special care to keep the bank and page bits set correctly. The current compiler handles this in a correct but inefficient way: it simply puts the appropriate page or bank instruction before each normal instruction. The result is that SX code is about two times the size of 16x84 code.
Before a call, a jump or the use of a file register the page/bank selection bits are set. When the instruction which requires the selection bits to be set is preceded by a (conditional) skip the setting will be put before the skip. The SX specific instructions page and bank are used for the setting of the selection bits.
The SX does not have addlw and sublw instructions. The compiler first fills a temporary with the constant and then uses an addwf or subwf instruction. The parser accepts these instructions in inline assembly, but unless such instructions are eliminated before the actual assembly phase an error is generated. This makes it possible to use if statements to select the appropriate code for a specific target. Look at jdelay.jal for an example.
Each routine is entered via a jump in the lower half of the first page, even when the entry point to the routine is located in the lower half of a page so the routine could be called directly.
The compiler and the libraries produce code for an SX in turbo mode with the carry option disabled (add and subtract instructions do not take a previous carry into account) and RTCC mapped to address 1.
The compiler generates an assembler file which is compatible with MPASM. The SX-specific instructions are implemented as macro's. The page and bank instructions are called cpage and rbank to avoid conflicts with the MPASM page and bank directives.
The SX uses banked register addressing were the lower 16 registers are replicated in each bank. Hence the addressing of the scratch registers is not linear. The compiler uses a linear address to refer to each register. This value is returned by a movlw x instruction and expected by the memory_read and memory_write routines.