Institutt for Informatikk
Universitetet i Bergen
INF 225 - Introduction to program translation - H05
Project Part 3 - Code Generation
This part is highly recommended to be solved in groups of two, but each student is to hand in his/hers own program!
Goal
In this last exercise, you are going to write a compiler to
produce executable code for the Batu Virtual Machine (BVM).
Of course, you will have to employ your own scanner and parser from
previous exercises.
Deadline
Deadline for hand-in is no later than Monday 28th November 2005
at 12:00 am.
Contents of hand-in
The answers handed in must include:
- Complete, fully documented, source code. Including the code from part 1 and part 2.
- Archive file with all classes we need to run the program.
- A description of your program, in a seperate text file. Emphasize the parts requiring special consideration and
report shortly your design decisions. Avoid repeating descriptions, even when your code is repeated.
Do not include general text about program analysis and restrict
yourselves to your own code.
- Instructions on how to run your program (command line syntax and options).
- Examples of programs your program can compile.
- Important! If you have discussed the assignment with another student, the name of that student must be stated in the documentation.
The answers are to be mailed to Peter Guzikowski (peterg@ii.uib.no), no later than the specified deadline. Answers handed in after the deadline will NOT be considered.
The program
The program is to read source code for a C- program, and
write a program file for BVM to stdout.
If errors occur during compilation, error messages must be written
to stderr. Diagnostics may also be written to stderr.
If you cannot manage to complete the compiler, you may reduce the
problem to avoid writing many similar blocks of code. Start to write
small C- programs that only uses simple program constructions, and make
sure that the compiler works for them. Still we do
require you to include a variety of functionality, at least one
aritmetical operation (e.g. +), one relation (e.g. <), if and
while statements, assignment, and function calls with one parameter.
All missing features and bugs must be documented. We do not
accept undocumented flaws.
Remember an example file using the implemented features.
Suggested command line syntax
Compile:
Java -cp mycompiler.jar Compile inputfile.c-
Parse: Java -cp mycompiler.jar Parse inputfile.c-
Scan: Java -cp mycompiler.jar Scan inputfile.c-
The Batu Virtual Machine (BVM)
Error messages
-
The program must be checked for structural errors and type mismatch,
which means using everything from previous handins.
-
If error occurs during compilation, the program should report, as well
as possible, the type and cause of the error.
-
All error messages shall contain reference to the source code, with
line number (and preferably a column indicator).
-
There is no use trying to generate code if an error has occured during
analysis.
Allocation
-
Parameters and local variables must be indexed by integers.
-
You have to compute the required number of variables for the frame
of each function.
-
You have to compute maximum stack depth for each function.
Linking
-
Every employed function must exist in the program.
The parameters must match those indicated by the mangled name.
-
There must be a function with the mangled name main()I.
Optimization
You don't need to optimize your code. But you are likely to find plenty of examples of your compiler doing
really stupid things, so don't hesitate to optimize.
Documentation
- Describe the design of your program.
-
Concentrate on parts which required special treatment or trade-offs.
-
Do not repeat yourselves. Simple and repetitous code may be presented
in a short overview.
-
Describe your own code, we do not want a general essay on code
generation.
-
Keep the documentation short. It is supposed to be easier
reading than the source code is.
-
If you use several classes, each of them must be described, and you
have to tell how they are supposed to interact.
-
Make a running example where you describe how code generation
is performed in real, on a simple sample of source code.
Include the source code which is used in this example.
(If you plan the example ahead of time, it may help you in the
implementation of the program.)
-
The running sample may very well be used to illustrate the
construction of the program.
Hints
-
Variables must be initiated before use, probably it's best to do that
just after declaration.
-
Remember to count parameters in the number of local variables of a
function, though they must not be initiated.
-
You may insert output(I)V and input()I in the
symbol table before analysis. In the production of function calls, you
may check for these to function and generate print or
read as appropriate. Otherwise an invokestatic
should be generated.
-
There is no instruction for relational operators, so you will have to
translate this into some kind of if-batch where either a 1 (true) or
0 (false) is left afterwards.
Remember that such results also may be assigned to variables.
-
If each node puts its code in a vector, then its length will indicate
how long a goto has to jump to get past it.
-
The methods on code generation from the book may be used. The
differences between register-based and stack-based machines are not
really significant until you start with optimization.
-
Continue with the syntax tree from part II.
-
Code from each production may be put in a Vector, which makes it easy
both to count instruction for goto statements and to optimise.
That will be much harder if you flush code directly to stdout.
Each production may then paste the Vectors from its children into its
own code.
-
There is no instruction for logical not, but you may use for instance
iconst_1 followed by ixor.
-
Computing maximal stack size is a classic DFS algorithm, and has the
same order of size as the syntax tree.
-
Expressions leaves their value on the stack. Since assignments are
expression too, you nead dup (or dup_x2 for arrays)
before the value is moved to memory.
-
Optimization for a stack machine is to a large extent a question of
avoiding register spill.
-
Optimization is expected to be easier if it is separated from code
generation.
More hints
- Start with the simple things, and expand the program as you go.
- Write documentation and source code in parallel.
Afterwards everything appears simple or forgotten.
- Start early. This takes time.
- Do not wait till the topics are covered in lecture, lest you lose
time and fail to meet the deadline.
- There is no one way to solve the problem.
- Discuss the exercise with your partner and with other students.