Institutt for Informatikk
Universitetet i Bergen
INF 225 - Innføring i Programoversettelse - H04
Project Part 3: Code Generation
This part can be solved and handed in groups of two.
In this last exercise, you are going to write a compiler to
produce executable code for the Batu Virtual Machine (BVM)
Of course, you will have to employ the scanner and parser from
You have to hand in your answers no later than Monday 29th November 2004
at 12:00 am.
Contents of hand in
The answers handed in must include:
The answers are to be mailed to
Magnus Hoff (firstname.lastname@example.org), no later
than the specified deadline. Answers handed in after deadline will NOT be considered.
- Complete, fully documented, source code. Including the code from part 1 and part 2.
- Archive file with all classes we need to run the program.
- A description of your program, in a seperate text file. Emphasize the parts requiring special consideration and
report shortly your design decisions. Avoid repeating descriptions, even when your code is repeated.
Do not include general text about program analysis and restrict
yourselves to your own code.
- Instructions how to run your program (command line syntax and options).
- Examples of programs your program can compile.
The program should read C- source code from stdin and
write a program file for BVM to stdout.
If error occurs during compilation, error messages must be written
to stderr. Diagnostics may also be written to stderr.
If you cannot manage to complete the compiler, you may reduce the
problem to avoid writing many similar blocks of code. Start to write
small C- programs that only uses simple program constructions, and make
sure that the compiler works for them. Still we do
require you to include a variety of functionality, at least one
aritmetical operation (e.g. +), one relation (e.g. <), if and
while statements, assignment, and function calls with one parameter.
All missing features and bugs must be documented. We do not
accept undocumented flaws.
Remember an example using but the implemented features.
In the exercise description of the last year and the description of the Batu
instructions you find the necessary informations.
The program must be checked for structural errors and type mismatch,
which means using (almost) everything from previous handins.
If error occurs during compilation, the program should report, as well
as possible, the type and cause of the error.
All error messages shall contain reference to the source code, with
line number (and preferably a column indicator).
There is no use trying to generate code if an error has occured during
Parameters and local variables must be indexed by integers.
You have to compute the required number of variables for the frame
of each function.
You have to compute maximum stack depth for each function.
Every employed function must exist in the program.
The parameters must match those indicated by the mangled name.
There must be a function with the mangled name main()I.
You don't need to optimize your code. But,
you are likely to find plenty of examples of your compilers doing
really stupid things, so don't hesitate to optimise.
- Describe the design of your program.
Concentrate on parts which required special treatment or trade-offs.
Do not repeat yourselves. Simple and repetitous code may be presented
in a short overview.
Describe your own code, we do not want a general essay on code
Keep the documentation short. It is supposed to be easier
reading than the source code is.
If you use several classes, each of them must be described, and you
have to tell how they are supposed to interact.
Make a running example where you describe how code generation
is performed in real, on a simple sample of source code.
Include the source code which is used in this example.
(If you plan the example ahead of time, it may help you in the
implementation of the program.)
The running sample may very well be used to illustrate the
construction of the program.
Variables must be initiated before use, probably it's best to do that
just after declaration.
Remember to count parameters in the number of local variables of a
function, though they must not be initiated.
You may insert output(I)V and input()I in the
symbol table before analysis. In the production of function calls, you
may check for these to function and generate print or
read as appropriate. Otherwise an invokestatic
should be generated.
There is no instruction for relational operators, so you will have to
translate this into some kind of if-batch where either a 1 (true) or
0 (false) is left afterwards.
Remember that such results also may be assigned to variables.
If each node puts its code in a vector, then its length will indicate
how long a goto has to jump to get past it.
The methods on code generation from the book may be used. The
differences between register-based and stack-based machines are not
really significant until you start with optimisation.
Continue with the syntax tree from part II.
Code from each production may be put in a Vector, which makes it easy
both to count instruction for goto statements and to optimise.
That will be much harder if you flush code directly to stdout.
Each production may then paste the Vectors from its children into its
There is no instruction for logical not, but you may use for instance
iconst_1 followed by ixor.
Computing maximal stack size is a classic DFS algorithm, and has the
same order of size as the syntax tree.
Expressions leaves their value on the stack. Since assignments are
expression too, you nead dup (or dup_x2 for arrays)
before the value is moved to memory.
Optimisation for a stack machine is to a large extent a question of
avoiding register spill.
Optimisation is expected to be easier if it is separated from code
- Start with the simple things, and expand the program as you go.
- Write documentation and source code in parallel.
Afterwards everything appears simple or forgotten.
- Start early. This takes time.
- Do not wait till the topics are covered in lecture, lest you lose
time and fail to meet the deadline.
- There is no one way to solve the problem.
- Discuss the exercise with your partner and with other students,
perhaps on the newsgroup.