Project Part 1 - Scanning

This part is to be solved and handed in individually.


Write a scanner which recognises tokens in a C- source file. We recommend to write the scanner in JLex.


Deadline for hand in is Friday 4th October 10:00 am.

Contents of hand in

The answers handed in must include:

The prints may be handed in in the box marked "I125" outside the reception at the department on the "4th" floor, HIB (Thormølens gate 55).

The problem

You are going to use JLex to write a scanner for C- as it is specified on the project main page. To test the scanner you will need an executable driver program which reads C- source code and write the tokens to stdout in the following format:

  linenumber, start..end: token; attr= value\n

See also the example below. We request such a strict format to be able to easily check the program with diff(1). The driver program may very well be a main method in the Scanner class. The driver program is to be replaced by the parser in the next part of the project.

You have to write proper documentation for the scanner and the driver program. Make sure you include

  1. What problems you encounter in the implementation of a scanner and how you have solved them.
  2. Overview of the classes of the scanner and how they may be used in the production of a parser.
  3. How we run the driver program.

Do not include general text about scanners, but restrict yourselves to your own code. If you use several classes, you have to explain what role each of them takes and how they interact. You may use UML if you want to.

You have to hand in the program as a jar file, so that we can run it. project main page on how to make a jar file.)


Source code for C-:

main() {
  /* test-program */
  return 41;

Output from the driver program:

1, 1..4: Id; name="main"
1, 5: Vparen
1, 6: Hparen
1, 8: Vbrace
3, 3..8: Return
3, 10..11: Num; value=41
3, 12: Semicolon
4, 1: Hbrace


The list of tokens which must be recognised by the scanner is found on page 491-492 of the text book.