qccompiler

Design and implementation of a compiler from scratch, including:

  1. Lexical analysis
  2. Syntactic analysis
  3. Abstract syntax tree (AST) construction
  4. Semantic analysis
  5. Code generation

It’s written in C and it compiles input files written in qC, a small subset of the language ANSI C (C89/C90). The generated code is in C, but very close to Assembly.

Features of qC

Tokens

Token Meaning
ID alphameric case sensitive sequences beginning with a letter where ‘_’ is also allowed
INTLIT sequence of digits without unnecessary left pad zeros
CHRLIT single character (except newline or single quote) or escape sequence (\n, \t, \, ', " and \0) between single quotes
STRLIT sequence of characters (except newline or single quote) and/or escape sequences between double quotes
AMP &
AND &&
ASSIGN =
AST *
ATOI atoi
CHAR char
COMMA ,
DIV /
ELSE else
EQ ==
GE >=
GT >
IF if
INT int
ITOA itoa
LBRACE {
LE <=
LPAR (
LSQ [
LT <
MINUS -
MOD %
NE !=
NOT !
OR ||
PLUS +
PRINTF printf
RBRACE }
RETURN return
RPAR )
RSQ ]
SEMI ;
WHILE while
RESERVED C keywords not used in qC

Grammar (EBNF notation)

Start → (FunctionDefinition | FunctionDeclaration | Declaration) {FunctionDefinition | FunctionDeclaration | Declaration}
FunctionDefinition → TypeSpecifier FunctionDeclarator LBRACE {Declaration} {Statement} RBRACE
FunctionDeclaration → TypeSpecifier FunctionDeclarator SEMI
FunctionDeclarator → {AST} ID LPAR [ParameterList] RPAR
ParameterList → ParameterDeclaration {COMMA ParameterDeclaration}
ParameterDeclaration → TypeSpecifier {AST} ID
Declaration → TypeSpecifier Declarator {COMMA Declarator} SEMI
TypeSpecifier → CHAR | INT
Declarator → {AST} ID [LSQ INTLIT RSQ]
Statement → [Expression] SEMI
Statement → LBRACE {Statement} RBRACE
Statement → IF LPAR Expression RPAR Statement [ELSE Statement]
Statement → WHILE LPAR Expression RPAR Statement
Statement → RETURN Expression SEMI
Expression → Expression ASSIGN Expression
Expression → Expression (AND | OR) Expression
Expression → Expression (EQ | NE | LT | GT | LE | GE) Expression
Expression → Expression (PLUS | MINUS | AST | DIV | MOD) Expression
Expression → (AMP | AST | PLUS | MINUS | NOT) Expression
Expression → Expression LSQ Expression RSQ
Expression → ID LPAR [Expression {COMMA Expression}] RPAR
Expression → (PRINTF | ATOI) LPAR Expression RPAR
Expression → ITOA LPAR Expression COMMA Expression RPAR
Expression → ID | INTLIT | CHRLIT | STRLIT | LPAR Expression RPAR

Usage

$ make
$ ./qccompiler [OPTIONS] < input.qc

Options:

-t     print the abstract syntax tree and stop after syntatic analysis.
-s     print the symbol table and stop after semantic analisys.
-c     allways compile the program (unless errors occur).
-o     allways compile the program (unless errors occur) and print compiled program to file.

If both flags -s and -t are set, the proccess stops after the semantic analysis. The result of the compilation is written to the file `result.c

Dependencies