Skip to content

Commit 0862c71

Browse files
Update
1 parent a3b552b commit 0862c71

21 files changed

+1167
-21
lines changed

_sources/abstract-syntax-tree.rst.txt

+10
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
====================
2+
Abstract Syntax Tree
3+
====================
4+
5+
TODO
6+
7+
Example Implementation
8+
======================
9+
10+
* See `AST in EZ Language <https://github.com/CompilerProgramming/ez-lang/blob/main/parser/src/main/java/com/compilerprogramming/ezlang/parser/AST.java>`_.

_sources/compiler-books.rst.txt

+91
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,95 @@
1+
==============
12
Compiler Books
23
==============
34

45
I own a bunch of compiler books that I have purchased over the years.
6+
7+
Dragon Books
8+
============
9+
I have 3 editions of these.
10+
11+
* Principles of Compiler Design. Aho & Ullman, 1977.
12+
* Compilers: Principles, Techniques and Tools. Aho, Sethi, Ullman, 1986.
13+
* Compilers: Principles, Techniques and Tools, 2nd Ed. Aho, Lam, Sethi, Ullman, 2006.
14+
15+
These books are criticised today because of the excessive focus on lexical analysis and parsing techniques.
16+
While this is true, they do cover various aspects of a compiler backend such as intermediate representations and
17+
optimization techniques including peephole optimization, data flow analysis, register allocation etc.
18+
I found the description of the lattice in a data flow analysis quite accessible.
19+
20+
The 2nd edition adopts a more mathematical presentation style, whereas the earlier editions present
21+
algorithms using pseudo code. I think the 1986 edition is the best.
22+
23+
For a different take on 2nd edition see `Review of the second addition of the "Dragon Book" <https://gcc.gnu.org/wiki/Review_of_the_second_addition_of_the_Dragon_Book.>`_.
24+
25+
Engineering a Compiler, 2nd Ed. Cooper & Torczon. 2012.
26+
=======================================================
27+
This is a more modern version of the Dragon book. It is less focused on the lexical analysis / parsing
28+
phases, and covers the later phases of a compiler in more detail. Exposition is similar to the Dragon book, i.e. mostly describes
29+
techniques conceptually, with some high level algorithm descriptions, but like the Dragon book, does not
30+
go into detailed descriptions of algorithms.
31+
32+
Both this and the Dragon books describe ahead of time compilers and cover topics that are suited for procedural languages
33+
such as C or traditional Pascal or Fortran. They cover both front-end and back-end techniques; however, on the front-end
34+
side, interesting topics such as Object Orientation, Closures, Generics,
35+
or Semantic analysis of more complex languages such as Java are not covered.
36+
37+
Modern Compiler Implementation in C. Appel. 1998. (Tiger book)
38+
==============================================================
39+
This book takes a hands on tutorial like approach to describing how to implement both the front-end and back-end
40+
of a compiler, using a toy language called Tiger as an example. Algorithms are described in pseudo code.
41+
If I had to choose between the Dragon book, Engineering a compiler, and this book, I would pick this one.
42+
43+
This book covers functional languages, closures, as well as Object Oriented languages such as Java. Type inference is
44+
covered too.
45+
46+
Crafting a Compiler. Fischer, LeBlanc, Cytron. 2010.
47+
====================================================
48+
The last couple of chapters are the most interesting - these focus on code generation and program optimization.
49+
50+
The 2nd edition of the book (with Cytron as co author) has a description of Static Single assignment that is
51+
perhaps the most complete in all the books I cover here. The 1st edition describes data flow analysis in more
52+
detail.
53+
54+
Apart from the final two chapters, the rest of the book is about parsing and semantic analysis.
55+
56+
Building an Optimizing Compiler. Bob Morgan. 1998.
57+
==================================================
58+
I have the kindle edition which is very poor and hard to read. I wish I had a paper copy.
59+
60+
This book is almost completely about the backend of the compiler.
61+
62+
Advanced Compiler Design & Implementation. Muchnick. 1997.
63+
==========================================================
64+
I have the kindle edition, which is very poor and hard to read.
65+
66+
This book is also mostly about the backend of a compiler, focusing on optimization.
67+
68+
My impression is that this book describes many algorithms in detail. But when I tried to implement one of the
69+
simpler algorithms (18.1 Unreachable Code Elimination) I found that the description left out a
70+
part (No_Path) of the algorithm.
71+
72+
This book describes the idea of multiple levels of intermediate representation, HIR, MIR and LIR.
73+
I guess this has influenced many compiler implementations.
74+
75+
Its coverage of SSA is rudimentary - I guess it was written when SSA was still very new.
76+
77+
This book has a reputation of containing many errors, although I assume the latest printings have the errors
78+
fixed.
79+
80+
Despite its faults, it is a must have book if you want to learn about compiler construction.
81+
82+
Retargetable C Compiler, A: Design and Implementation. Hanson & Fraser. 1995.
83+
=============================================================================
84+
Describes a production C compiler. Detailed dsecription of the actual compiler code.
85+
86+
Weak on theoretical aspects, and limited by features of the compiler being described.
87+
88+
Program Flow Analysis: Theory and Applications. Editors Muchnick, Jones. 1981.
89+
==============================================================================
90+
Collection of essays on program analysis, by various authors. This is pre-SSA, hence a bit
91+
dated.
92+
93+
Other Book Reviews
94+
==================
95+
* `List of compiler books <https://gcc.gnu.org/wiki/ListOfCompilerBooks>`_

_sources/index.rst.txt

+9-5
Original file line numberDiff line numberDiff line change
@@ -40,11 +40,15 @@ Preliminaries
4040
Basic Front-End techniques
4141
==========================
4242

43-
* Lexical analysis
44-
* Parsing
45-
* Abstract Syntax Trees
46-
* Type Systems
47-
* Semantic Analysis
43+
.. toctree::
44+
:maxdepth: 2
45+
:caption: Parsing Techniques
46+
47+
lexical-analysis
48+
syntax-analysis
49+
abstract-syntax-tree
50+
type-systems
51+
semantic-analysis
4852

4953
Basic Back-end techniques
5054
=========================

_sources/lexical-analysis.rst.txt

+55
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
================
2+
Lexical Analysis
3+
================
4+
5+
When compiling a program we need to recognize the words and punctuations that make up the vocabulary of the language.
6+
This part of the compiler is therefore known as "lexical" analysis.
7+
8+
Usually a compiler is given one or more input programs, and the first thing it must do is read the program and
9+
figure out what lexical elements appear in the program.
10+
11+
Typically, these lexical elements are known as tokens. So for example, in the following snippet of code::
12+
13+
print('hi')
14+
15+
We have a number of lexical elements / tokens:
16+
17+
* ``print``
18+
* ``(``
19+
* ``'hi'``
20+
* ``)``
21+
22+
There are many different ways to implement a "lexer" - the name we give to this component of the compiler.
23+
24+
* We can write this code by hand. This involves scanning the input program character by character and
25+
deciding what tokens appear in the program.
26+
* Or we can specify the lexical elements in a grammar and have a tool generate the code to process the input
27+
program and give us the tokens that appear in the program.
28+
29+
A lexical analyser can be designed to process input on demand, or it may be designed to translate the entire
30+
input source to a set of tokens at the very beginning.
31+
32+
Considerations
33+
==============
34+
35+
* Should comments in the input program be retained as tokens? Usually a lexer will discard comments, but in languages that
36+
allow comments to be retained as documentation, the lexer must not discard them.
37+
* Should end of line markers be retained? Typically lexers drop all intermediate space including line markers,
38+
but if the language syntax depends on line markers then these may need to be retained.
39+
* Should tokens copy the input text, convert them to another form, or retain pointers to the input itself?
40+
Retaining the original form of the lexical token may be important in some cases, for example if the lexer
41+
is used in a code formatter.
42+
* How much can we peek ahead? During later stages of the compiler, depending on the complexity of the language grammar,
43+
it may be necessary to allow the compiler to look ahead one or more tokens without consuming them.
44+
* Ancillary information regarding tokens such a line number, column number in the input source are invaluable for
45+
error reporting.
46+
47+
Example Hand-Coded Implementation
48+
=================================
49+
50+
The `Lexer <https://github.com/CompilerProgramming/ez-lang/tree/main/lexer>`_ module in the EZ language
51+
implementation contains an example of hand-coded lexical analyser written in Java. This implementation returns tokens
52+
on demand.
53+
54+
Another example is the `Lua lexer <https://lua.org/source/5.4/llex.c.html>`_.
55+

_sources/parsing.rst.txt

+10
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
=======
2+
Parsing
3+
=======
4+
5+
TODO
6+
7+
Example Implementation
8+
======================
9+
10+
See `EZ Language Parser <https://github.com/CompilerProgramming/ez-lang/tree/main/parser>`_.

_sources/semantic-analysis.rst.txt

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
=================
2+
Semantic Analysis
3+
=================
4+
5+
TODO
6+
7+

_sources/syntax-analysis.rst.txt

+10
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
===============
2+
Syntax Analysis
3+
===============
4+
5+
TODO
6+
7+
Example Implementation
8+
======================
9+
10+
See `EZ Language Parser <https://github.com/CompilerProgramming/ez-lang/tree/main/parser>`_.

_sources/type-systems.rst.txt

+10
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
============
2+
Type Systems
3+
============
4+
5+
TODO
6+
7+
Example Implementation
8+
======================
9+
10+
See `Type System in EZ Language <https://github.com/CompilerProgramming/ez-lang/tree/main/types>`_.

abstract-syntax-tree.html

+127
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
2+
<!DOCTYPE html>
3+
4+
<html>
5+
<head>
6+
<meta charset="utf-8" />
7+
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="generator" content="Docutils 0.17.1: http://docutils.sourceforge.net/" />
8+
9+
<title>Abstract Syntax Tree &#8212; Compiler Programming</title>
10+
<link rel="stylesheet" type="text/css" href="_static/pygments.css" />
11+
<link rel="stylesheet" type="text/css" href="_static/agogo.css" />
12+
<script data-url_root="./" id="documentation_options" src="_static/documentation_options.js"></script>
13+
<script src="_static/jquery.js"></script>
14+
<script src="_static/underscore.js"></script>
15+
<script src="_static/doctools.js"></script>
16+
<link rel="index" title="Index" href="genindex.html" />
17+
<link rel="search" title="Search" href="search.html" />
18+
<link rel="next" title="Type Systems" href="type-systems.html" />
19+
<link rel="prev" title="Parsing" href="parsing.html" />
20+
</head><body>
21+
<div class="header-wrapper" role="banner">
22+
<div class="header">
23+
<div class="headertitle"><a
24+
href="index.html">Compiler Programming</a></div>
25+
<div class="rel" role="navigation" aria-label="related navigation">
26+
<a href="parsing.html" title="Parsing"
27+
accesskey="P">previous</a> |
28+
<a href="type-systems.html" title="Type Systems"
29+
accesskey="N">next</a> |
30+
<a href="genindex.html" title="General Index"
31+
accesskey="I">index</a>
32+
</div>
33+
</div>
34+
</div>
35+
36+
<div class="content-wrapper">
37+
<div class="content">
38+
<div class="document">
39+
40+
<div class="documentwrapper">
41+
<div class="bodywrapper">
42+
<div class="body" role="main">
43+
44+
<section id="abstract-syntax-tree">
45+
<h1>Abstract Syntax Tree<a class="headerlink" href="#abstract-syntax-tree" title="Permalink to this headline"></a></h1>
46+
<p>TODO</p>
47+
<section id="example-implementation">
48+
<h2>Example Implementation<a class="headerlink" href="#example-implementation" title="Permalink to this headline"></a></h2>
49+
<ul class="simple">
50+
<li><p>See <a class="reference external" href="https://github.com/CompilerProgramming/ez-lang/blob/main/parser/src/main/java/com/compilerprogramming/ezlang/parser/AST.java">AST in EZ Language</a>.</p></li>
51+
</ul>
52+
</section>
53+
</section>
54+
55+
56+
<div class="clearer"></div>
57+
</div>
58+
</div>
59+
</div>
60+
</div>
61+
<div class="sidebar">
62+
63+
<h3>Table of Contents</h3>
64+
<p class="caption" role="heading"><span class="caption-text">Preliminaries</span></p>
65+
<ul>
66+
<li class="toctree-l1"><a class="reference internal" href="prelim-impl-lang.html">Implementation Language</a></li>
67+
</ul>
68+
<p class="caption" role="heading"><span class="caption-text">Basic Front-end Techniques</span></p>
69+
<ul class="current">
70+
<li class="toctree-l1"><a class="reference internal" href="lexical-analysis.html">Lexical Analysis</a></li>
71+
<li class="toctree-l1"><a class="reference internal" href="lexical-analysis.html#example-implementation-in-ez-language">Example Implementation in EZ Language</a></li>
72+
<li class="toctree-l1"><a class="reference internal" href="parsing.html">Parsing</a></li>
73+
<li class="toctree-l1 current"><a class="current reference internal" href="#">Abstract Syntax Tree</a><ul>
74+
<li class="toctree-l2"><a class="reference internal" href="#example-implementation">Example Implementation</a></li>
75+
</ul>
76+
</li>
77+
<li class="toctree-l1"><a class="reference internal" href="type-systems.html">Type Systems</a></li>
78+
</ul>
79+
<p class="caption" role="heading"><span class="caption-text">Reviews</span></p>
80+
<ul>
81+
<li class="toctree-l1"><a class="reference internal" href="compiler-books.html">Compiler Books</a></li>
82+
</ul>
83+
84+
<div role="search">
85+
<h3 style="margin-top: 1.5em;">Search</h3>
86+
<form class="search" action="search.html" method="get">
87+
<input type="text" name="q" />
88+
<input type="submit" value="Go" />
89+
</form>
90+
</div>
91+
92+
</div>
93+
<div class="clearer"></div>
94+
</div>
95+
</div>
96+
97+
<div class="footer-wrapper">
98+
<div class="footer">
99+
<div class="left">
100+
<div role="navigation" aria-label="related navigaton">
101+
<a href="parsing.html" title="Parsing"
102+
>previous</a> |
103+
<a href="type-systems.html" title="Type Systems"
104+
>next</a> |
105+
<a href="genindex.html" title="General Index"
106+
>index</a>
107+
</div>
108+
<div role="note" aria-label="source link">
109+
<br/>
110+
<a href="_sources/abstract-syntax-tree.rst.txt"
111+
rel="nofollow">Show Source</a>
112+
</div>
113+
</div>
114+
115+
<div class="right">
116+
117+
<div class="footer" role="contentinfo">
118+
&#169; Copyright 2024, Dibyendu Majumdar.
119+
Created using <a href="https://www.sphinx-doc.org/">Sphinx</a> 4.3.2.
120+
</div>
121+
</div>
122+
<div class="clearer"></div>
123+
</div>
124+
</div>
125+
126+
</body>
127+
</html>

0 commit comments

Comments
 (0)