Skip to content

Commit e67d829

Browse files
Added a writeup on EZ lang
1 parent d86c0bd commit e67d829

17 files changed

+873
-27
lines changed

_sources/ez-lang.rst.txt

+365
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,365 @@
1+
The EeZee Programming Language
2+
==============================
3+
4+
The EeZee programming language is a toy language with just enough features to allow
5+
experimenting with various compiler techniques.
6+
7+
The base language is intentionally very small. Eventually there will be extended versions
8+
that allow functional and object oriented paradigms.
9+
10+
Language features
11+
-----------------
12+
* User defined functions
13+
* Integer type
14+
* User defined ``struct`` types
15+
* One dimensional arrays
16+
* Basic control flow such as ``if`` and ``while`` statements
17+
18+
Keywords
19+
--------
20+
Following are keywords in the language::
21+
22+
func var int struct if else while break continue return
23+
24+
Source Unit
25+
-----------
26+
27+
The EeZee language does not have the concept of modules or imports. Each source file must be
28+
self contained.
29+
30+
There is no predefined ``main`` function in a source unit. The runtime should allow
31+
any defined function to be invoked by supplying appropriate arguments.
32+
33+
Types
34+
-----
35+
36+
The only primitive type in the language is the integer type ``Int``.
37+
The size of this type is unspecified, the default implementation is 64-bit integers.
38+
39+
There is not a distinct boolean type, non-zero integer values evaluate as true, and ``0`` evaluates as false.
40+
41+
Users can define one-dimensional arrays and structs.
42+
43+
Arrays and structs are implicitly reference types, i.e. instances of these types are
44+
allocated on the heap.
45+
46+
The language does not specify whether the heap is garbage collected or manually managed, it is
47+
up to the implementation.
48+
49+
A ``struct`` type is a named aggregate with one or more fields. Fields may of be of any supported
50+
type.
51+
52+
An array type is declared by enclosing the element type in brackets, i.e. ``[`` and ``]``.
53+
54+
There is a ``Null`` type, with a predefined literal named ``null`` of this type.
55+
56+
When declaring fields or variables of reference types, user may suffix the type name with ``?`` to
57+
indicate a ``Nullable`` type. A ``Null`` is an implicit subtype of all ``Nullable`` types.
58+
59+
Examples::
60+
61+
struct Tree {
62+
var left: Tree?
63+
var right: Tree?
64+
}
65+
struct Test {
66+
var intArray: [Int]
67+
}
68+
struct TreeArray {
69+
var array: [Tree?]?
70+
}
71+
72+
Struct types are nominal, i.e. each struct type is identified uniquely by its name.
73+
Multiple definitions of struct types is not allowed.
74+
75+
The language does not require forward declarations.
76+
77+
Functions
78+
---------
79+
80+
Users can declare functions, each function must have a unique name.
81+
82+
Polymorphic functions are not supported.
83+
84+
Functions can accept one or more arguments and may optionally return a result.
85+
86+
The ``func`` keyword instroduces a function declaration.
87+
88+
Examples::
89+
90+
func fib(n: Int)->Int {
91+
var f1=1
92+
var f2=1
93+
var i=n
94+
while( i>1 ){
95+
var temp = f1+f2
96+
f1=f2
97+
f2=temp
98+
i=i-1
99+
}
100+
return f2
101+
}
102+
103+
func foo()->Int {
104+
return fib(10)
105+
}
106+
107+
Variables and Fields
108+
--------------------
109+
110+
The ``var`` keyword is used to introduce a new variable in the current lexical scope,
111+
or to add a field to a struct.
112+
113+
There are two forms of this:
114+
115+
When introducing variables, you can supply an initializer; this removes the need to
116+
specify a type. Examples::
117+
118+
var i = 1
119+
var j = foo()
120+
121+
In this form the type of the variable is inferred from the initializer's type.
122+
123+
The second form is more suited when declaring fields in a struct. In this form
124+
a type is required - initializer cannot be set.
125+
126+
Example::
127+
128+
struct T
129+
{
130+
var f: Int
131+
var arry: [Int]
132+
}
133+
134+
Creating new instances of Arrays
135+
--------------------------------
136+
137+
The ``new`` keyword is used to create array instances.
138+
139+
It must be followed by an array type name, and optionally followed by an initializer.
140+
141+
The array initializer must be a comma separated list of values, enclosed in ``{`` and ``}``.
142+
143+
The array is sized based on number of values in the initilizer.
144+
145+
Alternatively the array initializer may have a field named ``len`` that specifies the size of the
146+
array, and a field named ``value`` to specify the value to use.
147+
148+
Examples::
149+
150+
var arry = new [Int] {1,2,3}
151+
var arry2 = new [Int] {len=10, value=0}
152+
153+
The second example creates an array with 10 elements and sets the initial value to 0.
154+
155+
Creating new instances of structs
156+
---------------------------------
157+
158+
The ``new`` keyword is used to create struct instances.
159+
160+
It must be followed by the struct type name, and optionally followed by an initializer.
161+
162+
The struct initializer must be a comma separated list of field initializers, enclosed in ``{`` and ``}``.
163+
164+
A field initializer has the form of name followed by ``=`` followed by an expression.
165+
166+
Examples::
167+
168+
var stats = new Stats { age=10, height=100 }
169+
170+
171+
Control Flow
172+
------------
173+
174+
The language is lexically scoped, and block structured.
175+
176+
A block is enclosed in ``{`` and ``}`` and introduces a lexical scope.
177+
178+
The ``if`` statement allows branching based on a condition. The condition must be an
179+
integer expression; a value of ``0`` is false, any other value is ``true``.
180+
181+
The ``if`` statement can have an optional ``else`` branch.
182+
183+
The only looping construct is the ``while`` statement; this executes the sub statement
184+
as long as the supplied condition evaluates to a non zero value.
185+
186+
The ``break`` statement exits a loop.
187+
188+
The ``continue`` statement branches to the beginning of the loop.
189+
190+
The ``return`` statement takes an expression if the function is meant to return a value.
191+
It causes the currently executing function to terminate.
192+
193+
Expressions
194+
-----------
195+
196+
Following table describes the available operators by their precedence (low to high):
197+
198+
+------------+-----------------+----------+
199+
| Operator | Meaning | Type |
200+
| | | |
201+
+============+=================+==========+
202+
| ``||`` | logical or | Binary |
203+
+------------+-----------------+----------+
204+
| ``&&`` | logical and | Binary |
205+
+------------+-----------------+----------+
206+
| ``==`` | relational | Binary |
207+
| ``!=`` | | |
208+
| ``<`` | | |
209+
| ``<=`` | | |
210+
| ``>`` | | |
211+
| ``>=`` | | |
212+
+------------+-----------------+----------+
213+
| ``+`` | addition | Binary |
214+
| ``-`` | | |
215+
+------------+-----------------+----------+
216+
| ``*`` | multiplication | Binary |
217+
| ``/`` | | |
218+
+------------+-----------------+----------+
219+
| ``-`` | negate | Unary |
220+
| ``!`` | | |
221+
+------------+-----------------+----------+
222+
| ``(...)``, | function call, | Postfix |
223+
| ``[]``, | array index, | |
224+
| ``.`` ID | field access | |
225+
+------------+-----------------+----------+
226+
227+
228+
229+
Grammar
230+
-------
231+
232+
The following grammar describes the language syntax::
233+
234+
program
235+
: declaration+ EOF
236+
;
237+
238+
declaration
239+
: structDeclaration
240+
| functionDeclaration
241+
;
242+
243+
structDeclaration
244+
: 'struct' IDENTIFIER '{' fields '}'
245+
;
246+
247+
fields
248+
: varDeclaration+
249+
;
250+
251+
varDeclaration
252+
: 'var' IDENTIFIER ':' typeName ';'?
253+
;
254+
255+
typeName
256+
: simpleType
257+
| arrayType
258+
;
259+
260+
simpleType
261+
: IDENTIFIER ('?')?
262+
;
263+
264+
arrayType
265+
: '[' simpleType ']' ('?')?
266+
;
267+
268+
functionDeclaration
269+
: 'func' IDENTIFIER '(' parameters? ')' ('->' typeName)? block
270+
;
271+
272+
parameters
273+
: parameter (',' parameter)*
274+
;
275+
276+
parameter
277+
: IDENTIFIER ':' typeName
278+
;
279+
280+
block
281+
: '{' statement* '}'
282+
;
283+
284+
statement
285+
: 'if' '(' expression ')' statement
286+
| 'if' '(' expression ')' statement 'else' statement
287+
| 'while' '(' expression ')' statement
288+
| postfixExpression '=' expression ';'?
289+
| block
290+
| 'break' ';'?
291+
| 'continue' ';'?
292+
| varDeclaration
293+
| 'var' IDENTIFIER '=' expression ';'?
294+
| 'return' orExpression? ';'?
295+
| expression ';'?
296+
;
297+
298+
expression
299+
: orExpression
300+
;
301+
302+
orExpression
303+
: andExpression ('||' andExpression)*
304+
;
305+
306+
andExpression
307+
: relationalExpression ('&&' relationalExpression)*
308+
;
309+
310+
relationalExpression
311+
: additionExpression (('==' | '!='| '>'| '<'| '>='| '<=') additionExpression)*
312+
;
313+
314+
additionExpression
315+
: multiplicationExpression (('+' | '-') multiplicationExpression)*
316+
;
317+
318+
multiplicationExpression
319+
: unaryExpression (('*' | '/' ) unaryExpression)*
320+
;
321+
322+
unaryExpression
323+
: ('-' | '!') unaryExpression
324+
| postfixExpression
325+
;
326+
327+
postfixExpression
328+
: primaryExpression (indexExpression | callExpression | fieldExpression)*
329+
;
330+
331+
indexExpression
332+
: '[' orExpression ']'
333+
;
334+
335+
callExpression
336+
: '(' arguments? ')'
337+
;
338+
339+
arguments
340+
: orExpression (',' orExpression)*
341+
;
342+
343+
fieldExpression
344+
: '.' IDENTIFIER
345+
;
346+
347+
primaryExpression
348+
: INTEGER_LITERAL
349+
| IDENTIFIER
350+
| '(' orExpression ')'
351+
| 'new' typeName initExpression
352+
;
353+
354+
initExpression
355+
: '{' initializers? '}'
356+
;
357+
358+
initializers
359+
: initializer (',' initializer)*
360+
;
361+
362+
initializer
363+
: (IDENTIFIER '=')? orExpression
364+
;
365+

_sources/index.rst.txt

+1
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ Preliminaries
4444
:caption: Preliminaries
4545

4646
prelim-impl-lang
47+
ez-lang
4748

4849
Basic Front-End techniques
4950
==========================

_sources/prelim-impl-lang.rst.txt

+3-6
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
Implementation Language
2-
=======================
1+
Compiler Implementation Language
2+
================================
33

44
A compiler can be implemented in any language we choose. For a pedagogical project it is more convenient
55
to choose a language that is widely used, has garbage collection, and comes with excellent tools such
@@ -17,10 +17,7 @@ from a technical standpoint, that is. It is a garbage collection language that h
1717
work with. The main negatives are that it is not a popular language, and the tooling is not up to
1818
the standards of other languages.
1919

20-
Go would be a good candidate except that its an opinionated language that forces a certain programming model,
21-
whereas we would like a language that offers least resistance.
22-
23-
Java, Kotlin, Swift and C# seem like good candidates. Java has some limitations that make it harder to write memory optimized
20+
Go, Java, Kotlin, Swift and C# seem like good candidates. Java has some limitations that make it harder to write memory optimized
2421
code that is often necessary in a production compiler, but we don't care so much about that.
2522

2623
I decided to use Java because it is the language I am most familar with, has great tooling, and despite some

0 commit comments

Comments
 (0)