Assignments 10 and 11: Object-Oriented Programming (a source-to-source translator)
In Assignments 10 and 11 of this class, we will “prototype” an object-oriented programming language using a technique known as source-to-source translation.
Introduction

So far we’ve seen two common ways to prototype a programming language: as a stand-alone interpreter, and as a library (or embedded language) in an existing host language. In these assignments, we’ll see a third common implementation style that involves translating a program that was written in the source language — i.e., the language you are implementing — to a program in some existing target language that has the desired behavior. This implementation style, known as source-to-source translation, can be viewed as a lightweight form of compilation: rather than translating all the way down to machine code, the language implementer gets to leverage the features in a high-level target language, which significantly simplifies the task.

This approach can be seen as a middle ground between the two other approaches we’ve seen in class. Like with an interpreter, the source language has its own syntax, and requires a parser to convert this syntax into some form of abstract syntax trees. Like with an embedded language, it’s often possible to represent features in the source language directly, using counterparts in the target language. For example, it might be possible to translate a function in the source language to a function in the target language that has the same behavior, which then allows function calls in the source language to be implemented simply as function calls in the target language.

Of course, some language features in the source language will not map directly to a semantically-equivalent construct in the target language. In that case, the translation has to construct some code in the target language that will behave as desired. Depending on the complexity of that code, it may be nicer to factor it out to a library so that the target code is as simple and readable as possible. Such a library is similar in spirit to the idea of an embedded language that we saw in the previous assignment, except that this library is meant only for use by the source-to-source translator rather than directly by the programmer.

In Assignments 10 and 11, you’ll “prototype” an object-oriented programming language by translating it to JavaScript. We’ll start out with a basic version of the language in Assignment 10, and throw in some fancier features in Assignment 11.

Our OO Language

We’ll start out with a “vanilla” object-oriented language with single inheritance. Our language is dynamically-typed, and has Java-like syntax.

Declaring Classes

To declare a new class in our language, you use the class keyword: class C; Well, this class is not very interesting — it doesn’t even have instance variables! Here’s how you would declare a Point class with two instance variables, x and y: class Point with x, y;

By default, every new class that you declare is a direct subclass of Obj, which is the root of the class hierarchy in our language. You can optionally specify a superclass in a class declaration using the extends keyword, e.g., class ThreeDeePoint extends Point with z;

Declaring Methods

Our language supports open classes. This means that you can add new methods to a class without editing its declaration. In fact, the syntax of our language does not even allow programmers to write methods as part of a class declaration. Here’s how you add a method called init with arguments x and y to our Point class:def Point.init(x, y) { this.x = x; this.y = y; } And here’s how you override Point’s init method shown above for instances of ThreeDeePoint: def ThreeDeePoint.init(x, y, z) { super.init(x, y); this.z = z; } Note that our language does not support static overloading. It doesn’t matter that ThreeDeePoint’s version of init takes 3 arguments whereas Point’s init method only takes two arguments — the former still overrides the latter.

Creating Objects

To create a new instance of a class, you use the new keyword just like in JavaScript: var p = new Point(1, 2); As part of evaluating a new expression, our language invokes the init method on the new instance with the arguments supplied. Here’s what happens when the expression new C(e1,, en) is evaluated:

Statements and Expressions

Our language supports the kinds of statements and expressions that you’ll find in a typical OO language: the abilities to send a message to an object and access / update the value of an instance variable, etc. The init methods above illustrate assignment to instance variables, for example.

The next section includes a complete list of the statements and expressions in our language. We’ll describe the concrete and abstract syntax of each construct, as well as its expected behavior.

Assignment 10: The Base Language Due Tuesday, March 31st or April 7th, at 11:59pm
Turn in your trans.js (and, if needed, an updated classes.js) through Canvas. Please turn in either this assignment or Assignment 9 by the earlier due date, March 31.

You know the drill. Here’s what the concrete syntax of the base language looks like, and how we’ll represent it as abstract syntax in JavaScript:

Concrete Syntax JS AST
p ::= s1 … sn Evaluates to the last statement, if it’s an “expression statement” (see below), or null otherwise. new Program([s1,, sn])
s ::=
class C extends S with x1,, xn;
def C.m(x1,, xn) { s1 … sm }
var x = e;
x = e;
this.x = e;
return e;
e;
new ClassDecl(C, S, [x1,, xn])
new MethodDecl( C, m, [x1, … xn], [s1, … sm])
new VarDecl(x, e)
new VarAssign(x, e)
new InstVarAssign(x, e)
new Return(e)
new ExpStmt(e)
e ::=
primValue
x
e1 op e2
this
this.x
new C(e1,, en)
erecv.m(e1,, en)
super.m(e1,, en)
new Lit(primValue)
new Var(x)
op ∈ {+, -, *, /, %, <, >, ==, !=}
These operators have the same semantics as they do in JavaScript.
new BinOp(op, e1, e2)
new This()
new InstVar(x)
new New(C, [e1,,en])
new Send(erecv, m, [e1,, en])
new SuperSend(m, [e1,, en])
x, m, C, S ::= an identifier, e.g., sum a string, e.g., "sum"
primValue ::= a JavaScript number, boolean, or string literal, or null

If you’re not under the influence of a controlled substance (like these guys on the left) chances are you noticed that our language doesn’t have any control structures like if statements and while loops. There is a very good reason for this, and we’ll tell you all about it in Homework 5. Stay tuned!

Your job is to write a translator from our language to JavaScript. The translator will be a function called trans that takes the AST of a program and returns a string containing the JavaScript code generated from that AST: function trans(ast) { // do your thing! } Please do all your work in the file called trans.js. Note that your implementation should not use the eval feature of JavaScript (though unsurprisingly, our test code does).

Some Tips for Writing the Translator

It’s up to you to design an appropriate translation strategy, i.e., a mapping from our language to JavaScript that will give the translated programs the desired behavior. Here are a few tips to help you get started.

Re-read section about the “class sugar” in our JavaScript Primer. The desugaring of the class syntax in JavaScript, which we explained in detail, is almost exactly what we’re asking you to implement in this assignment. So take another look, see how classes are represented, how super-sends work, etc. (You’ll have to think a little bit more about how to represent instance variables, which work differently in our language.)

Divide and conquer. Similar to the evaluators you wrote in earlier assignments, it is natural for your translation to be compositional. That is, the translation of an expression or statement should be defined in terms of the translations of its subparts (other expressions and statements). This leads to a nice recursive solution, and it also ensures that you allow the subparts themselves to be arbitrarily complex. For example, the arguments to a message sends can be arbitrary expressions, including other message sends.

Unit Tests

We have included some unit tests for your translator below. As in the previous assignments, you can add your own test cases by editing asst10-tests.js.

Assignment 11 Due Tuesday, April 14th, at 11:59pm
Turn in your trans.js (and, if needed, an updated classes.js) through Canvas.

Part I: Look ma, no primitives!

In mainstream “object-oriented” languages like Java and C++, primitive values like 5 and true are not real objects. This is unfortunate because (among other things) it often forces programmers to write code in an unnatural way. Here are a couple of examples:

As an aspiring language designer, we hope this lack of uniformity gives you the heebie-jeebies, and we know you can do better! It shouldn’t matter how an integer is represented at the language implementation level. Our job is to help programmers, and we shouldn’t expose them to implementation details that make programming more complicated than it has to be.

In this homework assignment, you will modify your translator to make our language “purely” object-oriented, i.e., a language in which everything is an object. As we’ll see, this has some really nice benefits for expressiveness.

As a first step toward supporting pure OO programming, modify your implementation so that JavaScript’s primitive numbers, booleans, strings, and null can be used as first-class objects. Here’s what you’ll have to do in order to make that possible:

Here are a few unit tests for this part of the assignment. To add your own test cases, just edit asst11-tests-part1.js.

Part II: Blocks

Borrowing from Smalltalk, our language also includes blocks, which are essentially an object-oriented version of lambdas, a.k.a. first-class functions. Here are some examples:
{1 + 2} is a block with no arguments.
{ x, y | x + y } is a block with two arguments, x and y.
{ x | x.m(); x.n(); } is a block with one argument whose body consists of multiple statements.

In general, a block can have any number of declared arguments and its body can consist of any number of statements. When the last statement is an expression statement, the semicolon at the end is optional.

Concrete Syntax JS AST
e ::=
{ x1,, xn | s1 … sm }
new BlockLit([x1,, xn], [s1,, sm])

You evaluate a block by sending it a call message, to which you can pass the appropriate arguments. Unlike in a method body, which requires an explicit return statement, a block implicitly returns the value of its last statement, if it’s an expression statement, or null otherwise. Here are some examples:
{1 + 2}.call() should evaluate to 3.
{ x, y | x * y }.call(6, 7) should evaluate to 42.
{ x | x.m(); x.n(); }.call(someObj) should result in calling someObj’s m method, then someObj’s n method, and evaluate to the result of the latter.
{ 1 + 2; var x = true; }.call() should evaluate to null.

Just like lambdas, blocks can reference variables from their surrounding scope. A block also acts as kind of lexical scope: any variable declarations that are made inside a block are not visible outside it. Conveniently, JavaScript’s functions have both of these properties…

So you can (and should!) avoid the need to implement the semantics of closures and lexical scopes from scratch by translating blocks to plain old JavaScript functions. As with the treatment of numbers, strings and booleans, you will need to add a class for blocks (Block) that supports a call method.

Roll Your Own Control Structures

You’ve probably noticed that our language lacks control structures, e.g., it doesn’t have if or while statements. It turns out we don’t need any built-in control structures because it’s straightforward for programmers to define their own, as ordinary methods. This power comes from a combination of purity (the fact that everything in our language is an object) and support for open classes (the fact that a programmer can add new methods to any class in the system).

For example, an if-then-else “statement” can be defined as a method thenElse on Bools that takes two blocks as arguments, one for each branch of the conditional. With appropriate implementations for the classes True and False, it is now possible to write conditionals like the following:(x > 0).thenElse( { x = 2*x }, { x = x * -1 }) With the syntactic sugar that we saw in class you could write the following expression, which is equivalent to the one shown above:x > 0 then: { x = 2 * x } else: { x = x * -1 }

Semantics of return Inside a Block

As mentioned earlier, a block implicitly returns the value of its last expression statement. Sometimes it is more natural for a block to directly return from its enclosing method — this is especially the case when blocks are used to implement control structures. In our language, the return statement inside a block acts as such a non-local return. For example, here is an implementation of the absolute value method for Numbers:def Number.abs() { this >= 0 then: { return this; } else: { return this * -1; } } When a return statement is executed in the above code, it returns the associated value from the abs method itself, and returns control to the caller of abs, rather than just returning from the block. While it may seem like there are two different kinds of return in our language, this isn’t really the case. A return inside a block means exactly the same thing as a return inside a method: return this value from (this particular activation of) the enclosing method.

One interesting issue is how to treat non-local returns in cases where the block is passed around before it is called. In our language, it’s a run-time error to try to execute a return from a block whose enclosing method has already returned. Otherwise, it is OK for a block to execute a return, regardless of where on the call stack the enclosing method’s activation record is. For instance, in the absolute value example above, a return causes the activation records for Block’s call method and Bool’s thenElse method to be popped off the stack, and the return value is then associated with the original call to Number’s abs method. Here’s another example:def Object.m() { var b = { return 5; }; return this.n(b) * 2; } def Object.n(aBlock) { aBlock.call(); return 42; } new Object().m(); // evaluates to 5

To implement non-local return properly, the stack must be “walked,” popping off stack frames until the right activation record is found. Hint: Exceptions already walk the stack, so it is natural to use them to implement non-local returns. The main difficulty is to ensure that a return is always associated with the correct method invocation.

Here are a few unit tests for this part of the assignment. To add your own test cases, just edit asst11-tests-part2.js.

Playground
Epilogue

If you're interested in pushing this project further, here are a few ideas you might try:

Recommended Reading