Programming assignments 3 through 5 will direct you to design and build an interpreter for snail. Each assignment will cover one component of the interpreter: lexical analysis, parsing, and operational semantics. Each assignment will ultimately result in a working interpreter phase which can interface with the other phases.
For this assignment you will write a parser using a parser generator. You will describe the snail grammar in an appropriate input format and the parser generator will generate actual code (in Python). You will also write additional code to unserialize the tokens produced by the lexer stage and to serialize the abstract syntax tree produced by your parser.
You will be implementing this stage of the interpreter in the Python programming language. You may work in a team of two people for this assignment.
You must create a minimum of four artifacts:
SL-Lex
format (as
described the language specification) from standard input and
produces an SL-AST
-formated abstract syntax tree.
SL-Lex
data will always be well-formed
(i.e., there will be no syntax errors in the
SL-Lex
file itself). However, the
SL-Lex
file may describe a sequence of snail
tokens that do not form a valid snail program.
SL-Lex
data
(e.g., a parse error in the snail file) or emit a
SL-AST
-formatted serialized snail abstract
syntax tree.
python3 parser.py < file.sl-lex > file.sl-ast
snail --parse file.sl
Note that <
performs input redirection to load standard input
from a file, and >
performs
output redirection to save standard
output to a file.
readme.txt
describing your design decisions and choice of test cases. See
the grading rubric. A few paragraphs should suffice.
references.txt
providing a citation for each resource you used (excluding class
notes, and assigned readings) to complete the assignment. For
example, if you found a Stack Overflow answer helpful, provide a
link to it. Additionally, provide a brief description of how the
resource helped you.
test-1-XXX.sl
through
test-N-XXX.sl
>, which exercise parsing corner cases.
Your test cases will be evaluated against buggy implementations
of the snail parser.
Your parser should conform to the syntax and SL-AST specification provided by the snail documentation.
To report an error, write the string
to standard output and terminate the program with exit code 0. You may write whatever you want in the message, but it should be fairly indicative.
class Cons : List + IO {
If there are no errors in file.sl-lex
your program
should print the abstract syntax tree in SL-AST format to standard
output (the default output of print()
). This is the
same format used by the reference compiler to create a
file.sl-ast
file when invoked with the
--parse
flag. You can find a detailed
specification of this format here.
Note that the reference implementation (along with your upcoming
P5 interpreter!) will read SL-AST
input instead of
.sl
files.
The SL-AST
format is quite verbose, but it is
particularly easy for the interpreter to read in again using
library code for JSON. It will also make it particularly easy for
you to notice where things are going awry if your parser is not
producing the correct output.
Outputting SL-AST
should be fairly trivial using
Python's json
module, assuming you construct
reasonable objects or dictionaries as part of parsing.
Note that SLUGS assumes that objects in arrays are emitted in the same order they appear in the source file. This is more strict than the SL-AST spec, but it makes for easier comparison with your output.
You must use a parser generator or similar library for this assignment. In class, we discussed Ply, a parser generator for Python. You will find the documentation to be particularly helpful.
Ply is pre-installed in the course development environment. Outside of this, the easiest way to get a recent version of Ply is to download and install it using Python's package index:
python3 -m pip install --user ply
In this project, we continue to employ a form of test-driven development or mutation testing into our software development process and require you to construct a high-quality test suite.
The goal is to leave you with a high-quality test suite of programs that you can use to evaluate your own P4 Parser. While you you can check for correct "positive" behavior by comparing your lexer's output to the reference interpreters's output on the usual "good" snail programs, it is comparatively harder to check for "corner case" behavior.
SLUGS contains 9 variants of the Python-based parser, each with a secret intentionally-introduced defect related to Parsing. A high-quality test suite is one that reveals each introduced defect by showing a difference between the behavior of the true reference interpreter and the corresponding buggy verison. You desire a high-quality test suite to help you gain confidence in your own P4 submission.
For each bug, if one of your tests causes the reference and the buggy version to produce different output (that is different stdout), you win: that test has revealed that bug. For full credit your tests must reveal all 9 unknown defects.
SLUGS will tell you the correct output for test cases you submit that reveal bugs in your own implementation of P4. This is the same information you can determine by comparing your output with that of the reference compiler.
Tests should include a snail program
named
test-n-XXX.sl
(XXX can be anything) where 1 ≤ n
≤ 20.
Your test files may contain no more than 2048 characters in any one file (including comments and whitspace). You may submit up to 20 tests (though it is possible to get full credit with fewer). Note that the tests the autograder runs on your solution are NOT limited to 2048 characters in a file, so your solution should not impose any size limits (as long as sufficient system memory is available).
You can do basic testing as follows:
snail --lex file.sl
snail --parse file.sl
python3 parser.py < file.sl-lex > generated.sl-ast
Since P4 prints SL-AST to standard output (rather than writing it
to a file), we use a redirect to save this output to a
file (> generated.sl-ast
).
You may find the reference interpreter's --unparse
option useful for debugging your .sl-ast
files.
SL-AST
Conforms to the SpecYou can use a python module to verify that your output conforms to the schema provided in the snail specification. Be sure to download an place a copy of the schema in your project.
python3 -m pip install --user jsonschema
python3 -m jsonschema --instance generated.sl-ast sl-ast.schema.json
There are various options for comparing your output to the reference implementation. The easiest approach is likely to use the JSON Diff tool.
There are also a command-line tool, jq
, which can
help with comparing JSON files (it can be installed in the
development container with
sudo apt install jq
). Then you can run:
diff -b -B -E -w <(jq -S . file.sl-ast) <(jq -S . generated.sl-ast)
Video guides are provided to help you get started with various aspects of this project. Note that there may be errors and mistakes in these videos. You should not blindly follow the steps taken in these videos!
You must turn in the following files:
readme.txt
: your README file references.txt
: your file of citationsparser.py
(contains the main entry point for your parser) parsetab.py
test-N-XXX.sl
where
N
is a number in the range 1 through 20 and
XXX
is any descriptive text without spaces
wc -c yourtest.sl
says 2048 or
less). You want each of your testcases to be meaningful so
that it helps you narrow down a bug later.
You will submit your files to Gradescope. The easiet way to do this for the current assignment is to drag the individual files to the submission page on Gradescope.
You may complete this assignment in a team of two. Teamwork imposes burdens of communication and coordination, but has the benefits of more thoughtful designs and cleaner programs. Team programming is also the norm in the professional world.
Students on a team are expected to participate equally in the effort and to be thoroughly familiar with all aspects of the joint work. Both members bear full responsibility for the completion of assignments. Partners turn in one solution for each programming assignment; each member receives the same grade for the assignment. If a partnership is not going well, the instructor will help to negotiate new partnerships. Teams may not be dissolved in the middle of an assignment.
If you are working in a team, one team member should updload a single submission to Gradescope and add both members to the submission. There is an option to add members to a group once you upload a submission. Both students in the team will receive the same grade for their submission.
P4 Grading (out of 60 points):