Project 1 — Shell Scripting

CS-340: Software Engineering

Spring 2023

For project 1, you will demonstrate your mastery of shell scripting by developing a framework for automatically testing a piece of software. You will then use this framework to analyze faults in a real-world piece of code and identify patterns in the output.

Testing is a critical aspect of the software development process. Often, this involves running a program against a set of inputs and comparing the output with known results. Because there are often many tests, automation of this running and testing can significantly reduce the time needed to identify problems in the code.

For this project, you are being asked to work with a program you have not written. This means that some subset of your time will be spent figuring out how to use the program. Further, this assignment is intentially underspecified; you will need to infer some of the requirements for achieving full marks through reasoning about expectations and feedback from the autograder.

Requirements

Write a bash script named run_tests.sh that:

  • Takes a single command-line argument of the program to run
  • Iterates overall of the test input files in tests/inputs
  • Runs the given program with the input file as the first command line argument
  • Compares the output with the associated output file in tests/outputs, and
    • Outputs TEST {#}: PASS when the comparision matches
    • Outputs TEST {#}: FAIL when the comparision does not match
    • In both cases, {#} is the number of the test (without curly braces)
  • Produces no other output on stdout
  • Takes no longer than 5 minutes to run all tests

Assumptions

You may reasonably assume that:

  • The tests directory is located in the same directory as your script
  • The structure of the test directory will always match the naming scheme of the provided sample
  • There will be 100 tests
  • Not all tests will be passing
  • It is possible to determine if all tests pass/fail within the given time limit
  • The server has basic command line utilities

You may not assume that:

  • The program being run has a specific name or location
  • Your script will always be run on the same set of tests
  • You can install additional utilities

Sample Data

 Download p1.tar.gz

You are provided with some sample program and data to use for testing your script. There are two directories in this archive, bin and tests. The bin directory contains a single, compiled program that is compiled for Ubuntu 20.04 (x86-64) and newer. This will execute on the operating system running on your cslinuxlab server.

The program, days_to_years, takes a single file as input. This file contains a single integer, representing a number of days from 1980 and outputs the year in which that day falls. For example, running the following will produce the given output:

              
                bin/days_to_years
                Usage: bin/days_to_years dayfile
                    dayfile is a text file containing a single number
         
                echo 0 > days_0.txt
                bin/days_to_years days_0.txt
                1980
                
                echo 375 > days_375.txt
                bin/days_to_years days_375.txt
                1981
              
            

This code is taken from a commercial music player produced from 2006–2011. At midnight on December 31, 2008, many of these music players froze as a result of this code.

The tests directory contains a number of input files along with the associated correct output for each input. These tests follow the directory structure specified above.

Written Report (README)

You must also write a short two-paragraph report reflecting on your experience creating a shell script to test an unknown piece of software.

The first paragraph should address your design decisions and implemenation strategies for this assignment. You might consider addressing how your script handles all of the requirements (or if you fail to meet some of the requirements, why that might be). If there were particular aspects of this script that took you a while to perfect, these might be good things to discuss.

The second paragraph should analyze the provided sample program. Discuss which inputs fail. Do you notice any commonalities between the inputs that fail? Are there certain failure modes that you observe? If you have a hypothesis about what is going wrong, share your thoughts here.

Submit this report either as a UTF-8 text file or a PDF. For a UTF-8 file, wrap the text at no more than 80 columns. Any other format may cause you to lose points on this assignment.

Commentary

Nearly all of this assignment can be completed by composing tools we have discussed and used together in class. However, you might find that certain aspects of this project require you to research additional tools or settings.

The instructor solution for this project is about 20 lines of code. You may be able to complete the assignment with fewer (or you may need significantly more). Instead of targeting a specific line count, instead consider the quality of yoru script. How well can you reuse what you write for other projects? How much of it makes sense?

Further, while the solution is not particularly long, each line may require deep thought. Do not wait to begin this assignment.

Using the Autograder to Your Advantage

It is NOT in your best interest to wait until the last minute to submit your project to Gradescope. The autograded portion of the assignment will give you (minimal) feedback each time you submit.

By submitting early an often, you receive feedback that can help guide the next steps of developing your script. Try to write a script that will always successfully run. You can incrementally refine this to create the valid output. For example, you might start by iterating over all of the tests, printing bogus output. Does Gradescope say your output has an incorrect number of lines? Next, you might try to make the output be formatted correctly, or you might try to actually use the test input without crashing.

In general, incremental programming is a good habit to develop. This will help you both in your classes, but also in future jobs. Students who submit incrementally on assignments tend to receive higher grades because they give themselves time to fix errors that are exposed by the autograder and also receive more feedback on each assignment.

What to Submit for P1

You must submit the following files to Gradescope:

  • readme.txt or readme.pdf: your README file
  • references.txt: your file of citations
  • run_tests.sh: your shell script

You can submit these by dragging the files into the submission window on the Gradescope interface.

Grading Rubric

P1 Grading (out of 40 points):

  • 30 points: autograder tests, broken down as:
    • 10 — 1 point for each failing test your script finds
    • 10 — correctly identifying all passing tests (-1 point for each test you fail to correctly identify)
    • 5 — formatting is correct
    • 5 — script runs within 5 minutes
  • 5 points: README, one of the following:
    • 5 — a two-paragraph report reflecting on your activities creating a shell script for automated testing. One paragraph describes your design of the script. The second paragraph reflects on the bug(s) revealed by the test suite in the sample data.
    • 4 — a reasonable report, but lacking a solid description of one or more aspects or significantly exceeding the length limit.
    • 3 — a brief report, detailing only half of the required information, but mentioning both required topics.
    • 2 — a brief report, detailing only half of the required information.
    • 1 — a terse or uninformative report.
    • 0 — No README file provided.
  • 2 points: References, one of the following:
    • 2 — Standalone references file exists and contains a list of resources used for this assignment. This includes course materials.
    • 1 — References file exists, but might not be separate or does not adequately identify resources used for this assignment.
    • 0 — No references file provided.
  • 3 points: Coding Style and Comments, one of the following:
    • 3 — script is well-commented with spacing between separate tasks.
    • 2 — script is generally well-written, but key tasks are uncommented or spacing is not uniform (e.g., large gaps, old commented code, etc.).
    • 1 — script is very difficult to follow and contains the bare minimum of comments.
    • 0 — script contains no comments.