Embeddable Spreadsheet -- Software Requirements Specification

Steven J Zeil:

Last modified: Dec 13, 2016
Contents:

1 Introduction

1.1 Purpose

This document provides the software requirements specification for the embeddable spreadsheet project. It is intended to serve as a sample SRS for CS350 students.

1.2 Scope

The embeddable spreadsheet is an implementation of the familiar computational model of a spreadsheet.

The principle component of this project is a library (API) that can be incorporated into other applications to add a spreadsheet-like capability.

Command-line and GUI interfaces will also be provided, but are secondary products, as their purpose is to add convenience in exploiting the library.

1.3 Definitions, acronyms, and abbreviations

Spreadsheet: a computational model generally envisioned as a rectangular arrangement of cells. Expressions associated with these cells create an implicit order dependency in which the cells must be evaluated: If a cell A contains an expression with references to cells B1, B2, … BN, then A must be evaluated after the evaluation of B1, B2, … BN.

Cell: A container that can be empty or hold an expression (possibly referencing other cells). If the cell contains an expression, then it also has a current value denoting the evaluation of that expression.

Value: The result of evaluating an expression. Values can be of various kinds, including numeric, string, and various artificial values used to indicate problems in the calculation.

direct dependence: A cell C that has been (most recently) assigned an expression E is considered to be directly dependent on any cells referenced within E.

dependence: A cell C is said to be dependent on any cells within the transitive closure of the direct dependence relationship defined above.

indirect dependence: A cell C is said to be indirectly dependent on any cell B if C depends on B but is not directly dependent on B.

1.4 References

1.4.1 Spreadsheets

zeil15a
Embeddable Spreadsheet – Requirements Definition, Steven Zeil, CS 350 lecture notes, Dec 21, 2019
wikiSS
Spreadsheet, http://en.wikipedia.org/wiki/Spreadsheet, accessed Dec 16, 2016
zeil14a
CS350: Dynamic Binding: Class-Appropriate behavior, https://secweb.cs.odu.edu/~zeil/cs330/s14/Public/dynamicBinding/dynamicBinding__html.html#example:evaluatingacell, Section 4.2, Feb 24, 2014
zeil14b
CS350: The Observer Pattern, https://secweb.cs.odu.edu/~zeil/cs330/s14/Public/idioms/idioms__html.html#theobserverpattern, Mar 11, 2014
zeil15b
CS361: The Topological Sort, https://www.cs.odu.edu/~zeil/cs361/sum15/Public/graphalgorithms/#topological-sort, Jul 02, 2015

1.4.2 Parsing

zeil03a
CS355: Basic Components - Lexical, Steven Zeil, http://www.cs.odu.edu/~zeil/cs355/Lectures/1components/lexical_summary.pdf, Aug 28, 2003
zeil03b
CS355: Basic Components - Syntax, Steven Zeil, http://www.cs.odu.edu/~zeil/cs355/Lectures/1components/syntax_summary.pdf, Sep 7, 2001

1.4.3 Parser Generators

zeller05
Zeller & Kinke, Essential Open Source Toolset, 2005, Wiley, ch 5–7

1.5 Overview

Section 2 of this document describes the Embeddable Spreadsheet_ project in general terms.

Section 3 presents the detailed requirements, organized by feature.

The appendices are devoted to the grammar for the spreadsheet expression language and a list of functions related to it.

2 Overall description

2.1 Product perspective

The spreadsheet is a now familiar model of calculation. Spreadsheets are typically envisioned as a rectangular arrangement of cells, each cell containing a mathematical expression. These expressions frequently contain references to other cells, and it is a defining characteristic of the spreadsheet model that the resulting computational order dependencies are resolved automatically rather than by explicit programming (e.g., as in a general programming language like C++ or Java). [zeil15a]

Commonly, the GUI for a spreadsheet is often seen as integral to this model. But, in fact, there are many applications that could benefit from the ability to incorporate user-supplied calculations. The spreadsheet model suggests a means of doing so without requiring the embedding of an entire scripting language.

The embeddable spreadsheet is envisioned as a library that can be incorporated into applications to add a spreadsheet-like capability.

2.2 Product functions

The system should be accessible through a Java API that would allow construction, loading and saving, and modification of spreadsheets and the querying of values in the evaluated spreadsheet.

It should also be possible to launch a command line application to load, evaluate, and report on a spreadsheet.

Optional, but highly desirable would be a GUI to aid in the construction of spreadsheets, even if the primary subsequent use of those spreadsheets would subsequently be evaluation via the API or via command line operation.

2.3 User characteristics

The embeddable nature of this spreadsheet means that the primary user group for this product will be application developers seeking to incorporate spreadsheet-style calculations into their applications software.

A secondary user group would be the users of those applications. We distinguish between

For example, an application in the education domain might involve collecting scores from a variety of assignments and assessments. The course instructor would play author role in using the spreadsheet to tailor the calculation of overall grades to the course policy (e.g., dropping a lowest assignment grade, optional extra-credit work). Students in that class, by way of contrast, would play the consumer role in receiving the computed overall grade.

2.4 Constraints

The system should run under Windows, Linux, and OS/X systems equipped with a Java JRE of version 7 or later.

The system should be buildable under Windows, Linux, and OS/X systems equipped with a Java SDK of version 7 or later and the ant build manager.

2.5 Assumptions and dependencies

3 Specific requirements

3.1 External interface requirements

3.1.1 User interfaces

The system shall support

3.1.2 Hardware interfaces

N/A

3.1.3 Software interfaces

The system shall support a Java API providing operations to

3.1.4 Communications interfaces

3.2 System features

3.2.1 Dimensionality and CellNames

Introduction/Purpose of feature

A spreadsheet is generally envisioned as a rectangular arrangement of cells.[zeil13a] In this rectangular view, cells are named by column (typically an alphabetic string) and row (numeric), for example, A1 or ABC23.

The most common spreadsheets go beyond this simple 2D rectangular model. Most add a third dimension, the “sheet”, also identified by alphabetic strings, written before the column and row, with a separating character (‘!’) used to indicate the end of the first string and the start of the column string, e.g., Data!A1 or Summary!BB170. In the absence of a sheet designation, the appropriate sheet is determined by context.

There is, however, no fundamental reason in the computation model to limit the arrangements of cells to 3 dimensions, nor to preclude the possibility of one-dimensional naming (e.g., ‘X’, ‘Y’, etc.)

Stimulus/Response sequence

N/A

Associated functional requirements

DC-1
Alphabetic characters within a cell name are case-insensitive and treated as upper-case for sorting purposes.
DC-2
If the purely numeric component is empty, it is treated as if it had been given as ‘0’ for sorting purposes.
DC-3
The dimension 1 (purely numeric) component may not contain leading zeros unless the number is entirely zero, in which case it may consist of only a single digit.
DC-4
For sorting purposes, the values of dimension 1 are ordered as integers.
DC-5
For sorting purposes, the values of dimension 2 are ordered so that shorter strings precede longer strings, and strings of equal length are ordered by conventional alphabetic (lexicographic) ordering, after replacing any lower-case letters by their upper-case equivalents.
DC-6
For sorting purposes, the values of dimensions above 2 are ordered by conventional alphabetic (lexicographic) ordering, after replacing any lower-case letters by their upper-case equivalents.

3.2.2 Expressions

Introduction/Purpose of feature

Associated with each non-empty cell is an expression. The syntax for expressions is conventional algebraic expressions combined with functions written as a function name followed by a parenthesized, comma-separated parameter list.

An expression defines a computation for a cell’s value as functions of constants and of the values of other cells.

Stimulus/Response sequence

N/A

Associated functional requirements

EX-1
The syntax for expressions is given as the grammar in Appendix sections Expression Lexemes and Expression Grammar.
EX-2
An AreaReference is considered a reference to all non-empty cells within that region.
EX-3
Any rectangular region of 2 or more rows and 2 or more columns can be specified in 4 ways, all of which are entirely equivalent in meaning (e.g., A1:B3, B3:A1, A3:B1, and B1:A3 all specify the same region).
EX-4
The minimal set of functions to be supported by the spreadsheet are listed in Appendix section Functions.
EX-5
A <functionCall> in which the functionName matches one of the functions listed in Appendix section Functions is valid only if the number of parameters and their type match the specification given in the Appendix.
EX-6
When an expression is copied from a source cell to a destination cell offset from the source by $(c,r)$ columns and rows, then each CellReference and AreaReference in the expression copy is modified from the original by adding $c$ columns to any unfrozen column coordinate and adding $r$ rows to an unfrozen row coordinate.

3.2.3 Values

Introduction/Purpose of feature

When evaluated, expressions yield values.

Stimulus/Response sequence

N/A

Associated functional requirements

V-1
The system shall support expressions over numeric (floating-point) values.
V-2
The system shall support expressions over string values.
V-3
A special value Error, rendered as “error”, shall be supported for the purpose of reporting errors encountered during expression evaluation.
V-4
Any function or operator evaluations that receive Error as an input shall return Error as their result.

3.2.4 Assignments

Introduction/Purpose of feature

An assignment is an action of storing an expression in a cell.

An assignment to a cell C has the potential to affect the value of C, of all cells that reference C in their own expressions, of all cells referencing any of those cells, and so on.

Stimulus/Response sequence

API

stimulus An expression is inserted into a cell.

response Subsequent retrievals of expression from that cell show the new expression.

response Subsequent retrievals of values from that cell show the value obtained by evaluation of the new expression. Subsequent retrievals of values from cells that depend, directly or indirectly, on this cell may also change.

GUI

stimulus A cell in the spreadsheet is selected and a new expression entered. assignments

Subsequent retrievals of expression from that cell show the new expression.

response Subsequent retrievals of values from that cell show the value obtained by evaluation of the new expression. The GUI display of the cell is updated to reflect this new value.

response Subsequent retrievals of values form cells that depend, directly or indirectly, on this cell may also change. For any such dependent cells currently visible in the GUI, the displays of those cells are updated to reflect their new values.

GUI/Command Line

stimulus A spreadsheet is loaded from a file containing a list of assignments

response Identical to multiple API assignments

Associated functional requirements

AS-1
Assignment of an expression to a cell removes any prior direct dependencies of that cell, but may add new direct dependencies.
AS-2
Following any assignment of an expression E to a cell, a value retrieval from that cell will produce the value of E.

(This requirement should not be read as indicating that the re-evaluation of C must occur at the time of the assignment, nor as an indication that it must at the time of the value retrieval. This requirement is satisfied if the expression evaluation is performed at either of those times or at any time in between. )

AS-3
Operations that copy an expression from one cell to another are considered an assignment to the second cell. (The copied expression is modified as described in EX-6.)
AS-4
The primary input format for loading spreadsheets from files consists of a series of assignments, one per line, with syntax given in Expression Grammar.
AS-5
The action of loading a file is equivalent to creating an empty spreadsheet, then performing each of the assignments indicated in the file, in the order presented.
AS-6
An input assignment to an AreaReference is treated as an assignment of the indicated expression to the cell named to the left of the ‘:’, followed by a copy from that cell to each other cell in the rectangular region denoted by AreaReference.
AS-7
An input assignment that is syntactically incorrect results in an error message “Syntax error in line ##” and no change to the spreadsheet. Subsequent lines in the input will be processed normally.
AS-8
An API assignment of an expression that is syntactically incorrect results in an exception, with no change made to the cell.
AS-9
A GUI assignment of an expression that is syntactically incorrect results in no change to the cell, but a non-fatal notification within the GUI.

3.2.5 Outputs

Introduction/Purpose of feature

The possible outputs from the spreadsheet involve:

Stimulus/Response sequence

API/GUI

stimulus Request is made to save the spreadsheet in a given file

response A list of assignments sufficient to recreate the spreadsheet is written to that file.

API/GUI

stimulus Request is made to save the value report in a given file

response The value report is written to that file.

Command Line

stimulus Command line program is run.

response At end, value report is generated.

Associated functional requirements

OU-1
A saved spreadsheet report consists of a list of assignments conforming to the syntax of Expression Grammar.
OU-2
The contents of a saved spreadsheet report shall be sufficient to re-create the value of the spreadsheet at the time of the save operation.
OU-3
The saved spreadsheet may employ AreaReferneces to reduce the number of assignments to be written out. This is optional (though encouraged).
OU-4
The value report consists of multiple lines containing a cell name, a single blank space, and then the value of that cell.
OU-5
Only non-empty cells appear in the value report.
OU-6
Cells are listed in the value report in ascending order by name.
OU-7
Cellnames with row 0 are written to the value report with an empty row designator. (e.g., Total0 would be written to the report as Total.)

3.3 Performance requirements

The reference platform for performance evaluation shall be CS Dept’s Linux servers, sirius and atria.

  1. The system should be capable of evaluating a spreadsheet with 10,000 non-empty cells, with a total of 15,000 inter-cell references, within one second on the reference platform.

  2. The system should be capable of loading and parsing a file describing a spreadsheet with 10,000 non-empty cells, with a total of 15,000 inter-cell references, within 3 seconds on the reference platform.

3.4 Design constraints

3.5 Software system attributes

SSA-1
The set of value types supported by the system shall be extendable, requiring a maximum of 40 hours effort from a programmer with a grasp of inheritance (B-level CS330) and parsing (B-level CS488).
SSA-2
The set of functions supported by the system shall be extendable, requiring a maximum of 5 hours effort from a programmer with a grasp of inheritance (B-level CS330).

3.6 Other requirements

OR-1
The system should run on Windows, Linux and OS/X systems.
OR-2
The system should build on Windows, Linux and OS/X systems.

4 Appendices

4.1 Expression Lexemes

This section presents the definitions of the tokens that make up the expression language. Each token type is denoted by a regular expression.

4.2 Expression Grammar

The primary starting points in this grammar are <Expression> and <InputFile>

As is conventional when describing syntax, whitespace is not included in the grammar. Only blanks and tab characters are considered valid whitespace in this application. Line terminators are represented as explicit tokens.

<InputFile> ::= <assignmentList>
<assignmentList> ::=  | <Assignment> <assignmentList> lineTerminator

<Assignment> ::= CellReference '=' <Expression>
               | AreaReference '=' <Expression>

<Expression> ::= <Compared> <relop> <Expression> | <Compared>
<Compared> ::=  <Factor> <multOp> <Compared> | <Factor>
<Factor> ::=  <Term> <addOp> <Factor> | <Term>
<Term> ::= <Literal> | <FunctionCall> | ( <Expression> )

<FunctionCall> ::= functionName ( <paramList> )
<paramList> ::= <param> | <param> , <paramList>
<param> ::= <Expression> | AreaReference

<Literal> :== numericLiteral | stringLiteral | CellRefernece
<relOp> :== '==' | '!=' | '<' | '<=' | '>' | '>='
<multOp> :== '*' | '/' | '%'
<addOp> :== '+' | '-'

An example of a sentence satisfying <InputFile> would be

B4 = sum(B1:B3)
A1 = 23
A2 = 27
A3 = A2+4
B1:B3 = sqrt(2*a1)

The final line in the above example is understood as a shorthand for

B1 = sqrt(2*a1)
B2 = sqrt(2*a2)
B3 = sqrt(2*a3)

4.3 Functions

The minimum set of function to be supported are

In all cases, inputs not conforming to the above descriptions cause the function to return the Error value.


1: This is a deliberate simplification of the similar idea described in the original requirements definition, which allowed prismatic regions of arbitrary dimension.