This document provides the software requirements specification for the embeddable spreadsheet project. It is intended to serve as a sample SRS for CS350 students.
The embeddable spreadsheet is an implementation of the familiar computational model of a spreadsheet.
The principle component of this project is a library (API) that can be incorporated into other applications to add a spreadsheet-like capability.
Command-line and GUI interfaces will also be provided, but are secondary products, as their purpose is to add convenience in exploiting the library.
Spreadsheet: a computational model generally envisioned as a rectangular arrangement of cells. Expressions associated with these cells create an implicit order dependency in which the cells must be evaluated: If a cell A contains an expression with references to cells B1, B2, … BN, then A must be evaluated after the evaluation of B1, B2, … BN.
Cell: A container that can be empty or hold an expression (possibly referencing other cells). If the cell contains an expression, then it also has a current value denoting the evaluation of that expression.
Value: The result of evaluating an expression. Values can be of various kinds, including numeric, string, and various artificial values used to indicate problems in the calculation.
direct dependence: A cell C that has been (most recently) assigned an expression E is considered to be directly dependent on any cells referenced within E.
dependence: A cell C is said to be dependent on any cells within the transitive closure of the direct dependence relationship defined above.
indirect dependence: A cell C is said to be indirectly dependent on any cell B if C depends on B but is not directly dependent on B.
Section 2 of this document describes the Embeddable Spreadsheet_ project in general terms.
Section 3 presents the detailed requirements, organized by feature.
The appendices are devoted to the grammar for the spreadsheet expression language and a list of functions related to it.
The spreadsheet is a now familiar model of calculation. Spreadsheets are typically envisioned as a rectangular arrangement of cells, each cell containing a mathematical expression. These expressions frequently contain references to other cells, and it is a defining characteristic of the spreadsheet model that the resulting computational order dependencies are resolved automatically rather than by explicit programming (e.g., as in a general programming language like C++ or Java). [zeil15a]
Commonly, the GUI for a spreadsheet is often seen as integral to this model. But, in fact, there are many applications that could benefit from the ability to incorporate user-supplied calculations. The spreadsheet model suggests a means of doing so without requiring the embedding of an entire scripting language.
The embeddable spreadsheet is envisioned as a library that can be incorporated into applications to add a spreadsheet-like capability.
The system should be accessible through a Java API that would allow construction, loading and saving, and modification of spreadsheets and the querying of values in the evaluated spreadsheet.
It should also be possible to launch a command line application to load, evaluate, and report on a spreadsheet.
Optional, but highly desirable would be a GUI to aid in the construction of spreadsheets, even if the primary subsequent use of those spreadsheets would subsequently be evaluation via the API or via command line operation.
The embeddable nature of this spreadsheet means that the primary user group for this product will be application developers seeking to incorporate spreadsheet-style calculations into their applications software.
A secondary user group would be the users of those applications. We distinguish between
consumers of calculation results, who may be unaware of the method by which those calculations were carries out. Such users will have minimal impact on this project.
authors of calculations, who supply formulas involving variables related to the application within which the spreadsheet is embedded. Ease of entry and language will be importnat to this class of users.
For example, an application in the education domain might involve collecting scores from a variety of assignments and assessments. The course instructor would play author role in using the spreadsheet to tailor the calculation of overall grades to the course policy (e.g., dropping a lowest assignment grade, optional extra-credit work). Students in that class, by way of contrast, would play the consumer role in receiving the computed overall grade.
The system should run under Windows, Linux, and OS/X systems equipped with a Java JRE of version 7 or later.
The system should be buildable under Windows, Linux, and OS/X systems equipped with a Java SDK of version 7 or later and the ant
build manager.
The system shall support
N/A
The system shall support a Java API providing operations to
A spreadsheet is generally envisioned as a rectangular arrangement of cells.[zeil13a] In this rectangular view, cells are named by column (typically an alphabetic string) and row (numeric), for example, A1
or ABC23
.
The most common spreadsheets go beyond this simple 2D rectangular model. Most add a third dimension, the “sheet”, also identified by alphabetic strings, written before the column and row, with a separating character (‘!’) used to indicate the end of the first string and the start of the column string, e.g., Data!A1
or Summary!BB170
. In the absence of a sheet designation, the appropriate sheet is determined by context.
There is, however, no fundamental reason in the computation model to limit the arrangements of cells to 3 dimensions, nor to preclude the possibility of one-dimensional naming (e.g., ‘X’, ‘Y’, etc.)
N/A
Associated with each non-empty cell is an expression. The syntax for expressions is conventional algebraic expressions combined with functions written as a function name followed by a parenthesized, comma-separated parameter list.
An expression defines a computation for a cell’s value as functions of constants and of the values of other cells.
N/A
AreaReference
is considered a reference to all non-empty cells within that region.A1:B3
, B3:A1
, A3:B1
, and B1:A3
all specify the same region).<functionCall>
in which the functionName
matches one of the functions listed in Appendix section [Functions] is valid only if the number of parameters and their type match the specification given in the Appendix.CellReference
and AreaReference
in the expression copy is modified form the original by adding $c$ columns to any unfrozen column coordinate and adding $r$ rows to an unfrozen row coordinate.When evaluated, expressions yield values.
N/A
Error
, rendered as “error”, shall be supported for the purpose of reporting errors encountered during expression evaluation.Error
as an input shall return Error
as their result.An assignment is an action of storing an expression in a cell.
An assignment to a cell C has the potential to affect the value of C, of all cells that reference C in their own expressions, of all cells referencing any of those cells, and so on.
API
stimulus An expression is inserted into a cell.
response Subsequent retrievals of expression from that cell show the new expression.
response Subsequent retrievals of values from that cell show the value obtained by evaluation of the new expression. Subsequent retrievals of values from cells that depend, directly or indirectly, on this cell may also change.
GUI
stimulus A cell in the spreadsheet is selected and a new expression entered. assignments
Subsequent retrievals of expression from that cell show the new expression.
response Subsequent retrievals of values from that cell show the value obtained by evaluation of the new expression. The GUI display of the cell is updated to reflect this new value.
response Subsequent retrievals of values form cells that depend, directly or indirectly, on this cell may also change. For any such dependent cells currently visible in the GUI, the displays of those cells are updated to reflect their new values.
GUI/Command Line
stimulus A spreadsheet is loaded from a file containing a list of assignments
response Identical to multiple API assignments
(This requirement should not be read as indicating that the re-evaluation of C must occur at the time of the assignment, nor as an indication that it must at the time of the value retrieval. This requirement is satisfied if the expression evaluation is performed at either of those times or at any time in between. )
AreaReference
is treated as an assignment of the indicated expression to the cell named to the left of the ‘:’, followed by a copy from that cell to each other cell in the rectangular region denoted by AreaReference
.The possible outputs from the spreadsheet involve:
direct queries in the API for the expression or value of a cell
a “saved” spreadsheet format listing expression assignments to each non-empty cell.
a “value” report listing the values of each non-empty cell
API/GUI
stimulus Request is made to save the spreadsheet in a given file
response A list of assignments sufficient to recreate the spreadsheet is written to that file.
API/GUI
stimulus Request is made to save the value report in a given file
response The value report is written to that file.
Command Line
stimulus Command line program is run.
response At end, value report is generated.
AreaRefernece
s to reduce the number of assignments to be written out. This is optional (though encouraged).Total0
would be written to the report as Total
.)The reference platform for performance evaluation shall be CS Dept’s Linux servers, sirius
and atria
.
The system should be capable of evaluating a spreadsheet with 10,000 non-empty cells, with a total of 15,000 inter-cell references, within one second on the reference platform.
The system should be capable of loading and parsing a file describing a spreadsheet with 10,000 non-empty cells, with a total of 15,000 inter-cell references, within 3 seconds on the reference platform.
This section presents the definitions of the tokens that make up the expression language. Each token type is denoted by a regular expression.
CellName
: used to name a cell for the purpose of retrieval (API) or as a target for an assignment.
([A-Za-z][A-Za-z0-9]*!)*([A-Za-z][A-za-z]*)([0-9]*)
The dimension of a cellname is the number of substrings matching the parenthesized components in the expression of DC-1. Each such matching substring specifies a coordinate along that dimensional axis.
Dimensions in the names are counted from right to left, starting with 1, so the purely numeric component at the end is dimension 1 (and sometimes referred to as the “row”), the purely alphabetic component before that is dimension 2 (sometimes referred to as the “column”), and so on.
CellReference
: used within expressions to denote a cell from which a value should be fetched
([A-Za-z][A-Za-z0-9]*!)*($?[A-Za-z][A-za-z]*)($?[0-9]*)
The optional $
symbols, if present, indicate that the following dimension coordinate is frozen during movement and copy operations.
AreaName
: denotes a rectangular area of a spreadsheet as a target for an assignment.
([A-Za-z][A-Za-z0-9]*!)*($?[A-Za-z][A-za-z]*)($?[0-9]*):(?[A-Za-z][A-za-z]*)(?[0-9]*)
AreaReference
: describes a rectangular region of the spreadsheet from which values should be fetched.1
([A-Za-z][A-Za-z0-9]*!)*($?[A-Za-z][A-za-z]*)($?[0-9]*):($?[A-Za-z][A-za-z]*)($?[0-9]*)
functionName
: name of a function being called using the parenthesized comma-separated list of parameters form.
[A-Za-z]A-Za-z0-9]*
The primary starting points in this grammar are <Expression>
and <InputFile>
As is conventional when describing syntax, whitespace is not included in the grammar. Only blanks and tab characters are considered valid whitespace in this application. Line terminators are represented as explicit tokens.
<InputFile> ::= <assignmentList>
<assignmentList> ::= | <Assignment> <assignmentList> lineTerminator
<Assignment> ::= CellReference '=' <Expression>
| AreaReference '=' <Expression>
<Expression> ::= <Compared> <relop> <Expression> | <Compared>
<Compared> ::= <Factor> <multOp> <Compared> | <Factor>
<Factor> ::= <Term> <addOp> <Factor> | <Term>
<Term> ::= <Literal> | <FunctionCall> | ( <Expression> )
<FunctionCall> ::= functionName ( <paramList> )
<paramList> ::= <param> | <param> , <paramList>
<param> ::= <Expression> | AreaReference
<Literal> :== numericLiteral | stringLiteral | CellRefernece
<relOp> :== '==' | '!=' | '<' | '<=' | '>' | '>='
<multOp> :== '*' | '/' | '%'
<addOp> :== '+' | '-'
An example of a sentence satisfying <InputFile>
would be
B4 = sum(B1:B3)
A1 = 23
A2 = 27
A3 = A2+4
B1:B3 = sqrt(2*a1)
The final line in the above example is understood as a shorthand for
B1 = sqrt(2*a1)
B2 = sqrt(2*a2)
B3 = sqrt(2*a3)
The minimum set of function to be supported are
if(cond, then, else)
<Expression>
<Expression>
<Expression>
cond
. If numeric and non-zero or string and non-empty, evaluates and returns the value of then
, otherwise evaluates and returns the value of else
.
sqrt(x)
<Expression>
x
if numeric and non-negative.
min(...)
<Expression>
or AreaReference
max(...)
<Expression>
or AreaReference
count(A)
A
: an AreaReference
A
sum(A)
A
: an AreaReference
A
sumproduct(A,B)
A
: an AreaReference
B
: an AreaReference
of the same dimensions as A
A
and B
.
average(A)
A
: an AreaReference
A
contains at least one non-empty, numerically-valued cell, returns the mean average of all such cells in A
.
stddev(A)
A
: an AreaReference
A
contains at least two non-empty, numerically-valued cells, returns the sample standard deviation of all such cells in A
.
In all cases, inputs not conforming to the above descriptions cause the function to return the Error
value.
1: This is a deliberate simplification of the similar idea described in the original requirements definition, which allowed prismatic regions of arbitrary dimension.