Multi-Threading Case Study: Spreadsheet
Steven Zeil
1 Review: Evaluating Cells
-
Each Cell has an expression and a (current) value.
-
Certain spreadsheet operations may affect the value of one or more cells
-
anything that changes a cell formula
-
entering formula via GUI
-
loading a spreadsheet from a file
-
pasting cells
-
-
1.1 evaluationRequired
How does the spreadsheet know what cells to re-evaluate after one or more changes have been made?
It keeps a queue of cells requiring (re)-evaluation.
** Missing file: evalRequired.java **
-
After any spreadsheet action that changes cell values, those actions add the affected cells to the queue.
-
Then eventually the spreadsheet evaluates everything in the queue.
For example, this is the code that manages the “formula” text entry box where new expressions can be typed for insertion into a cell
public class MainWindow extends java.awt.Frame {
⋮
formula = new Formula();
formula.addActionListener(new ActionListener() {
public void actionPerformed (ActionEvent e) {
if (!ssview.getSelectingRectangleMode()) {
CellName cn = ssview.getSelectedCells().upperLeftCorner();
Cell c = sheet.getCell(cn);
if (!formula.getText().equals("")) {
Expression expr = Expression.get(formula.getText());
if (expr != null) {
c.putFormula(expr); ➀
}
else {
statusLine.setText("Syntax error in formula");
}
}
else {
c.putFormula(null);
}
sheet.evaluateAll(); ➁
}
}});
-
➀ Here a newly parsed Expression is added to a cell. We’ll look at the implications of that shortly.
-
➁ Here we instruct the spreadsheet to evaluated all affected cells. Of course, the only one that it knows is affected at the moment is the cell c, into which we have just inserted the new expression.
1.2 How Do Cells Get into the Queue?
The spreadsheet provides functions for indicating that a Cell needs to be re-evaluated:
public class SpreadSheet
{
⋮
// Cells use the following two functions to inform
// the spreadsheet of changes in their status
void cellRequiresEvaluation (Cell c)
{
if (c.getValueIsCurrent())
{
c.putValueIsCurrent (false);
evaluationRequired.addToBack (c);
}
}
void cellHasNewFormula (Cell c)
{
evaluationCounter = 0;
cellRequiresEvaluation(c);
}
The various actions that change a cell must then call those functions. For example, a little earlier we saw that when a new expression is entered through a GUI, the cell’s putFormula function is called:
public
class Cell extends Observable implements Observer
{
⋮
public void putFormula(Expression e)
{
if (theFormula != null)
{
Sequence oldReferences = theFormula.collectReferences();
for (SearchEnumeration p = oldReferences.front();
p.hasMoreElements(); ) {
CellName cn = (CellName)p.nextElement();
Cell c = theSheet.getCell(cn);
c.deleteObserver (this);
}
}
theFormula = e;
if (e != null)
{
Sequence newReferences = e.collectReferences();
for (SearchEnumeration p = newReferences.front();
p.hasMoreElements(); ) {
CellName cn = (CellName)p.nextElement();
Cell c = theSheet.getCell(cn);
c.addObserver (this);
}
}
theSheet.cellHasNewFormula (this); ➀
}
⋮
- ➀ Near the end of this function, you can see the cell telling the spreadsheet that it (the cell) will need to be re-evaluated.
1.3 How Do Indirectly Affected Cells Get into the Queue?
Now, that’s all fine for cells that we have changed directly, but we know that if one cell is changed, it may affect others that depend upon it. For example, if we enter a new formula into cell B2 and cell C3 contains the expression a1+b2, then cell C3 may eventually have to be re-evaluated as well. And them, if there are any cells that mention C3 in their formulas, those cells will need be re-evaluated, and so on.
This propagation of changes is managed via a pattern we have discussed earlier, the observer-observable pattern.
-
Cells observe other cells
- e.g., if
c3
contains “a1+ 2 *b2
”, it observes cellsa1
andb2
.
- e.g., if
-
A cell that changes values notifies its observers
-
We’ve seen this notification when we discussed how cells evaluate themselves.
-
public Value evaluateFormula()
{
Value newValue = (theFormula == null)
? new StringValue()
: theFormula.evaluate(theSheet);
if (theValue == null || !newValue.equals(theValue)) {
theValue = newValue;
setChanged();
notifyObservers();
}
return theValue;
}
-
and they add themselves to the queue
-
A cell is notified of a change in something it was observing by a call to update:
public void update (Observable changedCell, Object ignore)
{
theSheet.cellRequiresEvaluation (this);
}
Any cell so notified simply tells the spreadsheet ot add it to the queue.
2 Use-Case: cell updates
Assume a spreadsheet already contains the following:
-
Cell a1 contains 10
-
Cell b2 contains a1+1
-
Cell c3 contains a1+b2
The cursor is on cell b2
, and we enter 2*a1
in the formula box and hit enter.
2.1 Putting 2*a1 in b2
2.2 Recomputing Cell Values
We could actually represent the sequence of operations like this
which is correct as a portrayal of which messages get used, but is a little hard to understand in terms of which cells are involved when.
So, let me “unroll” the loop and show two specific iterations of it
3 Circular Dependencies
Try This: Here is the compiled version of the spreadsheet that we have been discussing.
Run it and do the following:
- Put 1 in A1. (Click on cell A1 to select it, then click in the formula box at the top of the spreadsheet, type “1” and hit Enter.)
- Put
if(a1 > 1000,-999,1)
in B2 - Put
a1+b2
in C3 - Return to A1, and repalce the original formula by c3.
- Then click on B1 as if you wanted to enter something there.
3.1 Poor Response
The program gets caught in a loop for a noticeable time.
-
In fact, it only halts because of an explicitly coded guard against infinite evaluation loops
-
But until it does, the program ignores all mouse clicks, key presses, etc.
3.2 Possible Fixes
In the final formula entry above we created a circular dependency in which cell C3 depended (indirectly) on the value of cell A1 but A1 depended on the value of cell C3. This created an infinite loop as each cell kept notifying the other changes, causing the other to request re-evaluation.
One way to avoid this would be to check for circular dependencies before entering a new formula. An algorithm for doing this is called topological sorting and you might learn it in our data structures course.
One problem with this approach is that it sometimes disallows operations that would have been OK. In particular, copy and pasting of large regions of the spreadsheet could cause temporary circular dependencies. Many spreadsheets will break off the paste operation in the middle upon detecting this, even if the circularity would have been resolved once the entire region had finished pasting. operation was carried out.
So we’ll look at another possibility, making the cell evaluation and GUI event handling separate threads.
4 Multi-Threaded Design
- Add to SpreadSheet an internal Evaluator class that runs as a thread
-
Whenever cells need to be evaluated, it pulls them from the evaluationRequired queue and tells them to re-evaluate themselves.
-
class SpreadSheet
{
class Evaluator extends Thread
{
public Evaluator() {}
public void run() {
while (true) {
partialEvaluate();
}
}
}
public SpreadSheet()
{
theLowerRightCorner = new CellName("z", 26);
evaluationCounter = 0;
evaluationRequired = new Sequence();
cells = new Hashtable();
evaluator = new Evaluator();
evaluator.start();
}
partialEvaluate takes one cell off the front of the queue and re-evaluates it.
4.1 Identifying Shared Resources
-
To make this design safe, we need to consider what synchronization might be required.
-
Where could we get into trouble if the two threads accessed shared resources?
Consider the evaluationRequired queue.
-
GUI thread adds cells to queue
-
evaluation thread removes cells from queue
That’s an obvious shared resource. We’ve already seen how queues can be corrupted by simultaneous updates, so we will want to protect that.
Let’s also think about the cell formulas.
-
Changes in formula
-
eval thread traverses formulas (trees) to evaluate
-
GUI puts new formulas into cell
-
-
Changes in the observer lists
-
eval thread traverses observer list to schedule additional nodes for evaluation
-
GUI
->putFormula()
rewrites the observer list
-
Again, that looks like a shared rewource that need protecting.
Finally, let’s think about the cell values.
-
GUI looks at values to display on screen
-
eval thread continuously rewrites those values
This might be a problem as well. It’s not a simultaneous update, because the GUI only looks at these values, but we could wind up displaying garbage temporarily.
4.1.1 Synchronization Reqts
It appears, then, that we need
-
synchronization on each Cell
-
getFormula(), putFormula(), getValue(), putValue(), evaluate()
-
-
synchronization on evaluationRequired queue
-
add, remove
-
4.2 Checking for Liveness Problems
If we go with that level of synchronization, however, we may encounter liveness problems:
-
Synchronizing on individual cells could lead to a situation where
-
eval thread holds lock on cell
c1
and requests lock on cellc2
-
GUI thread holds lock on cell
c2
and requests lock on cellc1
-
Deadlock!
-
-
At least, I think that’s possible. Frankly, it’s too complicated for me to really prove it one way or the other.
5 When in Doubt, Simplify
We resolve these problems by making life simpler.
-
Synchronize on the whole spreadsheet instead of individual cells
-
Because there’s only one object involved, chances of deadlock are nil
-
-
Need to restructure spreadsheet slightly so that other code does not get cells directly
* e.g., instead of
public
class Cell extends Observable implements Observer
{
⋮
public void putFormula(Expression e)
{
if (theFormula != null)
{
Sequence oldReferences = theFormula.collectReferences();
for (SearchEnumeration p = oldReferences.front();
p.hasMoreElements(); ) {
CellName cn = (CellName)p.nextElement();
Cell c = theSheet.getCell(cn);
c.deleteObserver (this);
}
}
theFormula = e;
if (e != null)
{
Sequence newReferences = e.collectReferences();
for (SearchEnumeration p = newReferences.front();
p.hasMoreElements(); ) {
CellName cn = (CellName)p.nextElement();
Cell c = theSheet.getCell(cn);
c.addObserver (this);
}
}
theSheet.cellHasNewFormula (this); ➀
}
⋮
we do
public
class Cell extends Observable implements Observer
{
⋮
public void putFormula(Expression e)
{
if (theFormula != null)
{
Sequence oldReferences = theFormula.collectReferences();
for (SearchEnumeration p = oldReferences.front();
p.hasMoreElements(); ) {
CellName cn = (CellName)p.nextElement();
theSheet.deleteCellObserver (cn,this);
}
}
theFormula = e;
if (e != null)
{
Sequence newReferences = e.collectReferences();
for (SearchEnumeration p = newReferences.front();
p.hasMoreElements(); ) {
CellName cn = (CellName)p.nextElement();
theSheet.addCellObserver(cn, this);
}
}
theSheet.cellHasNewFormula (this);
}
⋮
5.1 Hiding the Cells
In essence, we can hide the cells so that they cannot be accessed in an unsynchronized fashion by requiring all access to cells to be managed via the spreadsheet.
-
Make the spreadhsheet’s getCell() function private.
-
Provide cell functionality through the spreadsheet, so that synchronized functions lock the spreadsheet, not the cells
// Remove c from the observer list of the cell at position cname
public synchronized void deleteCellObserver (CellName cname, Observer c)
{
Cell observer = getCell(cname);
observer.deleteObserver(c);
}
// Add c as an observer of the cell at position cname
public synchronized void addCellObserver (CellName cname, Observer c)
{
Cell observer = getCell(cname);
observer.addObserver(c);
}
// Get the formula associated with the cell at position cname
public synchronized Expression getFormula(CellName cname)
{
return getCell(cname).getFormula();
}
// Put a formula into the cell at position cname
- Synchronize cell evaluation
// Re-evaluate all cells in the spreadsheet
public int evaluateAll()
{
return 1;
}
// Evaluate one or more cells in the spreadsheet
synchronized void partialEvaluate()
{
if (moreToEvaluate()) {
Cell c = (Cell)evaluationRequired.front().at();
evaluationRequired.removeFront();
c.putValueIsCurrent (true);
++evaluationCounter;
c.evaluateFormula();
} else {
try {
wait();
} catch (InterruptedException ex) {}
}
}
// Return true if at least one cell is believed to require evaluation
public boolean moreToEvaluate()
{
return (evaluationRequired.size() > 0
&& evaluationCounter < EvaluationLimit);
}
5.2 Again, a Circle
Here is the threaded version of the spreadsheet. Try running it with the same sequence of insertions that caused the earlier version to lock up.
5.3 All Locked Up and Nowhere To Go
What happens to the Evaluator thread if no cells need to be evaluated?
We don’t want to waste CPU cycles on a busy loop.
5.3.1 wait()
The Java wait()
function (from class Object
)
-
releases the lock being held
-
puts the thread to sleep
-
places the thread in a queue of threads blocked waiting for access to that object
// Re-evaluate all cells in the spreadsheet
public int evaluateAll()
{
return 1;
}
// Evaluate one or more cells in the spreadsheet
synchronized void partialEvaluate()
{
if (moreToEvaluate()) {
Cell c = (Cell)evaluationRequired.front().at();
evaluationRequired.removeFront();
c.putValueIsCurrent (true);
++evaluationCounter;
c.evaluateFormula();
} else {
try {
wait();
} catch (InterruptedException ex) {}
}
}
// Return true if at least one cell is believed to require evaluation
public boolean moreToEvaluate()
{
return (evaluationRequired.size() > 0
&& evaluationCounter < EvaluationLimit);
}
5.3.2 notifyAll()
The notifyAll()
function
-
wakes up all threads that are blocked waiting for the object on which the lock is being held
-
so when the lock is next released, those threads are eligible to run
// Cells use the following two functions to inform
// the spreadsheet of changes in their status
synchronized void cellRequiresEvaluation (Cell c)
{
if (c.getValueIsCurrent())
{
c.putValueIsCurrent (false);
evaluationRequired.addToBack (c);
notifyAll();
}
}
void cellHasNewFormula (Cell c)
{
evaluationCounter = 0;
cellRequiresEvaluation(c);
}