From CS 725/825 Spring 2018

CS725-S18: Homework 7

Contents (hide)

Due: March 14, 2018 before 9:30am
You have 2 weeks to complete this assignment.

The goal of this week's assignment is to use Tableau (or another tool) to explore and create a visualization that answers one of the questions you posed in HW5. You will also be documenting the exploratory data analysis process.

This assignment is based Assignment 2 from Maneesh Agrawala's Fall 2017 CS 448B course.


The task in this assignment is to use an existing software tool (such as Tableau) to formulate and answer a specific question about a data set of your choice. After answering the question, you should create a final visualization that is designed to present the answer to your question to others.

You must maintain a notebook (using Markdown in a Gitlab project) that documents all the questions you asked and the steps you performed from start to finish. The goal of this assignment is not to develop a new visualization tool, but to understand better the process of exploring data using an off-the-shelf visualization tool. Documenting the data analysis process you went through is the main pedagogical goal of the assignment and more important than the design of the final visualization.

Data Preparation

Since you've already picked a data set and proposed some questions in HW5, the next step is choose one question and assess the fitness of the data for answering your question. Inspect the data it is invariably helpful to first look at the raw values. Does the data seem appropriate for answering your question? If not, you may need to start the process over. If so, does the data need to be reformatted or cleaned prior to analysis? Perform any steps necessary to get the data into shape prior to visual analysis.

You will need to iterate through these steps a few times. It may be challenging to find interesting questions and a dataset that has the information that you need to answer those questions. You may need to try several datasets.

Your question needs to be more involved than something like 'what are the top donor countries?' (as an example from the AidData dataset). You must perform some analysis and investigation into the data.

If you find that you need to either change your questions from HW5 or change your dataset, you may do so -- but do this step early!

Exploratory Analysis Process

After you have an initial question and a dataset, construct a visualization that provides an answer to your question. As you construct the visualization you will find that your question evolves - often it will become more specific. Keep track of this evolution and the other questions that occur to you along the way. Once you have answered all the questions to your satisfaction, think of a way to present the data and the answers as clearly as possible. In this assignment, you should use an existing visualization software tool (such as Tableau). You may find it beneficial to use more than one tool.


Before starting, write down the initial question clearly.

As you go, add to your notebook what you had to do to construct the visualizations and how the questions evolved.

Include in the notebook where you got the data, and documentation about the format of the dataset.

Describe any transformations or rearrangements of the dataset that you needed to perform; in particular, describe how you got the data into the format needed by the visualization system.

Keep copies of any intermediate visualizations that helped you refine your question. Put these in your Gitlab project and refer to them in your notebook.

After you have constructed the final visualization for presenting your answer, write a caption and a paragraph describing the visualization, and how it answers the question you posed. Think of the figure, the caption and the text as material you might include in a research paper. The caption and paragraph should be displayed along with the visualization in the same file. This should be different than your notebook file.


Retrieved from
Page last modified on February 27, 2018, at 07:49 AM