From CS 725/825 Spring 2018

CS725-S18: Homework 6

Contents (hide)

Due: February 28, 2018 before 9:30am
You have 2 weeks to complete this assignment.

The goals of this week's assignment are to

Many thanks to Enrico Bertini ( for sharing this "Data Analysis" exercise.

How do you know if you are on the right track? You can easily think of which data attributes and charts may be helpful in researching the question you have. You do not have major problems in translating the ideas/questions you have into charts.

Background Story

Every year in New York State over 10,000 parole eligible prisoners appear in front of the Parole Board and are denied release. Many legal experts argue that there is little reason to the way that the New York State Parole Board makes these decisions. In response to this problem, Nikki Zeichner (a lawyer and ex criminal defense attorney) built a dataset of New York State Parole Board hearings based on publicly published parole hearing records scraped from the Parole Board's website.

Through analyzing this dataset, she hopes to uncover new information about patterns in parole board determinations. She also anticipates to uncover new information about how incarceration impacts different people, how people convicted of crimes change over time, and what types of prison programming proves most successful. You can read more about this project at (read the documentation carefully).


Download a local copy of parole-dataset.csv. This is a set of over 30,000 records from the Parole Board's website in CSV format. The Inmate Information Data Definitions from the Parole Board’s website will help to clarify the meaning of each data field.


Explore this data set with Excel, Open Refine, Tableau, R, or whatever tool you choose. Identify useful information about how the parole system works in New York. Your task is to surprise Nikki with your discoveries!

Create a document writing a story of what you found and describing your findings through charts.

Data Analysis Vs. Presentation

As you develop the results you need for this homework, think about the difference and relationship between data analysis and presentation. What are the most important goals in analysis? And what are the most important ones in presentation?

Note that not all charts you explore during your analysis have to be in your final document. Analysis often leads you to some dead ends. Select only those charts that together tell a coherent story about your findings.

Details on how to write the story

  1. Write a short introduction explaining what the goal of your analysis is.
  2. Structure the document as a sequence of images and text, images and text, images and text, etc.
  3. For each chart specify:
    • Question: What question each chart addresses
    • Findings: What you can see in the chart that is useful/interesting for your analysis;
    • Follow-up: How such analysis leads to the generation of the new question and chart that follow;
  4. Write a section called "Conclusion", in which you summarize your main findings in a few sentences.

IMPORTANT! Make sure that your document reads like a coherent story, not like a "patchwork" of unrelated images/charts. You will need to provide some explanation of what the dataset fields are. Don't assume that your reader is familiar with the dataset (especially with abbreviations).

How the assignment is evaluated

The parameters used for evaluation are:


Retrieved from
Page last modified on February 12, 2018, at 08:49 AM