CS 725/825 - Information Visualization
Spring 2017: Wednesdays, 9:30am-12:15pm, E&CS 2120

Print - Admin

Home

Staff

Syllabus

Schedule
  Objectives

Assignment Guidelines

Blackboard
CS725 @ GitLab
WebEx

Paper Presentations

Project - updated

Links


Tableau's data visualization software is provided through the Tableau for Teaching program.

In-Class Work 2 (ICW2)

See http://www.cs.odu.edu/~mweigle/CS725-S17/Guidelines for general information on ICW assignments.

Background

Academics typically publish their research findings in journals. Most journals are subscription-based and require readers (or their university libraries) to pay for a subscription to the journal in order to read the article. Some journals now allow authors to pay an article processing charge (APC) that then allows the reader to access the article free of charge, making the article "open access" (OA). The data you will examine for this assignment comes from a list of articles where this APC was paid.

Task

The data in icw2-journals.pdf contains information about fees paid for publishing 55 articles in academic journals. (Note: You will be considering the entire datafile for your VI2.)

For now, focus only on the Publisher and Journal title columns.

  1. Identify all of the inconsistencies in naming either the publisher or journal. For each, indicate what the appropriate name should be. (Note that for two entries, if the journal name is the same, then the publisher name should be the same.)
  2. What problems might these inconsistencies cause when trying to analyze or visualize the data?
  3. If the complete dataset was only these 55 entries, how would you go about cleaning it? (Assuming that you had either a CSV or Excel file.) What tools would you use?
  4. Would your strategy change if you had the full dataset of 2128 entries? What tools would you use?

Submission

Add the following files to the ICW2 Gitlab project

  • username1-username2-username3-ICW2.md - solution to ICW2
  • username1-notes.md - notes during discussion

where username1 is the note-taker. Make sure to use the @ notation (e.g., @mweigle) to give credit to group members for ideas or to flag items for review.