gravatar

Recently Published

data 608 assignment 3

DATA 643 - Project 5

data 608 assignment 3

DATA 643 - Final Project Proposal

data 608 assignment 3

DATA 643 - Project 4

data 608 assignment 3

DATA 643 - Project 3

data 608 assignment 3

DATA 643 - Project 2

data 608 assignment 3

DATA 643 - Recommender Systems in Context

data 608 assignment 3

DATA 643 - Project 1

data 608 assignment 3

DATA 608 - Blog 4

data 608 assignment 3

DATA 608 - Blog 3

data 608 assignment 3

DATA 608 Final Project Proposal

data 608 assignment 3

DATA 608 Homework 1

data 608 assignment 3

DATA 605 Final Project

data 608 assignment 3

DATA 605 Homework 15

data 608 assignment 3

DATA 605 Discussion 15

data 608 assignment 3

DATA 605 Homework 14

data 608 assignment 3

DATA 605 Discussion 14

data 608 assignment 3

DATA 605 Homework 13

data 608 assignment 3

DATA 605 Discussion 13

data 608 assignment 3

DATA 605 Homework 12

data 608 assignment 3

DATA 605 Discussion 12

data 608 assignment 3

DATA 605 Homework 11

data 608 assignment 3

DATA 605 Discussion 11

data 608 assignment 3

DATA 605 Homework 10

data 608 assignment 3

DATA 605 Discussion 10

data 608 assignment 3

DATA 605 Homework 9

data 608 assignment 3

DATA 605 Discussion 9

data 608 assignment 3

DATA 605 Homework 8

data 608 assignment 3

DATA 605 Discussion 8

data 608 assignment 3

DATA 605 Homework 7

data 608 assignment 3

DATA 605 Discussion 7

data 608 assignment 3

DATA 605 Homework 6

data 608 assignment 3

DATA 605 Discussion 6

data 608 assignment 3

DATA 605 Homework 5

data 608 assignment 3

DATA 605 Discussion 5

data 608 assignment 3

DATA 605 Homework 4

data 608 assignment 3

DATA 605 Discussion 4

data 608 assignment 3

DATA 605 Homework 3

data 608 assignment 3

DATA 605 Discussion 3

data 608 assignment 3

DATA 605 Homework 2

data 608 assignment 3

DATA 605 Discussion 2

data 608 assignment 3

DATA 605 Homework 1

data 608 assignment 3

DATA 606 Final Exam

data 608 assignment 3

DATA 606 Project

data 608 assignment 3

DATA 607 Final Project

data 608 assignment 3

DATA 606 Lab 8

data 608 assignment 3

DATA 606 Homework 8

data 608 assignment 3

DATA 607 Week 13 Assignment

data 608 assignment 3

DATA 606 Homework 7

data 608 assignment 3

DATA 606 Lab 7

data 608 assignment 3

DATA 607 Discussion 11

data 608 assignment 3

DATA 606 Presentation

data 608 assignment 3

DATA 607 Project 4

data 608 assignment 3

DATA 606 Homework 6

data 608 assignment 3

DATA 606 Lab 6

data 608 assignment 3

DATA 606 Lab 5

data 608 assignment 3

DATA 606 Homework 5

data 608 assignment 3

DATA 607 Week 9 Assignment

data 608 assignment 3

DATA 606 Project Proposal

data 608 assignment 3

DATA 606 Homework 4

data 608 assignment 3

DATA 607 Week 7 Assignment

data 608 assignment 3

DATA 606 Lab 4b

data 608 assignment 3

DATA 607 Project 2

data 608 assignment 3

DATA 606 Lab 4a

data 608 assignment 3

DATA 607 Week 5 Assignment

data 608 assignment 3

DATA 606 Homework 3

data 608 assignment 3

DATA 607 Project 1

data 608 assignment 3

DATA 606 Lab 3

data 608 assignment 3

DATA 607 Week 3 Assignment

data 608 assignment 3

DATA 606 Lab 2

data 608 assignment 3

DATA 606 Homework 2

data 608 assignment 3

DATA 607 Week 2 Assignment

data 608 assignment 3

DATA 607 Week 1 Assignment

data 608 assignment 3

DATA 606 Lab 1

data 608 assignment 3

DATA 606 Lab 0

data 608 assignment 3

DATA 606 Homework 1

data 608 assignment 3

R Bridge Week 3 Final Project

data 608 assignment 3

CUNY MSDA R Bridge - Week 2 Assignment

data 608 assignment 3

CUNY MSDA R Bridge - Week 1 Assignment

data 608 assignment 3

CUNY MSDA R Bridge - Week 1 Quiz

DATA 608: Assignment #1

Michael munguia.

Principles of Data Visualization and Introduction to ggplot2

I have provided you with data about the 5,000 fastest growing companies in the US, as compiled by Inc. magazine. lets read this in:

And lets preview this data:

Having looked at the first few rows of the data, we see it’s ordered by company ranking. Let’s have a look at the bottom few rows too. I’ll also load tidyverse to work with ggplot2 and the rest of that suite of libraries.

Before thinking about visualization, I think it’s useful to have a sense of what’s going on with the text data - particularly if I think these will serve as labels, legends, etc. downstream.

Not only do we get a preview of some important insights (NYC is clear and away the leading city whereas California is the leading state - and interesting contrast), but we now know that we likely don’t have a single categorical variable that consists of just a handful of labels. Certainly, intuition would tell us not to expect a handful of values in Name , City or State .

We can confirm this and have a look at the count of values present in Industry and note that we have a total of 25.

There was also only one variable with NA values, at a count of just 12, so we can evaluate those below.

We can also take a look at what states are impacted by these NA values:

Create a graph that shows the distribution of companies in the dataset by State (ie how many are in each state). There are a lot of States, so consider which axis you should use. This visualization is ultimately going to be consumed on a ‘portrait’ oriented screen (ie taller than wide), which should further guide your layout choices.

Lets dig in on the state with the 3rd most companies in the data set. Imagine you work for the state and are interested in how many people are employed by companies in different industries. Create a plot that shows the average and/or median employment by industry for companies in this state (only use cases with full data, use R’s complete.cases() function.) In addition to this, your graph should show how variable the ranges are, and you should deal with outliers.

We know that there are no missing data elements for this state from inspecting NA values earlier, but there are certainly outliers. I feel like imputing average values or simply dropping those data points would be misleading, so I’m going to handle outliers by comparing average and median. Knowing that the median is robust against outliers, that should be an anchor in surveying this data visually. An average value should stand out and immediately inform me what the variability looks like, rather than being hidden away by dropping the original value/imputing something new.

Now imagine you work for an investor and want to see which industries generate the most revenue per employee. Create a chart that makes this information clear. Once again, the distribution per industry should be shown.

IMAGES

  1. DATA 608 Final Project

    data 608 assignment 3

  2. GitHub

    data 608 assignment 3

  3. MHA 608-Assignment 3

    data 608 assignment 3

  4. EDC 608 Assignment 3

    data 608 assignment 3

  5. EPA 608 Exams Bundled Together with complete solutions Questions and

    data 608 assignment 3

  6. Data 608 Final Project

    data 608 assignment 3

VIDEO

  1. Berbahaya hampir saja fatal?? buka kamera.. lapak rwk

  2. How to Utilize PastPerfect Authority Files

  3. Mie Goreng Dadakan Pas Lagi Sibuk||@Rosmiati Salim||

  4. Why You NEED to Pixel Your Website!

  5. MS Excel Tutorial Part 1: Business Sales Data Analysis question

  6. Alert! Martha Karua Shares Urgent Video on Missing Security Details-Ruto Must Act Now

COMMENTS

  1. PDF CUNY_DATA_608/module3/Assignment 3.pdf at master

    Supplemental Class Materials for CUNY IS 608: Knowledge and Visual Analytics - CUNY_DATA_608/module3/Assignment 3.pdf at master · charleyferrari/CUNY_DATA_608

  2. DATA 608

    Load the Data mortality <- read.csv("https://raw.githubusercontent.com/charleyferrari/CUNY_DATA_608/master/module3/data/cleaned-cdc-mortality-1999-2010-2.csv", header ...

  3. DATA_608/Assignment_3/DATA608_assign3_Q1.Rmd at master

    Assignments for DATA 608. Contribute to ncooper76/DATA_608 development by creating an account on GitHub.

  4. DATA 608 Assignment 3

    3.45 1999 2002.0 2005 2008.0 2010.0 Deaths 0 1 2928.70 7153.09 10 177.0 667 2474.0 96511.0 Population 0 1 5937895.69 6551952.76 491780 1728292.0 4219239 6562231.0 37253956.0 Crude.Rate 0 1 52.15 80.37 0 4.6 24 50.5 478.4

  5. PDF DATA-608/Module 3/Assignment 3 (1).pdf at main

    Contribute to SubhalaxmiRout002/DATA-608 development by creating an account on GitHub.

  6. DATA608_Assignment_3

    ## ICD.Chapter State Year Deaths Population ## 1 Certain infectious and parasitic diseases AL 1999 1092 4430141 ## 2 Certain infectious and parasitic diseases AL 2000 1188 4447100 ## 3 Certain infectious and parasitic diseases AL 2001 1211 4467634 ## 4 Certain infectious and parasitic diseases AL 2002 1215 4480089 ## 5 Certain infectious and ...

  7. RPubs

    Data 608 Assignment 3; by Leticia Salazar; Last updated over 1 year ago; Hide Comments (-) Share Hide Toolbars

  8. SieSiongWong

    DATA 608. Assignment #3. over 3 years ago. DATA 608. Assignment #1. over 3 years ago. DATA 605. Final Exam. over 3 years ago. DATA 605. Assignment #15. almost 4 years ago. DATA 605. Assignment #14. ... Week 3 - Assignment 3. about 5 years ago. CUNY SPS 2019 R Workshop. Week 2 - Assignment 2. about 5 years ago.

  9. RPubs

    DATA 608 Assignment 1. DATA 608 Assignment 1 Author: Philip Tanofsky Date: Sept. 6, 2020. about 4 years ago. DATA 607 Final Project Presentation. ... DATA 607 Week 3 Assignment by Philip Tanofsky Spring 2020. over 4 years ago. DATA607_Week01_Assignment. Week 1 assignment for course DATA 607. Completed by Philip Tanofsky for Spring 2020 semester.

  10. RPubs

    Data 608 Assignment 6. over 4 years ago. Data 608 Assignment 5 Part 2. almost 5 years ago. Data 608 Assignment 5 Part 1. almost 5 years ago. Data 608 Project 3. almost 5 years ago. DATA 608 Homework 1. almost 5 years ago. Data 612 Final Project. about 5 years ago. Data 612 Project 5. about 5 years ago.

  11. DATA 608: Week 3 Visualization

    Getting, cleaning the data. We downloaded the zip file and used csvkit to merge the separate borough files in the terminal. Per the assignment, steps not shown. We checked borough counts, looked for missing data in key fields. We limited the data to buildings constructed in 1850 and older and fixed some glitches (i.e, Year Built = 2040).

  12. ilyakats

    DATA 608 - Blog 3. R Markdown: Organizing Your Code Chunks. over 6 years ago. DATA 608 Final Project Proposal. over 6 years ago. DATA 608 Homework 1. ... DATA 607 Week 3 Assignment. over 7 years ago. DATA 606 Lab 2. over 7 years ago. DATA 606 Homework 2. over 7 years ago. DATA 607 Week 2 Assignment. over 7 years ago.

  13. ncooper76/DATA_608: Assignments for DATA 608

    HTML 29.7%. Python 0.3%. Assignments for DATA 608. Contribute to ncooper76/DATA_608 development by creating an account on GitHub.

  14. R Notebook Homework 3

    \r"," Question 1 \r"," As a researcher, you frequently compare mortality rates from particular causes across different States. You need a visualization that will let you see (for 2010 only) the crude mortality rate, across all States, from one cause (for example, Neoplasms, which are effectively cancers).

  15. RPubs

    Data 608 - Assignment 1; by Sin Ying Wong; Last updated over 3 years ago; Hide Comments (-) Share Hide Toolbars

  16. Data 608

    Lets dig in on the state with the 3rd most companies in the data set. Imagine you work for the state and are interested in how many people are employed by companies in different industries. Create a plot that shows the average and/or median employment by industry for companies in this state (only use cases with full data, use R's complete ...

  17. PDF Data608_HW3/Assignment 3.pdf at master

    Saved searches Use saved searches to filter your results more quickly

  18. DATA 608: Assignment #1

    DATA 608: Assignment #1 Michael Munguia. Principles of Data Visualization and Introduction to ggplot2. I have provided you with data about the 5,000 fastest growing companies in the US, as compiled by Inc. magazine. lets read this in:

  19. RPubs

    DATA 608 - Assignment 1; by Brett D; Last updated about 3 years ago; Hide Comments (-) Share Hide Toolbars

  20. RPubs

    Data 608 Assignment 1; by Stephen Haslett; Last updated almost 3 years ago; Hide Comments (-) Share Hide Toolbars

  21. Data 608 Assignment 1

    Or copy & paste this link into an email or IM: