Preface
About the Authors
Introduction: Data Science, Many Skills
What Is Data Science?
The Steps in Doing Data Science
The Skills Needed to Do Data Science
Chapter 1 • About Data
Storing Data—Using Bits and Bytes
Combining Bytes Into Larger Structures
Creating a Data Set in R
Chapter 2 • Identifying Data Problems
Talking to Subject Matter Experts
Looking for the Exception
Exploring Risk and Uncertainty
Chapter 3 • Getting Started With R
Installing R
Using R
Creating and Using Vectors
Chapter 4 • Follow the Data
Understand Existing Data Sources
Exploring Data Models
Chapter 5 • Rows and Columns
Creating Dataframes
Exploring Dataframes
Accessing Columns in a Dataframe
Chapter 6 • Data Munging
Reading a CSV Text File
Removing Rows and Columns
Renaming Rows and Columns
Cleaning Up the Elements
Sorting Dataframes
Chapter 7 • Onward With RStudio®
Using an Integrated Development Environment
Installing RStudio
Creating R Scripts
Chapter 8 • What’s My Function?
Why Create and Use Functions?
Creating Functions in R
Testing Functions
Installing a Package to Access a Function
Chapter 9 • Beer, Farms, and Peas and the Use of Statistics
Historical Perspective
Sampling a Population
Understanding Descriptive Statistics
Using Descriptive Statistics
Using Histograms to Understand a Distribution
Normal Distributions
Chapter 10 • Sample in a Jar
Sampling in R
Repeating Our Sampling
Law of Large Numbers and the Central Limit Theorem
Comparing Two Samples
Chapter 11 • Storage Wars
Importing Data Using RStudio
Accessing Excel Data
Accessing a Database
Comparing SQL and R for Accessing a Data Set
Accessing JSON Data
Chapter 12 • Pictures Versus Numbers
A Visualization Overview
Basic Plots in R
Using ggplot2
More Advanced ggplot2 Visualizations
Chapter 13 • Map Mashup
Creating Map Visualizations With ggplot2
Showing Points on a Map
A Map Visualization Example
Chapter 14 • Word Perfect
Reading in Text Files
Using the Text Mining Package
Creating Word Clouds
Chapter 15 • Happy Words?
Sentiment Analysis
Other Uses of Text Mining
Chapter 16 • Lining Up Our Models
What Is a Model?
Linear Modeling
An Example—Car Maintenance
Chapter 17 • Hi Ho, Hi Ho—Data Mining We Go
Data Mining Overview
Association Rules Data
Association Rules Mining
Exploring How the Association Rules Algorithm Works
Chapter 18 • What’s Your Vector, Victor?
Supervised and Unsupervised Learning
Supervised Learning via Support Vector Machines
Support Vector Machines in R
Chapter 19 • Shiny® Web Apps
Creating Web Applications in R
Deploying the Application
Chapter 20 • Big Data? Big Deal!
What Is Big Data?
The Tools for Big Data
Index
Jeffrey S. Saltz is an Associate Professor at Syracuse University
in the School of Information Studies and Director of the school′s
Master′s of Science program in Applied Data Science. His research
and teaching focus on helping organizations leverage information
technology and data for competitive advantage. Specifically, his
current research focuses on the socio-technical aspects of data
science projects, such as how to coordinate and manage data science
teams. In order to stay connected to the “real world”, Dr. Saltz
consults with clients ranging from professional football teams to
Fortune 500 organizations. Prior to becoming a professor, Dr.
Saltz′s two decades of industry experience focused on leveraging
emerging technologies and data analytics to deliver innovative
business solutions. In his last corporate role, at JPMorgan Chase,
he reported to the firm′s Chief Information Officer and drove
technology innovation across the organization. Jeff also held
several other key technology management positions at the company,
including CTO and Chief Information Architect. He also served as
Chief Technology Officer and Principal Investor at Goldman Sachs,
where he helped incubate technology start-ups. He started his
career as a programmer, project leader and consulting engineer with
Digital Equipment Corp. Dr. Saltz holds a B.S. degree in computer
science from Cornell University, an M.B.A. from The Wharton School
at the University of Pennsylvania, and a PhD in Information Systems
from the New Jersey Institute of Technology.
Jeffrey M. Stanton, Ph.D. is a Professor at Syracuse University in
the School of Information Studies. Dr. Stanton’s research focuses
on the impacts of machine learning on organizations and
individuals. He is the author of Reasoning with Data (2017), an
introductory statistics textbook. Stanton has also published many
scholarly articles in peer-reviewed behavioral science journals,
such as the Journal of Applied Psychology, Personnel Psychology,
and Human Performance. His articles also appear in Journal of
Computational Science Education, Computers and Security,
Communications of the ACM, Computers in Human Behavior, the
International Journal of Human-Computer Interaction, Information
Technology and People, the Journal of Information Systems
Education, the Journal of Digital Information, Surveillance and
Society, and Behaviour & Information Technology. He also has
published numerous book chapters on data science, privacy, research
methods, and program evaluation. Dr. Stanton′s research has
been supported through 19 grants and supplements including the
National Science Foundation’s CAREER award. Before getting his PhD,
Stanton was a software developer who worked at startup companies in
the publishing and professional audio industries. He holds a
bachelor′s degree in Computer Science from Dartmouth College, and a
master′s and Ph.D. in Psychology from the University of
Connecticut.
Ask a Question About this Product More... |