Preface xii
HOUR 1: The R
Community 1
A Concise History of
R 1
The R Community 3
R Development 7
Summary 8
Q&A 8
Workshop 9
Activities 9
HOUR 2: The R Environment
11
Integrated Development
Environments 11
R Syntax 14
R Objects 16
Using R Packages 23
Internal Help 28
Summary 29
Q&A 30
Workshop 30
Activities 32
HOUR 3: Single-Mode Data Structures
33
The R Data Types 33
Vectors, Matrices, and Arrays 34
Vectors 35
Matrices 49
Arrays 58
Relationship Between Single-Mode Data
Objects 60
Summary 62
Q&A 62
Workshop 63
Activities 64
HOUR 4: Multi-Mode Data Structures
67
Multi-Mode Structures 67
Lists 68
Data Frames 86
Exploring Your Data 93
Summary 98
Q&A 98
Workshop 100
Activities 101
HOUR 5: Dates, Times, and Factors
103
Working with Dates and Times
103
The lubridate Package 107
Working with Categorical Data 108
Summary 112
Q&A 112
Workshop 113
Activities 114
HOUR 6: Common R Utility Functions
115
Using R Functions 115
Functions for Numeric Data 117
Logical Data 121
Missing Data 122
Character Data 123
Summary 125
Q&A 126
Workshop 126
Activities 127
HOUR 7: Writing Functions: Part I
129
The Motivation for Functions
129
Creating a Simple Function 130
The If/Else Structure 136
Summary 146
Q&A 147
Workshop 148
Activities 149
HOUR 8: Writing Functions: Part II
151
Errors and Warnings 151
Checking Inputs 155
The Ellipsis 157
Checking Multivalue Inputs 162
Using Input Definition 164
Summary 168
Q&A 168
Workshop 170
Activities 171
HOUR 9: Loops and Summaries
173
Repetitive Tasks 173
The “apply” Family of Functions 181
The apply Function 183
The lapply Function 195
The sapply Function 204
The tapply Function 208
Summary 213
Q&A 213
Workshop 214
Activities 216
HOUR 10: Importing and Exporting
217
Working with Text Files 217
Relational Databases 223
Working with Microsoft Excel 226
Summary 231
Q&A 232
Workshop 232
Activities 233
HOUR 11: Data Manipulation and
Transformation
235
Sorting 236
Appending 237
Merging 238
Duplicate Values 241
Restructuring 242
Data Aggregation 249
Summary 258
Q&A 258
Workshop 259
Activities 259
HOUR 12: Efficient Data Handling in R
261
dplyr: A New Way of Handling
Data 261
Efficient Data Handling with data
table 273
Summary 282
Q&A 283
Workshop 283
Activities 284
HOUR 13: Graphics 287
Graphics
Devices and Colors 287
High-Level Graphics Functions 289
Low-Level Graphics Functions 298
Graphical Parameters 304
Controlling the Layout 305
Summary 308
Q&A 309
Workshop 309
Activities 311
HOUR 14: The ggplot2 Package for
Graphics 313
The Philosophy of
ggplot2 313
Quick Plots and Basic Control 314
Changing Plot Types 317
Aesthetics 320
Paneling (a k a
Faceting) 328
Custom Plots 333
Themes and Layout 338
The ggvis Evolution 342
Summary 342
Q&A 343
Workshop 343
Activities 344
HOUR 15: Lattice Graphics 345
The
History of Trellis Graphics 345
The Lattice Package 346
Creating a Simple Lattice Graph 346
Graph Options 356
Multiple Variables 358
Groups of Data 360
Using Panels 362
Controlling Styles 372
Summary 376
Q&A 377
Workshop 378
Activities 378
HOUR 16: Introduction to R Models and Object
Orientation 379
Statistical Models
in R 379
Simple Linear Models 380
Assessing a Model in R 382
Multiple Linear Regression 391
Interaction Terms 396
Factor Independent Variables 398
Variable Transformations 402
R and Object Orientation 405
Summary 407
Q&A 408
Workshop 408
Activities 409
HOUR 17: Common R Models
411
Generalized Linear Models
411
Nonlinear Models 423
Survival Analysis 430
Time Series Analysis 441
Summary 452
Q&A 452
Workshop 452
Activities 453
HOUR 18: Code Efficiency
455
Determining Efficiency 455
Initialization 458
Vectorization 459
Using Alternative Functions 462
Managing Memory Usage 463
Integrating with C++ 464
Summary 468
Q&A 469
Workshop 469
Activities 470
HOUR 19: Package Building 471
Why
Build an R Package? 471
The Structure of an R Package 472
Code Quality 476
Automated Documentation with roxygen2
477
Building a Package with devtools 482
Summary 485
Q&A 485
Workshop 486
Activities 487
HOUR 20: Advanced Package Building
489
Extending R Packages 489
Developing a Test Framework 490
Including Data in Packages 494
Including a User Guide 496
Code Using Rcpp 501
Summary 502
Q&A 502
Workshop 503
Activities 504
HOUR 21: Writing R Classes
505
What Is a Class? 505
Creating a New S3 Class 509
Generic Functions and Methods 511
Inheritance in S3 516
Documenting S3 518
Limitations of S3 518
Summary 519
Q&A 519
Workshop 520
Activities 520
HOUR 22: Formal Class Systems
523
S4 523
Reference Classes 535
R6 Classes 542
Other Class Systems 544
Summary 544
Q&A 545
Workshop 545
Activities 546
HOUR 23: Dynamic Reporting
547
What Is Dynamic Reporting?
547
An Introduction to knitr 548
Simple Reports with RMarkdown 548
Reporting with LaTeX 553
Summary 557
Q&A 558
Workshop 558
Activities 559
HOUR 24: Building Web Applications with
Shiny 561
A Simple Shiny
Application 561
Reactive Functions 566
Interactive Documents 569
Sharing Shiny Applications 570
Summary 571
Q&A 571
Workshop 571
Activities 572
APPENDIX: Installation
573
Installing R 573
Installing Rtools for Windows 575
Installing the RStudio IDE 577
Index 579
Andy Nicholls has a Master of Mathematics degree from the University of Bath and Master of Science in Statistics with Applications in Medicine from the University of Southampton. Andy worked as a Senior Statistician in the pharmaceutical industry for a number of years before joining Mango Solutions as an R consultant in 2011. Since joining Mango, Andy has taught more than 50 on-site R training courses and has been involved in the development of more than 30 R packages. Today, he manages Mango Solution’s R consultancy team and continues to be a regular contributor to the quarterly LondonR events, by far the largest R user group in the UK, with over 1,000 meet-up members. Andy lives near the historical city of Bath, UK with his wonderful, tolerant wife and son.
Richard Pugh has a first-class Mathematics degree from the University of Bath. Richard worked as a statistician in the pharmaceutical industry before joining Insightful, the developers of S-PLUS, joining the pre-sales consulting team. Richard’s role at Insightful included a variety of activities, providing a range of training and consulting services to blue-chip customers across many sectors. In 2002, Richard co-founded Mango Solutions, developing the company and leading technical efforts around R and other analytic software. Richard is now Mango’s Chief Data Scientist and speaks regularly at data science and R events. Richard lives in Bradford on Avon, UK with his wife and two kids, and spends most of his “spare” (ha!) time renovating his house.
Aimee Gott has a PhD in Statistics from Lancaster University where she also completed her undergraduate and master’s degrees. As Training Lead, Aimee has delivered over 200 days of training for Mango. She has delivered on-site training courses in Europe and the U.S. in all aspects of R, as well as shorter workshops and online webinars. Aimee oversees Mango’s training course development across the data science pipeline, and regularly attends R user groups and meet-ups. In her spare time, Aimee enjoys learning European languages and documenting her travels through photography.
Ask a Question About this Product More... |