resources
Data
Various open data used throughout the course:
- Dinesafe Data: Toronto food inspections info (data)
- Bike Share Data: Toronto’s bicycle sharing program info (data/codebook)
- Building Permit Data: Toronto building permit information (data/codebook)
- Fuel Consumption Ratings: fuel consumption & CO2 emissions for new vehicles sold in Canada (data)
Computing
We will be using the R programming language for this course.
We will interact with R and other tools through the RStudio IDE.
I recommend you download and install R and RStudio on your personal computer; both programs are freely available for Linux/Windows/Mac. Alternatively, you can use a web-based version by opening a free account at RStudio Cloud, or go to a UTSC computer lab (all have R/RStudio installed).
We will be using several R packages that add important functionality to R (similar to libraries in Python). We will also be using RMarkdown (R’s version of the Markdown language) to write reports that combine text with R code, output, and plots.
Learning R
There are many excellent and free resources for learning R, below is partial list:
Books
- Hands-On Programming with R, by Garrett Grolemund; great introduction to R for beginners.
- R for Data Science, by Hadley Wickham and Garrett Grolemund; methods & libraries for doing Data Science.
- Advanced R, by Hadley Wickham; for experienced programmers who want to delve deeper.
Other
- Reference cards for the tools we will be using (and more).
- Style guide for writing readable R code.
- Directory with various R resources.