Lecture Goals

  • Learn how to:
    • Work w/ relational data
    • Join/merge data-frames
    • Append/bind data-frames
  • Readings

Relational Data

  • Databases are organized collections of data
    • Implemented through some DB Management Systems (e.g. Oracle)
  • Majority of DBs use relational model of data ogranization
    • Data live in multiple inter-related tables
    • SQL is standard language for managing relational DB's

Relational Data

  • Sample university database

Keys

  • Tables are connected through variable(s) called keys
    • Keys are used to identify/relate observations
  • Two types of keys
    • Primary keys uniquely identify observations within a table
    • Foreign keys associate observations with those in other tables

Relationships

  • Primary/foreign key pairs establish relationships between tables

  • Three types of relationships
    • One-to-one: primary-primary pair
    • One-to-many: primary-foreign pair
    • Many-to-many: (primary-foreign) [junction table] (foreign-primary) pairs

Mutating Joins

  • Expand left/right table with variables from other table, matching observations where possible

Mutating Joins

  • Merge tables' variables, keeping only matching (inner join) or all (full join) observations

Filtering Joins

  • Filter observations in one table, based on whether they match (semi join) or not (anti join) with other table

Set Operations

  • Operations on sets or observations (rows) with the same variables

Binding

  • Append table rows/columns