Exercises for Chapter 9
In these exercises, we use again the database
dbadvanced and the tables we have created in the chapter. Make sure that this database exists exactly as described in the chapter. Alternatively, you can regenerate this database with
library(RPostgres) db <- dbConnect(Postgres(), dbname = "dbadvanced", user = "postgres", password = "pgpasswd" ) dbExecute(db, "DROP TABLE IF EXISTS elections") dbExecute(db, "DROP TABLE IF EXISTS parties") dbExecute(db, "DROP TABLE IF EXISTS first_elections") elections <- read.csv(file.path("ch09", "elections.csv"), header = T) elections$election_date <- as.Date(elections$election_date) dbWriteTable(db, "elections", elections) dbExecute(db, "ALTER TABLE elections ADD COLUMN seat_share real, ADD COLUMN year integer") dbExecute(db, "UPDATE elections SET seat_share=seats::real / seats_total, year=extract(year from election_date)") parties <- read.csv(file.path("ch09", "parties.csv"), header = T) dbWriteTable(db, "parties", parties) populist <- read.csv(file.path("ch09", "populist.csv"), header = T) dbWriteTable(db, "populist", populist) dbExecute(db, "ALTER TABLE parties ADD COLUMN populist integer") dbExecute( db, "UPDATE parties SET populist = populist.populist FROM populist WHERE parties.party_id = populist.parlgov_id" ) dbExecute(db, "ALTER TABLE parties ADD PRIMARY KEY (party_id)") dbExecute(db, "ALTER TABLE elections ADD FOREIGN KEY (party_id) REFERENCES parties (party_id)")
Exercise 1: Grouping and Aggregation with Joined Tables
The PopuList dataset that we have used in the Chapter also includes information about whether a party is considered to be “far left” or “far right”. In this exercise, we want to find out which of these extremist parties have the highest electoral experience, measured by the number of elections they have participated in. As a first step, merge the respective variables
farright from the PopuList into the
parties table. Next, use grouping and aggregation to retrieve the number of elections that parties of either type have participated in. Can you order this list such that the parties with the most electoral experience appear at the top?
Exercise 2: Foreign Keys and Joins If Tables Do Not Match
Rather than merging values from
parties as new attributes, somebody suggests that we could simply join the two tables when we need them, and add a foreign key relationship such that referential integrity is maintained. What foreign key do you need for this? What happens if you try to set up a foreign key, and why?
It seems that there is a number of entries in the PopuList dataset that cause problems. Can you find these entries with a JOIN? Since the standard join retrieves only those records with matching combinations, we will have to use a different type of join. Use only a single SELECT statement for this!
Exercise 3: Saving Intermediate Results
Let us assume that for a new project, we want to know which election results parties achieve when they first run. For this, we need to flag those election results in
elections where a party first shows up in an election. All the data we need for this is already contained in the
Write a short SQL statement that shows you the
party_id and the
election_date when the respective party first ran.
Create a separate table
first_elections, which contains the data created in the previous step. Hint: You can create a new table from a SELECT statement using
CREATE TABLE xyz AS <select statement>. This creates a copy of the result and stores it as a new table.
Finally, we want to update the
elections table such that it gets a new boolean variable
first_election, which indicates whether the party first ran in the given election. Hint: as described in the chapter, create the variable first, and then UPDATE it from the
What is the average election result that parties achieve in their first elections?
Exercise 4: Cascading
elections table depends on
parties, since the latter contains information on the parties participating in the elections. This is why in the chapter, we have defined
party_id to be a foreign key in
election, since it points to the
parties table. As we have seen in the chapter, the database system ensures that these relationships remain intact at the level of the individual records - for example, a record that is still referenced from the other table cannot be deleted. However, the DBMS also maintains these relationships at the level of entire tables. Try deleting the
parties table with
DROP TABLE. What happens, and why?
PostgreSQL gives you a specific recommendation for dealing with this problem: using the
CASCADE keyword. Change your statement such that is uses this keyword. What exactly does it do? Does it delete the
elections table? If not, what do you think has happened?