Exercises for Chapter 9

In these exercises, we use again the database dbadvanced and the tables we have created in the chapter. Make sure that this database exists exactly as described in the chapter. Alternatively, you can regenerate this database with

library(RPostgres)
db <- dbConnect(Postgres(),
  dbname = "dbadvanced",
  user = "postgres",
  password = "pgpasswd"
)

dbExecute(db, "DROP TABLE IF EXISTS elections")
dbExecute(db, "DROP TABLE IF EXISTS parties")
dbExecute(db, "DROP TABLE IF EXISTS first_elections")

elections <- read.csv(file.path("ch09", "elections.csv"), header = T)
elections$election_date <- as.Date(elections$election_date)
dbWriteTable(db, "elections", elections)
dbExecute(db, "ALTER TABLE elections
          ADD COLUMN seat_share real,
          ADD COLUMN year integer")
dbExecute(db, "UPDATE elections
          SET seat_share=seats::real / seats_total,
          year=extract(year from election_date)")

parties <- read.csv(file.path("ch09", "parties.csv"), header = T)
dbWriteTable(db, "parties", parties)

populist <- read.csv(file.path("ch09", "populist.csv"), header = T)
dbWriteTable(db, "populist", populist)
dbExecute(db, "ALTER TABLE parties ADD COLUMN populist integer")
dbExecute(
  db,
  "UPDATE parties
  SET populist = populist.populist
  FROM populist
  WHERE parties.party_id = populist.parlgov_id"
)

dbExecute(db, "ALTER TABLE parties ADD PRIMARY KEY (party_id)")
dbExecute(db, "ALTER TABLE elections ADD FOREIGN KEY (party_id) REFERENCES parties (party_id)")

Exercise 1: Grouping and Aggregation with Joined Tables

The PopuList dataset that we have used in the Chapter also includes information about whether a party is considered to be “far left” or “far right”. In this exercise, we want to find out which of these extremist parties have the highest electoral experience, measured by the number of elections they have participated in. As a first step, merge the respective variables farleft and farright from the PopuList into the parties table. Next, use grouping and aggregation to retrieve the number of elections that parties of either type have participated in. Can you order this list such that the parties with the most electoral experience appear at the top?

Exercise 2: Foreign Keys and Joins If Tables Do Not Match

Rather than merging values from populist into parties as new attributes, somebody suggests that we could simply join the two tables when we need them, and add a foreign key relationship such that referential integrity is maintained. What foreign key do you need for this? What happens if you try to set up a foreign key, and why?

It seems that there is a number of entries in the PopuList dataset that cause problems. Can you find these entries with a JOIN? Since the standard join retrieves only those records with matching combinations, we will have to use a different type of join. Use only a single SELECT statement for this!

Exercise 3: Saving Intermediate Results

Let us assume that for a new project, we want to know which election results parties achieve when they first run. For this, we need to flag those election results in elections where a party first shows up in an election. All the data we need for this is already contained in the elections table.

Write a short SQL statement that shows you the party_id and the election_date when the respective party first ran.

Create a separate table first_elections, which contains the data created in the previous step. Hint: You can create a new table from a SELECT statement using CREATE TABLE xyz AS <select statement>. This creates a copy of the result and stores it as a new table.

Finally, we want to update the elections table such that it gets a new boolean variable first_election, which indicates whether the party first ran in the given election. Hint: as described in the chapter, create the variable first, and then UPDATE it from the first_elections table.

What is the average election result that parties achieve in their first elections?

Exercise 4: Cascading

Our elections table depends on parties, since the latter contains information on the parties participating in the elections. This is why in the chapter, we have defined party_id to be a foreign key in election, since it points to the parties table. As we have seen in the chapter, the database system ensures that these relationships remain intact at the level of the individual records - for example, a record that is still referenced from the other table cannot be deleted. However, the DBMS also maintains these relationships at the level of entire tables. Try deleting the parties table with DROP TABLE. What happens, and why?

PostgreSQL gives you a specific recommendation for dealing with this problem: using the CASCADE keyword. Change your statement such that is uses this keyword. What exactly does it do? Does it delete the elections table? If not, what do you think has happened?