TDM 20100: Project 13 — 2023
Motivation: We’ve covered a lot about SQL in a relatively short amount of time, but we still haven’t touched on some other important SQL topics. In this project, we will touch on some other important SQL topics.
Context: In the previous project, you had the opportunity to take the time to insert data into a sqlite3
database. There are still many common tasks that you may need to perform using a database: triggers, views, transaction, and even a few sqlite3
-specific functionalities that may prove useful.
Scope: SQL
Dataset(s)
For this project, we will be using the lahman
sqlite database. This database contains the data in the directory
-
/anvil/projects/tdm/data/lahman
You may get some more lahman
database information from this youtube video: 2023 SABR Analytics:Sean Lahman, "introduction to Baseball Databases"
For every following cell where you want to run a SQL query, prepend %%sql
to the top of the cell — just like we do for R or bash cells.
Questions
Question 1 (2 pts)
-
Following the instructions to create a new column and a trigger for table "teams"
-
Update the table "teams" and display the updated information
Begin by copying the database from the previous project to your
Go ahead and launch
From within
With any luck, things should be working just fine. Let’s go ahead and create a trigger. A trigger is what it sounds like, given a specific action, do a specific action. This is a powerful tool. One of the most common uses of a trigger that you will see in the wild is the "updated_at" field. This is a field that stores a datetime value, and uses a trigger to automatically update to the current date and time anytime a record in the database is updated. First, we need to create a new column called "updated_at", and set the default value to something. In our case, lets set it to January 1, 1970 at 00:00:00.
If you query the table now, you will see all of the values have been properly added, great!
Now add a trigger called "update_teams_updated_at" that will update the "updated_at" column to the current date and time whenever a record is updated. Check out the official documentation here for examples of triggers. Once your trigger has been written, go ahead and test it out by updating the following record.
If it worked right, your |
Question 2 (2 pts)
Next, we will touch on views. A view is essentially a virtual table that is created from some query and given a name. Why would you want to create such a thing? Well, there could be many reasons. Maybe you have a complex query that you need to run frequently, and it would just be easier to see the final result with a click? Maybe the database has horrible naming conventions and you want to rename things in a view to make it more readable and/or queryable? After some thought, it may occur to you that we’ve had such an instance where a view could be nice using our You may get more information about "view" here: www.sqlitetutorial.net/sqlite-create-view/ |
-
Create a view called "players_with_awards_2020" that will provide information for a player. It should include the player’s name, height, weight, and if the play has an award in 2020; use the year 2020 data, joining the "people" and "awardsplayers" tables.
-
Display 5 records from the view "players_with_awards_2020"
-
use "playerID" to join two tables
Question 3 (2 pts)
Read the official sqlite3
documentation for transactions here. As you will read, you’ve already been using transactions each time you run a query! What we will focus on is how to use transactions to rollback changes, as this is probably the most useful use case you’ll run into.
Connect to our "lahman.db" database from question (1), start a deferred transaction, and update a row, similar to what we did before, using the following query.
UPDATE teams SET teamRank = 30 WHERE yearID = 2022 AND teamID = 'ARI';
Now, query the record to see what it looks like.
SELECT * FROM teams WHERE yearID = 2022 AND teamID ='ARI' and teamRank = 30;
You’ll notice our trigger from before is still working, cool! |
This is pretty great, until you realized that the teamRank was not right! Oh no! Well, at this stage you haven’t committed your transaction yet, so you can just rollback the changes and everything will be back to normal. Give it a try (again, following the official documentation).
After rolling back, run the following query.
SELECT * FROM teams WHERE yearID = 2022 AND teamID = 'ARI' ;
As you can see, the data changed back to the original one! As you can imagine, this is pretty powerful stuff, especially if you are writing to a database and want to make sure things look right before committing the changes.
Question 4 (2 pts)
SQL and sqlite3
are powerful tools, and we’ve barely scratched the surface. Check out the offical documentation, and demonstrate another feature of sqlite3
that we haven’t yet covered.
Some suggestions, if you aren’t interested in browsing the documentation: window functions, math functions, date and time functions, and core functions (there are many we didn’t use!)
Please make sure the queries you run are run from an sql cell in your Jupyter notebook.
Project 13 Assignment Checklist
-
Jupyter Lab notebook with your code, comments and output for the assignment
-
firstname-lastname-project13.ipynb
-
-
Submit the copy of the
lahman.db
file that you made in your home directory. -
Submit files through Gradescope
Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted. In addition, please review our submission guidelines before submitting your project. |