CSci 284 HW Assignment

CSci 284 - Homework 2

Due: Thursday, Feb. 13, 2025

This assignment is worth 50 points. For full credit:

submit responses to BrightSpace by 11:59pm on the due date
see the Deliverables section below

Reading. Math and Aggregate Functions in SQL

Part 1: Working with CSV Files (25 points).

The purpose of this section is to get better acquainted with different ways of reading and writing CSV files using Python. I'll continue assigning small Python programs for review so when we look at larger Python programs later this semester, your Python skills aren't too rusty!

(8 points) Work through this Introduction to CSV Files tutorial. You'll produce four separate programs (there's no program for the JSON formatted data section, but read through it).
See Problem #2 below for the CSV file to use.

Note: the author is not a computer professional or a CS professor. The tutorial is fine, but the comments in the example code are not always appropriate.
Comments should always describe the program or the code block that follows the description.
(7 points) Two of the programs use the CSV file presidents.csv. Download this file into the same directory as your programs, then use a text editor to make the following changes to the file:
- this file was saved in 2019, so it's missing entries for Presidency #46 (Joe Biden) and Presidency #47 (Trump's second term). Add these two rows (you can of course copy most of Trump's information from the current last row).
- former President Jimmy Carter recently died. Add his death-date and death-location information. Also update the left-office field for Presidency #45.
(2+8 points) Modify the read_csv.py program so that it:
- prompts the user for the CSV file name, so it will work with any properly-formatted CSV file, then
- modify the way it writes the data so that it looks like the MySQL table format, for example:
```
+-------+----------------------+-----------+
| name  |      address         |    job    |
+-------+----------------------+-----------+
| Alice | Gainesville, Florida | chef      |
| Bob   | Chicago, Illinois    | writer    |
| Ted   | Miami, Florida       | driver    |
| Carol | Portland, Oregon     | executive |
+-------+----------------------+-----------+
```
Using Python's f-strings (formatted strings) will simplify producing the table format. Click the link for a description and some examples if you didn't study these in CSci 157. And no, it is not required that column headings be centered.

Part 2: SQL Queries (25 points - 5 points each).

The purpose of this section is to get more experience using GROUP BY queries.

For each problem, write an SQL query that computes the requested information. All such queries will come from tables in the cs284_spring25 database. Remember to use describe to re-familiarize yourself with the fields in these tables if necessary. For each query, be sure to match the output shown, including column names, as much as possible. Note: the database content may have changed since these examples where run, so it is okay if your numbers are slightly different.

For your submission, queries and representative results can be cut and pasted into a text file, or you can take screenshots. By representative results, I mean just the first few lines, or the entire table if it isn't too long.

Construct a query that generates this table:

+-------------+-----------+-------------+
| human_throw | bot_throw | num_matches |
+-------------+-----------+-------------+
| r           | r         |         245 |
| r           | p         |         252 |
| r           | s         |         288 |
| p           | r         |         254 |
| p           | p         |         255 |
| p           | s         |         248 |
| s           | r         |         220 |
| s           | p         |         226 |
| s           | s         |         260 |
+-------------+-----------+-------------+

Modify the above so that it looks like this. Hint: use an if statement to turn 'r' into 'rock' and so on.

+-------------+-----------+-------------+
| human_throw | bot_throw | num_matches |
+-------------+-----------+-------------+
| paper       | paper     |         255 |
| paper       | rock      |         254 |
| paper       | scissors  |         248 |
| rock        | paper     |         252 |
| rock        | rock      |         245 |
| rock        | scissors  |         288 |
| scissors    | paper     |         226 |
| scissors    | rock      |         220 |
| scissors    | scissors  |         260 |
+-------------+-----------+-------------+

Create a table that shows how often each human player threw each type of throw.
Hint: start with the query SELECT player_1_id, if(player_1_throw='r',1,0) from rps_matches; and expand from there.

+----------+------+-------+----------+
| human    | rock | paper | scissors |
+----------+------+-------+----------+
| adejuoj0 |    7 |    24 |       20 |
| bahao0   |   42 |    49 |       23 |
| bellijh0 |   17 |    11 |        9 |
| bertct0  |   19 |    15 |       16 |
| bhandu0  |   29 |    29 |       33 |
| carpeca0 |    9 |     8 |        9 |
| darbnjr0 |   24 |    28 |       26 |
| ddrinen  |    1 |     2 |        3 |
| doylear0 |   43 |    36 |       27 |
| hardibc0 |   18 |    20 |       17 |
| jacksbc0 |    5 |     5 |       12 |
| jakoba0  |   17 |    18 |       14 |
| leek0    |  489 |   452 |      447 |
| manacjs0 |   16 |    17 |       14 |
| ricekg0  |   23 |     0 |        0 |
| sulliwj0 |   26 |    43 |       36 |
+----------+------+-------+----------+

Use the DATE_FORMAT() function to re-do problem 2 from the in-class exercise so that it looks like this:

+----------+-------------------------+-------------------------+
| human    | first_match             | most_recent             |
+----------+-------------------------+-------------------------+
| adejuoj0 | Thu, Jan 18 at 8:09 PM  | Thu, Jan 18 at 8:11 PM  |
| bahao0   | Thu, Jan 18 at 9:59 AM  | Fri, Jan 19 at 8:34 AM  |
| bellijh0 | Fri, Jan 19 at 7:58 AM  | Fri, Jan 19 at 8:03 AM  |
| bertct0  | Thu, Jan 18 at 1:44 PM  | Thu, Jan 18 at 1:46 PM  |
| bhandu0  | Thu, Jan 18 at 11:51 AM | Thu, Jan 18 at 11:58 AM |
| carpeca0 | Thu, Jan 18 at 12:24 PM | Thu, Jan 18 at 12:47 PM |
| darbnjr0 | Thu, Jan 18 at 2:10 PM  | Thu, Jan 18 at 2:16 PM  |
| ddrinen  | Thu, Jan 18 at 8:56 AM  | Thu, Jan 18 at 9:35 AM  |
| doylear0 | Thu, Jan 18 at 6:31 PM  | Fri, Jan 19 at 8:34 AM  |
| hardibc0 | Thu, Jan 18 at 3:42 PM  | Thu, Jan 18 at 3:45 PM  |
| jacksbc0 | Fri, Jan 19 at 8:01 AM  | Fri, Jan 19 at 8:02 AM  |
| jakoba0  | Thu, Jan 18 at 6:53 PM  | Thu, Jan 18 at 6:56 PM  |
| leek0    | Thu, Jan 18 at 3:10 PM  | Thu, Jan 18 at 3:32 PM  |
| manacjs0 | Thu, Jan 18 at 9:55 AM  | Thu, Jan 18 at 10:00 AM |
| ricekg0  | Thu, Jan 18 at 1:33 PM  | Thu, Jan 18 at 1:34 PM  |
| sulliwj0 | Fri, Jan 19 at 12:04 AM | Fri, Jan 19 at 8:34 AM  |
+----------+-------------------------+-------------------------+

In the cities table, there are several city names that appear in multiple states. For example, there is a Cleveland, OH and a Cleveland, TN. Springfield is also a popular city name.
Write a query that determines how many city names appear in multiple states (the answer should be 62).

Now, format your table so that it looks like the one below. HINT: you'll want to use at least one GROUP_CONCAT, and also a CONCAT inside the GROUP_CONCAT

+--------------+--------------------------------------------------------------------------------------------+
| city_name    | states_and_pops                                                                            |
+--------------+--------------------------------------------------------------------------------------------+
| Albany       | New York (98424), Georgia (76185), Oregon (51583)                                          |
| Alexandria   | Virginia (148892), Louisiana (48426)                                                       |
| Apple Valley | California (70924), Minnesota (50201)                                                      |
| Auburn       | Washington (74860), Alabama (58582)                                                        |
| Aurora       | Colorado (345803), Illinois (199963)                                                       |

...

| Warren       | Michigan (134873), Ohio (40768)                                                            |
| Westminster  | Colorado (110945), California (91739)                                                      |
| Wilmington   | North Carolina (112067), Delaware (71525)                                                  |
+--------------+--------------------------------------------------------------------------------------------+

Deliverables

For Part 1, create a zip file containing each Python program and the updated presidents.csv file, plus any other CSV files produced.
For Part 2, save a text file containing your results (or create another zip file if you did screenshots).
Post these files to the Assignments page on BrightSpace.