Fall 2025 — Assignment 3

Due 15-October-2025, 1pm EST (before class)
Purpose: If you use Google Cloud Platform (GCP): If you do the bonus points

For this assignment you may work individually or with at most 4 partners (max group size = 5). We strongly encourage you to work with partners as different query designers typically have different perspectives. Besides, you can learn from each other.

Answer the following questions / complete the following tasks. All homeworks are due before class on the due date. Please remember that the UVA Honor Code is in effect.


Part 1: [Total: 100 points] Write SQL queries to solve the following problems

Use an SQL script cardata.sql to create and populate three tables of the US vehicle toy database. (data from https://www.kaggle.com/datasets/doaaalsenani/usa-cers-dataset?resource=download)

You may use phpMyAdmin (locally or on the CS server) or use GCP SQL instance.

Write SQL queries to solve the following problems.

  1. (10 points) Find the average mileage across all cars for each state.
  2. (10 points) List the brands, models, years, prices, and conditions of all blue cars in California and sort them by their price from the most expensive to the least expensive.
  3. (10 points) Find the number of non-expired, non-salvage insurance vehicle listings in each state, which would be helpful for customers looking for states with many listings available. Display the state and the number of non-expired, non-salvage insurance vehicle listings and sort them by the number in descending order. Hint: consider only the vehicles with non-expired listing "and" non-salvage insurance — this means, if a vehicle satisfies *both* requirements, count as 1. If a vehicle satisfies only one of these requirements, do not count it.
  4. (10 points) For each state, find the number of vehicles with a price below the average price of their state and the average mileage of these (below average price) vehicles. Display the state, the number of these vehicles, and the average mileage.
  5. (10 points) For each year, find the ratio of average price vs. average mileage of the cars. Display the years and the ratios, and sort the information by year. Display only the year with a ratio of at least 0.1
  6. (10 points) Find the number of vehicles with prices above the average price of their state. Then, display the state and the number of vehicles.
  7. (10 points) Find the brands of the vehicles with "clean vehicle" title status listed in Virginia and Michigan. Hint: pay attention to "and" — this means, we are considering only those that are listed in *both* Virginia and Michigan — must be listed in both states, not just one state.

    Do not include vehicles with expired license conditions. Do not repeat the names of the brands.

  8. (10 points) Find the brands and the average prices of the vehicles with the "salvage insurance" title status. Display only the brands with the average prices higher than 2000.
  9. (10 points) Find the differences in prices between the least and most expensive vehicles in each state. For each state, display the price difference. Sort the information by the price differences from highest to lowest.
  10. (10 points) Find all vehicle brands that are listed in New York but not listed in Alabama.

Part 2: [10 bonus points] Data exploration and analysis

Let's practice data exploration and analysis using SQL! For this section, the questions will be a bit more open-ended. (Note: This is a bonus section of the assignment. If you do not wish to work on this part and have completed the SQL queries part, then you have finished this assignment!)

  1. (Up to 5 points) Using the following listed criteria, create at least 3 SQL queries on the US vehicle toy database (see Part 1 of the SQL queries part, above) that can be used for analysis. You can create as many queries as you want (they can be simple or complicated queries), as long as all three of the following requirements are satisfied:
    • One of your queries must use an aggregate function (SUM, AVG, MIN, MAX, COUNT)
    • One of your queries must make use of a join (NATURAL JOIN, INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN)
    • One of your queries must use a subquery (or subqueries)

    The bonus points will be awarded based on the quality (meaningful, reasonable, correctness, ...) of your SQL queries.

    Your SQL queries must be *different* from the SQL queries in Part 1 and different from the SQL queries for the more practice [sample solution] activity.

  2. (Up to 3 points) Now that you have all of the queried results, what are some of the patterns or trends in the data? Are there outliers that exist, and if so, what external factors do you think could be affecting these results? Identify other areas of analysis, and write a brief paragraph (5-7 sentences) with your findings.

    The bonus points will be awarded based on the quality of your write-up (meaningful, reasonable, precise, concise, ...)

  3. (Up to 2 points) Provide a visual representation of one of your findings. This can be any type of graph/chart that best represents the data, and provides insight into your analysis (above). Make sure to label the visual with a title, axis labels and titles, key (if needed), and measurement (if applicable).

    The bonus points will be awarded based on the quality of your visual representation and the information being conveyed (meaningful, reasonable, consistent with the findings you discussed previously, ...)


Late submission

To grade and record your assignment efficiently, the following details apply to all homework assignments, unless otherwise specified.


What to submit

Part 1: Your SQL queries Part 2: If you do the bonus points, submit a report containing

Submission

Include all team member names and computingIDs in the report.

Part 1 — Save your answers as .sql file.

Part 2 (if you do the bonus points) — Save your report as a PDF —  No Word document. No handwriting. Only typed PDF is acceptable.

Each team submits only one copy.

Upload your file(s) to Assignment 3 on Gradescope. Make sure you connect your partner to your group on Gradescope so that everyone receives credit.

Verify that you submit the correct version of your file.

Making your submission available to the course staff is your responsibility; if we cannot access or open your file, we have to assign a zero grade. Be sure to test access to your file before the due date.



Copyright © 2025 Upsorn Praphamontripong
Released under the Creative Commons License CC-BY-NC-SA 4.0 license.
Last updated 2025-09-28 9:31