Spring 2024 — Assignment 4

Due 12-April, 12pm EST (before class)
Purpose:

For this assignment, you may work individually or with at most 4 partners (max group size = 5). We strongly encourage you to work with partner(s) as different query designers typically have different perspectives. Besides, you can learn from each other.

Answer the following questions / complete the following tasks. All homework assignments are due before class on the due date. Please remember that the UVA Honor Code is in effect.


Note on handwriting and hand drawing

For this assignment, you may write your answers by hand ** neatly **

You may write RA expressions and/or draw RA trees by hand, take pictures, and embed them in your report. Do not submit the pictures separately. Submit a single report containing all of your answers, and save your report as a .pdf file.

** Draw / write legibly **. We have to deduct points if we are unsure what you drew/wrote.


[Total: 100 pts.]


Part 1 [70 points]: Relational algebra (RA)

  1. [40 points] Use the following schema, write RA queries to solve the problems.
    Student (sID, sname, email)
    Course (cID, insID, cname)
    Instructor (insID, iname, email)
    Grades (sID, cID, grade)
    1. (10 pts.) Find the names and emails of the students who have taken the course with course ID "CS4750"
    2. (10 pts.) Find the IDs of the students who never received a grade "D"
    3. (10 pts.) Find the IDs and names of the courses taught by at least two different instructors
    4. (10 pts.) Find the IDs and names of all instructors and students
  2. [30 points] Use the following schema, write RA queries to solve the problems.
    Supplier (sid, sname, address) 
    Part (pid, pname, color) 
    Catalog (sid, pid, cost)    -- which supplier supplies which part and at which price (cost)
    1. (10 pts.) Find the names of suppliers who supply some red parts.
    2. (10 pts.) Find the IDs of suppliers who supply only red parts (not all red parts, only red parts).
      For example, there are 10 red parts in the Part relation.        
         - Supplier 'A' supplies 2 red parts and 3 green parts
         - Supplier 'B' supplies 10 red parts, 1 yellow part, 6 blue parts
         - Supplier 'C' supplies 3 green parts
         - Supplier 'D' supplies 4 red parts
         - Supplier 'E' does not supply any parts (at the moment)
                    
      Your RA query should return only supplier 'D' 
      
      A supplier who supplies no parts is not considered a supplier who supplies only red parts. 
    3. (10 pts.) Find the IDs of suppliers who supply all parts with cost < 20.00
      Imagine, our company sells parts that cost < 20 and parts that cost >= 20. 
      The cost is set by our company. 
      Multiple suppliers can supply the same part, which costs the same.      
      
      List all the suppliers who supply all of the parts our company sells that cost < 20,
      not suppliers who supply just some items that are < 20. 
      Your RA query should return all the supplier IDs that satisfy the condition.
                    
      For example, suppose our company sells
         - part A, $12
         - part B, $15
         - part C, $15
         - part D, $18
         - part E, $20
         - part F, $22
         - part G, $50
         
      Supplier#1 supplies parts A, B, E, F
      Supplier#2 supplies parts A, B, C, D
      Supplier#3 supplies parts A, B 
      Supplier#4 supplies parts A, B, C, D, E, F, G
      
      Your RA query would return only supplier#2 and supplier#4 
      since these suppliers supply all parts our company sells that are < 20 (parts A, B, C, and D)         

Part 2 [30 points]: Cost estimation

Consider the following relations:
Actor (pid, fname, lname, gender)
Movie (mid, name, year, revenue)
Directors (did, fname, lname)
Casts (pid, mid, role)
Movie_Directors (did, mid) 
with the following properties:
T(Directors)            = 480 = number of tuples in relation Directors
V(Directors, fname)	= 360 = number of distinct values of attribute fname in Directors
V(Directors, lname)	= 440 = number of distinct values of attribute lname in Directors

T(Movie_Directors)      = 600 = number of tuples in relation Movie_Directors
V(Movie_Directors, did)	= 450 = number of distinct values of attribute did in Movie_Directors
V(Movie_Directors, mid)	= 550 = number of distinct values of attribute mid in Movie_Directors 
Consider the following query
SELECT D.fname, D.lname, COUNT(*) AS number_movies
FROM   Directors D, Movie_Directors MD
WHERE  D.did = MD.did
GROUP BY D.did, D.fname, D.lname; 
  1. (10 pts.) Write an RA expression that is equivalent to the above SQL query
  2. (10 pts.) Draw an RA tree representing a logical plan of the given SQL query and then estimate the number of tuples produced by the query. For full credit, show all of your work

    [You may draw the RA tree and compute the cost by hand, take a picture, and embed it in your report. Draw/write legibly; we have to deduct points if we are unsure what you drew/wrote.]

  3. (10 pts.) Based on the above query what is the range of number_movies. For credits, explain your analysis or show your thought process.

Late submission

To grade and record your assignment efficiently, the following details apply to all homework assignments, unless otherwise specified.


Submission

Include all team member names and computingIDs in the report.

Save your report as a PDF.  No Word document.

Each team submits only one copy.

Upload your report as a PDF to the Assignment 4 on Gradescope. Make sure you connect your partner to your group on Gradescope so that everyone receives credit.

Making your submission available to the course staff is your responsibility; if we cannot access or open your file, we have to assign a zero grade. Be sure to test access to your file before the due date.



Copyright © 2024 Upsorn Praphamontripong

Released under the Creative Commons License CC-BY-NC-SA 4.0 license.

Last updated 2024-04-04 19:27