Activity: Indexing

(no submission)

Purpose:

You may make a copy of a worksheet and complete this activity, or simply type your answers in any text editor.

You may work alone or with at most two other students in this course (feel free to make use of any communication channels of your choice, or simply type in Zoom's chat .. or talk if that works for your team).


Part 1: Work with B+ tree

  1. Construct a B+ tree for the following set of key values:
    (10, 15, 20, 30, 60, 65, 80, 95, 100, 112, 124, 140, 148)
    Assume that the tree is initially empty and values are added in ascending order. Construct B+ tree for the cases where the number of pointers (n) that will fit in one node is five.

  2. Use the B+ tree constructed in question 1, what is the minimum number of pointers to be followed to satisfy the query: Get all records of data associated with key(s) between 40 and 85?
  3. Use the B+ tree from question 1, apply each of the following operations in order, and show the tree after each operation.
    1. Insert 18
    2. Insert 40
    3. Insert 85
    4. Delete 124
    5. Insert 138
    6. Insert 135
    7. Insert 120

Part 2: Think about your indexing choices

Assume each disk block holds t tuples of a relation customer. A relation customer has 100 tuples; and 10 disk blocks are used to hold tuples of the relation. An attribute city indicates a city where a customer lives. The customer relation is sorted by city (ascending). There are 80 distinct values of the city in the customer relation.

For each of the following data access methods, estimate the I/O cost to answer the query: Find information of customers who live in Charlottesville

  1. Use no index (and thus need to go directly to the data file and do sequential scan)
  2. Use clustered index (assume search key of a clustered index is based on city of the customer relation)
  3. Use unclustered index (assume search key of an unclustered index is based on an attribute that is not city)