Class 13 – Wednesday, September 23
A dataset by any other name is still a dataset (but it is not a set)
In a loop, a loop — Nesting, but not like a bird — Repeating again
Asian lives matter — We are a single people — That celebrate you
Look both ways
Agenda
- Dataset processing
Downloads and redownloads
- Program re_going.py
- Revisits introducing datasets
- Program rc_printing.py
- Introduces nested looping
- Program column_grabbing.py
- Copies a column out of a dataset
- Program lotta_books.py
- Examines a literal dataset based on the web dataset best_sellers.csv
To do list
- Complete current homework
- Examine recent artifacts
Program re_going.py
- Demonstrates looping through a web-specified dataset
-
Some program runs
Enter name of dataset: elevations.csv
table: [['Location', 'Author', 'Max Height', 'Min Height'], ['Narnia', 'Lewis', 4810, -10], ['Neverland', 'Milne', 426, -2], ['Oz', 'Baum', 1231, 679], ['Sleepy Hollow', 'Irving', 1629, 304], ['Stars Hollow', 'Sherman-Palladino', 725, 152], ['Toyland', 'MacDonough', 6187, 0], ['Wonderland', 'Carroll', 5895, -5]]
the table has 8 rows
row ['Location', 'Author', 'Max Height', 'Min Height'] has 4 columns
row ['Narnia', 'Lewis', 4810, -10] has 4 columns
row ['Neverland', 'Milne', 426, -2] has 4 columns
row ['Oz', 'Baum', 1231, 679] has 4 columns
row ['Sleepy Hollow', 'Irving', 1629, 304] has 4 columns
row ['Stars Hollow', 'Sherman-Palladino', 725, 152] has 4 columns
row ['Toyland', 'MacDonough', 6187, 0] has 4 columns
row ['Wonderland', 'Carroll', 5895, -5] has 4 columns
row 0 : ['Location', 'Author', 'Max Height', 'Min Height']
row 1 : ['Narnia', 'Lewis', 4810, -10]
row 2 : ['Neverland', 'Milne', 426, -2]
row 3 : ['Oz', 'Baum', 1231, 679]
row 4 : ['Sleepy Hollow', 'Irving', 1629, 304]
row 5 : ['Stars Hollow', 'Sherman-Palladino', 725, 152]
row 6 : ['Toyland', 'MacDonough', 6187, 0]
row 7 : ['Wonderland', 'Carroll', 5895, -5]
Enter name of dataset: oceania.csv
table: [['Country', 'Females', 'Males'], ['Australia', 11175724, 11092660], ['Fiji', 421365, 439258], ['French Polynesia', 132082, 138682], ['New Caledonia', 125322, 125548], ['New Zealand', 2223281, 2144855], ['Papua New Guinea', 3359979, 3498287], ['Solomon Islands', 259909, 278239], ['Vanuatu', 117573, 122078]]
the table has 9 rows
row ['Country', 'Females', 'Males'] has 3 columns
row ['Australia', 11175724, 11092660] has 3 columns
row ['Fiji', 421365, 439258] has 3 columns
row ['French Polynesia', 132082, 138682] has 3 columns
row ['New Caledonia', 125322, 125548] has 3 columns
row ['New Zealand', 2223281, 2144855] has 3 columns
row ['Papua New Guinea', 3359979, 3498287] has 3 columns
row ['Solomon Islands', 259909, 278239] has 3 columns
row ['Vanuatu', 117573, 122078] has 3 columns
row 0 : ['Country', 'Females', 'Males']
row 1 : ['Australia', 11175724, 11092660]
row 2 : ['Fiji', 421365, 439258]
row 3 : ['French Polynesia', 132082, 138682]
row 4 : ['New Caledonia', 125322, 125548]
row 5 : ['New Zealand', 2223281, 2144855]
row 6 : ['Papua New Guinea', 3359979, 3498287]
row 7 : ['Solomon Islands', 259909, 278239]
row 8 : ['Vanuatu', 117573, 122078]
Program rc_printing.py
- Demonstrates printing a table of values (cells) for a user-specicied number of rows and columns.
- The value of cell should be the sum of its row and column indices.
-
Some program runs
Number of rows and columns: 3 4
0 1 2 3
1 2 3 4
2 3 4 5
Number of rows and columns: 4 5
0 1 2 3 4
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
Program column_grabbing.py
- For a user-specified column index produce a list of values for that column
-
Some program runs
Enter name of dataset: oceania.csv
Enter column index: 1
row ['Country', 'Females', 'Males'] : column 1 cell: Females
row ['Australia', 11175724, 11092660] : column 1 cell: 11175724
row ['Fiji', 421365, 439258] : column 1 cell: 421365
row ['French Polynesia', 132082, 138682] : column 1 cell: 132082
row ['New Caledonia', 125322, 125548] : column 1 cell: 125322
row ['New Zealand', 2223281, 2144855] : column 1 cell: 2223281
row ['Papua New Guinea', 3359979, 3498287] : column 1 cell: 3359979
row ['Solomon Islands', 259909, 278239] : column 1 cell: 259909
row ['Vanuatu', 117573, 122078] : column 1 cell: 117573
Column 1 : ['Females', 11175724, 421365, 132082, 125322, 2223281, 3359979, 259909, 117573]
Enter name of dataset: elevations.csv
Enter column index: 0
row ['Location', 'Author', 'Max Height', 'Min Height'] : column 0 cell: Location
row ['Narnia', 'Lewis', 4810, -10] : column 0 cell: Narnia
row ['Neverland', 'Milne', 426, -2] : column 0 cell: Neverland
row ['Oz', 'Baum', 1231, 679] : column 0 cell: Oz
row ['Sleepy Hollow', 'Irving', 1629, 304] : column 0 cell: Sleepy Hollow
row ['Stars Hollow', 'Sherman-Palladino', 725, 152] : column 0 cell: Stars Hollow
row ['Toyland', 'MacDonough', 6187, 0] : column 0 cell: Toyland
row ['Wonderland', 'Carroll', 5895, -5] : column 0 cell: Wonderland
Column 0 : ['Location', 'Narnia', 'Neverland', 'Oz', 'Sleepy Hollow', 'Stars Hollow', 'Toyland', 'Wonderland']
Program lotta_books.py
- Examines a literal dataset based on the web dataset best_sellers.csv
Program run
header: ['Name', 'Author', 'Language', 'Date', 'Sales']
sales column: 4
name column: 0
date column: 3
total sold: 1897000000
dates: [1865, 1939, 1754, 1605, 1997, 1937, 1943, 1954, 1859]
earliest: 1605
latest : 1997
average date: 1872
row with earliest book: 3
row with latest book : 4
info on earliest: ['Don Quixote', 'de Cervantes', 'Spanish', 1605, 500000000]
info on latest: ['Harry Potter', 'Rowling', 'English', 1997, 447000000]
name of earliest: Don Quixote
name of latest: Harry Potter