Class 13 – Wednesday, September 23

A dataset by any other name is still a dataset (but it is not a set)

In a loop, a loop — Nesting, but not like a bird — Repeating again

Asian lives matter — We are a single people — That celebrate you

Look both ways

Back or ahead

Agenda

Dataset processing

Downloads and redownloads

Program re_going.py

Revisits introducing datasets

Program rc_printing.py

Introduces nested looping

Program column_grabbing.py

Copies a column out of a dataset

Program lotta_books.py

Examines a literal dataset based on the web dataset best_sellers.csv

To do list

Complete current homework

Examine recent artifacts

Program re_going.py

Demonstrates looping through a web-specified dataset

Some program runs

Enter name of dataset: elevations.csv

table: [['Location', 'Author', 'Max Height', 'Min Height'], ['Narnia', 'Lewis', 4810, -10], ['Neverland', 'Milne', 426, -2], ['Oz', 'Baum', 1231, 679], ['Sleepy Hollow', 'Irving', 1629, 304], ['Stars Hollow', 'Sherman-Palladino', 725, 152], ['Toyland', 'MacDonough', 6187, 0], ['Wonderland', 'Carroll', 5895, -5]]

the table has 8 rows

row ['Location', 'Author', 'Max Height', 'Min Height'] has 4 columns
row ['Narnia', 'Lewis', 4810, -10] has 4 columns
row ['Neverland', 'Milne', 426, -2] has 4 columns
row ['Oz', 'Baum', 1231, 679] has 4 columns
row ['Sleepy Hollow', 'Irving', 1629, 304] has 4 columns
row ['Stars Hollow', 'Sherman-Palladino', 725, 152] has 4 columns
row ['Toyland', 'MacDonough', 6187, 0] has 4 columns
row ['Wonderland', 'Carroll', 5895, -5] has 4 columns

row 0 : ['Location', 'Author', 'Max Height', 'Min Height']
row 1 : ['Narnia', 'Lewis', 4810, -10]
row 2 : ['Neverland', 'Milne', 426, -2]
row 3 : ['Oz', 'Baum', 1231, 679]
row 4 : ['Sleepy Hollow', 'Irving', 1629, 304]
row 5 : ['Stars Hollow', 'Sherman-Palladino', 725, 152]
row 6 : ['Toyland', 'MacDonough', 6187, 0]
row 7 : ['Wonderland', 'Carroll', 5895, -5]

Enter name of dataset: oceania.csv

table: [['Country', 'Females', 'Males'], ['Australia', 11175724, 11092660], ['Fiji', 421365, 439258], ['French Polynesia', 132082, 138682], ['New Caledonia', 125322, 125548], ['New Zealand', 2223281, 2144855], ['Papua New Guinea', 3359979, 3498287], ['Solomon Islands', 259909, 278239], ['Vanuatu', 117573, 122078]]

the table has 9 rows

row ['Country', 'Females', 'Males'] has 3 columns
row ['Australia', 11175724, 11092660] has 3 columns
row ['Fiji', 421365, 439258] has 3 columns
row ['French Polynesia', 132082, 138682] has 3 columns
row ['New Caledonia', 125322, 125548] has 3 columns
row ['New Zealand', 2223281, 2144855] has 3 columns
row ['Papua New Guinea', 3359979, 3498287] has 3 columns
row ['Solomon Islands', 259909, 278239] has 3 columns
row ['Vanuatu', 117573, 122078] has 3 columns

row 0 : ['Country', 'Females', 'Males']
row 1 : ['Australia', 11175724, 11092660]
row 2 : ['Fiji', 421365, 439258]
row 3 : ['French Polynesia', 132082, 138682]
row 4 : ['New Caledonia', 125322, 125548]
row 5 : ['New Zealand', 2223281, 2144855]
row 6 : ['Papua New Guinea', 3359979, 3498287]
row 7 : ['Solomon Islands', 259909, 278239]
row 8 : ['Vanuatu', 117573, 122078]

Program `rc_printing.py`

Demonstrates printing a table of values (cells) for a user-specicied number of rows and columns.

The value of cell should be the sum of its row and column indices.

Some program runs

Number of rows and columns: 3 4

0 1 2 3
1 2 3 4
2 3 4 5

Number of rows and columns: 4 5

0 1 2 3 4
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7

Program `column_grabbing.py`

For a user-specified column index produce a list of values for that column

Some program runs

Enter name of dataset: oceania.csv
Enter column index: 1

row ['Country', 'Females', 'Males'] : column 1 cell: Females
row ['Australia', 11175724, 11092660] : column 1 cell: 11175724
row ['Fiji', 421365, 439258] : column 1 cell: 421365
row ['French Polynesia', 132082, 138682] : column 1 cell: 132082
row ['New Caledonia', 125322, 125548] : column 1 cell: 125322
row ['New Zealand', 2223281, 2144855] : column 1 cell: 2223281
row ['Papua New Guinea', 3359979, 3498287] : column 1 cell: 3359979
row ['Solomon Islands', 259909, 278239] : column 1 cell: 259909
row ['Vanuatu', 117573, 122078] : column 1 cell: 117573

Column 1 : ['Females', 11175724, 421365, 132082, 125322, 2223281, 3359979, 259909, 117573]

Enter name of dataset: elevations.csv
Enter column index: 0

row ['Location', 'Author', 'Max Height', 'Min Height'] : column 0 cell: Location
row ['Narnia', 'Lewis', 4810, -10] : column 0 cell: Narnia
row ['Neverland', 'Milne', 426, -2] : column 0 cell: Neverland
row ['Oz', 'Baum', 1231, 679] : column 0 cell: Oz
row ['Sleepy Hollow', 'Irving', 1629, 304] : column 0 cell: Sleepy Hollow
row ['Stars Hollow', 'Sherman-Palladino', 725, 152] : column 0 cell: Stars Hollow
row ['Toyland', 'MacDonough', 6187, 0] : column 0 cell: Toyland
row ['Wonderland', 'Carroll', 5895, -5] : column 0 cell: Wonderland

Column 0 : ['Location', 'Narnia', 'Neverland', 'Oz', 'Sleepy Hollow', 'Stars Hollow', 'Toyland', 'Wonderland']

Program lotta_books.py

Examines a literal dataset based on the web dataset best_sellers.csv

Program run

header: ['Name', 'Author', 'Language', 'Date', 'Sales']

sales column: 4
name column: 0
date column: 3

total sold: 1897000000

dates: [1865, 1939, 1754, 1605, 1997, 1937, 1943, 1954, 1859]

earliest: 1605
latest : 1997

average date: 1872

row with earliest book: 3
row with latest book : 4

info on earliest: ['Don Quixote', 'de Cervantes', 'Spanish', 1605, 500000000]
info on latest: ['Harry Potter', 'Rowling', 'English', 1997, 447000000]

name of earliest: Don Quixote
name of latest: Harry Potter

Class 13 – Wednesday, September 23

A dataset by any other name is still a dataset (but it is not a set)

Look both ways

Agenda

Downloads and redownloads

To do list

Program re_going.py

Some program runs

Program rc_printing.py

Some program runs

Program column_grabbing.py

Some program runs

Program lotta_books.py

Program run

Program `rc_printing.py`

Program `column_grabbing.py`