Class 14 — Friday February 14
Web chrestomathics
How was I to know — Person sitting next to me — Would become my spouse
Look both ways
Agenda
- Web access and analysis
- Foreshadow programmer-defined functions
Notices
- Happy Valentine's day to you all
- No office yours today
For the fun of it
- Reveal one of your super power(s).
- Share a selfie.
- Sign up for peer mentoring
To do list
- Review class artifacts.
- Do homework.
- Prepare for Test 1
Downloads
- Program master_plan.py
- Program secret_revealed.py
- Program paging_page.py
- Program get_a_dataset.py
- Module url.py
CS 1112 CSV datasets
- Primary examples
Web programming
- The acronym URL stands for Uniform Resource Locator. Think of a URL as an address. Everything on the WWW has its own unique URL.
- Our introduction to interacting with the web is intentionally simple. Industrial-strength web applications also require familiarity with other and more powerful URL modules.
- To start off the only thing we need is access to the module
urllib.request
. The module supports working with URLs.
import urllib.request
- The module has a function
urllib.request.urlopen()
that returns a connector to a URL resource (think web page). Sample usage:
stream = urllib.request.urlopen( link )
- Although we do not care in itself, the value returned by
urlopen()
is anhttp.client.HTTPResponse
object. What we do care is that such an object has a functionread()
that can be used to get the contents of the web resource to which it is connected.
page = stream.read()
- The contents provided by
read()
is a string encoded in a web format rather than as regular text.
- Besides having functions
lower()
andupper()
, a string also has a functiondecode()
that can be use to get a plain-text version of itself.
text = page.decode()
The above assignment sets
text
to be the decoded contents of the url resource named bylink
; that istext
is a string equaling the contents of the url resource indicted bylink
.
- The four statements form a template for getting the contents of a URL resource in string format.
import urllib.request # get module access
stream = urllib.request.urlopen( link ) # open connector to the link web resource
page = stream.read() # read contents of the resource
text = page.decode() # decode contents as normal text string
- What happens next is problem-dependent.
Program master_plan.py
- Displays the word of the day from the CS 1112 web file
word-of-the-day
???
Program secret_revealed.py
- Determines the superpower associated with a user-suppled computing id.
Some program runs
Enter computing id: mst3k
Boo-boo finding
Enter computing id: jpc
Thanato-etos cognition
Program paging_page.py
- Displays the contents of user-indicated web page
Some program runs
Enter url: http://www.cs.virginia.edu/~cs1112/words/most-misspelled
appreciate
beautiful
cancelled
definitely
desert
diarrhea
gray
leprechaun
maintenance
neighbor
pneumonia
vacuum
Enter url: http://www.cs.virginia.edu/~cs1112/words/hangman
abated
abhors
ablush
abrade
...
zenith
zephyr
zipper
zombie
Enter url: http://www.cs.virginia.edu/~cs1112/syllabus
<!DOCTYPE html> <html lang="en"> <head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>CS 1112: Spring 2020</title>
<link rel="shortcut icon" href="/~cs1112/images/favicon.ico" /><link rel="stylesheet" type="text/css" href="/~cs1112/defs/css/201-default.css"></head>
<body class="single-column-page">
<div class="task-bar">
<br/>
<table class="header-bar">
<tr style="line-height: 26px;">
<td style="text-align: center; vertical-align: middle">
<a class="menu" href="/~cs1112/term/201/">
<img style="height: 35px" src="/~cs1112/images/home.svg" alt="home icon">
</a> </td>
<td style="text-align: center; vertical-align: middle;"> <a class="menu" href="/~cs1112/software/" > <img style="height: 35px" src="/~cs1112/images/computer.svg" alt="software icon"> </a> </td>
...
<li>This syllabus is to be considered a reference document that can and will be adjusted through the course of the semester to address changing needs. It is up to the student to monitor this page for any changes. Final authority on any decision in this course rests with the professor, not with this document.</li>
</ul>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
</div>
</body>
</html>
Program get_a_dataset.py
- Gets a user-specified web csv resource as a Python dataset
Some program runs
Enter name of dataset: best-sellers.csv
['Name', 'Author', 'Language', 'Date', 'Sales']
['Don Quixote', 'de Cervantes', 'Spanish', '1605', '500000000']
['A Tale of Two Cities', 'Dickens', 'English', '1859', '200000000']
['The Lord of the Rings', 'Tolkien', 'English', '1954', '150000000']
['The Little Prince', 'de Saint-Exupery', 'French', '1943', '140000000']
["Harry Potter and the Philosopher's Stone", 'Rowling', 'English', '1997', '120000000']
['The Hobbit', 'Tolkien', 'English', '2017', '100000000']
['And Then There Were None', 'Christie', 'English', '2019', '100000000']
['Dream of the Red Chamber', 'Xueqin', 'Chinese', '1754', '100000000']
["Alice's Adventures in Wonderland", 'Carroll', 'English', '1865', '100000000']
Enter name of dataset: rows_of_stuff.csv
['Asta', 'Hachiko', 'Laika', 'Lassie']
['59.0', 'TruE']
['faLse', '3.14', '271']
['01', '10', '10.0', 'ABC']
['Asta', 'Hachiko', 'Laika', 'Lassie']
[59.0, True]
[False, 3.14, 271]
[1, 10, 10.0, 'ABC']
© 2020 Jim Cohoon | Resources from previous semesters are available. |