Web pages


Our introduction to interacting with the web is intentionally simple. Industrial-strength web applications also require familiarity with other and more powerful URL modules. There is an external library requests worth checking if you have further interest.

In the explanation that follows

'http://www.cs.virginia.edu/~cs1112/datasets/words/most-misspelled'

It is a partial list of the most misspelled words in Google searches.


Module urllib.request

Overview

How to get access

Essential function (for us)

stream = urllib.request.urlopen( link )

The assignment establishes stream to be a connection from your program to the URL resource specified by link.


Module http.client.HTTPResponse

Overview

How to get access

Essential function (for us)

stream = urllib.request.urlopen( link )

page = stream.read()

Sets string page to be the encoded contents of the url resource named by link.

text = page.decode( 'UTF-8' )

Sets text to be the decoded contents of the url resource named by link.

# get access to needed web support

import urllib.request

 

# where is our page of interest

link = 'http://www.cs.virginia.edu/~cs1112/datasets/words/most-misspelled'

 

# establish a connection from our program to the web resource

stream = urllib.request.urlopen( link )

 

# get web source contents

page = stream.read()

 

# decode to standard text

text = page.decode( 'UTF-8' )

 

# print the result

print( text )

produces

cancelled

desert

 

gray

pneumonia

vacuum

 

appreciate

beautiful

definitely

diarrhea

leprechaun

maintenance

neighbor