Class 22 — Friday October 15
It's a fact, look it up
Really need to know — What lies farther down the road — Need a future map
Look both ways
Agenda
- hacer 뭔가 frais oggi
- Get ready to move on
Examples
- Problem what_did_you_say.py
- Problem spell_rite.py
- Problem wc.py
To do
- Look over artifacts
- Do homework
- Submit program
spell_rite.py
today
Huh?
- dónde está mi lápiz
- donde esta mi lapiz
- hvor 是 该 tumili maleri av Munch
- est la baguette fraîche aujourd'hui
- est la baguette fraiche aujourd'hui
Babelfish — a translation dictionary in CSV format
a, a
aab, water
aamadam, came
...
ziba, beautiful
zur, to
zu, to
zwei, two
...
Problem text translation using babelfish dictionary
-
Goal
- Produce a program using translate text to English.
-
Some possible interactions
Enter phrase to be translated: dónde está mi lápiz
where is my pencil
Enter phrase to be translated: est la baguette fraîche aujourd'hui
is the baguatte fresh today
Enter phrase to be translated: je %#@#&# ból svayam !
I _%#@#&#_ hurt myself !
Enter phrase to be translated: hvor er den skriget maleri af Munch
where is the scream painting by Münch
Enter phrase to be translated: hacer 뭔가 frais oggi
do something cool today
-
Brainstorming
- How should we represent the translations?
- How should we represent the input?
- How should we represent the result?
-
Discussion and requirements
- Important data structures that the program needs to accomplish its task.
- List of words making up the user text.
- Dictionary loaded with the entries from the babelfish web file.
- Translation accumulation.
- If a word to be translated is not in the translation dictionary, then its translation is that word surrounded by underscores (_).
- For example, French word
heureuse
for happiness is not in the babelfish dictionary, so it would be translated as_heureuse_
.
- The translation should be displayed with a single
print()
statement.
- The translation needs to be stored as a string.
- The translation of the user text needs to be accumulated word by word.
- We need a string accumulator.
- When accumulating the translation, need a space following each word to separate it from the next word.
- To produce the accumulation, each word in the user-supplied text needs to be processed one by one.
- Every user word contributes to the translation.
- If a word is known to the babelish dictionary, its translation is added to the accumulation.
- If instead a word is unknown, the word surrounded by underscores is added to the accumulation.
Spell correction — program spell_rite.py
-
Web resources
- most-common — list of most commonly used words
- corrections — list of most common misspellings and their likely correction
-
Goal
- Produce a spell-checked version of user text that echos the user text word by word, unless a word is not known to be spelled correctly
- If a correction is known, the correction is offered within asterisks.
- If instead no correction is known, the word is flagged with underscores.
-
Some possible program interactions
Enter text: I am as happy as I can be
I am as happy as I can be
Enter text: I am a Newyorker who is livin in virginia
I am a *New Yorker* who is _livin_ in *Virginia*
Enter text:
-
Brain storming
- Problem has great similarity with text translation.
- If a word is spelled correctly, then it is added to the accumulation.
- If instead a word has an unknown spelling, then if it is known to the corrections dictionary then the correction surrounded by asterisks is added to the accumulation.
- Otherwise, if a word has an unknown spelling and is unknown to the corrections dictionary, then the word surrounded by underscores is added to the accumulation.
Program wc.py — making a point about the importance of functions
-
What it does
- Determines the number of lines, words, and characters in a user-specified web file
-
How it does it
- Uses built-in, string and list manipulating functions
- Imports a modules to gain access to its functions
-
Some program runs
Enter web file link: http://www.cs1112.org
nl = 241
nw = 1261
nc = 14104
Enter web file link: http://www.nytimes.com
nl = 583
nw = 12339
nc = 1166144
-
Code
# import module for internet access
import url
# get the lines of text from a user-specified file
reply = input( 'Enter web file link: ' )
link = reply.strip()
contents = url.get_contents( link )
# count the number of lines, words, and characters in the contents
nbr_lines = contents.count( '\n' )
words = contents.split()
nbr_words = len( words )
nbc_chars = len( contents )
# display result
print( 'nl =', nbr_lines )
print( 'nw =', nbr_words )
print( 'nc =', nbc_chars )
Here we go
- Question: Why bother with functions?
- Makes problem solving possible. Without them you would need to figure out
- How to communicate with your keyboard to read what it’s being typed
- How to communicate to your display so that information can be displayed
- How to specify the range of integers a
for
statement uses to loop.
- How to ...
- Question: What built-in functions are missing — how should Python be updated?
- The
url.py
module is an attempt by me to let you skip the tedious web processing setup to do something interesting
- One of your classmates asked whether there was some function
ints()
that would handle a string containing a bunch of integers.
- We will get to that likely in another three classes or so.
Babelfish — Hitchhiker's Guide to the Galaxy
Ziggurat of Ur