Collecting Rich Data Structures for students

K

kirby.urner

Greetings Pythoneers --

Some of us over on edu-sig, one of the community actives,
have been brainstorming around this Rich Data Structures
idea, by which we mean Python data structures already
populated with non-trivial data about various topics such
as: periodic table (proton, neutron counts); Monty Python
skit titles; some set of cities (lat, long coordinates); types
of sushi.

Obviously some of these require levels of nesting, say
lists within dictionaries, more depth of required.

Our motivation in collecting these repositories is to give
students of Python more immediate access to meaningful
data, not just meaningful programs. Sometimes all it takes
to win converts, to computers in general, is to demonstrate
their capacity to handle gobs of data adroitly. Too often,
a textbook will only provide trivial examples, which in the
print medium is all that makes sense.

Some have offered XML repositories, which I can well
understand, but in this case we're looking specifically for
legal Python modules (py files), although they don't have
to be Latin-1 (e.g. the sushi types file might not have a
lot of romanji).

If you have any examples you'd like to email me about,
(e-mail address removed) is a good address.

Here's my little contribution to the mix:
http://www.4dsolutions.net/ocn/python/gis.py

Kirby Urner
4D Solutions
Silicon Forest
Oregon
 
P

Paddy

Greetings Pythoneers --

Some of us over on edu-sig, one of the community actives,
have been brainstorming around this Rich Data Structures
idea, by which we mean Python data structures already
populated with non-trivial data about various topics such
as: periodic table (proton, neutron counts); Monty Python
skit titles; some set of cities (lat, long coordinates); types
of sushi.

Obviously some of these require levels of nesting, say
lists within dictionaries, more depth of required.

Our motivation in collecting these repositories is to give
students of Python more immediate access to meaningful
data, not just meaningful programs. Sometimes all it takes
to win converts, to computers in general, is to demonstrate
their capacity to handle gobs of data adroitly. Too often,
a textbook will only provide trivial examples, which in the
print medium is all that makes sense.

Some have offered XML repositories, which I can well
understand, but in this case we're looking specifically for
legal Python modules (py files), although they don't have
to be Latin-1 (e.g. the sushi types file might not have a
lot of romanji).

If you have any examples you'd like to email me about,
(e-mail address removed) is a good address.

Here's my little contribution to the mix:http://www.4dsolutions.net/ocn/python/gis.py

Kirby Urner
4D Solutions
Silicon Forest
Oregon

I would think there was more data out there formatted as Lisp S-
expressions than Python data-structures.
Wouldn't it be better to concentrate on 'wrapping' XML and CSV data-
sources?

- Paddy.
 
B

bearophileHUGS

It may be better to keep the data in a simpler form:

data = """\
42 40 73 45 Albany, N.Y.
35 5 106 39 Albuquerque, N.M.
35 11 101 50 Amarillo, Tex.
34 14 77 57 Wilmington, N.C.
49 54 97 7 Winnipeg, Man., Can."""

cities = {}
for line in data.splitlines():
a1, a2, a3, a4, n = line.split(" ", 4)
cities[n] = [(int(a1), int(a2), "N"), (int(a3), int(a4), "W")]
print cities

Bye,
bearophile
 
F

Fredrik Lundh

Some have offered XML repositories, which I can well
understand, but in this case we're looking specifically for
legal Python modules (py files), although they don't have
to be Latin-1 (e.g. the sushi types file might not have a
lot of romanji).

you can of course convert any XML file to legal Python code simply by
prepending

from xml.etree.ElementTree import XML
data = XML("""

and appending

""")

and then using the ET API to navigate the data, but I guess that's not
what you had in mind.

</F>
 
P

Paddy

I would think there was more data out there formatted as Lisp S-
expressions than Python data-structures.
Wouldn't it be better to concentrate on 'wrapping' XML and CSV data-
sources?

- Paddy.

The more I think on it the more I am against this- data should be
stored in programming language agnostic forms but which are easily
made available to a large range of programming languages.
If the format is easily parsed by AWK then it is usually easy to parse
in a range of programming languages.

- Paddy.
 
K

kirby.urner

The more I think on it the more I am against this- data should be
stored in programming language agnostic forms but which are easily
made available to a large range of programming languages.
If the format is easily parsed by AWK then it is usually easy to parse
in a range of programming languages.

- Paddy.

It's OK to be against it, but as many have pointed out, it's often
just one value adding step to go from plaintext or XML to something
specifically Python.

Sometimes we spare the students (whomever they may be) this added
step and just hand them a dictionary of lists or whatever. We
may not be teaching parsing in this class, but chemistry, and
having the info in the Periodic Table in a Python data structure
maybe simply be the most relevant place to start.

Many lesson plans I've seen or am working on will use these .py
data modules.

Kirby
 
S

Scott David Daniels

Some of us over on edu-sig, one of the community actives,
have been brainstorming around this Rich Data Structures
idea, by which we mean Python data structures already
populated with non-trivial data about various topics such
as: periodic table (proton, neutron counts); Monty Python
skit titles; some set of cities (lat, long coordinates); types
of sushi.

Look into the "Stanford GraphBase" at:
http://www-cs-faculty.stanford.edu/~knuth/sgb.html
A great source of some data with some interesting related
exercises.

Also, a few screen-scraping programs that suck _current_
information from some sources should also delight; the students
have a shot at getting ahead of the teacher.

--Scott David Daniels
(e-mail address removed)
 
P

Paddy

It's OK to be against it, but as many have pointed out, it's often
just one value adding step to go from plaintext or XML to something
specifically Python.

Sometimes we spare the students (whomever they may be) this added
step and just hand them a dictionary of lists or whatever. We
may not be teaching parsing in this class, but chemistry, and
having the info in the Periodic Table in a Python data structure
maybe simply be the most relevant place to start.

Many lesson plans I've seen or am working on will use these .py
data modules.

Kirby

Then I'd favour the simple wrappings of bearophile and Frederik Lundhs
replies where it is easy to extract the original datamaybe for
updating , or for use in another language.

- Paddy.
 
D

Dennis Lee Bieber

Sometimes we spare the students (whomever they may be) this added
step and just hand them a dictionary of lists or whatever. We
may not be teaching parsing in this class, but chemistry, and
having the info in the Periodic Table in a Python data structure
maybe simply be the most relevant place to start.
In this particular example, I'd probably suggest stuffing the data
into an SQLite3 database file... Searching on name, symbol, weight, etc.
would be much easier then trying to dig through a nested dictionary.

--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
K

kirby.urner

In this particular example, I'd probably suggest stuffing thedata
into an SQLite3 database file... Searching on name, symbol, weight, etc.
would be much easier then trying to dig through a nested dictionary.

--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/

That's not a bad idea. We might see people passing ZODBs around
more too, as 'import zodb' in IDLE or whatever is increasingly
the style, vs. some megabundle you have to install. Think of
Zope as another site-package.

The advantage of just passing .py files around, among XO users
for example, is the periodicTable.py's contents are directly
eyeballable as ascii/unicode text, vs. stuffed into a wrapper.

I think what I'm getting from this fruitful discussion is the
different role of amalgamator-distributors, and Sayid or Kate
as classroom teachers, just trying to get on with the lesson
and having no time for computer science topics.

XML or YAML also make plenty of sense, for the more generic
distributor type operations.

Speaking only for myself, I appreciated some of the pointers
to APIs. Over on edu-sig, we've been talking a lot about
the 3rd party module for accessing imdb information -- not
a screen scraper.

Given xml-rpc, there's really no limit on the number of
lightweight APIs we might see. How about CIA World Factbook?
Too boring maybe, but it's already going out on the XOs, or
some of them, just because it's relatively up to date.
Could be imported as Python module too -- maybe that work
has already been done?

Kirby
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,992
Messages
2,570,220
Members
46,807
Latest member
ryef

Latest Threads

Top