A
Alexnb
Okay, I am not sure if there is a better way of doing this than findAll() but
that is how I am doing it right now. I am making an app that screen scapes
dictionary.com for definitions. However, I would like to have the type of
the word for each definition. For example if def1 and def2 are noun
defintions but def3 isn't:
noun
def1
def2
verb
def3
Something like that. Now I can get the definitions just fine. But the
problem comes when I want to get the type. I can get the types, but I don't
know for what definitions they go with. So I can get noun and verb, but for
all I know noun is def1, and verb is 2 and 3. I am wondering if there is a
way to use findAll() but like stop once it hits a certain thing, or a way to
do just that. for example, if I have
noun
<table blah>
<table blah>
verb
<table blah>
I want to be able to do like findAll('span', {'class': 'pg'}), but tell me
how many <table> things are after it, or before the next so I know how many
defintions it has.
Here is the code I am using(I used "cheese" because that is kinda my test
word for everything in the app.):
import urllib
from BeautifulSoup import BeautifulSoup
class defWord:
def __init__(self, word):
self.word = word
def get_types(term):
soup =
BeautifulSoup(urllib.urlopen('http://dictionary.reference.com/search?q=%s' %
term))
for tabs in soup.findAll('span', {'class': 'pg'}):
yield tabs.contents[0].string
self.mainList = list(get_types(self.word))
print self.mainList
type = defWord("cheese")
I don't know if this is really something anyone can help me fix or if I have
to do it on my own. But I would love some help.
that is how I am doing it right now. I am making an app that screen scapes
dictionary.com for definitions. However, I would like to have the type of
the word for each definition. For example if def1 and def2 are noun
defintions but def3 isn't:
noun
def1
def2
verb
def3
Something like that. Now I can get the definitions just fine. But the
problem comes when I want to get the type. I can get the types, but I don't
know for what definitions they go with. So I can get noun and verb, but for
all I know noun is def1, and verb is 2 and 3. I am wondering if there is a
way to use findAll() but like stop once it hits a certain thing, or a way to
do just that. for example, if I have
noun
<table blah>
<table blah>
verb
<table blah>
I want to be able to do like findAll('span', {'class': 'pg'}), but tell me
how many <table> things are after it, or before the next so I know how many
defintions it has.
Here is the code I am using(I used "cheese" because that is kinda my test
word for everything in the app.):
import urllib
from BeautifulSoup import BeautifulSoup
class defWord:
def __init__(self, word):
self.word = word
def get_types(term):
soup =
BeautifulSoup(urllib.urlopen('http://dictionary.reference.com/search?q=%s' %
term))
for tabs in soup.findAll('span', {'class': 'pg'}):
yield tabs.contents[0].string
self.mainList = list(get_types(self.word))
print self.mainList
type = defWord("cheese")
I don't know if this is really something anyone can help me fix or if I have
to do it on my own. But I would love some help.