Scrapy/XPath help

A

Always Learning

Hello all. I'm new to Python, but have been playing around with it for a few weeks now, following tutorials, etc. I've spun off on my own and am trying to do some basic web scraping. I've used Firebug/View XPath in Firefox for some help with the XPaths, however, I still am receiving errors when I try to run this script. If you could help, it would be greatly appreciated!

from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from cbb_info.items import CbbInfoItem, Field

class GameInfoSpider(BaseSpider):
name = "game_info"
allowed_domains = ["www.sbrforum.com"]
start_urls = [
'http://www.sbrforum.com/betting-odds/ncaa-basketball/',
]

def parse(self, response):
hxs = HtmlXPathSelector(response)
toplevels = hxs.select("//div[@class='eventLine-value']")
items = []
for toplevels in toplevels:
item = CbbInfoItem()
item ["teams"] = toplevels.select("/span[@class='team-name'/text()").extract()
item ["lines"] = toplevels.select("/div[@rel='19']").extract()
item.append(item)
return items
 
G

Grant Rettke

You might have better luck if you share the python make, version, os,
error message, and some unit tests demonstrating what you expect.

Hello all. I'm new to Python, but have been playing around with it for a few weeks now, following tutorials, etc. I've spun off on my own and am trying to do some basic web scraping. I've used Firebug/View XPath in Firefox for some help with the XPaths, however, I still am receiving errors when I try to run this script. If you could help, it would be greatly appreciated!

from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from cbb_info.items import CbbInfoItem, Field

class GameInfoSpider(BaseSpider):
name = "game_info"
allowed_domains = ["www.sbrforum.com"]
start_urls = [
'http://www.sbrforum.com/betting-odds/ncaa-basketball/',
]

def parse(self, response):
hxs = HtmlXPathSelector(response)
toplevels = hxs.select("//div[@class='eventLine-value']")
items = []
for toplevels in toplevels:
item = CbbInfoItem()
item ["teams"] = toplevels.select("/span[@class='team-name'/text()").extract()
item ["lines"] = toplevels.select("/div[@rel='19']").extract()
item.append(item)
return items
 
A

Always Learning

Sorry about that. I'm using Python 2.7.3, 32 bit one Windows 7.

The errors I get are
File "C:\python27\lib\site-packages\scrapy-0.16.3-py2.7.egg\scrapy\selector\lxmlsel.py", line 47, in select
raise ValueError("Invalid XPath: %s" % xpath)
exceptions.ValueError: Invalid XPath: /span[@class='team-name'/text()

Ultimaly, I expect it to gather the team name in text, and then the odds in one of the columns in text as well, so I can then put it into a .csv
 
A

Always Learning

Sorry about that. I'm using Python 2.7.3, 32 bit one Windows 7.

The errors I get are
File "C:\python27\lib\site-packages\scrapy-0.16.3-py2.7.egg\scrapy\selector\lxmlsel.py", line 47, in select
raise ValueError("Invalid XPath: %s" % xpath)
exceptions.ValueError: Invalid XPath: /span[@class='team-name'/text()

Ultimaly, I expect it to gather the team name in text, and then the odds in one of the columns in text as well, so I can then put it into a .csv
 
D

Dave Angel

Sorry about that. I'm using Python 2.7.3, 32 bit one Windows 7.

The errors I get are
File "C:\python27\lib\site-packages\scrapy-0.16.3-py2.7.egg\scrapy\selector\lxmlsel.py", line 47, in select
raise ValueError("Invalid XPath: %s" % xpath)
exceptions.ValueError: Invalid XPath: /span[@class='team-name'/text()
Ultimaly, I expect it to gather the team name in text, and then the odds in one of the columns in text as well, so I can then put it into a .csv

Why are you displaying only the last 3 lines of the error message?
Unless your source code is lxmlsel.py, there are other stack levels
above this one.

(I can't help, but I'm trying to save some time for someone who can)
 
D

donarb

The errors I get are
File "C:\python27\lib\site-packages\scrapy-0.16.3-py2.7.egg\scrapy\selector\lxmlsel.py", line 47, in select
raise ValueError("Invalid XPath: %s" % xpath)
exceptions.ValueError: Invalid XPath: /span[@class='team-name'/text()


You're missing a right bracket in the xpath expression:

/span[@class='team-name']/text()
 
D

donarb

The errors I get are
File "C:\python27\lib\site-packages\scrapy-0.16.3-py2.7.egg\scrapy\selector\lxmlsel.py", line 47, in select
raise ValueError("Invalid XPath: %s" % xpath)
exceptions.ValueError: Invalid XPath: /span[@class='team-name'/text()


You're missing a right bracket in the xpath expression:

/span[@class='team-name']/text()
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,812
Latest member
GracielaWa

Latest Threads

Top