C
Chuck Dawit
I submitted a post a few days ago about scraping the web for Cisco
products. I didn't receive that much input so I thought I would ask
again. Here are the requirments. I have a list of 2000 urls that all
have Cisco in its domain name.
(ex. http://www.soldbycisco.net
http://www.ciscoindia.net
http://www.ciscobootcamp.net
http://www.cisco-guy.net
and I want to scrape through them and determine which websites are
selling new cisco products, I'm actually looking for around 20 or so
products (ex. WIC-1T, NM-4E, WS-G2950-24). One idea I was given was to
split the pages into ones with forms and those without forms. Those
without forms probably wont have anything for sale so I can eliminate
those. But then I really don't know how to handle after that. Does
anyone have a different/better approach? Any help would be appreciated.
products. I didn't receive that much input so I thought I would ask
again. Here are the requirments. I have a list of 2000 urls that all
have Cisco in its domain name.
(ex. http://www.soldbycisco.net
http://www.ciscoindia.net
http://www.ciscobootcamp.net
http://www.cisco-guy.net
and I want to scrape through them and determine which websites are
selling new cisco products, I'm actually looking for around 20 or so
products (ex. WIC-1T, NM-4E, WS-G2950-24). One idea I was given was to
split the pages into ones with forms and those without forms. Those
without forms probably wont have anything for sale so I can eliminate
those. But then I really don't know how to handle after that. Does
anyone have a different/better approach? Any help would be appreciated.