General Q about screen scraping

K

KMA

Just a general question about peoples opinion of web services.

Basically, I'm a bit dissapointed that there don't seem to be many usable
useful web services out there. Of course, there are _oodles_ out there, but
it seems kind of difficult to locate the one you want.

Take for example weather information. If you want to find out the average
temperature in Moscow this month* you could almost certainly turn up a web
page with this information, but a web service? I get the feeling that I got
when I started using COM. The idea was that COM componenets would
proliferate with dozens of thrid parties writing spell checkers and credit
card validators and so on, and that developers would just snap them together
like pieces of Lego. But it didn't really happen. And now the same thing
seems to be (not) happening with web services. Sure, some companies are
using them internally, and there are of course some good and useful services
(Amazon). does anyone agree that the proliferation is not as widespread as
on emight have expected? After all, web services are not new. They are now
in the phase that they should be springing up rapidly if they're going to
spring up at all.

So, with this in mind I though I might write one myself. The thing is, the
source data will almost certaily be through accessing a web page and
scraping it into a convenient data store. Do many peopl edo this as a matter
of course? I understand that I'll be at the mercy of the page provider but I
'd rather this than key in data each day myself. I can always send myself an
alert if the source is unavailable.

cheers.

* by the way, if your wondering: cold. very cold. so cold, in fact, that I
saw a couple of brass monkeys wandering about red square looking for a spot
welder.
 
C

CESAR DE LA TORRE [MVP]

I'm sure WebServices are going to have a wide proliferation, because of the
following:
- It's been th eonly standard trusted by all plaforms (Java, Microsoft,
etc.). You couldn't see that with COM/DCOM, RMI/Java, Corba, etc. I see it
like HTTP, HTML and XML standards.
- It is being wide trusted in most of the distributed App. within internal
Enterprises (Jave platform, MS platform, etc.). It is not so wide adopted
through the Internet (between several Enterprises comm, B2B ,etc.) because 1
and 2 years ago we didn't have enough features for complex Enterprises inter
communication. We didn't have the standards implementation (like WS-*
specifications) about Security (Message SOAP oriented), Async. protocols,
peer-to-peer communication, Transport Protocol independence, Transactions,
Messaging, Secure conversation, Attachments, etc. We already have it now but
in a changing environment (WSE 2.0, WSE 3.0, etc.). In the next 1 or 2 years,
all those standards are gonna be very stable (in MS platform with
WCF-INDIGO). I'm really sure about it. I could bet for it... ;-)
--
CESAR DE LA TORRE
Software Architect
[Microsoft MVP - XML Web Services]
[MCSE] [MCT]

Renacimiento
[Microsoft GOLD Certified Partner]
 
M

Michael Nemtsev

Hello KMA,

The answer is so simple - websites for million of end-users, and WS for hundred
of developers


K> Just a general question about peoples opinion of web services.
K>
K> Basically, I'm a bit dissapointed that there don't seem to be many
K> usable useful web services out there. Of course, there are _oodles_
K> out there, but it seems kind of difficult to locate the one you want.
K>

---
WBR,
Michael Nemtsev :: blog: http://spaces.msn.com/members/laflour

"At times one remains faithful to a cause only because its opponents do not
cease to be insipid." (c) Friedrich Nietzsche
 
A

alex_f_il

Look at
SWExplorerAutomation(http://home.comcast.net/~furmana/SWIEAutomation.htm)

SW Explorer Automation (SWEA) creates an object model (automation
interface) for any Web application running in Internet Explorer. The
automation interface consists of pages (scenes) and controls. The page
consists of controls. The following controls are supported:
HtmlContent, HtmlAnchor, HtmlImage, HtmlInputButton, HtmlInputCheckBox,
HtmlInputRadioButton, HtmlInputText, HtmlSelect, HtmlTextArea. The
object model is defined visually by SWEA designer. The designer allows
to record scripts (C# and VB) based on the defined application object
model.

It is very easy to create a scraping solution for any Web site using
SWEA.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top