how to get the summarized text from a given URL?

R

Rama Vadakattu

Is there any python library to solve the below problem?

FOr the below URL :
--------------------------
http://tinyurl.com/dzcwbg

Summarized text is :
---------------------------
By Roy Mark With sales plummeting and its smart phones failing to woo
new customers, Sony Ericsson follows its warning that first quarter
sales will be disappointing with the announcement that Najmi Jarwala,
president of Sony Ericsson USA and head of ...

~~~~~~~~~~~~~~
Usually summarized text is a 2 to 3 line description of the URL which
we usually obtain by fetching that html page , examining the content
and figuring out short description from that html markup.
~~~~~~~~~~~~~

Are there any python libraries which give summarized text for a given
url ?

It is ok even if the library just gives intial two lines of text
from the given URL Instead of summarization.
 
P

Peter Otten

Rama said:
Is there any python library to solve the below problem?

FOr the below URL :
--------------------------
http://tinyurl.com/dzcwbg

Summarized text is :
---------------------------
By Roy Mark With sales plummeting and its smart phones failing to woo
new customers, Sony Ericsson follows its warning that first quarter
sales will be disappointing with the announcement that Najmi Jarwala,
president of Sony Ericsson USA and head of ...

~~~~~~~~~~~~~~
Usually summarized text is a 2 to 3 line description of the URL which
we usually obtain by fetching that html page , examining the content
and figuring out short description from that html markup.
~~~~~~~~~~~~~

Are there any python libraries which give summarized text for a given
url ?

BeautifulSoup makes it easy to access parts of a web page.

import urllib2
from BeautifulSoup import BeautifulSoup

data = urllib2.urlopen("http://tinyurl.com/dzcwbg").read()
bs = BeautifulSoup(data)
print bs.find("meta", dict(name="description"))["content"]
It is ok even if the library just gives intial two lines of text
from the given URL Instead of summarization.

The problem is how you identify the summary. Different web sites will put it
in different places using different markup.

Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,998
Messages
2,570,242
Members
46,834
Latest member
vina0631

Latest Threads

Top