A
Anish Chapagain
Hi,
I was trying to extract wikipedia Infobox contents which is in format
like given below, from the opened URL page in Python.
{{ Infobox Software
| name = Bash
| logo = [[Image:bash-org.png|165px]]
| screenshot = [[Image:Bash demo.png|250px]]
| caption = Screenshot of bash and [[Bourne shell|sh]]
sessions demonstrating some features
| developer = [[Chet Ramey]]
| latest release version = 4.0
| latest release date = {{release date|mf=yes|2009|02|20}}
| programming language = [[C (programming language)|C]]
| operating system = [[Cross-platform]]
| platform = [[GNU]]
| language = English, multilingual ([[gettext]])
| status = Active
| genre = [[Unix shell]]
| source model = [[Free software]]
| license = [[GNU General Public License]]
| website = [http://tiswww.case.edu/php/chet/bash/
bashtop.html Home page]
}} //upto this line
I need to extract all data between {{ Infobox ...to }}
Thank's if anyone can help,
am trying with
s1='{{ Infobox'
s2=len(s1)
pos1=data.find("{{ Infobox")
pos2=data.find("\n",pos2)
pat1=data.find("}}")
but am ending up getting one line at top only.
thank you,
I was trying to extract wikipedia Infobox contents which is in format
like given below, from the opened URL page in Python.
{{ Infobox Software
| name = Bash
| logo = [[Image:bash-org.png|165px]]
| screenshot = [[Image:Bash demo.png|250px]]
| caption = Screenshot of bash and [[Bourne shell|sh]]
sessions demonstrating some features
| developer = [[Chet Ramey]]
| latest release version = 4.0
| latest release date = {{release date|mf=yes|2009|02|20}}
| programming language = [[C (programming language)|C]]
| operating system = [[Cross-platform]]
| platform = [[GNU]]
| language = English, multilingual ([[gettext]])
| status = Active
| genre = [[Unix shell]]
| source model = [[Free software]]
| license = [[GNU General Public License]]
| website = [http://tiswww.case.edu/php/chet/bash/
bashtop.html Home page]
}} //upto this line
I need to extract all data between {{ Infobox ...to }}
Thank's if anyone can help,
am trying with
s1='{{ Infobox'
s2=len(s1)
pos1=data.find("{{ Infobox")
pos2=data.find("\n",pos2)
pat1=data.find("}}")
but am ending up getting one line at top only.
thank you,