S
Sebastian Bassi
I have this code:
import xml.parsers.expat
def start_element(name, attrs):
print 'Start element:', name, attrs
def end_element(name):
print 'End element:', name
def char_data(data):
print 'Character data:', repr(data)
p = xml.parsers.expat.ParserCreate()
p.StartElementHandler = start_element
p.EndElementHandler = end_element
p.CharacterDataHandler = char_data
fh=open("/home/sbassi/bioinfo/smallUniprot.xml","r")
p.ParseFile(fh)
And I get this on the output:
....
Start element: sequence {u'checksum': u'E0C0CC2E1F189B8A', u'length': u'393'}
Character data: u'\n'
Character data: u'MPKKKPTPIQLNPAPDGSAVNGTSSAETNLEALQKKLEELELDEQQRKRL'
Character data: u'\n'
Character data: u'EAFLTQKQKVGELKDDDFEKISELGAGNGGVVFKVSHKPSGLVMARKLIH'
....
End element: sequence
....
Is there a way to have the character data together in one string? I
guess it should not be difficult, but I can't do it. Each time the
parse reads a line, return a line, and I want to have it in one
variable.
(the file is here: http://sbassi.googlepages.com/smallUniprot.xml)
import xml.parsers.expat
def start_element(name, attrs):
print 'Start element:', name, attrs
def end_element(name):
print 'End element:', name
def char_data(data):
print 'Character data:', repr(data)
p = xml.parsers.expat.ParserCreate()
p.StartElementHandler = start_element
p.EndElementHandler = end_element
p.CharacterDataHandler = char_data
fh=open("/home/sbassi/bioinfo/smallUniprot.xml","r")
p.ParseFile(fh)
And I get this on the output:
....
Start element: sequence {u'checksum': u'E0C0CC2E1F189B8A', u'length': u'393'}
Character data: u'\n'
Character data: u'MPKKKPTPIQLNPAPDGSAVNGTSSAETNLEALQKKLEELELDEQQRKRL'
Character data: u'\n'
Character data: u'EAFLTQKQKVGELKDDDFEKISELGAGNGGVVFKVSHKPSGLVMARKLIH'
....
End element: sequence
....
Is there a way to have the character data together in one string? I
guess it should not be difficult, but I can't do it. Each time the
parse reads a line, return a line, and I want to have it in one
variable.
(the file is here: http://sbassi.googlepages.com/smallUniprot.xml)