Splitting Tree

S

subhabangalore

Dear Group,

I am using NLTK and I used the following command,

chunk=nltk.ne_chunk(tag)
print "The Chunk of the Line Is:",chunk


The Chunk of the Line Is: (S
''/''
It/PRP
is/VBZ
virtually/RB
a/DT
homecoming/NN
,/,
''/''
said/VBD
(PERSON Gen/NNP Singh/NNP)
on/IN
arrival/NN)

Now I am trying to split the output preferably by ",/,".

But how would I split a Tree object in python.

If I use command like,
chunk_word=chunk.split()

It is giving me the error as,

File "C:/Python27/docstructure1.py", line 38, in document_structure1
chunk1=chunk.split()
AttributeError: 'Tree' object has no attribute 'split'

If anyone of the learned members of the room can kindly help.

Regards,
Subhabrata.
 
S

subhabangalore

Dear Group,



I am using NLTK and I used the following command,



chunk=nltk.ne_chunk(tag)

print "The Chunk of the Line Is:",chunk





The Chunk of the Line Is: (S

''/''

It/PRP

is/VBZ

virtually/RB

a/DT

homecoming/NN

,/,

''/''

said/VBD

(PERSON Gen/NNP Singh/NNP)

on/IN

arrival/NN)



Now I am trying to split the output preferably by ",/,".



But how would I split a Tree object in python.



If I use command like,

chunk_word=chunk.split()



It is giving me the error as,



File "C:/Python27/docstructure1.py", line 38, in document_structure1

chunk1=chunk.split()

AttributeError: 'Tree' object has no attribute 'split'



If anyone of the learned members of the room can kindly help.



Regards,

Subhabrata.

Sorry to ask this. I converted in string and then splitted it.
 
C

Cameron Simpson

| On Sunday, December 2, 2012 5:39:32 PM UTC+5:30, (e-mail address removed) wrote:
| > I am using NLTK and I used the following command,
| > chunk=nltk.ne_chunk(tag)
| >
| > print "The Chunk of the Line Is:",chunk
| >
| > The Chunk of the Line Is: (S
| > ''/''
| > It/PRP
[...]
| > Now I am trying to split the output preferably by ",/,".
[...]
|
| Sorry to ask this. I converted in string and then splitted it.

I'm glad you solved your problem, but I would like to point out that
this is generally a risky way of manipulating data.

The problem arises if the string you're splitting on occurs as a literal
piece of text, but _not_ in the sense you intend. It may be the case
that it will not happen in your particular situation, but in general the
procedure:
- convert structure to string somehow
- perfect simple text manipulation
- unconvert
is at risk of simplistic parsing of the string.

A common example is with CSV data. Supposing you wanted the the third
column from an array of tuples:

rows = [ (1,2,"A",4),
(5,6,"B",8),
(9,10,"C,D",12),
]

and you wanted [ "A", "B", "C,D" ]. If one went with the "convert to
text" approach, and decided that converting each tuple to a CSV style
data row was a good idea you might write:

column_3 = []
for row in rows:
csv_string = ",".join( str(item) for item in row )
item3 = csv_string.split(",")[2]
column_3.append(item3)

The (simplistic) code above with give you "C" from the third row, not
"C,D". Because it naively assumes there are no commas in the data, and
then does a simplistic textual split to find the third column.

Obviously you woldn't really do that for something this simple; it is to
show the issue. But your situation where manipulating a tree was tricky
and you converted it to a string is very similar conceptually.

Hoping this shows you the issue,
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,692
Latest member
JenniferTi

Latest Threads

Top