S
Sez
Hi,
I'm not a programmer. I start working as text miner and as a first task
I have given 1000 dirty files that needs to be cleaned before
classification tasks. I have been told python is the best tool for this
job.
Each file's structure as below:
Comments: This is article 1965 obtained from the website
Title: Banana Report #65, September 2003
Author: dylab
Date: 1st September 2003
Section: pulse
In the past month:
A mass hit North America, cutting electricity to 50 million people
across the North east
I'm expected execute the python script so the file suppose to look like
this:
pulse, In, the, past, month, A, mass, hit, North, America, cutting,
electricity, to, 50, million, people, across, the, North east, dylab
Could you please point me to right direction here. Or provide some
example code. In the mean time I'll be searching myself. I know you
guys hate novice people like me but I would appreciated if you could
provide little help here.
Thanks & regards,
Sez
I'm not a programmer. I start working as text miner and as a first task
I have given 1000 dirty files that needs to be cleaned before
classification tasks. I have been told python is the best tool for this
job.
Each file's structure as below:
Comments: This is article 1965 obtained from the website
Title: Banana Report #65, September 2003
Author: dylab
Date: 1st September 2003
Section: pulse
In the past month:
A mass hit North America, cutting electricity to 50 million people
across the North east
I'm expected execute the python script so the file suppose to look like
this:
pulse, In, the, past, month, A, mass, hit, North, America, cutting,
electricity, to, 50, million, people, across, the, North east, dylab
Could you please point me to right direction here. Or provide some
example code. In the mean time I'll be searching myself. I know you
guys hate novice people like me but I would appreciated if you could
provide little help here.
Thanks & regards,
Sez