help me debug my "word capitalizer" script

S

Santosh Kumar

Here is the script I am using:

from os import linesep
from string import punctuation
from sys import argv

script, givenfile = argv

with open(givenfile) as file:
# List to store the capitalised lines.
lines = []
for line in file:
# Split words by spaces.
words = line.split(' ')
for i, word in enumerate(words):
if len(word.strip(punctuation)) > 3:
# Capitalise and replace words longer than 3 (without
punctuation)
words = word.capitalize()
# Join the capitalised words with spaces.
lines.append(' '.join(words))
# Join the capitalised lines by the line separator
capitalised = linesep.join(lines)
# Optionally, write the capitalised words back to the file.

print(capitalised)


Purpose of the script:
To capitalize the first letter of any word in a given file, leaving
words which have 3 or less letters.

Bugs:
I know it has many bugs or/and it can be improved by cutting down the
code, but my current focus is to fix this bug:
1. When I pass it any file, it does it stuff but inserts a blank
line everytime it processes a new line. (Please notice that I don't
want the output in an another file, I want it on screen).
 
H

Hans Mulder

Here is the script I am using:

from os import linesep
from string import punctuation
from sys import argv

script, givenfile = argv

with open(givenfile) as file:
# List to store the capitalised lines.
lines = []
for line in file:
# Split words by spaces.
words = line.split(' ')
for i, word in enumerate(words):
if len(word.strip(punctuation)) > 3:
# Capitalise and replace words longer than 3 (without
punctuation)
words = word.capitalize()
# Join the capitalised words with spaces.
lines.append(' '.join(words))
# Join the capitalised lines by the line separator
capitalised = linesep.join(lines)
# Optionally, write the capitalised words back to the file.

print(capitalised)


Purpose of the script:
To capitalize the first letter of any word in a given file, leaving
words which have 3 or less letters.

Bugs:
I know it has many bugs or/and it can be improved by cutting down the
code, but my current focus is to fix this bug:
1. When I pass it any file, it does it stuff but inserts a blank
line every time it processes a new line. (Please notice that I don't
want the output in an another file, I want it on screen).


The lines you read from your input file end in a line separator.
When you print them, the 'print' command adds another line separator.
This results in two line separators in a row, in other words, a blank
line.

The best way to solve this is usually to remove the line separator
right after you've read in the line. You could do that by inserting
after line 10:

line = line.rstrip()

That will remove all whitespace characters (spaces, tabs, carriage
returns, newlines) from the end of the line.

Alternatively, if you want to remove only the line separator,
you could do:

if line.endswith(linesep):
line = line[:-len(linesep)]

The 'if' command is only necessary for the last line, which may or
may not end in a linesep. All earlier lines are guaranteed to end
with a linesep.


Hope this helps,

-- HansM
 
M

MRAB

On 22/08/2012 09:20, Hans Mulder wrote:
[snip]
Alternatively, if you want to remove only the line separator,
you could do:

if line.endswith(linesep):
line = line[:-len(linesep)]

The 'if' command is only necessary for the last line, which may or
may not end in a linesep. All earlier lines are guaranteed to end
with a linesep.
Even better is:

line = line.rstrip(linesep)

The line separator is '\n'.

Strictly speaking, the line separator varies according to platform
(Windows, *nix, etc), but it's translated to '\n' on reading from a
file which has been opened in text mode (the default).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,732
Latest member
ArronPalin

Latest Threads

Top