Elementary string-formatting

O

Odysseus

Hello, group: I've just begun some introductory tutorials in Python.
Taking off from the "word play" exercise at

<http://www.greenteapress.com/thinkpython/html/book010.html#toc96>

I've written a mini-program to tabulate the number of characters in each
word in a file. Once the data have been collected in a list, the output
is produced by a while loop that steps through it by incrementing an
index "i", saying

print '%2u %6u %4.2f' % \
(i, wordcounts, 100.0 * wordcounts / wordcounts[0])

My problem is with the last entry in each line, which isn't getting
padded:

1 0 0.00
2 85 0.07
3 908 0.80
4 3686 3.24
5 8258 7.26
6 14374 12.63
7 21727 19.09
8 26447 23.24
9 16658 14.64
10 9199 8.08
11 5296 4.65
12 3166 2.78
13 1960 1.72
14 1023 0.90
15 557 0.49
16 261 0.23
17 132 0.12
18 48 0.04
19 16 0.01
20 5 0.00
21 3 0.00

I've tried varying the number before the decimal in the formatting
string; "F", "g", and "G" conversions instead of "f"; and a couple of
other permutations (including replacing the arithmetical expression in
the tuple with a variable, defined on the previous line), but I can't
seem to get the decimal points to line up. I'm sure I'm missing
something obvious, but I'd appreciate a tip -- thanks in advance!

FWIW I'm running

Python 2.3.5 (#1, Oct 5 2005, 11:07:27)
[GCC 3.3 20030304 (Apple Computer, Inc. build 1809)] on darwin

from the Terminal on Mac OS X v10.4.11.

P.S. Is there a preferable technique for forcing floating-point division
of two integers to that used above, multiplying by "100.0" first? What
about if I just wanted a ratio: is "float(n / m)" better than "1.0 * n /
m"?
 
G

Gary Herron

Odysseus said:
Hello, group: I've just begun some introductory tutorials in Python.
Taking off from the "word play" exercise at

<http://www.greenteapress.com/thinkpython/html/book010.html#toc96>

I've written a mini-program to tabulate the number of characters in each
word in a file. Once the data have been collected in a list, the output
is produced by a while loop that steps through it by incrementing an
index "i", saying

print '%2u %6u %4.2f' % \
(i, wordcounts, 100.0 * wordcounts / wordcounts[0])

Using 4.2 is the problem. The first digit (your 4) give the total
number of characters to use for the number. Your numbers require at
least 5 characters, two digits, one decimal point, and two more
digits. So try 5.2, or 6.2 or 7.2 or higher if your numbers can grow
into the hundreds or thousands or higher. If you want to try one of the
floating point formats, then your first number must be large enough to
account for digits (before and after) the decimal point, the 'E', and
any digits in the exponent, as well as signs for both the number and the
exponent.

Gary Herron
My problem is with the last entry in each line, which isn't getting
padded:

1 0 0.00
2 85 0.07
3 908 0.80
4 3686 3.24
5 8258 7.26
6 14374 12.63
7 21727 19.09
8 26447 23.24
9 16658 14.64
10 9199 8.08
11 5296 4.65
12 3166 2.78
13 1960 1.72
14 1023 0.90
15 557 0.49
16 261 0.23
17 132 0.12
18 48 0.04
19 16 0.01
20 5 0.00
21 3 0.00

I've tried varying the number before the decimal in the formatting
string; "F", "g", and "G" conversions instead of "f"; and a couple of
other permutations (including replacing the arithmetical expression in
the tuple with a variable, defined on the previous line), but I can't
seem to get the decimal points to line up. I'm sure I'm missing
something obvious, but I'd appreciate a tip -- thanks in advance!

FWIW I'm running

Python 2.3.5 (#1, Oct 5 2005, 11:07:27)
[GCC 3.3 20030304 (Apple Computer, Inc. build 1809)] on darwin

from the Terminal on Mac OS X v10.4.11.

P.S. Is there a preferable technique for forcing floating-point division
of two integers to that used above, multiplying by "100.0" first? What
about if I just wanted a ratio: is "float(n / m)" better than "1.0 * n /
m"?
 
J

John Machin

P.S. Is there a preferable technique for forcing floating-point division
of two integers to that used above, multiplying by "100.0" first? What
about if I just wanted a ratio: is "float(n / m)" better than "1.0 * n /
m"?

You obviously haven't tried float(n / m), or you wouldn't be asking.
Go ahead and try it.

"Preferable" depends on whether you want legibility or speed.

Most legible and slowest first:
1. float(n) / float(m)
2. n / float(m)
3. 1.0 * n / m
# Rationale so far: function calls are slow
4. If you have a lot of this to do, and you really care about the
speed and m (the denominator) is constant throughout, do fm = float(m)
once, and then in your loop do n / fm for each n -- and make sure you
run properly constructed benchmarks ...

Recommendation: go with (2) until you find you've got a program with
a real speed problem (and then it probably won't be caused by this
choice).

HTH,
John
 
R

Roberto Bonvallet

P.S. Is there a preferable technique for forcing floating-point division
of two integers to that used above, multiplying by "100.0" first?

Put this at the beginning of your program:

from __future__ import division

This forces all divisions to yield floating points values:

HTH,
 
O

Odysseus

You obviously haven't tried float(n / m), or you wouldn't be asking.

True, it was a very silly idea.
Most legible and slowest first:
1. float(n) / float(m)
2. n / float(m)
3. 1.0 * n / m
Recommendation: go with (2) until you find you've got a program with
a real speed problem (and then it probably won't be caused by this
choice).

I had actually used n / float(m) at one point; somehow the above struck
me as an improvement while I was writing the message. Thanks for the
reality check.
 
O

Odysseus

Roberto Bonvallet said:
Put this at the beginning of your program:

from __future__ import division

This forces all divisions to yield floating points values:

Thanks for the tip. May I assume the div operator will still behave as
usual?
 
J

John Machin

Thanks for the tip. May I assume the div operator will still behave as
usual?

div operator? The integer division operator is //
File "<stdin>", line 1
22 div 7
^
SyntaxError: invalid syntax
 
O

Odysseus

Gary Herron said:
Odysseus wrote:
print '%2u %6u %4.2f' % \
(i, wordcounts, 100.0 * wordcounts / wordcounts[0])

Using 4.2 is the problem. The first digit (your 4) give the total
number of characters to use for the number.


Thanks; I was thinking the numbers referred to digits before and after
the decimal. The largest figures have five characters in all, so they
were 'overflowing'.
 
M

Matt Nordhoff

Odysseus said:
Hello, group: I've just begun some introductory tutorials in Python.
Taking off from the "word play" exercise at

<http://www.greenteapress.com/thinkpython/html/book010.html#toc96>

I've written a mini-program to tabulate the number of characters in each
word in a file. Once the data have been collected in a list, the output
is produced by a while loop that steps through it by incrementing an
index "i", saying

print '%2u %6u %4.2f' % \
(i, wordcounts, 100.0 * wordcounts / wordcounts[0])


This isn't very important, but instead of keeping track of the index
yourself, you can use enumerate():
mylist = ['a', 'b', 'c']
for i, item in enumerate(mylist):
.... print i, item
....
0 a
1 b
2 c
Err, it doesn't look like you can make it start at 1 though.

<snip>
--
 

Members online

Forum statistics

Threads
473,989
Messages
2,570,207
Members
46,783
Latest member
RickeyDort

Latest Threads

Top