Jean said:
Op dinsdag 25 maart 2014 12:01:37 UTC+1 schreef Steven D'Aprano:
On Tue, 25 Mar 2014 03:26:26 -0700, Jean Dubois wrote:
I'm confused by the behaviour of the following python-script I wrote:
#!/usr/bin/env python
#I first made a data file 'test.dat' with the following content
#1.0 2 3
#4 5 6.0
#7 8 9
import numpy as np
lines=[line.strip() for line in open('test.dat')]
#convert lines-list to numpy-array
array_lines=np.array(lines)
#fetch element at 2nd row, 2nd column:
print array_lines[1, 1]
When running the script I always get the following error: IndexError:
invalid index
Can anyone here explain me what I am doing wrong and how to fix it?
Yes. Inspect the array by printing it, and you'll see that it is a one-
dimensional array, not two, and the entries are strings:
py> import numpy as np
py> # simulate a text file
... data = """1.0 2 3
... 4 5 6.0
... 7 8 9"""
py> lines=[line.strip() for line in data.split('\n')]
py> # convert lines-list to numpy-array
... array_lines = np.array(lines)
py> print array_lines
['1.0 2 3' '4 5 6.0' '7 8 9']
The interactive interpreter is your friend! You never need to guess what
the problem is, Python has powerful introspection abilities, one of the
most powerful is also one of the simplest: print. Another powerful tool
in the interactive interpreter is help().
So, what to do about it? Firstly, convert your string read from a file
into numbers, then build your array. Here's one way:
py> values = [float(s) for s in data.split()]
py> print values
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]
py> array_lines = np.array(values)
py> array_lines = array_lines.reshape(3, 3)
py> print array_lines
[[ 1. 2. 3.]
[ 4. 5. 6.]
[ 7. 8. 9.]]
Dear Steve,
Thanks for answering my question but unfortunately now I'm totally
confused.
Above I see parts from different programs which I can't
assemble together to one working program (I really tried hard).
Can I tell from your comment I shouldn't use numpy?
I also don't see how to get the value an element specified by (row,
column) from a numpy_array like "array_lines" in my original code
All I need is a little python-example reading a file with e.g. three lines
with three numbers per line and putting those numbers as floats in a
3x3-numpy_array, then selecting an element from that numpy_array using
it's row and column-number.
I'll try, too, but be warned that I'm using the same methology as Steven.
Try to replicate every step in the following exploration.
First let's make sure we start with the same data:
$ cat test.dat
1.0 2 3
4 5 6.0
7 8 9
Then fire up the interactve interpreter:
$ python
Python 2.7.5+ (default, Feb 27 2014, 19:37:08)
[GCC 4.8.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
import numpy
lines = [line.strip() for line in open("test.dat")]
lines
['1.0 2 3', '4 5 6.0', '7 8 9']
As you can see lines is a list of three strings.
Let's break these strings into parts:
cells = [line.split() for line in lines]
cells
[['1.0', '2', '3'], ['4', '5', '6.0'], ['7', '8', '9']]
We now have a list of lists of strings and you can address individual items
with
'6.0'
What happens when pass this list of lists of strings to the numpy.array()
constructor?array([['1.0', '2', '3'],
['4', '5', '6.0'],
['7', '8', '9']],
dtype='|S3')
'6.0'
It sort of works, but the array entries are strings rather than floating
point numbers. Let's fix that:array([[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]])
6.0
OK, now we can put the previous steps into a script:
$ cat tmp.py
import numpy
cells = [line.split() for line in open("test.dat")]
a = numpy.array(cells, dtype=float)
print a[1, 2]
Run it:
$ python tmp.py
6.0
Seems to work. But reading a 2D array from a file really looks like a common
task -- there should be a library function for that:
$ python
Python 2.7.5+ (default, Feb 27 2014, 19:37:08)
[GCC 4.8.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.array([[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]])