Question about StringIO

Frank Millman · Oct 9, 2005

Hi all

I understand that StringIO creates a file-like object in memory.

Is it possible to invoke another program, using os.system() or
os.popen(), and use the < redirect operator, so that the other program
reads my StringIO object as its input?

I will provide more details if required, but hopefully this is enough
for a simple yes or no answer, and if so, how.

BTW, I have tried using popen2() and passing my data via stdin, but the
other program (psql) does not react well to this - again, I will give
more info if necessary.

Thanks

Frank Millman

Steve Holden · Oct 9, 2005

Frank said:
Hi all

I understand that StringIO creates a file-like object in memory.

Is it possible to invoke another program, using os.system() or
os.popen(), and use the < redirect operator, so that the other program
reads my StringIO object as its input?

I will provide more details if required, but hopefully this is enough
for a simple yes or no answer, and if so, how.

BTW, I have tried using popen2() and passing my data via stdin, but the
other program (psql) does not react well to this - again, I will give
more info if necessary.

Unfortunately the StringIO module only creates instances inside the
process they are called: these objects have no existence to the
operating system or to other processes, and so can't be used for
inter-process communication.

regards
Steve

Diez B. Roggisch · Oct 9, 2005

Frank said:
Hi all

I understand that StringIO creates a file-like object in memory.

Is it possible to invoke another program, using os.system() or
os.popen(), and use the < redirect operator, so that the other program
reads my StringIO object as its input?

No. Processes don't share memory - thus you have to either use a temp
file, or pipes.

BTW, I have tried using popen2() and passing my data via stdin, but the
other program (psql) does not react well to this - again, I will give
more info if necessary.

Better do so

Diez

Frank Millman · Oct 10, 2005

Diez said:
No. Processes don't share memory - thus you have to either use a temp
file, or pipes.

Better do so

Diez

Thanks, Steve and Diez, for the replies. I didn't think it was
possible, but it was worth asking

I will try to explain my experience with popen() briefly.

I have some sql scripts to create tables, indexes, procedures, etc. At
present there are about 50 scripts, but this number will grow. I have
been running them manually so far. Now I want to automate the process.

I am supporting PostgreSQL and MS SQL Server, and the syntax is
slightly different in some cases. Rather than maintain two sets of
scripts, I prefix some lines with -pg- or -ms- to indicate the
platform, and then use Python to parse the scripts and generate a
correct output for each platform, passing it to 'psql' and 'osql'
respectively, using popen().

I have had a few problems, but it would take too long to describe them
all, and I just want a working solution, so I will focus on my latest
attempt.

I run through all the scripts and create a StringIO object with the
string I want to pass. It is about 250 000 bytes long. If I run psql
using popen(), and pass it the string via stdin, it works fine, but I
get all the messages on the screen. If I do the same, but end the
command with ' > fjm 2>&1' it works correctly and the messages end up
in the file fjm, which is about 40 000 bytes long. If I run it with
popen4(), it starts ok, but then hangs about 1/4 of the way through.
Exactly the same happens on MSW. It seems to be hitting a limit on the
size of the stdout file - is that possible?

For my purposes, I will be happy to use popen() and a choice of no
redirection, redirect to a file, or redirect to /dev/null. The question
about popen4() is therefore academic, though I would be interested to
know the answer.

BTW, is there an equivalent of /dev/null on MSW?

Thanks in advance for any suggestions.

Frank

Benjamin Niemann · Oct 10, 2005

Frank said:
I will try to explain my experience with popen() briefly.

I have some sql scripts to create tables, indexes, procedures, etc. At
present there are about 50 scripts, but this number will grow. I have
been running them manually so far. Now I want to automate the process.

I am supporting PostgreSQL and MS SQL Server, and the syntax is
slightly different in some cases. Rather than maintain two sets of
scripts, I prefix some lines with -pg- or -ms- to indicate the
platform, and then use Python to parse the scripts and generate a
correct output for each platform, passing it to 'psql' and 'osql'
respectively, using popen().

I have had a few problems, but it would take too long to describe them
all, and I just want a working solution, so I will focus on my latest
attempt.

I run through all the scripts and create a StringIO object with the
string I want to pass. It is about 250 000 bytes long. If I run psql
using popen(), and pass it the string via stdin, it works fine, but I
get all the messages on the screen. If I do the same, but end the
command with ' > fjm 2>&1' it works correctly and the messages end up
in the file fjm, which is about 40 000 bytes long. If I run it with
popen4(), it starts ok, but then hangs about 1/4 of the way through.
Exactly the same happens on MSW. It seems to be hitting a limit on the
size of the stdout file - is that possible?

For my purposes, I will be happy to use popen() and a choice of no
redirection, redirect to a file, or redirect to /dev/null. The question
about popen4() is therefore academic, though I would be interested to
know the answer.

That's probably a deadlock as described in

BTW, is there an equivalent of /dev/null on MSW?

Dunno - but as a last resort, you could create a tempfile with a unique name
(to be sure, not to override any existing data), dump your output there and
later os.unlink() it...

Diez B. Roggisch · Oct 10, 2005

Thanks, Steve and Diez, for the replies. I didn't think it was

possible, but it was worth asking

I will try to explain my experience with popen() briefly.

I have some sql scripts to create tables, indexes, procedures, etc. At
present there are about 50 scripts, but this number will grow. I have
been running them manually so far. Now I want to automate the process.

I am supporting PostgreSQL and MS SQL Server, and the syntax is
slightly different in some cases. Rather than maintain two sets of
scripts, I prefix some lines with -pg- or -ms- to indicate the
platform, and then use Python to parse the scripts and generate a
correct output for each platform, passing it to 'psql' and 'osql'
respectively, using popen().

Why don't youn use te python DB-Api instead?

Regards,

Diez

Frank Millman · Oct 11, 2005

Benjamin said:
That's probably a deadlock as described in
<http://docs.python.org/lib/popen2-flow-control.html>

Thanks for this pointer. I have read it, but I don't think it applies
to my situation, as it talks about 'reading' from the child's stdout
while the child is 'writing' to stderr. I am not doing that, or at
least not consciously. Here is a code snippet. 's' is a StringIO object
that contains my input. It is about 6000 lines/250000 bytes long.

-------------

sql_stdin,sql_stdout = os.popen4('psql -U %s -d %s' % (user,database))

sql_stdin.writelines(s.readlines())
s.close()
sql_stdin.close()

-------------

It starts, and then hangs, after processing about 6% of my input.

If I add ' > fjm 2>&1' to the command, it works, so it is definitely
connected with the child writing to stdout/stderr.

I tried storing my input in a list, and passing ''.join(s), but it had
the same result. I also looped over the list and wrote one line at a
time to sql_stdin - same result.

I can work around this, so a solution is not critical. However, it
would be nice to know if this is a limitation of popen4(), or if I am
doing something wrong.

Dunno - but as a last resort, you could create a tempfile with a unique name
(to be sure, not to override any existing data), dump your output there and
later os.unlink() it...

A quick google revealed the answer - there is a device called NUL which
achieves the same purpose.

Thanks

Frank

Frank Millman · Oct 11, 2005

Diez said:
Why don't youn use te python DB-Api instead?

Regards,

Diez

My scripts are used to create the tables in the database. I didn't
think that DB-API covered that. However, even if it did, I don't think
it would handle differences such as the following.

For Unicode support, PostgreSQL uses the character-set specified when
the database is created. SQL Server allows you to specify it for each
column, using the datatype NCHAR and NVARCHAR instead of CHAR and
VARCHAR.

PostgreSQL uses data types called DATE and TIMESTAMP. SQL Server uses
DATETIME (it also uses TIMESTAMP, but that is used for something else).

Both DBMS's have the concept of a column which is automatically
assigned a 'next number' each time a row is created, but the syntax for
defining the column is completely different.

PostgreSQL allows the use of a WHERE clause when creating an INDEX,
which is useful if you only want to index a subset of a table.

SQL Server has the concept of a CLUSTERED INDEX, whereby it stores the
rows physically in index sequence. It defaults to using a clustered
index for the primary key. Often this is not what you want, so it is
desirable to specify the primary key as NONCLUSTERED, and then specify
a CLUSTERED index for a more frequently used column.

These are just a few of the differences, but you get the idea. If there
is a better way to do this in a cross-platform manner, I would love to
know how.

Thanks

Frank

Diez B. Roggisch · Oct 11, 2005

Thanks for this pointer. I have read it, but I don't think it applies

to my situation, as it talks about 'reading' from the child's stdout
while the child is 'writing' to stderr.

But that is exactly the point: the psql blocks because you don't read
away the buffered data. Start a thread, read that stdout/stderr and see
if things go smoothly.

Diez

Diez B. Roggisch · Oct 11, 2005

My scripts are used to create the tables in the database. I didn't

think that DB-API covered that.

The DB-Api covers executin arbirary SQL - either DDL or DML. It is
surely centered around DML, but that doesn't mean that its not usabel to
issue "create ..." statements.

>However, even if it did, I don't think
it would handle differences such as the following.

<snip>

All that has nocthing to do with teh API - you'd still need your
differentiated DDL - but the communication with the programs would go away.

Diez

Frank Millman · Oct 11, 2005

Diez said:
But that is exactly the point: the psql blocks because you don't read
away the buffered data. Start a thread, read that stdout/stderr and see
if things go smoothly.

Diez

Of course (kicks himself), it is obvious now that you have explained
it. I tried your suggestion and it works perfectly.

Many thanks

Frank

Frank Millman · Oct 11, 2005

Diez said:
The DB-Api covers executin arbirary SQL - either DDL or DML. It is
surely centered around DML, but that doesn't mean that its not usabel to
issue "create ..." statements.

<snip>

All that has nocthing to do with teh API - you'd still need your
differentiated DDL - but the communication with the programs would go away.

Diez

I understand. It certainly gives me an alternative approach - I will
experiment to see which suits my purpose best.

Many thanks for your assistance.

Frank

Question about Source Control	3	Mar 17, 2014
Register Question	0	Oct 21, 2024
Question about weakref	0	Jul 4, 2012
Question about sub-packages	0	Feb 28, 2012
Question about 'remote objects'	3	Dec 9, 2009
Question about circular imports	1	Feb 26, 2012
question about list extension	3	Apr 16, 2010
Question about PEP 8	2	Sep 10, 2007

Question about StringIO

Frank Millman

Steve Holden

Diez B. Roggisch

Frank Millman

Benjamin Niemann

Diez B. Roggisch

Frank Millman

Frank Millman

Diez B. Roggisch

Diez B. Roggisch

Frank Millman

Frank Millman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads