Ok, this email is something of a recital of how I approached this.
The apache error log:
I restarted the apache:
/etc/init.d/httpd restart
Then a:
ps axf
gave me the PID of a running httpd. Examining its open files:
lsof -p 9287
shows me:
httpd 9287 nobody 2w REG 0,192 12719609 56510510 /usr/local/apache/logs/error_log
httpd 9287 nobody 7w REG 0,192 7702310 56510512 /usr/local/apache/logs/access_log
among many others.
So, to monitor these logs:
tail -F /usr/local/apache/logs/error_log /usr/local/apache/logs/access_log &
placing the tail in the background so I can still use that shell.
Watching the log while fetching the page:
http://superhost.gr/
says:
==> /usr/local/apache/logs/error_log <==
[Tue Apr 23 12:11:40 2013] [error] [client 54.252.27.86] suexec policy violation: see suexec log for more details
[Tue Apr 23 12:11:40 2013] [error] [client 54.252.27.86] Premature end ofscript headers: metrites.py
[Tue Apr 23 12:11:40 2013] [error] [client 54.252.27.86] File does not exist: /home/nikos/public_html/500.shtml
[Tue Apr 23 12:11:43 2013] [error] [client 107.22.40.41] suexec policy violation: see suexec log for more details
[Tue Apr 23 12:11:43 2013] [error] [client 107.22.40.41] Premature end ofscript headers: metrites.py
[Tue Apr 23 12:11:43 2013] [error] [client 107.22.40.41] File does not exist: /home/nikos/public_html/500.shtml
[Tue Apr 23 12:11:45 2013] [error] [client 79.125.63.121] suexec policy violation: see suexec log for more details
[Tue Apr 23 12:11:45 2013] [error] [client 79.125.63.121] Premature end of script headers: metrites.py
[Tue Apr 23 12:11:45 2013] [error] [client 79.125.63.121] File does not exist: /home/nikos/public_html/500.shtml
So:
You're using suexec in your Apache. This greatly complicates your debugging..
Suexec seems to be a facility for arranging that CGI script run as the user
who owns them. Because that has a lot of potential for ghastly
security holes, suexec performs a large number of strict checks on
CGI script locations, permissions and locations before running a
CGI script. At a guess the first hurdle would be that metrites.py
is owned by root. Suexec is very picky about what users it is
prepared to become. "root" is not one of them, as you might imagine.
I've chowned metrites.py to nikos:nikos. Suexec not lets it run, producing this:
Traceback (most recent call last):
File "metrites.py", line 9, in <module>
sys.stderr = open('/home/nikos/public_html/cgi.err.out', 'a')
PermissionError: [Errno 13] Permission denied: '/home/nikos/public_html/cgi..err.out'
That file is owned by root. metrites.py is being run as nikos.
So:
chown nikos:nikos /home/nikos/public_html/cgi.err.out
A page reload now shows this:
Error in sys.excepthook:
UnicodeEncodeError: 'ascii' codec can't encode characters in position 2334-2342: ordinal not in range(128)
Original exception was:
Traceback (most recent call last):
File "metrites.py", line 226, in <module>
print( template )
UnicodeEncodeError: 'ascii' codec can't encode characters in position 30-38: ordinal not in range(128)
This shows you writing the string in template to stdout. The default
encoding for stdout is 'ascii', accepting only characters of values
0..127. I expect template contains more than this, since the ASCII
range is very US Anglocentric; Greek characters for example won't
encode into ascii.
As mentioned in the thread on python-list, python will adopt your
terminal's encoding it used interactively but will be pretty
conservation if the output is not a terminal; ascii as you see
above.
What you want is probably UTF-8 in the output byte stream. But
let's check what the HTTP headers are saying, because _they_ tell
the browser the byte stream encoding. The headers and your program's
encoding must match. So:
% wget -S -O -
http://superhost.gr/
--2013-04-23 19:34:38--
http://superhost.gr/
Resolving superhost.gr (superhost.gr)... 82.211.30.133
Connecting to superhost.gr (superhost.gr)|82.211.30.133|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Date: Tue, 23 Apr 2013 09:34:46 GMT
Server: Apache/2.2.24 (Unix) mod_ssl/2.2.24 OpenSSL/1.0.0-fips mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8
Length: unspecified [text/html]
Saving to: ‘STDOUT’
<!--: spam
Content-Type: text/html
<body bgcolor="#f0f0f8"><font color="#f0f0f8" size="-5"> -->
<body bgcolor="#f0f0f8"><font color="#f0f0f8" size="-5"> --> -->
</font> </font> </font> </script> </object> </blockquote> </pre>
So, the Content-Type: header says: "text/html; charset=utf-8". So that's good.
So I've imported codecs and added this line:
sys.stdout = os.fdopen(1, 'w', encoding='utf-8')
under the setting of sys.stderr. If the cgi libraries run under
python 3 there is probably a cleaner way to do this but i don't know how.
This just opens UNIX file descriptor 1 (standard output) from scratch
for write ('w') using the 'utf-8' encoding.
And now your CGI script runs, accepting strings sent to print().
sys.stdout now takes care of transcoding those strings (Unicode
character code points inside Python) into the utf-8 encoding required
in the output bytes.