P
pietdejong
Hello again,
Sorry for posting this again, but since my thread of last saturday kind
of ended on a dead track, I decided to post it brand new. Refer also
to:
http://groups-beta.google.com/group...8&mode=thread&noheader=1#doc_b7b2b45a08c061be
The problem I'm having is basically only on the server side...
I'm working on a server that should receive HTTP requests. It is
however possible that the request that arrives at the server is not
HTTP. This possibility is verified on the first byte of data.
(in other words:
if the first byte is equal to 0x01,
then not HTTP
else ... )
Given that the information is posted according to HTTP, I'm trying to
resolve the following: I don't know a priori which encoding is used for
the data stream. The following rules for encoding apply:
If the string (using regex) <?xml [^>]+encoding="([^"]+)" is
encountered, $1 is used for decoding, otherwise a default char set is
used. My goal is to both use the characters (i.e. the server's
'interpretation' of the bytes received) as the original byte stream. I
want to write to a file the original byte stream, while using the
derived character stream for processing (using beans, XSL
transformation etc.)
I tried simulating the client using a basic HTML page, with a FORM
action to my server's url. Now in HTML I can specify the meta element
Content-type, and set it to "text/xml; charset=utf-8 or whatever I
like. I recall that by default HTML Forms encode using the platform
default charset and content-type application/x-www-form-urlencoded
Also tried to simulate the client with a JAVA application that makes
use of the java.net.HttpURLConnection. Here I have set the
requestProperty "Content-type" to "text/xml; charset=utf-8".
Now I'm not sure whether in either one or both cases the stream is mime
encoded...?
Someone in the previous thread suggested me to use HttpURLConnection
also on the serverside, but since I'm expecting also non-HTTP requests,
I'm not sure if I can. Most likely I cannot use a BufferedReader,
because it is based on a character stream, so I lose the original byte
stream...
Thx
Sorry for posting this again, but since my thread of last saturday kind
of ended on a dead track, I decided to post it brand new. Refer also
to:
http://groups-beta.google.com/group...8&mode=thread&noheader=1#doc_b7b2b45a08c061be
The problem I'm having is basically only on the server side...
I'm working on a server that should receive HTTP requests. It is
however possible that the request that arrives at the server is not
HTTP. This possibility is verified on the first byte of data.
(in other words:
if the first byte is equal to 0x01,
then not HTTP
else ... )
Given that the information is posted according to HTTP, I'm trying to
resolve the following: I don't know a priori which encoding is used for
the data stream. The following rules for encoding apply:
If the string (using regex) <?xml [^>]+encoding="([^"]+)" is
encountered, $1 is used for decoding, otherwise a default char set is
used. My goal is to both use the characters (i.e. the server's
'interpretation' of the bytes received) as the original byte stream. I
want to write to a file the original byte stream, while using the
derived character stream for processing (using beans, XSL
transformation etc.)
I tried simulating the client using a basic HTML page, with a FORM
action to my server's url. Now in HTML I can specify the meta element
Content-type, and set it to "text/xml; charset=utf-8 or whatever I
like. I recall that by default HTML Forms encode using the platform
default charset and content-type application/x-www-form-urlencoded
Also tried to simulate the client with a JAVA application that makes
use of the java.net.HttpURLConnection. Here I have set the
requestProperty "Content-type" to "text/xml; charset=utf-8".
Now I'm not sure whether in either one or both cases the stream is mime
encoded...?
Someone in the previous thread suggested me to use HttpURLConnection
also on the serverside, but since I'm expecting also non-HTTP requests,
I'm not sure if I can. Most likely I cannot use a BufferedReader,
because it is based on a character stream, so I lose the original byte
stream...
Thx