Japanese (speaking) developer needed for a bit of regex magic

S

Sebastian

Hi all,

I'm working on Python bindings for the Amazon Product Advertising API
(http://pypi.python.org/pypi/python-amazon-product-api/) which
supports the different localised versions - among them a Japanese one
(for http://www.amazon.co.jp).

All locales return error messages in English. Only the Japanese uses
Japanese which my regular expressions cannot handle at the moment.

Is there anyone fluent enough in Japanese to give me a hand? The bit
of code that needed tweaking can be found here:
http://bitbucket.org/basti/python-amazon-product-api/src/tip/amazonproduct.py#cl-152

A simple diff would help me greatly.

Thanks for your effort!
Seb.

P.S. If you have questions, I've set up a mailing list at python-
(e-mail address removed).
 
C

Chris Rebert

What exactly are you expecting to happen, and what exactly happens
instead?

General advice with character sets in Python apply: always explicitly
declare the encoding of input, then decode to Unicode interally as early
as possible, and process all text that way. Only fix into an encoding
when it's time to output.

I think he has more of a *literal* language problem: He doesn't know
Japanese and thus can't read the Japanese error message in order to
develop a regex for it. I assume there's some reason he can't just do
a blind equality test on the error message string(s).

Cheers,
Chris
 
S

Sebastian

General advice with character sets in Python apply: always explicitly
declare the encoding of input, then decode to Unicode interally as early
as possible, and process all text that way. Only fix into an encoding
when it's time to output.

Maybe I was too vague when describing my problem. As Chris correctly
guessed, I have a literal language problem.
What exactly are you expecting to happen, and what exactly happens
instead?

My regular expressions turn the Amazon error messages into Python
exceptions.

This works fine as long as they are in English: "??? is not a valid
value for BrowseNodeId. Please change this value and retry your
request.", for instance, will raise an InvalidParameterValue
exception. However, the Japanese version returns the error message "???
ã¯ã€BrowseNodeIdã®å€¤ã¨ã—ã¦ç„¡åŠ¹ã§ã™ã€‚値を変更ã—ã¦ã‹ã‚‰ã€å†åº¦ãƒªã‚¯ã‚¨ã‚¹ãƒˆã‚’実行ã—ã¦ãã ã•ã„。" which will not be
successfully handled.

This renders the my module pretty much useless for Japanese users.

I'm was therefore wondering if someone with more knowledge of Japanese
than me can have a look at my expressions. Maybe the Japanese messages
are completely different...

I have a collection of sample messages here (all files *-jp-*.xml):
http://bitbucket.org/basti/python-amazon-product-api/src/tip/tests/2009-11-01/

Any help is appreciated!

Cheers,
Sebastian
 
S

Sebastian

My regular expressions turn the Amazon error messages into Python
Your problem, then, appears to be that you're attacking the issue at the
wrong layer. Parsing messages in natural language and hoping to
reconstruct a structure is going to be an exercise in frustration.

Doesn't the API have defined response codes and parameters that you can
use, instead of parsing error strings in various natural languages?

No, unfortunately not. If it did, I would have used it.

The Amazon API returns an XML response which contains error messages
if a request fails. These messages consist of an error code and an
error description in natural language. Luckily, the description seems
to stick to the same format and is (in all but one case) in plain
English. Much to my dismay I discovered that the Japanese locale
translates the error message!

For example, this is the bit of XML returned for the German locale:

<Errors>
<Error>
<Code>AWS.InvalidParameterValue</Code>
<Message>??? is not a valid value for BrowseNodeId. Please
change this value and retry your request.</Message>
</Error>
</Errors>

The corresponding part from the Japanese locale looks like this:

<Errors>
<Error>
<Code>AWS.InvalidParameterValue</Code>
<Message>???
は、BrowseNodeIdの値として無効です。値を変更してから、再度リクエストを実行してください。</
Message>
</Error>
</Errors>

Of course, one could argue that the type of error (in this case
"AWS.InvalidParameterValue") would be enough. However, in order to
return a maeningful error message, I would like to parse the
description itself - and for this some knowledge of Japanese would be
helpful.
 
C

Chris Rebert

No, unfortunately not. If it did, I would have used it.

The Amazon API returns an XML response which contains error messages
if a request fails. These messages consist of an error code and an
error description in natural language. Luckily, the description seems
to stick to the same format and is (in all but one case) in plain
English. Much to my dismay I discovered that the Japanese locale
translates the error message!

For example, this is the bit of XML returned for the German locale:

     <Errors>
       <Error>
         <Code>AWS.InvalidParameterValue</Code>
         <Message>??? is not a valid value for BrowseNodeId. Please
change this value and retry your request.</Message>
       </Error>
     </Errors>

The corresponding part from the Japanese locale looks like this:

     <Errors>
       <Error>
         <Code>AWS.InvalidParameterValue</Code>
         <Message>???
は、BrowseNodeIdの値として無効です。値を変更してから、再度リクエストを実行してください。</
Message>
       </Error>
     </Errors>

Of course, one could argue that the type of error (in this case
"AWS.InvalidParameterValue") would be enough. However, in order to
return a maeningful error message, I would like to parse the
description itself - and for this some knowledge of Japanese would be
helpful.

Just throwing this out there, but perhaps you could grep for the
relevant terms in the error message and intuit it from there?
For example:

# terms = whatever the actual param names are
terms = "BrowseNodeId FooNodeId FooQueryType".split()
for term in terms:
if term in err_msg:
raise AmazonError, err_code + " for " +repr(term)

Cheers,
Chris
 
T

Terry Reedy

The Amazon API returns an XML response which contains error messages
if a request fails. These messages consist of an error code and an
error description in natural language. Luckily, the description seems
to stick to the same format and is (in all but one case) in plain
English. Much to my dismay I discovered that the Japanese locale
translates the error message!

Could you, when you get an error message, resubmit the request in the
standard locale so you get the messages in English? Or is the 'locale'
set by the url -- amazon.com versus amazon.co.jp?

After you parse, are you trying to formulate a substitute message in
Japanese?

Terry Jan Reedy
 
T

Terry Reedy

This works fine as long as they are in English:
"??? is not a valid value for BrowseNodeId.
> Please change this value and retry your request.",
> for instance, will raise an InvalidParameterValue
exception. However, the Japanese version returns the error message "???
ã¯ã€BrowseNodeIdã®å€¤ã¨ã—ã¦ç„¡åŠ¹ã§ã™ã€‚値を変更ã—ã¦ã‹ã‚‰ã€å†åº¦ãƒªã‚¯ã‚¨ã‚¹ãƒˆã‚’実行ã—ã¦ãã ã•ã„。"

My daughter, in 2nd year college Japanese, says that the above is
basically a translation of the English boilerplate. The only variable
info is 'BrowserNodeId', which you can read just fine already.
So we do not understand what your problem is and what you want to
accomplish.
I have a collection of sample messages here (all files *-jp-*.xml):
http://bitbucket.org/basti/python-amazon-product-api/src/tip/tests/2009-11-01/

Is this a commercial product? Are you willing to pay for serious help,
if needed?

Terry Jan Reedy
 
S

Sebastian

This works fine as long as they are in English:
 >  Please change this value and retry your request.",
 > for instance, will raise an InvalidParameterValue


My daughter, in 2nd year college Japanese, says that the above is
basically a translation of the English boilerplate. The only variable
info is 'BrowserNodeId', which you can read just fine already.
So we do not understand what your problem is and what you want to
accomplish.


Is this a commercial product? Are you willing to pay for serious help,
if needed?

Terry Jan Reedy

I just wanted to know if the Japanese version said the same. I'll
probably simply return the error message in full. Any Japanese
(speaking) developer will then know what caused the exception.

Thanks for your help.
 

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,817
Latest member
DicWeils

Latest Threads

Top