Retrive unicode keys from the registry

T

Thomas Heller

First I was astonished to see that _winreg.QueryValue doesn't accept
unicode key names, then I came up with this pattern:

def RegQueryValue(root, subkey):
if isinstance(subkey, unicode):
return _winreg.QueryValue(root, subkey.encode("mbcs"))
return _winreg.QueryValue(root, subkey)

Does this look ok?

Thomas
 
N

Neil Hodgson

Thomas Heller:
def RegQueryValue(root, subkey):
if isinstance(subkey, unicode):
return _winreg.QueryValue(root, subkey.encode("mbcs"))
return _winreg.QueryValue(root, subkey)

Does this look ok?

It will fail for keys that can not be encoded in your current code page.
That will not often be a problem but if you want to be safe then access to
the wide version is needed.

Neil
 
T

Thomas Heller

Neil Hodgson said:
Thomas Heller:


It will fail for keys that can not be encoded in your current code page.
That will not often be a problem but if you want to be safe then access to
the wide version is needed.

In the actual use case I have, I'm quite sure that the subkeys are coded
in latin-1:

# -*- coding: latin-1 -*-
LOG_KEYS = [u"DSC-S Cs Gun Emitter [Ah]",
u"Ga Gun Beam Defining Aperture [µAh]"]

and in my tests it worked.
Even if I wrote this:

# -*- coding: latin-1 -*-
LOG_KEYS = ["DSC-S Cs Gun Emitter [Ah]",
"Ga Gun Beam Defining Aperture [µAh]"]

But, assume that I wanted to provide a patch to Python (or implement in
ctypes) so that QueryValue accepts unicode subkey names (I was
astonished to find out that it does not). How could this be done,
hopefully portable between NT/2000/XP and 98, and assuming unicows.dll
is not installed - so the wide version is not available on 98?

My understanding is that it's possible to convert any (for a certain
definition of 'any) unicode string in a byte string (because the
encoding for the byte string can be specified), but that it's impossible
to convert a byte string into unicode unless the byte string's encoding
is known.

Maybe this only shows my un-understanding of unicode...

Thanks,

Thomas
 
N

Neil Hodgson

Thomas Heller:
In the actual use case I have, I'm quite sure that the subkeys are coded
in latin-1:

And you are also sure that your locale will always use a latin-1 code
page or another code page similar enough to work on your keys.
But, assume that I wanted to provide a patch to Python (or implement in
ctypes) so that QueryValue accepts unicode subkey names (I was
astonished to find out that it does not). How could this be done,
hopefully portable between NT/2000/XP and 98, and assuming unicows.dll
is not installed - so the wide version is not available on 98?

This is somewhat unpleasant, requiring runtime conditional code. For the
Python standard library, in posixmodule.c, places where there is a need to
branch first check that the OS is capable of wide calls with
unicode_file_names(), then check if the argument is Unicode and if it is
then it calls the wide system API. While the wide APIs do not work on 9x,
they are present so the executable will still load.

One limitation on the Unicode support in posixmodule is that it doesn't
try to detect and use MSLU on Windows 9x/Me.
My understanding is that it's possible to convert any (for a certain
definition of 'any) unicode string in a byte string (because the
encoding for the byte string can be specified).

Then you have to choose the encoding and switch the current locale to
that encoding as there is no encoding or locale parameter to RegQueryValue.
Values with characters from different languages may not be encodable into
any non-Unicode encoding.

Neil
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Neil said:
This is somewhat unpleasant, requiring runtime conditional code. For the
Python standard library, in posixmodule.c, places where there is a need to
branch first check that the OS is capable of wide calls with
unicode_file_names(), then check if the argument is Unicode and if it is
then it calls the wide system API. While the wide APIs do not work on 9x,
they are present so the executable will still load.

I believe the story would be completely different for the registry: the
wide registry functions are available on all Windows versions, AFAIK.

Contributions are welcome.

Regards,
Martin
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Thomas said:
My understanding is that it's possible to convert any (for a certain
definition of 'any) unicode string in a byte string (because the
encoding for the byte string can be specified), but that it's impossible
to convert a byte string into unicode unless the byte string's encoding
is known.

Maybe this only shows my un-understanding of unicode...

It certainly is. It is possible to create a registry key which the
"ANSI" API (RegQueryValueExA) cannot find - i.e. where you truly
need the "wide" API (RegQueryValueExW).

An example would be a registry key with Cyrillic, Greek, or Japanese
characters on your system. Encoding them as "mbcs" will convert those
characters into question marks (which is a bug in itself - Python
should rasise an exception instead).

Regards,
Martin
 
N

Neil Hodgson

Martin v. Löwis:
I believe the story would be completely different for the registry: the
wide registry functions are available on all Windows versions, AFAIK.

The documentation for RegQueryValue says:
""" Unicode: Implemented as Unicode and ANSI versions. Note that Unicode
support on Windows Me/98/95 requires Microsoft Layer for Unicode. """

Neil
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Neil said:
The documentation for RegQueryValue says:
""" Unicode: Implemented as Unicode and ANSI versions. Note that Unicode
support on Windows Me/98/95 requires Microsoft Layer for Unicode. """

Too bad. Perhaps we should link Python with MSLU?

Regards,
Martin
 
N

Neil Hodgson

Martin v. Löwis:
Too bad. Perhaps we should link Python with MSLU?

I don't have any experience with it and fear it would introduce a new
class of bug where the wide API is incompletely supported.

Neil
 
T

Thomas Heller

Martin v. Löwis said:
Too bad. Perhaps we should link Python with MSLU?

I don't know how to interpret the license that's contained in
unicows.exe.

REDIST.TXT is this:

<quote>
===============================================
Microsoft Layer for Unicode on Windows 95/98/ME
===============================================

In addition to the rights granted in Section 1 of the Agreement
("Agreement"), with respect to UNICOWS.DLL, you have the following
non-exclusive, royalty free rights subject to the Distribution
Requirements detailed in Section 1 of the Agreement:

(1) You may distribute UNICOWS.DLL with the following: Windows 95,
Windows 98, Windows 98 Second Edition, Windows Millennium, Windows NT4,
Windows 2000, Windows XP, and Windows Server 2003.
<end quote>

Since we're not distributing windows ;-) I don't know what this means.

And LICENSE.TXT contains this:

<quote>
* Distribution Terms. You may reproduce and distribute an unlimited
number of copies of the Sample Code and/or Redistributable Code
(collectively "Redistributable Components") as described above in object
code form, provided that

(a) you distribute the Redistributable Components only in conjunction
with and as a part of your Application solely for use with a Microsoft
Operating System Product;

[...]

(c) you distribute your Application containing the Redistributable
Components pursuant to an End-User License Agreement (which may be
"break-the-seal", "click-wrap" or signed), with terms no less protective
than those contained herein;

(d) you do not permit further redistribution of the Redistributable
Components by your end-user customers; (e) you do not use """
<end quote>

Thomas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,175
Messages
2,570,946
Members
47,497
Latest member
PilarLumpk

Latest Threads

Top