Extracting Rich Text data formats from win32clipboard

T

Trader

Hi,

I'm trying to use Mark Hammond's win32clipboard module to extract more
complex data than just plain ASCII text from the Windows clipboard.
For instance, when you select all the content on web page, you can
paste it into an app like Frontpage, or something Rich Text-aware, and
it will preserve all the formatting, HTML, etc. I'd like to include
that behavior in the application I'm writing.

In the interactive session below, before I run the clipboard_grab()
function, I've selected all of the www.google.com homepage in IE and
hit Control-C. The function cycles through all the formats stored on
the clipboard and loads up a data list with each type it finds.

Here's where it gets interesting: while data[2] is the textual data
that I would expect to see if I pasted the clipboard in a Notepad
file, data[0] and data[1] are in a weird, non-ASCII (binary?) format.
Are these pointers to (or metadata for) the actual HTML or rich text?
How do I use this data? Is there a reference I can use that will help
me decipher this information? Any help would be greatly appreciated.

Thanks!

----

Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information..... global format, formats, data
.... win32clipboard.OpenClipboard()
.... format = 1
.... formats = []
.... data = []
.... while 1:
.... format = win32clipboard.EnumClipboardFormats(format)
.... print "FORMAT:", format
.... if not format:
.... break
.... try:
.... datum = win32clipboard.GetClipboardData(format)
.... formats.append(format)
.... data.append(datum)
.... except:
.... print format, traceback.format_exception(sys.exc_type,
sys.exc_value, sys.exc_traceback)
.... win32clipboard.EmptyClipboard()
.... win32clipboard.CloseClipboard()
....FORMAT: 49171
FORMAT: 16
FORMAT: 7
FORMAT: 0
'\x00\x00\x00\x00\x18\x01\x00\x00\x01\x00\x00\x00\x06\x00\x00\x00\x00\x00\x00\x0
0\x00\x00\x00\x00\xe3\xc0\xc2w\x00\x00\x00\x00\x01\x00\x00\x00\xff\xff\xff\xff\x
01\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xa2\xc0\xe9\x02\x
00\x00\x00\x00\x01\x00\x00\x00\xff\xff\xff\xff\x01\x00\x00\x00\x01\x00\x00\x00\x
00\x00\x00\x00\x00\x00\x00\x00K\xc1\xc2w\x00\x00\x00\x00\x01\x00\x00\x00\xff\xff
\xff\xff\x01\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00L\xc1\xc
2w\x00\x00\x00\x00\x01\x00\x00\x00\xff\xff\xff\xff\x01\x00\x00\x00\x01\x00\x00\x
00\x00\x00\x00\x00\x00\x00\x00\x00\r\x00\xc2w\x00\x00\x00\x00\x01\x00\x00\x00\xf
f\xff\xff\xff\x01\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0
1\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\xff\xff\xff\xff\x01\x00\x00\x00\x0
1\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0
0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0
0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0
0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
data[1] '\t\x04\x00\x00'
data[2]
'\r\n \tWeb\t \tImages\t \tGroups\t \tDirectory\t \tNews\t \r\n\r\n
\t\r\n\t \x0
7 Advanced Search\r\n \x07 Preferences\r\n \x07 Language
Tools\r\n\r\n\r\nAdvert
ise with Us - Business Solutions - Services & Tools - Jobs, Press, &
Help\r\n\r\
nc2003 Google - Searching 3,307,998,701 web pages'
 
M

Michael Geary

clipboard_grab()
FORMAT: 49171
FORMAT: 16
FORMAT: 7

7 = CF_OEMTEXT
16 = CF_LOCALE
49171 = 0xC013 = apparently OLE private data

That should help you with some searches. Basically the CF_OEMTEXT is the
only one that's going to be useful for you, unless you can figure out what
to do with the OLE private data.

-Mike
 
T

Trader

Thanks for your help, Neil! Your example code gave me an idea what I
should be seeing when the HTML/RTF stuff is working properly. I'd
been using a non-IE browser (Firebird) for testing, and it wasn't
giving me those results. Thanks for getting me on track! Trader
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,990
Messages
2,570,211
Members
46,796
Latest member
SteveBreed

Latest Threads

Top