newbie - merging xls files using xldt and xlwt

A

Albert-jan Roskam

Hi,

I wrote the program below to merge all xls files in a given directory into one multisheet xls file. It uses xlwt and xlrd. The xls files I use for input are generated by Spss. When I open and re-save the files in Excel, the program works, but when I use the xls files as they were created by Spss, the program returns an error message (see below). Some details: Python 2.4, xlwt version 0.7.0, xlrd version 0.5.2, Win NT.

Any ideas?

Thanks in advance!

Albert-Jan


"""
Merge all xls files in a given directory into one multisheet xls file.
The sheets get the orginal file name, without the extension.
File names should not exceed 31 characters, as this is the maximum
sheet name length
"""

import xlrd, xlwt
import glob, os.path

def merge_xls (in_dir, out_file="d:/merged_output.xls"):

xls_files = glob.glob(in_dir + "*.xls")
sheet_names = []
merged_book = xlwt.Workbook()

[sheet_names.append(os.path.basename(v)[:-4]) for k, v in enumerate(xls_files)]
for k, xls_file in enumerate(xls_files):
if len (sheet_names[k]) <= 31:
book = xlrd.open_workbook(xls_file)
ws = merged_book.add_sheet(sheet_names[k])
for sheetx in range(book.nsheets):
sheet = book.sheet_by_index(sheetx)
for rx in range(sheet.nrows):
for cx in range(sheet.ncols):
ws.write(rx, cx, sheet.cell_value(rx, cx))
else:
print "File name too long: <%s.xls> (maximum is 31 chars) " % (sheet_names[k])
print "File <%s.xls> is *not* included in the merged xls file." % (sheet_names[k])
merged_book.save(out_file)

print "---> Merged xls file written to %s using the following source files: " % (out_file)
for k, v in enumerate(sheet_names):
if len(v) <= 31: print "\t", str(k+1).zfill(3), "%s.xls" % (v)

merge_xls(in_dir="d:/temp/")

*** WARNING: OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero ***
put_cell 0 1

Traceback (most recent call last):
File "G:\generic_syntaxes\merge_xls.py", line 37, in -toplevel-
merge_xls(in_dir="d:/temp/")
File "G:\generic_syntaxes\merge_xls.py", line 21, in merge_xls
book = xlrd.open_workbook(xls_file)
File "C:\Python24\lib\site-packages\xlrd\__init__.py", line 139, in open_workbook
bk.get_sheets()
File "C:\Python24\lib\site-packages\xlrd\__init__.py", line 389, in get_sheets
sht = self.get_sheet(sheetno)
File "C:\Python24\lib\site-packages\xlrd\__init__.py", line 379, in get_sheet
sh.read(self)
File "C:\Python24\lib\site-packages\xlrd\sheet.py", line 285, in read
self.put_cell(rowx, colx, XL_CELL_TEXT, bk._sharedstrings[index])
File "C:\Python24\lib\site-packages\xlrd\sheet.py", line 214, in put_cell
self._cell_types[rowx][colx] = ctype
IndexError: list assignment index out of range
 
J

John Machin

Hi,

I wrote the program below to merge all xls files in a given directory into one multisheet xls file. It uses xlwt and xlrd. The xls files I use for input are generated by Spss. When I open and re-save the files in Excel, the program works, but when I use the xls files as they were created by Spss, the program returns an error message (see below). Some details: Python 2.4, xlwt version 0.7.0, xlrd version 0.5.2, Win NT.

Any ideas?

The version of xlrd that you are using is an antique. Go to
http://pypi.python.org/pypi/xlrd and get the latest version. If that
doesn't work, send me a copy of one of the files that is displaying
this problem.
Thanks in advance!

Albert-Jan

"""
Merge all xls files in a given directory into one multisheet xls file.
The sheets get the orginal file name, without the extension.
File names should not exceed 31 characters, as this is the maximum
sheet name length
"""

import xlrd, xlwt
import glob, os.path

def merge_xls (in_dir, out_file="d:/merged_output.xls"):

    xls_files   = glob.glob(in_dir + "*.xls")
    sheet_names = []
    merged_book = xlwt.Workbook()

    [sheet_names.append(os.path.basename(v)[:-4]) for k, v in enumerate(xls_files)]

Wah! Try this:
sheet_names = [os.path.basename(v)[:-4]) for v in xls_files]

    for k, xls_file in enumerate(xls_files):
        if len (sheet_names[k]) <= 31:
            book = xlrd.open_workbook(xls_file)
            ws = merged_book.add_sheet(sheet_names[k])
            for sheetx in range(book.nsheets):
                sheet = book.sheet_by_index(sheetx)
                for rx in range(sheet.nrows):
                    for cx in range(sheet.ncols):
                         ws.write(rx, cx, sheet..cell_value(rx, cx))

I presume that you're not too worried about any date data.

If an input file has more than 1 sheet, you are creating only one
sheet in the output file, and overwriting cells.
        else:
[snip]

merge_xls(in_dir="d:/temp/")

*** WARNING: OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero ***
put_cell 0 1

Traceback (most recent call last): [snip]
    self._cell_types[rowx][colx] = ctype
IndexError: list assignment index out of range

Yeah, symptom of an xls file with a DIMENSIONS records that lies ...
xlrd has like Excel become more resilient over time :)

BTW, consider joining the python-excel group at http://groups.google.com/group/python-excel

Cheers,
John
 
S

Support Desk

Hello all,

I was wondering if it would be possible to make a script to grab my
balance account balance a few times a day without having to login every
time. I know I can use the urlib2 library, but not sure how to go about
filling in the forms and submitting them. BOA has a mobile site that is
pretty simple. Anyone else use Bank of America and would be interested in
this. This is not for anything illegal, just for me to prevent overdrafting
my account

https://sitekey.bankofamerica.com/sas/signonScreen.do?isMobileDevice=true



y =
urllib.urlopen('https://sitekey.bankofamerica.com/sas/signonScreen.do?isMobl
eDevice=true',urllib.urlencode({'onlineID':'MYONLLINEID'})).readlines()
 
S

Support Desk

I was also looking at the ClientForm Library
http://wwwsearch.sourceforge.net/ClientForm/

which can get me past the first username form, but I noticed it then goes
to a challenge question form and im not sure how to take the resulting for
and resubmit it with new information and then resubmit the resulting form
with the password

-----Original Message-----
From: (e-mail address removed) [mailto:[email protected]] On Behalf Of Chris
Rebert
Sent: Thursday, October 16, 2008 10:15 AM
To: Support Desk
Cc: (e-mail address removed)
Subject: Re: account balance checker

You'd probably have to use something like mechanize
(http://wwwsearch.sourceforge.net/mechanize/) to fill out the forms,
but if BofA's website uses Javascript at all, you're probably out of
luck.

Cheers,
Chris
 
J

Jules Stevenson

Hello all,

I've been tearing my hair out trying to get pylon installed most of the day,
and it seems that both setup tools and paster.exe have some serious issues
with 64bit on windows.

Unfortunately I'm stuck with 2.6 64bit.

I think I've got it nearly all up and running, the biggest problem is that
paster fails to find the python executable when run, as it is 32bit and the
python executable is 64 [afaik]. Does anyone either have a 64bit compile, or
know where the source code for paster.exe lives so I can attempt to compile
it myself. It doesn't appear to be in the source for PasterScript package.
Somehow it 'appears' when setup .py install is run, but I cannot find any c
code anywhere.

Many thanks in advance

Jules
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,690
Latest member
MacGyver

Latest Threads

Top