Anyone know of a MICR parser algorithm written in Python?

mkppk · Mar 24, 2007

MICR = The line of digits printed using magnetic ink at the bottom of
a check.

Does anyone know of a Python function that has been written to parse a
line of MICR data?
Or, some financial package that may contain such a thing?
Or, in general, where I should be looking when looking for a piece of
Python code that may have already been written by someone?

I'm working on a project that involves a check scanner the produces
the raw MICR line as text.

Now, that raw MICR needs to be parsed for the various pieces of info.
The problem with MICR is that there is no standard layout. There are
some general rules for item placement, but beyond that it is up to the
individual banks to define how they choose to position the
information.

I did find an old C program written by someone at IBM... But I've read
it and it is Not code that would nicely convert to Python (maybe its
all the Python I'm used to, be it seems very poorly written).

Here is the link to that C code: ftp://ftp.software.ibm.com/software/retail/poseng/4610/4610micr.zip

I've even tried using boost to generate a Python module, but that
didn't go well, and in the end is not going to be a solution for me
anyway.. really need access to the Python source.

Any help at all would be appreciated,

-mkp

Paul McGuire · Mar 24, 2007

MICR = The line of digits printed using magnetic ink at the bottom of
a check.

Does anyone know of a Python function that has been written to parse a
line of MICR data?
Or, some financial package that may contain such a thing?
Or, in general, where I should be looking when looking for a piece of
Python code that may have already been written by someone?

I'm working on a project that involves a check scanner the produces
the raw MICR line as text.

Now, that raw MICR needs to be parsed for the various pieces of info.
The problem with MICR is that there is no standard layout. There are
some general rules for item placement, but beyond that it is up to the
individual banks to define how they choose to position the
information.

I did find an old C program written by someone at IBM... But I've read
it and it is Not code that would nicely convert to Python (maybe its
all the Python I'm used to, be it seems very poorly written).

Here is the link to that C code:ftp://ftp.software.ibm.com/software/retail/poseng/4610/4610micr.zip

I've even tried using boost to generate a Python module, but that
didn't go well, and in the end is not going to be a solution for me
anyway.. really need access to the Python source.

Any help at all would be appreciated,

-mkp

Is there a spec somewhere for this data? Googling for "MICR data
format specification" and similar gives links to specifications for
the MICR character *fonts*, but not for the data content.

And you are right, reverse-engineering this code is more than a 10-
minute exercise. (However, the zip file *does* include a nice set of
test cases, which might be better than the C code as a starting point
for new code.)

-- Paul

mkppk · Mar 24, 2007

Is there a spec somewhere for this data? Googling for "MICR data
format specification" and similar gives links to specifications for
the MICR character *fonts*, but not for the data content.

And you are right, reverse-engineering this code is more than a 10-
minute exercise. (However, the zip file *does* include a nice set of
test cases, which might be better than the C code as a starting point
for new code.)

-- Paul

Well, the problem is that the "specification" is that "there is no
specification", thats just the way the MICR data line has evolved in
the banking industry unfortunately for us developers.. That being
said, there are obviusly enough banking companies out that with enough
example data to have intelligent parsers that handle all the
variations. And the C program appears to have all that built into it.

Its just that I would rather not reinvent the wheel (or read old C
code)..

So, the search continues..

Paul McGuire · Mar 25, 2007

Its just that I would rather not reinvent the wheel (or read old C
code)..

Wouldn't we all!

Here is the basic structure of a pyparsing solution. The parsing part
isn't so bad - the real problem is the awful ParseONUS routine in C.
Plus things are awkward since the C program parses right-to-left and
then reverses all of the found fields, and the parser I wrote works
left-to-right. Still, this grammar does most of the job. I've left
out my port of ParseONUS since it is *so* ugly, and not really part of
the pyparsing example.

-- Paul

from pyparsing import *

# define values for optional fields
NoAmountGiven = ""
NoEPCGiven = ""
NoAuxOnusGiven = ""

# define delimiters
DOLLAR = Suppress("$")
T_ = Suppress("T")
A_ = Suppress("A")

# field definitions
amt = DOLLAR + Word(nums,exact=10) + DOLLAR
onus = Word("0123456789A- ")
transit = T_ + Word("0123456789-") + T_
epc = oneOf( list(nums) )
aux_onus = A_ + Word("0123456789- ") + A_

# validation parse action
def validateTransitNumber(t):
transit = t[0]
flds = transit.split("-")
if len(flds) > 2:
raise ParseException(0, "too many dashes in transit number",
0)
if len(flds) == 2:
if len(flds[0]) not in (3,4):
raise ParseException(0, "invalid dash position in transit
number", 0)
else:
# compute checksum
ti = map(int,transit)
ti.reverse() # original algorithm worked with reversed data
cksum = 3*(ti[8]+ti[5]+ti[2]) + 7*(ti[7]+ti[4]+ti[1]) +
ti[6]+ti[3]+ti[0]
if cksum%10 != 0:
raise ParseException(0, "transit number failed checksum",
0)
return transit

# define overall MICR format, with results names
micrdata =
Optional(aux_onus,default=NoAuxOnusGiven).setResultsName("aux_onus") +
\
Optional(epc,default=NoEPCGiven).setResultsName("epc") +\

transit.setParseAction(validateTransitNumber).setResultsName("transit")
+ \
onus.setResultsName("onus") + \
Optional(amt,default=NoAmountGiven).setResultsName("amt")
+ \
stringEnd

import re

def parseONUS(tokens):
tokens["csn"] = ""
tokens["tpc"] = ""
tokens["account"] = ""
tokens["amt"] = tokens["amt"][0]
onus = tokens.onus
# remainder omitted out of respect for newsreaders...
# suffice to say that unspeakable acts are performed on
# onus and aux_onus fields to extract account and
# check numbers

micrdata.setParseAction(parseONUS)

testdata = file("checks.csv").readlines()[1:]
tests = [(flds[1],flds) for flds in map(lambda
l:l.split(","),testdata)]
def verifyResults(res,csv):
def match(x,y):
print (x==y and "_" or "X"),x,"=",y

Ex,MICR,Bank,Stat,Amt,AS,TPC,TS,CSN,CS,ACCT,AS,EPC,ES,ONUS,OS,AUX,AS,Tran,TS
= csv
match(res.amt,Amt)
match(res.account,ACCT)
match(res.csn,CSN)
match(res.onus,ONUS)
match(res.tpc,TPC)
match(res.epc,EPC)
match(res.transit,Tran)

for t,data in tests:
print t
try:
res = micrdata.parseString(t)
print res.dump()
if not(data[0] == "No"):
print "Passed expression that should have failed"
verifyResults(res,data)
except ParseException,pe:
print "<parse failed> %s" % pe.msg
if not(data[0] == "Yes"):
print "Failed expression that should have passed"
print

mkppk · Mar 25, 2007

Its just that I would rather not reinvent the wheel (or read old C
code)..

Click to expand...

Wouldn't we all!

Here is the basic structure of a pyparsing solution. The parsing part
isn't so bad - the real problem is the awful ParseONUS routine in C.
Plus things are awkward since the C program parses right-to-left and
then reverses all of the found fields, and the parser I wrote works
left-to-right. Still, this grammar does most of the job. I've left
out my port of ParseONUS since it is *so* ugly, and not really part of
the pyparsing example.

-- Paul

from pyparsing import *

# define values for optional fields
NoAmountGiven = ""
NoEPCGiven = ""
NoAuxOnusGiven = ""

# define delimiters
DOLLAR = Suppress("$")
T_ = Suppress("T")
A_ = Suppress("A")

# field definitions
amt = DOLLAR + Word(nums,exact=10) + DOLLAR
onus = Word("0123456789A- ")
transit = T_ + Word("0123456789-") + T_
epc = oneOf( list(nums) )
aux_onus = A_ + Word("0123456789- ") + A_

# validation parse action
def validateTransitNumber(t):
transit = t[0]
flds = transit.split("-")
if len(flds) > 2:
raise ParseException(0, "too many dashes in transit number",
0)
if len(flds) == 2:
if len(flds[0]) not in (3,4):
raise ParseException(0, "invalid dash position in transit
number", 0)
else:
# compute checksum
ti = map(int,transit)
ti.reverse() # original algorithm worked with reversed data
cksum = 3*(ti[8]+ti[5]+ti[2]) + 7*(ti[7]+ti[4]+ti[1]) +
ti[6]+ti[3]+ti[0]
if cksum%10 != 0:
raise ParseException(0, "transit number failed checksum",
0)
return transit

# define overallMICRformat, with results names
micrdata =
Optional(aux_onus,default=NoAuxOnusGiven).setResultsName("aux_onus") +
\
Optional(epc,default=NoEPCGiven).setResultsName("epc") +\

transit.setParseAction(validateTransitNumber).setResultsName("transit")
+ \
onus.setResultsName("onus") + \
Optional(amt,default=NoAmountGiven).setResultsName("amt")
+ \
stringEnd

import re

def parseONUS(tokens):
tokens["csn"] = ""
tokens["tpc"] = ""
tokens["account"] = ""
tokens["amt"] = tokens["amt"][0]
onus = tokens.onus
# remainder omitted out of respect for newsreaders...
# suffice to say that unspeakable acts are performed on
# onus and aux_onus fields to extract account and
# check numbers

micrdata.setParseAction(parseONUS)

testdata = file("checks.csv").readlines()[1:]
tests = [(flds[1],flds) for flds in map(lambda
l:l.split(","),testdata)]
def verifyResults(res,csv):
def match(x,y):
print (x==y and "_" or "X"),x,"=",y

Ex,MICR,Bank,Stat,Amt,AS,TPC,TS,CSN,CS,ACCT,AS,EPC,ES,ONUS,OS,AUX,AS,Tran,TS
= csv
match(res.amt,Amt)
match(res.account,ACCT)
match(res.csn,CSN)
match(res.onus,ONUS)
match(res.tpc,TPC)
match(res.epc,EPC)
match(res.transit,Tran)

for t,data in tests:
print t
try:
res = micrdata.parseString(t)
print res.dump()
if not(data[0] == "No"):
print "Passed expression that should have failed"
verifyResults(res,data)
except ParseException,pe:
print "<parse failed> %s" % pe.msg
if not(data[0] == "Yes"):
print "Failed expression that should have passed"
print

Great, thanks for taking a look Paul. I had never tried to use
pyparsing before. Yea, the ONUS field is crazy, don't know why there
is no standard for it.

Algorithm	1	Dec 15, 2019
Rock paper scissors in python with "algorithm"	1	Feb 27, 2022
Hey anyone know why I keep getting an error saying invalid escape sequence	3	Aug 25, 2024
Did you know that there is a match-case function in python?	4	Dec 17, 2023
How to implement a html parser in java?	1	Dec 28, 2023
Can anyone help me code a simple python code?	1	Mar 13, 2022
Do you know any other interesting features about coding in Python?	5	Sep 17, 2023
Anyone wants to make this programming language? (in C)	0	Jun 1, 2022

Anyone know of a MICR parser algorithm written in Python?

mkppk

Paul McGuire

mkppk

Paul McGuire

mkppk

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads