T
Travis Parks
I am writing a simple algorithms library that I want to work for both
Python 2.7 and 3.x. I am writing some functions like distinct, which
work with dictionaries under the hood. The problem I ran into is that
I am calling itervalues or values depending on which version of the
language I am working in. Here is the code I wrote to overcome it:
import sys
def getDictValuesFoo():
if sys.version_info < (3,):
return dict.itervalues
else:
return dict.values
getValues = getDictValuesFoo()
def distinct(iterable, keySelector = (lambda x: x)):
lookup = {}
for item in iterable:
key = keySelector(item)
if key not in lookup:
lookup[key] = item
return getValues(lookup)
I was surprised to learn that getValues CANNOT be called as if it were
a member of dict. I figured it was more efficient to determine what
getValues was once rather than every time it was needed.
First, how can I make the method getValues "private" _and_ so it only
gets evaluated once? Secondly, will the body of the distinct method be
evaluated immediately? How can I delay building the dict until the
first value is requested?
I noticed that hashing is a lot different in Python than it is in .NET
languages. .NET supports custom "equality comparers" that can override
a type's Equals and GetHashCode functions. This is nice when you can't
change the class you are hashing. That is why I am using a key
selector in my code, here. Is there a better way of overriding the
default hashing of a type without actually modifying its definition? I
figured a requesting a key was the easiest way.
Python 2.7 and 3.x. I am writing some functions like distinct, which
work with dictionaries under the hood. The problem I ran into is that
I am calling itervalues or values depending on which version of the
language I am working in. Here is the code I wrote to overcome it:
import sys
def getDictValuesFoo():
if sys.version_info < (3,):
return dict.itervalues
else:
return dict.values
getValues = getDictValuesFoo()
def distinct(iterable, keySelector = (lambda x: x)):
lookup = {}
for item in iterable:
key = keySelector(item)
if key not in lookup:
lookup[key] = item
return getValues(lookup)
I was surprised to learn that getValues CANNOT be called as if it were
a member of dict. I figured it was more efficient to determine what
getValues was once rather than every time it was needed.
First, how can I make the method getValues "private" _and_ so it only
gets evaluated once? Secondly, will the body of the distinct method be
evaluated immediately? How can I delay building the dict until the
first value is requested?
I noticed that hashing is a lot different in Python than it is in .NET
languages. .NET supports custom "equality comparers" that can override
a type's Equals and GetHashCode functions. This is nice when you can't
change the class you are hashing. That is why I am using a key
selector in my code, here. Is there a better way of overriding the
default hashing of a type without actually modifying its definition? I
figured a requesting a key was the easiest way.