how to avoid checking the same condition repeatedly ?

W

Wolfgang Maier

Dear all,
this is a recurring programming problem that I'm just not sure how to solve
optimally, so I thought I'd ask for your advice:
imagine you have a flag set somewhere earlier in your code, e.g.,

needs_processing = True

then in a for loop you're processing the elements of an iterable, but the
kind of processing depends on the flag, e.g.,:

for elem in iterable:
if needs_processing:
pre_process(elem) # reformat elem in place
print(elem)

this checks the condition every time through the for loop, even though there
is no chance for needs_processing to change inside the loop, which does not
look very efficient. Of course, you could rewrite the above as:

if needs_processing:
for elem in iterable:
pre_process(elem) # reformat elem in place
print(elem)
else:
for elem in iterable:
print(elem)

but this means unnecessary code-duplication.

You could also define functions (or class methods):
def pre_process_and_print (item):
pre_process(item)
print(item)

def raw_print (item):
print(item)

then:
process = pre_process_and_print if needs_processing else raw_print
for elem in iterable:
process(elem)

but while this works for the simple example here, it becomes complicated if
pre_process requires more information to do its job because then you will
have to start passing around (potentially lots of) arguments.

So my question is: is there an agreed-upon generally best way of dealing
with this?

Thanks for your help,
Wolfgang
 
S

Steven D'Aprano

Dear all,
this is a recurring programming problem that I'm just not sure how to
solve optimally, so I thought I'd ask for your advice: imagine you have
a flag set somewhere earlier in your code, e.g.,

needs_processing = True

then in a for loop you're processing the elements of an iterable, but
the kind of processing depends on the flag, e.g.,:

for elem in iterable:
if needs_processing:
pre_process(elem) # reformat elem in place
print(elem)

this checks the condition every time through the for loop, even though
there is no chance for needs_processing to change inside the loop, which
does not look very efficient.

If there is absolutely no chance the flag could change, then the best way
is to test the flag once:
if needs_processing:
for elem in iterable:
pre_process(elem) # reformat elem in place
print(elem)
else:
for elem in iterable:
print(elem)


The amount of duplication here is trivial. Sometimes you can avoid even
that:

if needs_preprocessing:
preprocess(sequence)
process(sequence)


This is especially useful if you can set up a pipeline of generators:


if flag1:
data = (preprocess(item) for item in data)
if flag2:
data = (process(item) for item in data)
if flag3:
data = (postprocess(item) for item in data)
for item in data:
print(item)



Another option is to use a factory-function:

def factory(flag):
def process(item):
do_this(item)
do_that(item)
do_something_else(item)
print(item)
if flag:
def func(item):
preprocess(item)
process(item)
return func
else:
return process



func = factory(needs_preprocessing)
for elem in iterable:
func(elem)



So my question is: is there an agreed-upon generally best way of dealing
with this?

No. It's mostly a matter of taste. Those coming from a Java background
will probably prefer the factory or builder idiom. Those coming from a
functional language background, like Haskell or Lisp, or with experience
with Unix pipes, will probably prefer using chained generators.

The only solution you absolutely should avoid is the naive "copy and
paste" solution, where the amount of code duplication is extensive:


if needs_preprocessing:
for elem in iterable:
# If you change this block, you must remember to change the
# block below too!
preprocess(item)
do_this(item)
do_that(item)
do_something_else(item, arg1, arg2, expression)
do_another_thing(item + arg3, key=whatever)
print(item)
else:
for elem in iterable:
# If you change this block, you must remember to change the
# block above too!
preprocess(item)
do_this(item)
do_that(item)
do_something_else(item, arg1, arg2, expression)
do_another_thing(item, key=whatever)
print(item)


The careful reader will notice that there's already a bug in this.


Even checking the variable inside the loop is okay, provided you don't
care too much about performance, although I wouldn't call it "best
practice". Merely "acceptable for code that isn't performance-critical".
 
C

Chris Angelico

if needs_preprocessing:
for elem in iterable:
# If you change this block, you must remember to change the
# block below too!
preprocess(item)
do_this(item)
do_that(item)
do_something_else(item, arg1, arg2, expression)
do_another_thing(item + arg3, key=whatever)
print(item)
else:
for elem in iterable:
# If you change this block, you must remember to change the
# block above too!
preprocess(item)
do_this(item)
do_that(item)
do_something_else(item, arg1, arg2, expression)
do_another_thing(item, key=whatever)
print(item)


The careful reader will notice that there's already a bug in this.

For a start, you're preprocessing in the second block... I don't think
that was your intentional bug :)

ChrisA
 
P

Piet van Oostrum

Wolfgang Maier said:
Dear all,
this is a recurring programming problem that I'm just not sure how to solve
optimally, so I thought I'd ask for your advice:
imagine you have a flag set somewhere earlier in your code, e.g.,

needs_processing = True

then in a for loop you're processing the elements of an iterable, but the
kind of processing depends on the flag, e.g.,:

for elem in iterable:
if needs_processing:
pre_process(elem) # reformat elem in place
print(elem)

this checks the condition every time through the for loop, even though there
is no chance for needs_processing to change inside the loop, which does not
look very efficient.

I bet in most cases you won't notice the time used to check the condition.
Beware of premature optimization!
 
N

Nobody

So my question is: is there an agreed-upon generally best way of dealing
with this?

Yes. Just leave the test inside the loop.

If you're sufficiently concerned about performance that you're willing to
trade clarity for it, you shouldn't be using Python in the first place.
 
S

Steven D'Aprano

For a start, you're preprocessing in the second block... I don't think
that was your intentional bug :)

Heh, no, that wasn't the intentional bug. But it does go to show the
risks of copy-and-paste programming.
 
P

Peter Cacioppi

Nobody (yes, his name is Nobody) said:

"If you're sufficiently concerned about performance that you're willing to
trade clarity for it, you shouldn't be using Python in the first place."

I think the correct thing to say here is, IF you know this subroutine is a bottleneck, THEN probably this subroutine (or even the module it lives within) should be recoding in a language closer to the metal (like C).

I don't think it's correct to imply that people very concerned about performance should not use Python. (And I agree, Nobody implied that ;) But sometimes performance concerns require the bottleneck(s) be recoded in a manner that sacrifices readability for performance, to include a different language. Python generally plays well with other languages, no? So code it in Py, profile it, refactor the bottlenecks as needed.
 
C

Chris Angelico

Nobody (yes, his name is Nobody) said:

"If you're sufficiently concerned about performance that you're willing to
trade clarity for it, you shouldn't be using Python in the first place."

I don't think it's correct to imply that people very concerned about performance should not use Python. (And I agree, Nobody implied that ;)

No, I don't think he implied that. You can care about performance
while still putting code clarity as a higher priority :) If you
actually profile and find that something-or-other is a bottleneck,
chances are you can break it out into a function with minimal loss of
clarity, and then reimplement that function in C (maybe wielding
Cython for the main work). That doesn't compromise clarity.
Duplicating a loop to hoist a condition _does_. Of course, in C, you
can let the compiler do it for you, but Python can't be sure that it's
as constant as you think, so it can't be changed.

ChrisA
 
P

Peter Cacioppi

Chris said

" If you actually profile and find that something-or-other is a bottleneck,
chances are you can break it out into a function with minimal loss of
clarity, and then reimplement that function in C (maybe wielding
Cython for the main work). That doesn't compromise clarity. "

Well, I'm not going to go back and forth saying "does too, does not" with you. I have a 7 year old for those sorts of arguments.

And I think we are saying more or less the same thing.

I agree that isolating your bottleneck to the tightest possible subroutine usually doesn't compromise clarity.

But once you are re-implementing the bottleneck in a different language.... esp a language notorious for high performance nuggets of opaqueness... that does compromise clarity, to some extent.

(Does not, does too, does not, does too, ok we're done with that part)

But this sort of bottleneck refactoring can be done in a careful way that minimizes the damage to readability. And a strength of py is it tends to encourage this "as pretty as possible" approach to bottleneck refactoring.

This is what you're saying, right?
 
C

Chris Angelico

But this sort of bottleneck refactoring can be done in a careful way that minimizes the damage to readability. And a strength of py is it tends to encourage this "as pretty as possible" approach to bottleneck refactoring.

This is what you're saying, right?

Yep, that's about the size of it. Want some examples of what costs no
clarity to reimplement in another language? Check out the Python
standard library. Some of that is implemented in C (in CPython) and
some in Python, and you can't tell and needn't care which. Code
clarity isn't hurt, because those functions would be named functions
even without.

ChrisA
 
N

Neil Cerutti

Yes. Just leave the test inside the loop.

If you're sufficiently concerned about performance that you're
willing to trade clarity for it, you shouldn't be using Python
in the first place.

When you detect a code small, as Wolfgang did, e.g., "I'm
repeating the same exact test condition in several places," you
should not simply ignore it, even in Python.
 
R

rusi

When you detect a code small, as Wolfgang did, e.g., "I'm
repeating the same exact test condition in several places," you
should not simply ignore it, even in Python.

Yes
It is an agenda of functional programming that such programs should be modularly writeable without loss of performance:
http://homepages.inf.ed.ac.uk/wadler/topics/deforestation.html
[the trees there include lists]

Unfortunately for the imperative programmer, some forms of modularization are
simply not conceivable that are natural for functional programmers (or as
Steven noted shell-script writers). eg The loop:

while pred:
statement
is simply one unit and cannot be split further.

Whereas in a FPL one can write:
iterate statement >>> takeWhile pred
-- the >>> being analogous to | in bash
 
P

Peter Cacioppi

Chris said :
"Want some examples of what costs no clarity to reimplement in another language? Check out the Python standard library. Some of that is implemented in C (in CPython) and some in Python, and you can't tell and needn't care which."

To ME (a consumer of the CPython library) there is zero cost to clarity.

To the angels that maintain/develop this library and need to go inside the black box regularly .... there is a non-zero cost to clarity.

Right?

(I'd rather run the risk of stating the obvious than missing something clever, that's why I keep hitting this sessile equine).
 
A

alex23

imagine you have a flag set somewhere earlier in your code, e.g.,

needs_processing = True

then in a for loop you're processing the elements of an iterable, but the
kind of processing depends on the flag, e.g.,:

for elem in iterable:
if needs_processing:
pre_process(elem) # reformat elem in place
print(elem)

this checks the condition every time through the for loop, even though there
is no chance for needs_processing to change inside the loop, which does not
look very efficient.

There are two approaches I would consider using here:

1. Make pre_process a decorator, and outside of the loop do:

def pre_process_decorator(fn):
def pre_process(x):
# functionality goes here
return fn(x)
return pre_process

if needs_processing:
print = pre_process_decorator(print)

for elem in iterable:
print(elem)

2. Replace the iterable with a generator if the condition is met:

if needs_processing:
iterable = (pre_process(x) for x in iterable)

for elem in iterable:
print(elem)

Personally, I find the 2nd approach clearer.
 
M

Mariano Anaya

Dear all,

this is a recurring programming problem that I'm just not sure how to solve

optimally, so I thought I'd ask for your advice:

imagine you have a flag set somewhere earlier in your code, e.g.,



needs_processing = True



then in a for loop you're processing the elements of an iterable, but the

kind of processing depends on the flag, e.g.,:



for elem in iterable:

if needs_processing:

pre_process(elem) # reformat elem in place

print(elem)



this checks the condition every time through the for loop, even though there

is no chance for needs_processing to change inside the loop, which does not

look very efficient. Of course, you could rewrite the above as:



if needs_processing:

for elem in iterable:

pre_process(elem) # reformat elem in place

print(elem)

else:

for elem in iterable:

print(elem)



but this means unnecessary code-duplication.



You could also define functions (or class methods):

def pre_process_and_print (item):

pre_process(item)

print(item)



def raw_print (item):

print(item)



then:

process = pre_process_and_print if needs_processing else raw_print

for elem in iterable:

process(elem)



but while this works for the simple example here, it becomes complicated if

pre_process requires more information to do its job because then you will

have to start passing around (potentially lots of) arguments.



So my question is: is there an agreed-upon generally best way of dealing

with this?



Thanks for your help,

Wolfgang

Hi All,
Trying to help you out, I wonder if something like the following code will help on what you need.

for elem in (needs_processing and iterable or []):
pre_process(elem) # reformat elem in place
print(elem)

It avoids code duplication and only process and iterates if the condition is True.
I hope it helps, otherwise, we could keep thinking alternatives.

Regards.
Mariano.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top