Should I use BEGIN, CHECK, or INIT?

J

J. Romano

Greetings,

I have a couple of questions and I would like to know the opinions
of the Perl community.

I while ago, I wrote a Perl module which is now practically
finished. Its exact contents aren't important for my questions,
except for the fact that it is a package and its functions are called
like:

use MyPackage;
MyPackage::init();
MyPackage::f1();
MyPackage::f2($var1, $var2);

The first function that must be called is an initialization
function (MyPackage::init()). It must be called before any other
function in the package and should only be called once. This fact is
well-documented in the perldoc documentation, so any Perl programmer
who bothers to check this module's documentation should be aware of
it.

However, recently I've been thinking: If the init() function has
to be called before any other function and should only be called once,
why not put the initialization call in a BEGIN block inside the module
itself? That way, the programmer who uses my module doesn't even have
to call the init() function; it will be done automatically at the "use
MyPackage;" statement.

My first question is: Is putting the initialization function
inside a BEGIN (or CHECK or INIT) block a good idea, or is there some
pontential problem that I'm not aware of? If the initialization
function fails, I want it to stop the program from running at all. Of
course, if I do put the init() function in a BEGIN block, I'll remove
all mention of it from the perldoc so the programmer won't be tempted
to use it.

And my second question: If it is a good idea to automatically call
the init() function for the programmer, should I:

a) put the init() statement in a BEGIN block
b) put the init() statement in a CHECK block
c) put the init() statement in an INIT block
d) put the init() statement as the last statement executed
in the module
or
e) it's not a good idea, so just let the programmer who uses
my module call it in his/her own code

The only difference I see between BEGIN, CHECK, and INIT (besides
the facts that BEGIN statments happen before CHECK statements which
happen before INIT statements and that CHECK statements happen in
"First In, Last Out" order) is that if I put the init() function in a
BEGIN or CHECK block and the init() function fails, then it is
discovered at compile time. In other words, if the init() function
fails, then the statement:

perl -c myscript.pl

will fail as well, as long as the init() function call is placed
inside a BEGIN block or a CHECK block.

This sounds appealing in that "use MyPackage;" will fail if the
init() function fails, bringing it right away to the attention of the
programmer. However, it's possible that I might not see an obvious
problem with this approach, and so I would like any advice or comments
from anyone who might have had experience with something similar.

To sum up my questions: Should I put the init() call in a BEGIN,
CHECK, or INIT block, or should I just let the programmer who uses my
module do it automatically (and document it thoroughly in the
perldoc)?

Thanks in advance for any advice.

-- Jean-Luc
 
T

Tassilo v. Parseval

Also sprach J. Romano:
I while ago, I wrote a Perl module which is now practically
finished. Its exact contents aren't important for my questions,
except for the fact that it is a package and its functions are called
like:
[...]

The first function that must be called is an initialization
function (MyPackage::init()). It must be called before any other
function in the package and should only be called once. This fact is
well-documented in the perldoc documentation, so any Perl programmer
who bothers to check this module's documentation should be aware of
it.

However, recently I've been thinking: If the init() function has
to be called before any other function and should only be called once,
why not put the initialization call in a BEGIN block inside the module
itself? That way, the programmer who uses my module doesn't even have
to call the init() function; it will be done automatically at the "use
MyPackage;" statement.

My first question is: Is putting the initialization function
inside a BEGIN (or CHECK or INIT) block a good idea, or is there some
pontential problem that I'm not aware of? If the initialization
function fails, I want it to stop the program from running at all. Of
course, if I do put the init() function in a BEGIN block, I'll remove
all mention of it from the perldoc so the programmer won't be tempted
to use it.

That appears to be good idea.
And my second question: If it is a good idea to automatically call
the init() function for the programmer, should I:

a) put the init() statement in a BEGIN block
b) put the init() statement in a CHECK block
c) put the init() statement in an INIT block
d) put the init() statement as the last statement executed
in the module

There's really not much difference between any of those alternatives.
Putting it in a BEGIN block means that it is executed very early, even
before the rest of the module is compiled. Since the module cannot be
used without successfully calling YourModule::init() anyway, there is no
need to finish compilation of it so it doesn't harm to die very early.
or
e) it's not a good idea, so just let the programmer who uses
my module call it in his/her own code

You should at least give him the opportunity to call it manually even
when it usually must not be done. I had a problem lately with one of my
modules where I called an init function in a BOOT-section of the XS
portion [which is roughly equivalent to d) in the above list]. I
received reports that the module didn't work in processes that had been
forked off. And indeed, it turned out that the underlying C-library
required this initialization to happen in each process.

So call init() in a BEGIN block but also document that it must be called
manually when using the module's functionality in child processes.

Tassilo
 
P

Peter Hickman

J. Romano said:
And my second question: If it is a good idea to automatically call
the init() function for the programmer, should I:

a) put the init() statement in a BEGIN block
b) put the init() statement in a CHECK block
c) put the init() statement in an INIT block
d) put the init() statement as the last statement executed
in the module
or
e) it's not a good idea, so just let the programmer who uses
my module call it in his/her own code

I would put it in the BEGIN block and if I was really worried about the init()
function being called only once I would make that a closure.


BEGIN { init(); }

{
my $called = 0;

sub init {
unless( $called ) {
print "Called init for the first time\n";
$called = 1;
} else {
print "Init has already been called\n";
}
}
}

I would not put it in the CHECK block in case there were any side effects of
calling it such as opening database connections, files or the like. As, I
believe, that CHECK is called when you do...

perl -c Module.pm

where as BEGIN is not.

I think, can anyone confirm this. I am only on my first cup to tea.
 
A

Anno Siegel

J. Romano said:
Greetings,

I have a couple of questions and I would like to know the opinions
of the Perl community.

I while ago, I wrote a Perl module which is now practically
finished. Its exact contents aren't important for my questions,
except for the fact that it is a package and its functions are called
like:

use MyPackage;
MyPackage::init();
MyPackage::f1();
MyPackage::f2($var1, $var2);
[...]

And my second question: If it is a good idea to automatically call
the init() function for the programmer, should I:

a) put the init() statement in a BEGIN block
b) put the init() statement in a CHECK block
c) put the init() statement in an INIT block
d) put the init() statement as the last statement executed
in the module
or
e) it's not a good idea, so just let the programmer who uses
my module call it in his/her own code

I have little to add to Tassilo's thorough discussion of the issue,
except still another alternative:

f) rename the init() function as import() (or call it from
the import() function)

That (import()) will be called after the last executable statement
of your module, but before "use" returns to the caller, so it is
basically equivalent to d). The difference is that the user has
the choice to suppress the call to import() and call it on their
own.
The only difference I see between BEGIN, CHECK, and INIT (besides
the facts that BEGIN statments happen before CHECK statements which
happen before INIT statements and that CHECK statements happen in

Another difference is that BEGIN and CHECK are executed when the
module in question is loaded. INIT (and END) are collected and
not executed before all compile-time activity is done, so if more
modules are loaded, INIT only runs after that has happened.

Anno
 
B

Ben Morrow

Quoth "Tassilo v. Parseval said:
Also sprach J. Romano:

You should not put things in CHECK blocks, in general. They are hooks
for the (unimplemented) perl compiler.

I would say here is best, but it really makes no difference. Putting
init stuff in init blocks is necessary for any form of perl compiler to
work (including things I have tried to do in the past to make PAR's
module-detection mechanism more reliable in the face of modules which
use other modules for you, such as 'if' and 'all'); however, this
convention is so universally ignored that there is little point trying
to follow it.
You should at least give him the opportunity to call it manually even
when it usually must not be done.

As you're going to do this, I would recommend calling it from ->import.
Thn the user can choose (if necessary) not to call it with

use Module ();

and then call it later when necessary.

Ben
 
B

Brian McCauley

Douglas said:
dug@slurp:~/scratch$ cat A.pm
package A;

BEGIN { warn "begin" }
CHECK { warn "check" }
INIT { warn "init" }
END { warn "end" }

1;

For more complete view of things...

bam@wcl-l:~/tmp> cat A1.pm
package A1;
sub import { warn "import" };
BEGIN { warn "begin" }
CHECK { warn "check" }
INIT { warn "init" }
END { warn "end" }
warn "body";
1;

bam@wcl-l:~/tmp> cat A2.pm
package A2;
sub import { warn "import" };
BEGIN { warn "begin" }
CHECK { warn "check" }
INIT { warn "init" }
END { warn "end" }
warn "body";
1;

bam@wcl-l:~/tmp> perl -ce 'use A1; use A1; use A2;'
begin at A1.pm line 3.
body at A1.pm line 7.
import at A1.pm line 2.
import at A1.pm line 2.
begin at A2.pm line 3.
body at A2.pm line 7.
import at A2.pm line 2.
check at A2.pm line 4.
check at A1.pm line 4.
-e syntax OK

bam@wcl-l:~/tmp> perl -e 'use A1; use A1; use A2;'
begin at A1.pm line 3.
body at A1.pm line 7.
import at A1.pm line 2.
import at A1.pm line 2.
begin at A2.pm line 3.
body at A2.pm line 7.
import at A2.pm line 2.
check at A2.pm line 4.
check at A1.pm line 4.
init at A1.pm line 5.
init at A2.pm line 5.
end at A2.pm line 6.
end at A1.pm line 6.

bam@wcl-l:~/tmp> perl -e 'require A1; require A1; require A2;'
begin at A1.pm line 3.
body at A1.pm line 7.
begin at A2.pm line 3.
body at A2.pm line 7.
end at A2.pm line 6.
end at A1.pm line 6.

My view is that initialisaton code that must be run once and once only
when the module is loaded and which doesn't connect to any external
resource is best called in the module body unless there's a specific
reason to put it elsewhere.
Whereas I have the luxury of being well into my second cup of coffee
{grin}.

And I'm dangerously close to an overdose.
 
B

Brian McCauley

Anno said:
J. Romano said:
Greetings,

I have a couple of questions and I would like to know the opinions
of the Perl community.

I while ago, I wrote a Perl module which is now practically
finished. Its exact contents aren't important for my questions,
except for the fact that it is a package and its functions are called
like:

use MyPackage;
MyPackage::init();
MyPackage::f1();
MyPackage::f2($var1, $var2);

[...]


And my second question: If it is a good idea to automatically call
the init() function for the programmer, should I:

a) put the init() statement in a BEGIN block
b) put the init() statement in a CHECK block
c) put the init() statement in an INIT block
d) put the init() statement as the last statement executed
in the module
or
e) it's not a good idea, so just let the programmer who uses
my module call it in his/her own code


I have little to add to Tassilo's thorough discussion of the issue,
except still another alternative:

f) rename the init() function as import() (or call it from
the import() function)

That (import()) will be called after the last executable statement
of your module, but before "use" returns to the caller, so it is
basically equivalent to d). The difference is that the user has
the choice to suppress the call to import() and call it on their
own.

The only difference I see between BEGIN, CHECK, and INIT (besides
the facts that BEGIN statments happen before CHECK statements which
happen before INIT statements and that CHECK statements happen in


Another difference is that BEGIN and CHECK are executed when the
module in question is loaded. INIT (and END) are collected and
not executed before all compile-time activity is done, so if more
modules are loaded, INIT only runs after that has happened.

Anno
 
B

Brian McCauley

Brian said:
Anno Siegel wrote:

[snip most of Anno's message ]

import() is called each time the module use use()d. It is therefore
inappropriate for initialisation that should happen once regardless of
how many times the module is used.
 
B

Ben Morrow

Quoth (e-mail address removed):
It's not entirely clear to me when (or if) an INIT in a module is run.
Is that just before runtime of the main program, or just before runtime
of the module?

All INIT blocks parsed during the main script's compile time (the call
to perl_parse) are queued, and are run in FIFO order at the beginning of
perl_run. INIT blocks parsed after perl_run has started are never run.
Thus, if a used module defines an init block, it will be run at the
beginning of the execution of the main script (well after the use
statement has finished); if a required one defines one it will never be
run.

Ben
 
A

Anno Siegel

Abigail said:
Brian McCauley ([email protected]) wrote on MMMMLXXVI September MCMXCIII in
<URL:--
--
-- Brian McCauley wrote:
--
-- >
-- >
-- > Anno Siegel wrote:
--
-- [snip most of Anno's message ]
--
-- >> I have little to add to Tassilo's thorough discussion of the issue,
-- >> except still another alternative:
-- >>
-- >> f) rename the init() function as import() (or call it from
-- >> the import() function)
-- >>
-- >> That (import()) will be called after the last executable statement
-- >> of your module, but before "use" returns to the caller, so it is
-- >> basically equivalent to d). The difference is that the user has
-- >> the choice to suppress the call to import() and call it on their
-- >> own.
--
-- import() is called each time the module use use()d. It is therefore
-- inappropriate for initialisation that should happen once regardless of
-- how many times the module is used.


my $ping_a_pong;
sub import {
$ping_a_pong ++ or do {
... initialization code ...
};
}

Either that, or the initialization is prepared to be run more than once,
perhaps with different parameters, by different callers.
I'd hesitate to use INIT in a module. From "man perlmod":

"INIT" blocks are run just before the Perl runtime begins
execution, in "first in, first out" (FIFO) order. For
example, the code generators documented in perlcc make use
of "INIT" blocks to initialize and resolve pointers to
XSUBs.

It's not entirely clear to me when (or if) an INIT in a module is run.
Is that just before runtime of the main program, or just before runtime
of the module?

I had always "known" that INIT runs before the first executable statement
of the main program. Could be that I blithely assumed, but the assumption
is supported by the actual behavior of programs.

It would be a time to see which other modules are loaded and what
they are up to. Modules that modify the behavior of other modules,
or of Perl as a whole, may have business that is best done at INIT
time.
However, if I need to initialize something in a module, I wouldn't use
BEGIN, INIT, or CHECK, nor would I use import() or require an init()
function to be called.

I'd either put the initialization code in the main body of the module,
or just the call to init(). After all, a module, whether required or used,
*is* executed. Once. Right after it was compiled. Which sounds exactly
what the OP wants. (Sure, it would run multiple times if people use
'do Module;', or much with '%INC'. But that's their problem.)

That is, of course, the natural thing to do when there is no reason
to delay initialization. Sometimes you want to adapt to a given
situation and may want to do that a late as possible. For example,
sometimes class initialization is done the first time ->new is called,
even later than INIT.

Anno
 
M

Michele Dondi

It's not entirely clear to me when (or if) an INIT in a module is run.
Is that just before runtime of the main program, or just before runtime
of the module?

Wow! God is dead, Marx is dead and... there's something not entirely
clear to Abigail too!! (about Perl, that is!)


Michele
 
J

John W. Krahn

Michele said:
Wow! God is dead, Marx is dead and... there's something not entirely
clear to Abigail too!! (about Perl, that is!)

Maybe he was stunned when he heard the news that the Red Sox won the World Series?


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top