S
Shannon Jacobs
I want to restrict a grep to certain columns. After extensive contemplation
and a lot of study of the camel book (1991 printing), various manuals and
Web pages, and assorted newsgroup posts, I finally indirectly arrived at
this:
@foo2 = grep(/.{50}$form_values{'a_SEARCH_VALUE'}.{6}/,@foo1);
That's how it appears in the actual file, though for the sake of this
question, a reduced form would be:
@foo = grep(/.{50}$theString.{6},@foo);
The "records" are fixed length, and I want to ignore the first 50 characters
and the last 6. This approach mostly works (unless the string has a | in it,
which is probably a different problem), but it seems to have pegged my
nauseous code detector. I'm interested in constructive suggestions or
bemusing feedback. (Or maybe I should just appeal for a reference to a
source to replace the entire grotesque thing, appealing to the camel page
xviii citation of "laziness" as the first great virtue. Right now this thing
feeds off of another database system that goes back over 20 years, and it
would probably be a good thing to port the entire mess to some reasonable
environment...)
For the sake of context, the entire program appears below. It is actually
live on the Web at shanenj.tripod.com/search0.html. (Yes, I know Tripod has
a brain-damaged interface, but returning the encoded error messages in the
URL is kind of unique. I also know I should move to a decent server.
Consider Perl/CGI on Tripod as one of those semi-religious rituals, rather
like flogging oneself with a chain.)
Warning: If you proceed past this point you are at risk of laughing yourself
sick. By any standard, I'm sure this is some really peculiar old code,
containing about 10 years of occasional hacking and tweaking. I think it was
the Lisp influence that did it... Or maybe the parts that came from the
Chinese guy with the Berkeley Ph.D.? (I'm sure he doesn't want a more
explicit citation for what I've done to his code.) Or the Y2K windowing? Or
something. <Imagine a movie of the "Lost in Space" robot wildly waving its
arms and shouting "Danger, Will Robinson! Danger!">
#!/usr/local/bin/perl
print '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">';
print "Content-type: text/html\n\n";
print "<HTML><HEAD><TITLE>Search Result</TITLE>\n";
print '<SCRIPT language="JavaScript" src="../formfunc.js"></SCRIPT>';
print "\n";
print '<SCRIPT language="JavaScript">var inHTMLdate = "';
print scalar localtime;
print '";</SCRIPT>';
print "\n";
print "</HEAD>\n";
print "<BODY bgcolor=\"#00bfbf\">\n";
print "<hr size=8>\n";
print "<center><h3>BookList Search Form Result</H3></center>\n";
print "<hr size=8>";
%form_values = &html_parse;
if ( $form_values{'a_SEARCH_VALUE'} ) {
open(DATAFILE,'titles.txt');
$i = 0;
while (<DATAFILE>) { $foo0[$i++] = $_; }
close(DATAFILE);
open(AUTHFILE,'authors.txt');
$i = 1;
while (<AUTHFILE>) { $authlist[$i++] = $_; }
close(AUTHFILE);
# Time for fishing against authors (Main event of 8JUN2005)
if ($form_values{'AuthorFishing'} eq 'gofish' ) {
@foobar = grep(/$form_values{'a_SEARCH_VALUE'}/i,@authlist);
$form_values{'a_SEARCH_VALUE'} = '';
foreach $GREPPED (@foobar) {
$form_values{'a_SEARCH_VALUE'} =
$form_values{'a_SEARCH_VALUE'}.substr($GREPPED,34,4)."|";
}
$form_values{'a_SEARCH_VALUE'} = $form_values{'a_SEARCH_VALUE'}."90AZ";
}
# Main searches from here
if ($form_values{'sorttype'} eq 'title' ) {
@foo1 = sort ignore3 @foo0
} else { if ($form_values{'sorttype'} eq 'subject') {
@foo1 = sort {substr($a,62,6) cmp substr($b,62,6)} @foo0
} else {
@foo1 = @foo0
}}
if ($form_values{'searchfields'} eq 'authornums' ) {
@foo2 = grep(/.{50}$form_values{'a_SEARCH_VALUE'}.{6}/,@foo1);
# fut } else { if ($form_values{'searchfields'} eq 'dates' ) {
# @foo2 = grep(/$form_values{'a_SEARCH_VALUE'}/,@foo2);
# } else { if ($form_values{'searchfields'} eq 'subjects' ) {
# fut @foo2 = grep(/$form_values{'a_SEARCH_VALUE'}/,@foo2);
} else {
@foo2 = @foo1
}
if ($form_values{'sensecase'} eq 'usecase' ) {
@foo3 = grep(/$form_values{'a_SEARCH_VALUE'}/,@foo2);
} else {
@foo3 = grep(/$form_values{'a_SEARCH_VALUE'}/i,@foo2);
}
$hits = 1+$#foo3;
print ("<center>This search ( $form_values{'a_SEARCH_VALUE'} ) found $hits
books.</center><p>\n");
if ($form_values{'datetype'} eq 'human' ) {
#only load $months once if required
$months[1] = 'January';
$months[2] = 'February';
$months[3] = 'March';
$months[4] = 'April';
$months[5] = 'May';
$months[6] = 'June';
$months[7] = 'July';
$months[8] = 'August';
$months[9] = 'September';
$months[10] = 'October';
$months[11] = 'November';
$months[12] = 'December';
}
print("<center><table border = 1 cell padding = 1>\n");
print("<th>Title of Book</th><th>Pub. Year</th><th>Date Read</th>\n");
print("<th colspan=\"3\">Authors</th><th colspan=\"3\">Subjects</th>\n");
print("<tr></tr><tr></tr>\n");
foreach $GREPPED (@foo3) {
print("<tr>");
print("<td>",substr($GREPPED,0,40),"</td>\n");
print("<td>",substr($GREPPED,40,4),"</td>");
# print("<td>",substr($GREPPED,44,6),"</td>");
print("<td>".nicedate(substr($GREPPED,44,6))."</td>");
# From here store the authors and get names if required
$auth1 = substr($GREPPED,50,4);
$auth2 = substr($GREPPED,54,4);
$auth3 = substr($GREPPED,58,4);
if ( $form_values{'numorname'} eq 'authnames' ) {
print("<td>".substr($authlist[$auth1],0,34)."<br></td>");
print("<td>".substr($authlist[$auth2],0,34)."<br></td>");
print("<td>".substr($authlist[$auth3],0,34)."<br></td>\n");
} else { if ( $form_values{'numorname'} eq 'both' ) {
print("<td>".substr($authlist[$auth1],0,34)."( $auth1 )<br></td>");
print("<td>".substr($authlist[$auth2],0,34)."( $auth2 )<br></td>");
print("<td>".substr($authlist[$auth3],0,34)."( $auth3 )<br></td>\n");
} else {
print("<td>$auth1<br></td>");
print("<td>$auth2<br></td>");
print("<td>$auth3<br></td>\n");
}}
# print("<td>",substr($GREPPED,50,4),"<br></td>");
# print("<td>",substr($GREPPED,54,4),"<br></td>");
# print("<td>",substr($GREPPED,58,4),"<br></td>\n");
print("<td>",substr($GREPPED,62,2),"<br></td>");
print("<td>",substr($GREPPED,64,2),"<br></td>");
print("<td>",substr($GREPPED,66,2),"<br></td>");
print("</tr>\n\n");
}
print("</table></center>\n");
} else {
print ("Nothing to Search For<br>\n");
}
print '<SCRIPT language="JavaScript">';
print "\n";
print 'threefooter(\'Return
to<br>shanen\\\'s<br>Home\',\'/index.html\',\'procbook\');';
print "\n";
print '</SCRIPT>';
print "\n";
# print "<hr size=8>";
# print "<center><P>End of Search Results<P><br>\nClick to go back to my
Index page </center>\n";
# print "<P><center>";
# print "<A http=\"/index.html\"><img src=\"/images/home.gif\"
border=0></a>";
# print "</center><P>";
# print "<hr size=8><P>";
print "</BODY></HTML>";
exit(0);
sub ignore3 {
# local ($acut,$bcut);
$acut = 0;
$bcut = 0;
if (index($a,'A ') == 0) {$acut = 2}
if (index($a,'An ') == 0) {$acut = 3}
if (index($a,'The ') == 0) {$acut = 4}
if (index($b,'A ') == 0) {$bcut = 2}
if (index($b,'An ') == 0) {$bcut = 3}
if (index($b,'The ') == 0) {$bcut = 4}
#print ("$a1 cmp $b1");
substr($a,$acut) cmp substr($b,$bcut)
# $a cmp $b
}
sub nicedate {
if ($form_values{'datetype'} eq 'human' ) {
if (substr(@_[0],4,1) eq '0') {
$foo = $months[substr(@_[0],2,2)].'
'.substr(@_[0],5,1)
} else {
$foo = $months[substr(@_[0],2,2)].'
'.substr(@_[0],4,2)
};
# Y2K window added to 1960-2059, slj 3/20/2000 including above #s
if (substr(@_[0],0,1) gt '5') {
$foo.', 19'.substr(@_[0],0,2)
} else {
$foo.', 20'.substr(@_[0],0,2)
}
# Yeah, I know it isn't pretty but I haven't touched PERL in a LONG time...
} else {
if ($form_values{'datetype'} eq 'usa' ) {
substr(@_[0],2,2).'/'.substr(@_[0],4,2).'/'.substr(@_[0],0,2)
} else {
@_[0]
}
}
}
sub html_parse
{
local($line, $length, $offset, @pairs);
# if ($ENV{"REQUEST_METHOD"} eq "GET") {
# $line = $ENV{"QUERY_STRING"};
# }
# elsif ($ENV{"REQUEST_METHOD"} eq "POST") {
#
# ($ENV{"CONTENT_TYPE"} ne "application/x-www-form-#urlencoded") &&
# &html_fatal("Illegal Content-Type '" . $ENV
# $length = $ENV{"CONTENT_LENGTH"};
# (($length =~ m/^\d+$/) == 0) &&
# &html_fatal("Content-Length variable not found");
#
# (read(STDIN, $line, $length) == $length) ||
# &html_fatal("Could not read form info from stdin: " .
#$length);
# }
# else {
# &html_fatal("Illegal Request-Method '" . $ENV
#{"REQUEST_METHOD"} . "'");
# }
# Probably not allowed to check ENV so just take the query string?
# Just use a GET and the one line?
$line = $ENV{"QUERY_STRING"};
((!defined($line)) || (length($line) == 0)) &&
&html_fatal("No values passed in");
@pairs = split(/[=&]/, $line);
(substr($line, $length - 1, 1) eq "=") &&
($pairs[@pairs] = "");
$offset = 0;
foreach $value (@pairs) {
($offset++ & 1) &&
&html_unescape($value);
}
return @pairs;
}
sub html_unescape
{
local($offset);
$_[0] =~ s/\+/ /g;
$offset = -1;
while (($offset = index($_[0], "%", $offset + 1)) >= 0) {
substr($_[0], $offset, 3) =
pack("c", hex(substr($_[0], $offset + 1, 2)));
}
}
sub html_fatal
{
print "<H3>Error Detected</H3><P>";
print @_[0], "\n";
exit 0;
}
and a lot of study of the camel book (1991 printing), various manuals and
Web pages, and assorted newsgroup posts, I finally indirectly arrived at
this:
@foo2 = grep(/.{50}$form_values{'a_SEARCH_VALUE'}.{6}/,@foo1);
That's how it appears in the actual file, though for the sake of this
question, a reduced form would be:
@foo = grep(/.{50}$theString.{6},@foo);
The "records" are fixed length, and I want to ignore the first 50 characters
and the last 6. This approach mostly works (unless the string has a | in it,
which is probably a different problem), but it seems to have pegged my
nauseous code detector. I'm interested in constructive suggestions or
bemusing feedback. (Or maybe I should just appeal for a reference to a
source to replace the entire grotesque thing, appealing to the camel page
xviii citation of "laziness" as the first great virtue. Right now this thing
feeds off of another database system that goes back over 20 years, and it
would probably be a good thing to port the entire mess to some reasonable
environment...)
For the sake of context, the entire program appears below. It is actually
live on the Web at shanenj.tripod.com/search0.html. (Yes, I know Tripod has
a brain-damaged interface, but returning the encoded error messages in the
URL is kind of unique. I also know I should move to a decent server.
Consider Perl/CGI on Tripod as one of those semi-religious rituals, rather
like flogging oneself with a chain.)
Warning: If you proceed past this point you are at risk of laughing yourself
sick. By any standard, I'm sure this is some really peculiar old code,
containing about 10 years of occasional hacking and tweaking. I think it was
the Lisp influence that did it... Or maybe the parts that came from the
Chinese guy with the Berkeley Ph.D.? (I'm sure he doesn't want a more
explicit citation for what I've done to his code.) Or the Y2K windowing? Or
something. <Imagine a movie of the "Lost in Space" robot wildly waving its
arms and shouting "Danger, Will Robinson! Danger!">
#!/usr/local/bin/perl
print '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">';
print "Content-type: text/html\n\n";
print "<HTML><HEAD><TITLE>Search Result</TITLE>\n";
print '<SCRIPT language="JavaScript" src="../formfunc.js"></SCRIPT>';
print "\n";
print '<SCRIPT language="JavaScript">var inHTMLdate = "';
print scalar localtime;
print '";</SCRIPT>';
print "\n";
print "</HEAD>\n";
print "<BODY bgcolor=\"#00bfbf\">\n";
print "<hr size=8>\n";
print "<center><h3>BookList Search Form Result</H3></center>\n";
print "<hr size=8>";
%form_values = &html_parse;
if ( $form_values{'a_SEARCH_VALUE'} ) {
open(DATAFILE,'titles.txt');
$i = 0;
while (<DATAFILE>) { $foo0[$i++] = $_; }
close(DATAFILE);
open(AUTHFILE,'authors.txt');
$i = 1;
while (<AUTHFILE>) { $authlist[$i++] = $_; }
close(AUTHFILE);
# Time for fishing against authors (Main event of 8JUN2005)
if ($form_values{'AuthorFishing'} eq 'gofish' ) {
@foobar = grep(/$form_values{'a_SEARCH_VALUE'}/i,@authlist);
$form_values{'a_SEARCH_VALUE'} = '';
foreach $GREPPED (@foobar) {
$form_values{'a_SEARCH_VALUE'} =
$form_values{'a_SEARCH_VALUE'}.substr($GREPPED,34,4)."|";
}
$form_values{'a_SEARCH_VALUE'} = $form_values{'a_SEARCH_VALUE'}."90AZ";
}
# Main searches from here
if ($form_values{'sorttype'} eq 'title' ) {
@foo1 = sort ignore3 @foo0
} else { if ($form_values{'sorttype'} eq 'subject') {
@foo1 = sort {substr($a,62,6) cmp substr($b,62,6)} @foo0
} else {
@foo1 = @foo0
}}
if ($form_values{'searchfields'} eq 'authornums' ) {
@foo2 = grep(/.{50}$form_values{'a_SEARCH_VALUE'}.{6}/,@foo1);
# fut } else { if ($form_values{'searchfields'} eq 'dates' ) {
# @foo2 = grep(/$form_values{'a_SEARCH_VALUE'}/,@foo2);
# } else { if ($form_values{'searchfields'} eq 'subjects' ) {
# fut @foo2 = grep(/$form_values{'a_SEARCH_VALUE'}/,@foo2);
} else {
@foo2 = @foo1
}
if ($form_values{'sensecase'} eq 'usecase' ) {
@foo3 = grep(/$form_values{'a_SEARCH_VALUE'}/,@foo2);
} else {
@foo3 = grep(/$form_values{'a_SEARCH_VALUE'}/i,@foo2);
}
$hits = 1+$#foo3;
print ("<center>This search ( $form_values{'a_SEARCH_VALUE'} ) found $hits
books.</center><p>\n");
if ($form_values{'datetype'} eq 'human' ) {
#only load $months once if required
$months[1] = 'January';
$months[2] = 'February';
$months[3] = 'March';
$months[4] = 'April';
$months[5] = 'May';
$months[6] = 'June';
$months[7] = 'July';
$months[8] = 'August';
$months[9] = 'September';
$months[10] = 'October';
$months[11] = 'November';
$months[12] = 'December';
}
print("<center><table border = 1 cell padding = 1>\n");
print("<th>Title of Book</th><th>Pub. Year</th><th>Date Read</th>\n");
print("<th colspan=\"3\">Authors</th><th colspan=\"3\">Subjects</th>\n");
print("<tr></tr><tr></tr>\n");
foreach $GREPPED (@foo3) {
print("<tr>");
print("<td>",substr($GREPPED,0,40),"</td>\n");
print("<td>",substr($GREPPED,40,4),"</td>");
# print("<td>",substr($GREPPED,44,6),"</td>");
print("<td>".nicedate(substr($GREPPED,44,6))."</td>");
# From here store the authors and get names if required
$auth1 = substr($GREPPED,50,4);
$auth2 = substr($GREPPED,54,4);
$auth3 = substr($GREPPED,58,4);
if ( $form_values{'numorname'} eq 'authnames' ) {
print("<td>".substr($authlist[$auth1],0,34)."<br></td>");
print("<td>".substr($authlist[$auth2],0,34)."<br></td>");
print("<td>".substr($authlist[$auth3],0,34)."<br></td>\n");
} else { if ( $form_values{'numorname'} eq 'both' ) {
print("<td>".substr($authlist[$auth1],0,34)."( $auth1 )<br></td>");
print("<td>".substr($authlist[$auth2],0,34)."( $auth2 )<br></td>");
print("<td>".substr($authlist[$auth3],0,34)."( $auth3 )<br></td>\n");
} else {
print("<td>$auth1<br></td>");
print("<td>$auth2<br></td>");
print("<td>$auth3<br></td>\n");
}}
# print("<td>",substr($GREPPED,50,4),"<br></td>");
# print("<td>",substr($GREPPED,54,4),"<br></td>");
# print("<td>",substr($GREPPED,58,4),"<br></td>\n");
print("<td>",substr($GREPPED,62,2),"<br></td>");
print("<td>",substr($GREPPED,64,2),"<br></td>");
print("<td>",substr($GREPPED,66,2),"<br></td>");
print("</tr>\n\n");
}
print("</table></center>\n");
} else {
print ("Nothing to Search For<br>\n");
}
print '<SCRIPT language="JavaScript">';
print "\n";
print 'threefooter(\'Return
to<br>shanen\\\'s<br>Home\',\'/index.html\',\'procbook\');';
print "\n";
print '</SCRIPT>';
print "\n";
# print "<hr size=8>";
# print "<center><P>End of Search Results<P><br>\nClick to go back to my
Index page </center>\n";
# print "<P><center>";
# print "<A http=\"/index.html\"><img src=\"/images/home.gif\"
border=0></a>";
# print "</center><P>";
# print "<hr size=8><P>";
print "</BODY></HTML>";
exit(0);
sub ignore3 {
# local ($acut,$bcut);
$acut = 0;
$bcut = 0;
if (index($a,'A ') == 0) {$acut = 2}
if (index($a,'An ') == 0) {$acut = 3}
if (index($a,'The ') == 0) {$acut = 4}
if (index($b,'A ') == 0) {$bcut = 2}
if (index($b,'An ') == 0) {$bcut = 3}
if (index($b,'The ') == 0) {$bcut = 4}
#print ("$a1 cmp $b1");
substr($a,$acut) cmp substr($b,$bcut)
# $a cmp $b
}
sub nicedate {
if ($form_values{'datetype'} eq 'human' ) {
if (substr(@_[0],4,1) eq '0') {
$foo = $months[substr(@_[0],2,2)].'
'.substr(@_[0],5,1)
} else {
$foo = $months[substr(@_[0],2,2)].'
'.substr(@_[0],4,2)
};
# Y2K window added to 1960-2059, slj 3/20/2000 including above #s
if (substr(@_[0],0,1) gt '5') {
$foo.', 19'.substr(@_[0],0,2)
} else {
$foo.', 20'.substr(@_[0],0,2)
}
# Yeah, I know it isn't pretty but I haven't touched PERL in a LONG time...
} else {
if ($form_values{'datetype'} eq 'usa' ) {
substr(@_[0],2,2).'/'.substr(@_[0],4,2).'/'.substr(@_[0],0,2)
} else {
@_[0]
}
}
}
sub html_parse
{
local($line, $length, $offset, @pairs);
# if ($ENV{"REQUEST_METHOD"} eq "GET") {
# $line = $ENV{"QUERY_STRING"};
# }
# elsif ($ENV{"REQUEST_METHOD"} eq "POST") {
#
# ($ENV{"CONTENT_TYPE"} ne "application/x-www-form-#urlencoded") &&
# &html_fatal("Illegal Content-Type '" . $ENV
# $length = $ENV{"CONTENT_LENGTH"};
# (($length =~ m/^\d+$/) == 0) &&
# &html_fatal("Content-Length variable not found");
#
# (read(STDIN, $line, $length) == $length) ||
# &html_fatal("Could not read form info from stdin: " .
#$length);
# }
# else {
# &html_fatal("Illegal Request-Method '" . $ENV
#{"REQUEST_METHOD"} . "'");
# }
# Probably not allowed to check ENV so just take the query string?
# Just use a GET and the one line?
$line = $ENV{"QUERY_STRING"};
((!defined($line)) || (length($line) == 0)) &&
&html_fatal("No values passed in");
@pairs = split(/[=&]/, $line);
(substr($line, $length - 1, 1) eq "=") &&
($pairs[@pairs] = "");
$offset = 0;
foreach $value (@pairs) {
($offset++ & 1) &&
&html_unescape($value);
}
return @pairs;
}
sub html_unescape
{
local($offset);
$_[0] =~ s/\+/ /g;
$offset = -1;
while (($offset = index($_[0], "%", $offset + 1)) >= 0) {
substr($_[0], $offset, 3) =
pack("c", hex(substr($_[0], $offset + 1, 2)));
}
}
sub html_fatal
{
print "<H3>Error Detected</H3><P>";
print @_[0], "\n";
exit 0;
}