FILE parsing problems

G

G

Hi, I'm a newbie to Perl and find regular expressions a mystery.
Anyway, I need to parse a file and display information in an HTML
file. My problem is that I don't know how to parse this stuff. Below
is the sample file showing 2 entries. After that I list my problem
code snip.

Thanks


"Title1" "Title2

" "Page:1" "Date: 12/15/2003 " "Time: 11:28:05AM
" "Sale
Number
" "
Sale Type
" "
Date Assigned
" "
Company Name
" "S034521" "Software Agreement" 11/08/2003 "Joes Garage" "2 Years
support"
"Title1" "Title2

" "Page:1" "Date: 12/15/2003 " "Time: 11:28:05AM
" "Sale
Number
" "
Sale Type
" "
Date Assigned
" "
Company Name
" "S034522" "Hardware" 12/11/2003 "JK & J INC." "Backup Tape"


while (<FILE>) {

#HERE IS WHERE I AM HAVING MY PROBLEM - I can't get a match
if ( ($snum, $type, $date, $comp_name) =
/\"+\"\t+\"+\"\t+\"+\"\t+\"+\"\t+\"+\"\t+\"(.+)\"\t+\"(.+)\"(.+?)\"(.+)\"\s*$/ix
) {

if ($snum !~ /^d/i) { next; }
$date =~ s/^\t*//; $date =~ s/\t*$//;
($ddate) = split(/\t/, $date);
$ddate = "&nbsp;" if (! $ddate);
$_ = <FILE>;
($description) = /^\"(.+)\"\s*$/; # dnum
if ($type !~ /^\s*$/) {
$output .=
PutRec($snum, $type, $ddate, $comp_name, $description);
}
}
}
 
B

Brian McCauley

Hi, I'm a newbie to Perl and find regular expressions a mystery.
Anyway, I need to parse a file and display information in an HTML
file. My problem is that I don't know how to parse this stuff.
Below is the sample file showing 2 entries. After that I list my
problem code snip.
"Title1" "Title2

" "Page:1" "Date: 12/15/2003 " "Time: 11:28:05AM
" "Sale
Number
" "
Sale Type
" "
Date Assigned
" "
Company Name
" "S034521" "Software Agreement" 11/08/2003 "Joes Garage" "2 Years
support"
"Title1" "Title2

" "Page:1" "Date: 12/15/2003 " "Time: 11:28:05AM
" "Sale
Number
" "
Sale Type
" "
Date Assigned
" "
Company Name
" "S034522" "Hardware" 12/11/2003 "JK & J INC." "Backup Tape"


while (<FILE>) {

#HERE IS WHERE I AM HAVING MY PROBLEM - I can't get a match
if ( ($snum, $type, $date, $comp_name) =
/\"+\"\t+\"+\"\t+\"+\"\t+\"+\"\t+\"+\"\t+\"(.+)\"\t+\"(.+)\"(.+?)\"(.+)\"\s*$/ix

Is suspect this is FAQ: "How can I split a [character] delimited
string except when inside [character]?"

--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
 
G

G

[email protected] (G) said:
Hi, I'm a newbie to Perl and find regular expressions a mystery.
Anyway, I need to parse a file and display information in an HTML
file. My problem is that I don't know how to parse this stuff.
Below is the sample file showing 2 entries. After that I list my
problem code snip.
"Title1" "Title2

" "Page:1" "Date: 12/15/2003 " "Time: 11:28:05AM
" "Sale
Number
" "
Sale Type
" "
Date Assigned
" "
Company Name
" "S034521" "Software Agreement" 11/08/2003 "Joes Garage" "2 Years
support"
"Title1" "Title2

" "Page:1" "Date: 12/15/2003 " "Time: 11:28:05AM
" "Sale
Number
" "
Sale Type
" "
Date Assigned
" "
Company Name
" "S034522" "Hardware" 12/11/2003 "JK & J INC." "Backup Tape"


while (<FILE>) {

#HERE IS WHERE I AM HAVING MY PROBLEM - I can't get a match
if ( ($snum, $type, $date, $comp_name) =
/\"+\"\t+\"+\"\t+\"+\"\t+\"+\"\t+\"+\"\t+\"(.+)\"\t+\"(.+)\"(.+?)\"(.+)\"\s*$/ix

Is suspect this is FAQ: "How can I split a [character] delimited
string except when inside [character]?"

--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\

I'm not so sure about that (of course I'm a newbie and couldn't
understand the answer anyway:

undef @field;
push(@fields, defined($1) ? $1:$3)
while m/"([^"\\]*(\\.[^"\\]*)*)"|([^,]+)/g;),

There are no quotes within quotes in my example. These are simply a
number of strings within quotes. Maybe the sample file I'm showing is
to complicated. Say I have a file which contains the entries:
" hello1" "hello2 " "bye3 " " bye4 "

and I want to read the contents of the 2nd and 4rth string, how can I
do it?

Thanks,

G
 
A

A. Sinan Unur

(e-mail address removed) (G) wrote in
Say I have a file which contains the entries:
" hello1" "hello2 " "bye3 " " bye4 "

and I want to read the contents of the 2nd and 4rth string, how can I
do it?

use strict;
use warnings;

my $input = '" hello1" "hello2 " "bye3 " " bye4 "';
my @fields = $input =~ /"\s*(\w+)\s*"/g;
print $fields[1], "\n", $fields[3];

__END__

C:\Home> perl t.pl
hello2
bye4

How about perldoc perlre?
 
M

Mahesha

G said:
Hi, I'm a newbie to Perl and find regular expressions a mystery.
Anyway, I need to parse a file and display information in an HTML
file. My problem is that I don't know how to parse this stuff. Below
is the sample file showing 2 entries. After that I list my problem
code snip.
while (<FILE>) {

I am not sure if lile-by-line processing is the right way in your case,
because some fields span across multiple lines. Unless you have undef'd
$/ earlier. But then there is no 's' modifier in your RE. Or may be the
news client wrapped the lines.
#HERE IS WHERE I AM HAVING MY PROBLEM - I can't get a match
if ( ($snum, $type, $date, $comp_name) =
/\"+\"\t+\"+\"\t+\"+\"\t+\"+\"\t+\"+\"\t+\"(.+)\"\t+\"(.+)\"(.+?)\"(.+)\"\s*$/ix
) {

Please consider the RE in while-condition of the following script. Rest
of the script is just the HTML output processing.

If this is a CGI script I'd prefer using CGI module.

HTH,
Mahesh.
---------------------------
#!/usr/local/bin/perl -w
use strict;

undef $/;

print "Content-type: text/html\n\n";

my $string = <DATA>;
my @fields;
while ($string =~ /\"\s*([^"]*?)\s*\" # " anything-non-quote "
| # Or
([\d\/]+) # that date outside of quotes
/xs) {
$string = $';
push (@fields, (defined $1) ? $1 : $2); # correct me if I am wrong,
# one of $1/$2 will be
# defined if this line is
# reached, no?
}

print qq {<html>
<head><title>Sample table</title>
<style>
body {font-family: verdana;}
table {font-size:10px;border:solid 1px #eeeeee;}
th {background-color:#eeeeee;text-align:left;}
</style>
</head>
<body>};
# Offsets and Indices are a total guesswork
for my $sale (0..1) {
$fields[$sale*14+3] =~ s/.*?://;
$fields[$sale*14+4] =~ s/.*?://;
print qq {<p><table width="350" cellspacing=0 cellpadding=2>
<tr>
<th width="50%">
$fields[$sale*14+5]
</th>
<th width="50%">
$fields[$sale*14+9]
</th>
</tr>
<tr>
<td width="50%">
When
</td>
<td width="50%">
$fields[$sale*14+3],
$fields[$sale*14+4]
</td>
</tr>
<tr>
<td width="50%">
$fields[$sale*14+6]
</td>
<td width="50%">
$fields[$sale*14+10]
</td>
</tr>
<tr>
<td width="50%">
$fields[$sale*14+7]
</td>
<td width="50%">
$fields[$sale*14+11]
</td>
</tr>
<tr>
<td width="50%">
$fields[$sale*14+8]
</td>
<td width="50%">
$fields[$sale*14+12]
</td>
</tr>
<tr>
<td width="50%">
Notes
</td>
<td width="50%">
$fields[$sale*14+13]
</td>
</tr>
</table>
};
}
print qq {</body></html>};

__DATA__
"Title1" "Title2

" "Page:1" "Date: 12/15/2003 " "Time: 11:28:05AM
" "Sale
Number
" "
Sale Type
" "
Date Assigned
" "
Company Name
" "S034521" "Software Agreement" 11/08/2003 "Joe's Garage" "2 Years
support"
"Title1" "Title2

" "Page:1" "Date: 12/15/2003 " "Time: 11:28:05AM
" "Sale
Number
" "
Sale Type
" "
Date Assigned
" "
Company Name
" "S034522" "Hardware" 12/11/2003 "JK & J INC." "Backup Tape"
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,143
Messages
2,570,822
Members
47,368
Latest member
michaelsmithh

Latest Threads

Top