T
touffik
Hi folks,
I'm trying to code a ruby script that select the content of a HTML
table in a HTML page.
I used rubular to test my regexp syntax which is
/ <td class="TabIntCenContenuto"[^>]*>(.*) /
with rubular the result of my expression is :
Result 1
1. 12345678
Result 2
1. SAN FRANCESCO DA PAOLA
Result 3
1. Via San Francesco Da Paola, 10
Result 4
1. 10123
Result 5
1. TORINO
etc....
But with my script :
File.open('D:/testt/1.txt', 'r') do |filein|
while line = filein.gets
p line if line =~ /<td class="TabIntCenContenuto"[^>]*>/ .. line
=~ /\/A /
end
fileout.puts p
end
end
I got this result
"</td><td class=\"TabIntCenContenuto\">12345678 \n"
"</td><td class=\"TabIntCenContenuto\">SAN FRANCESCO DA PAOLA </
td>\n"
"<td class=\"TabIntCenContenuto\">Via San Francesco Da Paola,
10 </td>\n"
"<td class=\"TabIntCenContenuto\">10123 </td>\n"
"<td class=\"TabIntCenContenuto\" align=\"left\">TORINO </td>\n"
I thought the .. between 2 "line =~" was like (...) in rubular which
let catch the content ??
Moreover I would like to transform this html code in XML. But I can"t
find an idea how to transform these HTML line in XML.
<root>
<number>12345678</number>
But there is no attribut 'name' or wathever in the <td> so making and
match/replace would be difficult ?
...
So, if someone can help me I would be very grateful.
Nice day
I'm trying to code a ruby script that select the content of a HTML
table in a HTML page.
I used rubular to test my regexp syntax which is
/ <td class="TabIntCenContenuto"[^>]*>(.*) /
with rubular the result of my expression is :
Result 1
1. 12345678
Result 2
1. SAN FRANCESCO DA PAOLA
Result 3
1. Via San Francesco Da Paola, 10
Result 4
1. 10123
Result 5
1. TORINO
etc....
But with my script :
File.open('D:/testt/1.txt', 'r') do |filein|
while line = filein.gets
p line if line =~ /<td class="TabIntCenContenuto"[^>]*>/ .. line
=~ /\/A /
end
fileout.puts p
end
end
I got this result
"</td><td class=\"TabIntCenContenuto\">12345678 \n"
"</td><td class=\"TabIntCenContenuto\">SAN FRANCESCO DA PAOLA </
td>\n"
"<td class=\"TabIntCenContenuto\">Via San Francesco Da Paola,
10 </td>\n"
"<td class=\"TabIntCenContenuto\">10123 </td>\n"
"<td class=\"TabIntCenContenuto\" align=\"left\">TORINO </td>\n"
I thought the .. between 2 "line =~" was like (...) in rubular which
let catch the content ??
Moreover I would like to transform this html code in XML. But I can"t
find an idea how to transform these HTML line in XML.
<root>
<number>12345678</number>
But there is no attribut 'name' or wathever in the <td> so making and
match/replace would be difficult ?
...
So, if someone can help me I would be very grateful.
Nice day