converting the string

Vlad Smith · Jul 1, 2009

Hi everybody! I`m sorry for asking a silly question .. i hope you`ll
find some time to assist

while searching shrough html body i get the string :
<td class=blk11 ><img
src='https://sc.omniture.com/sc13_5/reports/chart.php?id=CPRIY_6NvJ_kj7s&s=www461.sj2&type=GIF'
width=624 height=280 border=0 align=absmiddle
usemap=#imIY_6NvJ_kj7s></td>

how can i modify it to get just a plain link without tags and params? :
https://sc.omniture.com/sc13_5/reports/chart.php?id=CPRIY_6NvJ_kj7s&s=www461.sj2&type=GIF

Greg Willits · Jul 1, 2009

Vlad said:
Hi everybody! I`m sorry for asking a silly question .. i hope you`ll
find some time to assist

while searching shrough html body i get the string :
<td class=blk11 ><img
src='https://sc.omniture.com/sc13_5/reports/chart.php?id=CPRIY_6NvJ_kj7s&s=www461.sj2&type=GIF'
width=624 height=280 border=0 align=absmiddle
usemap=#imIY_6NvJ_kj7s></td>

how can i modify it to get just a plain link without tags and params? :
https://sc.omniture.com/sc13_5/reports/chart.php?id=CPRIY_6NvJ_kj7s&s=www461.sj2&type=GIF

You'll have to expand this to take care of all possible scenarios, but
here's an example:

x = "<img src=\"http://some_url_to_scrape\">"
y = x.scan(/src=\"([\S\s]+?)\"/)

That will return an array, and you'll have to fish the string out of the
array with y[0][0].

That works specifically with proper HTML using double quotes where your
examples above used malformed single quotes, but you can either use
multple expressions, or build a more complex one to cover the various
cases of quotes href, src, and other attributes names, etc.

There's other ways to do it, this is just a small example to give you
some ideas.

-- gw

Vlad Smith · Jul 1, 2009

Greg said:
Vlad said:

Hi everybody! I`m sorry for asking a silly question .. i hope you`ll
find some time to assist

while searching shrough html body i get the string :
<td class=blk11 ><img
src='https://sc.omniture.com/sc13_5/reports/chart.php?id=CPRIY_6NvJ_kj7s&s=www461.sj2&type=GIF'
width=624 height=280 border=0 align=absmiddle
usemap=#imIY_6NvJ_kj7s></td>

how can i modify it to get just a plain link without tags and params? :
https://sc.omniture.com/sc13_5/reports/chart.php?id=CPRIY_6NvJ_kj7s&s=www461.sj2&type=GIF

Click to expand...

You'll have to expand this to take care of all possible scenarios, but
here's an example:

x = "<img src=\"http://some_url_to_scrape\">"
y = x.scan(/src=\"([\S\s]+?)\"/)

That will return an array, and you'll have to fish the string out of the
array with y[0][0].

That works specifically with proper HTML using double quotes where your
examples above used malformed single quotes, but you can either use
multple expressions, or build a more complex one to cover the various
cases of quotes href, src, and other attributes names, etc.

There's other ways to do it, this is just a small example to give you
some ideas.

-- gw

Thanks! that worked!

i also accidently noticed a great feature taken from perl that worked
also:

x = <td class=blk11 ><img
src='https://sc.omniture.com/sc13_5/reports/chart.php?id=CPRIY_6NvJ_kj7s
s=www461.sj2&type=GIF'width=624 height=280 border=0
align=absmiddleusemap=#imIY_6NvJ_kj7s></td>
x = $1 if x =~ /.*(https.*GIF).*/

Aaron Patterson · Jul 1, 2009

Greg said:
Greg said:

Vlad said:

Hi everybody! I`m sorry for asking a silly question .. i hope you`ll
find some time to assist

while searching shrough html body i get the string :
<td class=blk11 ><img
src='https://sc.omniture.com/sc13_5/reports/chart.php?id=CPRIY_6NvJ_kj7s&s=www461.sj2&type=GIF'
width=624 height=280 border=0 align=absmiddle
usemap=#imIY_6NvJ_kj7s></td>

how can i modify it to get just a plain link without tags and params? :
https://sc.omniture.com/sc13_5/reports/chart.php?id=CPRIY_6NvJ_kj7s&s=www461.sj2&type=GIF

Click to expand...

You'll have to expand this to take care of all possible scenarios, but
here's an example:

x = "<img src=\"http://some_url_to_scrape\">"
y = x.scan(/src=\"([\S\s]+?)\"/)

That will return an array, and you'll have to fish the string out of the
array with y[0][0].

That works specifically with proper HTML using double quotes where your
examples above used malformed single quotes, but you can either use
multple expressions, or build a more complex one to cover the various
cases of quotes href, src, and other attributes names, etc.

There's other ways to do it, this is just a small example to give you
some ideas.

-- gw

Click to expand...

Thanks! that worked!

i also accidently noticed a great feature taken from perl that worked
also:

x = <td class=blk11 ><img
src='https://sc.omniture.com/sc13_5/reports/chart.php?id=CPRIY_6NvJ_kj7s
s=www461.sj2&type=GIF'width=624 height=280 border=0
align=absmiddleusemap=#imIY_6NvJ_kj7s></td>
x = $1 if x =~ /.*(https.*GIF).*/

Please don't do this. Every time you parse HTML with a regular
expression, a kitten dies.

Instead, try using an HTML parsing library:

x = <<-eohtml
<td class=blk11 ><img
src='https://sc.omniture.com/sc13_5/reports/chart.php?id=CPRIY_6NvJ_kj7ss=www461.sj2&type=GIF'width=624 height=280 border=0
align=absmiddleusemap=#imIY_6NvJ_kj7s></td>
eohtml

puts Nokogiri::HTML(x).at('img')['src']

Uncaught ReferenceError: item is not defined at HTMLButtonElement.onclick in the: <button onclick="item.inserir()">Inserir dados</button>	1	Apr 22, 2023
Help with my responsive home page	2	Dec 14, 2022
Can someone tell me if this a real tracker? Or is it one designed to show you a different message at certain times, ie. acting like one?	0	Jan 10, 2021
Help with Visual Lightbox: Scripts	2	May 3, 2023
How to have two html audio players on one page?	0	May 3, 2022
Image shifts to the right when export the page to pdf	4	May 5, 2023
How to position the tooltip comment on these buttons?	9	Nov 4, 2023
Only one table shows up with the information	2	Mar 29, 2023

converting the string

Vlad Smith

Greg Willits

Vlad Smith

Aaron Patterson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads