Substitution with Hash

Lee Jarvis · Sep 11, 2007

Ok i'll try to explain what i mean as well as i can

Lets say i have a hash like this

hash { 'a' => '1' } #just as example, its actually far bigger

and if a user inputs abcdabcd i was it to sub all of the a's with 1's..

As i said, the hash is far larger which is why i can't just do it with
gsub..

Any ideas?

Thanks in advance..

Lee

Lionel Bouton · Sep 11, 2007

Lee Jarvis wrote the following on 11.09.2007 12:41 :

Ok i'll try to explain what i mean as well as i can

Lets say i have a hash like this

hash { 'a' => '1' } #just as example, its actually far bigger

and if a user inputs abcdabcd i was it to sub all of the a's with 1's..

As i said, the hash is far larger which is why i can't just do it with
gsub..

Any ideas?

Thanks in advance..

Lee

yourstring.split(//).map{|c| hash[c] || c}.join

Lionel Bouton · Sep 11, 2007

Lionel Bouton wrote the following on 11.09.2007 12:48 :

Lee Jarvis wrote the following on 11.09.2007 12:41 :

Ok i'll try to explain what i mean as well as i can

Lets say i have a hash like this

hash { 'a' => '1' } #just as example, its actually far bigger

and if a user inputs abcdabcd i was it to sub all of the a's with 1's..

As i said, the hash is far larger which is why i can't just do it with
gsub..

Any ideas?

Thanks in advance..

Lee

Click to expand...

yourstring.split(//).map{|c| hash[c] || c}.join

Note that if your hash is only used to convert single characters to
single characters, you can use String#tr (or tr!). If you are after
performance, as you must prepare the strings used by String#tr from your
hash, you'll have to bench it to see if it's worth it in your use case
even if String#tr is faster in itself.
If you are processing UTF-8 content, String#tr is probably not safe
(there are libraries out there for fixing this though IIRC), but my
first answer probably is (assuming $KCODE='u'; require 'jcode'...) as
the regexp processing is utf-8 aware, so the String#split should be safe.

Lionel

Lee Jarvis · Sep 11, 2007

Thanks that worked well, And no its not single chars, Which is the only
reason i'm doing it this way..

I have to split on whitespace (/ /) because spliting on characters would
obviously split the text i want to transform, which means it wont match
if the characters are trailing another word, HTML special chars for
example

h = {"~" => "~"}

"hmm ~'.split(/ /).map{|c| h[c] || c}.join(' ')

Outputs hmm ~, but obviously doing things like question marks wont work,
Maybe i'll have to use loops and string#tr

Robert Klemme · Sep 11, 2007

2007/9/11 said:
Thanks that worked well, And no its not single chars, Which is the only
reason i'm doing it this way..

I have to split on whitespace (/ /) because spliting on characters would
obviously split the text i want to transform, which means it wont match
if the characters are trailing another word, HTML special chars for
example

h = {"~" => "~"}

"hmm ~'.split(/ /).map{|c| h[c] || c}.join(' ')

Outputs hmm ~, but obviously doing things like question marks wont work,
Maybe i'll have to use loops and string#tr

I'd rather not do the split step, IMHO direct replacement will be faster:

h = {"#126" => "~"}
s.gsub(/&([^;]+);/) {|c| h[c] || "&#{c};"}

Btw, I believe there are standard classes that do this type of
replacement (entities in HTML documents) - maybe it's in CGI.

Kind regards

robert

Lionel Bouton · Sep 11, 2007

Robert said:
h = {"~" => "~"}

"hmm ~'.split(/ /).map{|c| h[c] || c}.join(' ')

Outputs hmm ~, but obviously doing things like question marks wont work,
Maybe i'll have to use loops and string#tr

Click to expand...

I'd rather not do the split step, IMHO direct replacement will be faster:

If it's all for html entities yes. I'm not sure of what the actual use
case is though.

h = {"#126" => "~"}
s.gsub(/&([^;]+);/) {|c| h[c] || "&#{c};"}

Btw, I believe there are standard classes that do this type of
replacement (entities in HTML documents) - maybe it's in CGI.

The htmlentities gem (more robust than CGI with UTF-8...) is quite good.

Daniel DeLorme · Sep 12, 2007

Lee said:
Thanks that worked well, And no its not single chars, Which is the only
reason i'm doing it this way..

I have to split on whitespace (/ /) because spliting on characters would
obviously split the text i want to transform, which means it wont match
if the characters are trailing another word, HTML special chars for
example

h = {"~" => "~"}

If you're just trying to translate numeric html entities it's easy:
str.gsub(/&#(\d+);/){ [$1.to_i].pack('U') }
If you also want named entities I suggest the htmlentities gems.
If it's for a more general case, how about:
rx = Regexp.new(hash.keys.map{|k|Regexp.escape(k)}.join("|"))
str.gsub(rx){ hash[$&] }

Daniel

Hash counting	21	Feb 2, 2009
Search Results with Pagination	1	Oct 25, 2024
gsub mass substitution	6	Jan 9, 2008
hash	13	Dec 5, 2007
Sending data from web page to Raspberry Pi	0	Nov 26, 2022
Hash Surprises with Fixnum, #hash, and #eql?	43	Apr 7, 2011
named scope lambda string substitution problem	6	Nov 7, 2008
hash	6	Dec 4, 2007

Substitution with Hash

Lee Jarvis

Lionel Bouton

Lionel Bouton

Lee Jarvis

Robert Klemme

Lionel Bouton

Daniel DeLorme

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads