Making a random string

L

Lloyd Linklater

I have been trying to generate a random string. One approach in, say,
pascal would be something like this:

function GetRandomChar: char;
var
r: integer;
begin
r := random(36);
case r of
0..25: result := chr(ord('a') + r);
else : result := chr(ord('0') + r);
end;
end;

I know that there is something like "a".next but I need something more
like "a" + some_random_value. Even though it is more terse than the
Pascal, I am trying to avoid something time consuming and inelegant like

s = "a"
rand(26).times do {s.next!}

Any suggestions?
 
M

Michael Kohl

[Note: parts of this message were removed to make it a legal post.]

I have been trying to generate a random string. One approach in, say,
pascal would be something like this:

function GetRandomChar: char;
var
r: integer;
begin
r := random(36);
case r of
0..25: result := chr(ord('a') + r);
else : result := chr(ord('0') + r);
end;
end;

'Translating' your Pascal program I'd do something like this:

def get_random_char
(r = rand(36)) < 26 ? (?a+r).chr : (?0+r-26).chr
end

10.times { puts get_random_char} # => 10

# >> 5
# >> x
# >> p
# >> k
# >> s
# >> x
# >> d
# >> w
# >> 9
# >> t
 
F

F. Senault

Le 29 juin 2009 à 17:18, Lloyd Linklater a écrit :
Any suggestions?
=> ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m",
"n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "0",
"1", "2", "3", "4", "5", "6", "7", "8", "9"]

Or even :
[ *'a'..'z', *'0'..'9' ]
=> ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m",
"n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "0",
"1", "2", "3", "4", "5", "6", "7", "8", "9"]

So, if you need to compute a large number of random strings, store the
array aside in a constant :
CHARS = [ *'a'..'z', *'0'..'9' ]
=> ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m",
"n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "0",
"1", "2", "3", "4", "5", "6", "7", "8", "9"]

Then, you can use it to build your string :
def build_random_string(len)
r = ''
len.times { r << CHARS[rand(36)] }
r
end => nil
build_random_string(10)
=> "fdf93xdwq5"

Fred
 
M

Michael Kohl

[Note: parts of this message were removed to make it a legal post.]

'Translating' your Pascal program I'd do something like this:

Oh yeah, you said you want to build a string:

def generate_string(len)
raise ArgumentError if len < 1
(1..len).map{ get_random_char }.join
end

generate_string(10) # => "mjaxig1w35"

or

def generate_string(len)
raise ArgumentError if len < 1
s = ''
len.times { s << get_random_char }
s
end

generate_string(10) # => "dtuou833gq"
 
R

Robert Klemme

I have been trying to generate a random string. One approach in, say,
pascal would be something like this:

function GetRandomChar: char;
var
r: integer;
begin
r := random(36);
case r of
0..25: result := chr(ord('a') + r);
else : result := chr(ord('0') + r);
end;
end;

I know that there is something like "a".next but I need something more
like "a" + some_random_value. Even though it is more terse than the
Pascal, I am trying to avoid something time consuming and inelegant like

s = "a"
rand(26).times do {s.next!}

Any suggestions?

Stealing generate_id from my git repo:

http://github.com/rklemme/muppet-la...bf29cee37a0dd75a7869c99b10c7d/bin/test-gen.rb

Kind regards

robert
 
B

Bill Kelly

From: "Lloyd Linklater said:
I have been trying to generate a random string.

I use:

def gen_random_string(len)
(0...len).collect{rand(36).to_s(36)}.map{|x| (rand<0.5)?x:x.upcase}.join
end

...for short strings of length 64 or whatever. For very long strings, the
above may be a bit inefficient. (To generate a 1_000_000 character
string takes about 2.4 seconds on my system.)


Regards,

Bill
 
R

Robert Klemme

I use:

def gen_random_string(len)
(0...len).collect{rand(36).to_s(36)}.map{|x| (rand<0.5)?x:x.upcase}.join
end

..for short strings of length 64 or whatever. For very long strings, the
above may be a bit inefficient. (To generate a 1_000_000 character
string takes about 2.4 seconds on my system.)

Benchmark time!

robert

ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin]
user system total real
1 generate_id 0.110000 0.000000 0.110000 ( 0.105000)
1 gen_random_string 0.015000 0.000000 0.015000 ( 0.015000)
1 g3 0.016000 0.000000 0.016000 ( 0.014000)
1 g4 0.000000 0.000000 0.000000 ( 0.007000)
2 generate_id 0.015000 0.000000 0.015000 ( 0.004000)
2 gen_random_string 0.016000 0.000000 0.016000 ( 0.021000)
2 g3 0.031000 0.000000 0.031000 ( 0.026000)
2 g4 0.016000 0.000000 0.016000 ( 0.006000)
4 generate_id 0.000000 0.000000 0.000000 ( 0.010000)
4 gen_random_string 0.031000 0.000000 0.031000 ( 0.030000)
4 g3 0.047000 0.000000 0.047000 ( 0.043000)
4 g4 0.016000 0.000000 0.016000 ( 0.011000)
8 generate_id 0.015000 0.000000 0.015000 ( 0.012000)
8 gen_random_string 0.047000 0.000000 0.047000 ( 0.051000)
8 g3 0.078000 0.000000 0.078000 ( 0.081000)
8 g4 0.032000 0.000000 0.032000 ( 0.021000)
16 generate_id 0.015000 0.000000 0.015000 ( 0.020000)
16 gen_random_string 0.078000 0.000000 0.078000 ( 0.083000)
16 g3 0.157000 0.000000 0.157000 ( 0.150000)
16 g4 0.046000 0.000000 0.046000 ( 0.044000)
32 generate_id 0.032000 0.000000 0.032000 ( 0.038000)
32 gen_random_string 0.156000 0.000000 0.156000 ( 0.160000)
32 g3 0.281000 0.000000 0.281000 ( 0.296000)
32 g4 0.094000 0.000000 0.094000 ( 0.087000)
64 generate_id 0.078000 0.000000 0.078000 ( 0.080000)
64 gen_random_string 0.313000 0.000000 0.313000 ( 0.312000)
64 g3 0.562000 0.000000 0.562000 ( 0.563000)
64 g4 0.188000 0.000000 0.188000 ( 0.177000)
128 generate_id 0.140000 0.000000 0.140000 ( 0.151000)
128 gen_random_string 0.625000 0.000000 0.625000 ( 0.638000)
128 g3 1.156000 0.000000 1.156000 ( 1.211000)
128 g4 0.360000 0.000000 0.360000 ( 0.364000)
256 generate_id 0.328000 0.000000 0.328000 ( 0.322000)
256 gen_random_string 1.172000 0.000000 1.172000 ( 1.236000)
256 g3 2.172000 0.000000 2.172000 ( 2.223000)
256 g4 0.703000 0.000000 0.703000 ( 0.781000)
512 generate_id 0.625000 0.000000 0.625000 ( 0.624000)
512 gen_random_string 2.422000 0.000000 2.422000 ( 2.502000)
512 g3 4.406000 0.000000 4.406000 ( 4.674000)
512 g4 1.406000 0.000000 1.406000 ( 1.453000)
ruby 1.9.1p129 (2009-05-12 revision 23412) [i386-cygwin]
user system total real
1 generate_id 0.000000 0.000000 0.000000 ( 0.002000)
1 gen_random_string 0.015000 0.000000 0.015000 ( 0.009000)
1 g3 0.016000 0.000000 0.016000 ( 0.016000)
1 g4 0.016000 0.000000 0.016000 ( 0.006000)
2 generate_id 0.000000 0.000000 0.000000 ( 0.002000)
2 gen_random_string 0.000000 0.000000 0.000000 ( 0.009000)
2 g3 0.031000 0.000000 0.031000 ( 0.018000)
2 g4 0.000000 0.000000 0.000000 ( 0.004000)
4 generate_id 0.000000 0.000000 0.000000 ( 0.003000)
4 gen_random_string 0.016000 0.000000 0.016000 ( 0.018000)
4 g3 0.031000 0.000000 0.031000 ( 0.022000)
4 g4 0.000000 0.000000 0.000000 ( 0.007000)
8 generate_id 0.015000 0.000000 0.015000 ( 0.005000)
8 gen_random_string 0.032000 0.000000 0.032000 ( 0.031000)
8 g3 0.031000 0.000000 0.031000 ( 0.032000)
8 g4 0.016000 0.000000 0.016000 ( 0.012000)
16 generate_id 0.015000 0.000000 0.015000 ( 0.013000)
16 gen_random_string 0.063000 0.000000 0.063000 ( 0.058000)
16 g3 0.047000 0.000000 0.047000 ( 0.054000)
16 g4 0.016000 0.000000 0.016000 ( 0.023000)
32 generate_id 0.031000 0.000000 0.031000 ( 0.029000)
32 gen_random_string 0.109000 0.000000 0.109000 ( 0.104000)
32 g3 0.110000 0.000000 0.110000 ( 0.112000)
32 g4 0.031000 0.000000 0.031000 ( 0.040000)
64 generate_id 0.063000 0.000000 0.063000 ( 0.060000)
64 gen_random_string 0.218000 0.000000 0.218000 ( 0.232000)
64 g3 0.203000 0.000000 0.203000 ( 0.209000)
64 g4 0.094000 0.000000 0.094000 ( 0.095000)
128 generate_id 0.140000 0.000000 0.140000 ( 0.151000)
128 gen_random_string 0.407000 0.000000 0.407000 ( 0.503000)
128 g3 0.375000 0.000000 0.375000 ( 0.380000)
128 g4 0.187000 0.000000 0.187000 ( 0.180000)
256 generate_id 0.313000 0.000000 0.313000 ( 0.365000)
256 gen_random_string 0.812000 0.000000 0.812000 ( 0.806000)
256 g3 0.688000 0.000000 0.688000 ( 0.708000)
256 g4 0.359000 0.000000 0.359000 ( 0.352000)
512 generate_id 0.578000 0.000000 0.578000 ( 0.599000)
512 gen_random_string 1.547000 0.000000 1.547000 ( 1.549000)
512 g3 1.328000 0.000000 1.328000 ( 1.339000)
512 g4 0.735000 0.000000 0.735000 ( 0.807000)


require 'benchmark'

def generate_id(len = 15)
s = ''
len.times { s << 97 + rand(26) }
s.freeze
end

def gen_random_string(len)
(0...len).collect{rand(36).to_s(36)}.map{|x| (rand<0.5)?x:x.upcase}.join
end

def g3(len)
s = "." * len
s.gsub!(/./) { (97 + rand(26)).chr }
s
end

def g4 len
s = "." * len
len.times {|i| s = (97 + rand(26)).chr}
s
end

REP = 1000

Benchmark.bm 25 do |b|
len = 1

while len < 1_000

b.report '%7d generate_id' % len do
REP.times do
generate_id len
end
end

b.report '%7d gen_random_string' % len do
REP.times do
gen_random_string len
end
end

b.report '%7d g3' % len do
REP.times do
g3 len
end
end

b.report '%7d g4' % len do
REP.times do
g4 len
end
end

len <<= 1
end
end
 
B

Brian Candler

Lloyd said:
function GetRandomChar: char;
var
r: integer;
begin
r := random(36);
case r of
0..25: result := chr(ord('a') + r);
else : result := chr(ord('0') + r);
end;
end;

In this case, you can just do:

def get_random_char
rand(36).to_s(36)
end

Since Ruby allows arbitrary bignums, you can also get strings this way
too. e.g. for an 8-digit string:

rand(36 ** 8).to_s(36)

However there's a bug there, because numbers with one or more leading
zeros will be truncated. How to left-pad a non-decimal number with zeros
isn't actually that obvious. Maybe someone can point out something
simpler than this:

("0"*8 + rand(36 ** 8).to_s(36))[-8..-1]

(Unfortunately, "%08s" as a format string pads with spaces not zeros)
 
R

Robert Klemme

2009/6/30 Brian Candler said:
In this case, you can just do:

def get_random_char
=A0rand(36).to_s(36)
end

Since Ruby allows arbitrary bignums, you can also get strings this way
too. e.g. for an 8-digit string:

=A0 =A0rand(36 ** 8).to_s(36)

Interesting idea! Does rand have enough precision to fill arbitrary
large numbers?
However there's a bug there, because numbers with one or more leading
zeros will be truncated. How to left-pad a non-decimal number with zeros
isn't actually that obvious. Maybe someone can point out something
simpler than this:

=A0 =A0("0"*8 + rand(36 ** 8).to_s(36))[-8..-1]

(Unfortunately, "%08s" as a format string pads with spaces not zeros)

irb(main):012:0> "e".rjust 10, "0"
=3D> "000000000e"
irb(main):013:0>

Kind regards

robert


--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
B

Brian Candler

Robert said:
irb(main):012:0> "e".rjust 10, "0"
=> "000000000e"
irb(main):013:0>

Neat, thanks. So:

rand(36 ** 8).to_s(36).rjust(8,"0")

Useful for random binary strings too:

rand(2 ** 8).to_s(2).rjust(8,"0")
 
L

Lloyd Linklater

All very groovy stuff! The final version I think I shall use is a bit
of an amalgam of your input.

def randString(len)
s = ""
1.upto(len) { s << rand(36).to_s(36) }
s.upcase
end

At one point I had "unless len < 1" but this functions the same way.

Thanks again, everyone!
 
R

Robert Klemme

All very groovy stuff! The final version I think I shall use is a bit
of an amalgam of your input.

def randString(len)
s = ""
1.upto(len) { s << rand(36).to_s(36) }

I'd prefer len.times but this is just cosmetic.

You can probably squeeze out a bit performance especially for large
strings by replacing this with "s.upcase!; s".
end

At one point I had "unless len < 1" but this functions the same way.

Thanks again, everyone!

You're welcome!

Kind regards

robert
 
B

Brian Candler

Robert said:
You can probably squeeze out a bit performance especially for large
strings by replacing this with "s.upcase!; s".

A very bad idea IMO. Why make your code larger and less readable for the
sake of perhaps one microsecond or less? If performance matters on this
microscopic scale, you should be writing in C.
 
E

Eleanor McHugh

A very bad idea IMO. Why make your code larger and less readable for
the
sake of perhaps one microsecond or less? If performance matters on
this
microscopic scale, you should be writing in C.


I agree that "s.upcase!; s" is ugly, and I really wish the bang
methods returned self on success and raised an exception on error as
that's closer to how I use them than the current approach, but it's
still a well-established idiom and hardly likely to confuse even a
neophyte so long as they bother to RTFM.

As to the notion that we should only write performant code in C... why
complicate a codebase by using two languages (with all the debugging
nightmare that can entail) if the language you're already working in
is capable of doing the job anyway?


Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net
 
B

Brian Candler

Eleanor said:
As to the notion that we should only write performant code in C... why
complicate a codebase by using two languages (with all the debugging
nightmare that can entail) if the language you're already working in
is capable of doing the job anyway?

"Doing the job" is the critical part of that sentence.

In my opinion, if (and only if) your existing program won't do the job
within acceptable parameters, should you start modifying the code to
make it acceptable. Since you should have "done the simplest thing that
will possibly work" in the first place, then by definition, the modified
code will be more complex.

But more importantly: profile first, modify second. I find it highly
unlikely that in a real-world program, removing that one single hidden
string dup will make a noticeable improvement. More likely you'll want
to change your algorithm or data structures.

Of course, if in your particular application this change *does* improve
performance noticeably, then by all means make the change (and add a
comment as to why it was necessary to write it in a non-obvious way, so
that somebody doesn't simplify it back again later). But I think that's
the point: write more complex code *only* if it makes a measurable
improvement, not on the off-chance that it might.
 
R

Robert Klemme

2009/7/1 Brian Candler said:
A very bad idea IMO. Why make your code larger and less readable for the
sake of perhaps one microsecond or less?

I do not subscribe to the "less readable" assessment of yours -
uglier, yes. Also note that a microsecond per execution can be
harmful when the method is invoked often and / or the rest of the code
is not much costlier.
If performance matters on this
microscopic scale, you should be writing in C.

I couldn't have put it better than Ellie. Notice that object
allocation is one of the most expensive operations in Ruby. So it may
pay off to save one. Btw, this is also the reason why my solution
only works with Fixnums.

Apart from that I find unnecessary object creation ugly. You may call
that personal taste but with GC in mind there is also a quantifiable
reason to not waste objects.

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
D

David A. Black

Hi --

I agree that "s.upcase!; s" is ugly, and I really wish the bang methods
returned self on success and raised an exception on error as that's closer to
how I use them than the current approach, but it's still a well-established
idiom and hardly likely to confuse even a neophyte so long as they bother to
RTFM.

I don't think the distinction is between success and failure, though.
(str="ABC").upcase! succeeds -- it just doesn't change str. (I'm not a
huge fan of the nil returns either, by the way.)

Of course there's always:

str = "ABC"
str.tap(&:upcase!)

:)


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2)
"Ruby 1.9: What You Need To Know" Envycasts with David A. Black
http://www.envycasts.com
 
E

Eleanor McHugh

I don't think the distinction is between success and failure, though.
(str="ABC").upcase! succeeds -- it just doesn't change str. (I'm not a
huge fan of the nil returns either, by the way.)

Of course there's always:

str = "ABC"
str.tap(&:upcase!)

:)

I'm still adjusting to these 1.9 conveniences ;)


Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net
 
E

Eleanor McHugh

I couldn't have put it better than Ellie. Notice that object
allocation is one of the most expensive operations in Ruby. So it may
pay off to save one. Btw, this is also the reason why my solution
only works with Fixnums.

Apart from that I find unnecessary object creation ugly. You may call
that personal taste but with GC in mind there is also a quantifiable
reason to not waste objects.

I share that view. Being promiscuous with resources just because it's
simple to do so seems like a sure-fire way to build applications with
intrinsic scalability problems. It may save me some effort today, but
experience taught me long ago that it's a decision that will come back
to haunt me.


Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,813
Latest member
lawrwtwinkle111

Latest Threads

Top