Opened 10 years ago

Closed 8 years ago

#2272 closed defect (fixed)

Improve consistency in random generator usage

Reported by: neteler Owned by: grass-dev@…
Priority: major Milestone: 7.0.5
Component: Default Version: svn-releasebranch70
Keywords: random Cc:
CPU: Unspecified Platform: Unspecified

Description

The usage of drand48()/srand48() should be consolitated, perhaps using G_math_rand() which is provided in lib/gmath/rand1.c:

# incomplete search
find . -name '*.c' | xargs grep -l rand48
./raster/r.mapcalc/xrand.c
./raster/r.mapcalc/evaluate.c
./raster/r.random/creat_rand.c
./lib/gmath/rand1.c
./lib/vector/rtree/rect.c
./vector/v.qcount/findquads.c
./vector/v.random/main.c
./vector/v.kcv/main.c


grep rand48 */*/*.c | grep -v flag
lib/gmath/rand1.c:	srand48(-seed);
lib/gmath/rand1.c:    return (float)drand48();
raster/r.mapcalc/evaluate.c:    srand48(seed_value);
raster/r.mapcalc/xrand.c:#define drand48() ((double)rand()/((double)RAND_MAX + 1))
raster/r.mapcalc/xrand.c:#define mrand48() ((long)rand())
raster/r.mapcalc/xrand.c:		unsigned long x = (unsigned long)mrand48();
raster/r.mapcalc/xrand.c:		double x = drand48();
raster/r.mapcalc/xrand.c:		double x = drand48();
raster/r.random/creat_rand.c:#define lrand48() (((long) rand() ^ ((long) rand() << 16)) & 0x7FFFFFFF)
raster/r.random/creat_rand.c:#define srand48(sv) (srand((unsigned)(sv)))
raster/r.random/creat_rand.c:    return lrand48();
raster/r.random/creat_rand.c:    srand48((long)time((time_t *) 0));
vector/v.kcv/main.c:	rng = drand48;
vector/v.kcv/main.c:	srand48((long)getpid());
vector/v.qcount/findquads.c:   #define R_INIT srand48
vector/v.qcount/findquads.c:   #define RANDOM(lo,hi) (drand48()/R_MAX*(hi-lo)+lo)
vector/v.random/main.c:double drand48()
vector/v.random/main.c:#define srand48(sv) (srand((unsigned)(sv)))
vector/v.random/main.c:	struct Flag *rand, *drand48, *z, *notopo;
vector/v.random/main.c:	rng = drand48;
vector/v.random/main.c:	    srand48((long)seed);
vector/v.random/main.c:	    srand48((long)getpid());

See also r53234 etc.

Attachments (1)

lrand48.c (1.3 KB ) - added by glynn 10 years ago.
lrand48 implementation

Download all attachments as: .zip

Change History (12)

by glynn, 10 years ago

Attachment: lrand48.c added

lrand48 implementation

comment:1 by glynn, 10 years ago

Replying to neteler:

The usage of drand48()/srand48() should be consolitated, perhaps using G_math_rand() which is provided in lib/gmath/rand1.c:

First, G_math_rand() should be fixed. Currently, it returns a value strictly less than 1.0 if drand48 is used, or a value less than or equal to 1.0 if rand() is used.

Also, seeding should be a separate function.

And we should probably have a portable equivalent to lrand48(), i.e. something which returns a non-negative 31-bit integer. The version in r.random is broken, as bit 15 will always be zero if RAND_MAX is 32767 (the minimum value allowed by the ISO C standards).

My suggestion would be to re-implement the LCG used by lrand48() etc. Sample implementation

comment:2 by neteler, 10 years ago

Priority: normalblocker

comment:3 by neteler, 10 years ago

For the record, from grass-dev:

On Sun, Jul 27, 2014 at 1:58 AM, Glynn Clements wrote:

Most uses of rand, random, lrand48, etc have been replaced in r61415 and r61416.

Many of these modules seeded the RNG from either the clock or the PID, with no way to provide an explicit seed. In such cases, the use of G_srand48_auto() has been marked with a "FIXME" comment.

It would be possible to modify G_srand48_auto() to allow the use of an environment variable, but this has problems of its own (e.g. setting the variable manually then forgetting about it).

r.li.daemon/daemon.c uses a hard-coded seed of zero.

r61416 changes readcell.c in r.proj, i.rectify, and i.ortho.rectify. These all used rand() to select a random cache block for ejection.

While this wouldn't have affected the result, only the first RAND_MAX cache blocks would have been used. RAND_MAX is only guaranteed to be at least 32767, which would limit the effective cache size to 1 GiB (each block is 64 * 64 = 4096 "double"s = 32kiB).

Cases which haven't been changed are:

lib/raster3d/test/test_put_get_value_large_file.c. This appears to be a test case; does it matter?

include/iostream/quicksort.h uses random() or rand() to select a pivot for the quicksort algorithm. That file has no dependency on lib/gis (or anything else, except for <stdlib.h> for random/rand), and I didn't want to add one unnecessarily.

Again, this shouldn't affect the result, but there may be performance issues if the size of the array being sorted is significantly larger than RAND_MAX (in this situation, the algorithm will be O(n2) even in the best case).

Unless there's a specific reason not to, it may be better to simply replace all uses of that file with std::sort() from <algorithm>.

comment:4 by neteler, 10 years ago

Priority: blockercritical

Bundled backport to relbranch70 in r61471. Downgrading priority.

in reply to:  3 comment:5 by neteler, 9 years ago

Priority: criticalmajor

Downgrading priority (or to be closed?).

Replying to neteler:

For the record, from grass-dev:

On Sun, Jul 27, 2014 at 1:58 AM, Glynn Clements wrote:

Cases which haven't been changed are:

lib/raster3d/test/test_put_get_value_large_file.c. This appears to be a test case; does it matter?

include/iostream/quicksort.h uses random() or rand() to select a pivot for the quicksort algorithm. That file has no dependency on lib/gis (or anything else, except for <stdlib.h> for random/rand), and I didn't want to add one unnecessarily.

Again, this shouldn't affect the result, but there may be performance issues if the size of the array being sorted is significantly larger than RAND_MAX (in this situation, the algorithm will be O(n2) even in the best case).

Unless there's a specific reason not to, it may be better to simply replace all uses of that file with std::sort() from <algorithm>.

comment:6 by neteler, 9 years ago

By chance I found

raster/simwe/simlib/random.c

/* uniform random number generator (combined type) */

Should that be replaced as well (how)?

in reply to:  6 ; comment:7 by glynn, 9 years ago

Replying to neteler:

By chance I found

raster/simwe/simlib/random.c

/* uniform random number generator (combined type) */

Should that be replaced as well (how)?

ulec() can be replaced by G_drand48().

seedg() isn't used.

seeds() is always called with fixed values (12345 and 67891). Either call G_srand48() with a fixed value (if repeatability is important), or call G_srand48_auto() for a non-deterministic seed (if it isn't), or add a seed= option to the modules.

FWIW, it seems possible for ulec() to return values slightly greater than 1:

    ret_val = (double)iz *4.656613e-10;

    return ret_val;

(231-1) * 4.656613e-10 = 1.0000000267907612

in reply to:  7 comment:8 by annakrat, 9 years ago

Replying to glynn:

Replying to neteler:

By chance I found

raster/simwe/simlib/random.c

/* uniform random number generator (combined type) */

Should that be replaced as well (how)?

ulec() can be replaced by G_drand48().

seedg() isn't used.

seeds() is always called with fixed values (12345 and 67891). Either call G_srand48() with a fixed value (if repeatability is important), or call G_srand48_auto() for a non-deterministic seed (if it isn't), or add a seed= option to the modules.

Done in r63216 and backported. The seed is fixed value.

comment:9 by neteler, 8 years ago

Seems all has been done here?

comment:10 by martinl, 8 years ago

Milestone: 7.0.07.0.5

comment:11 by martinl, 8 years ago

Resolution: fixed
Status: newclosed

Seems to be, closing.

Note: See TracTickets for help on using tickets.