-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow optionally passing a random source to RandomList
, and to Random
for collections
#2204
Conversation
This in case anyone wonders, I kept |
If
|
lib/coll.gd
Outdated
## | ||
## <Description> | ||
## <Index>random seed</Index> | ||
## For a dense list <A>list</A>, | ||
## <Ref Func="RandomList"/> returns a (pseudo-)random element with equal | ||
## distribution. | ||
## <P/> | ||
## This function uses the <Ref Var="GlobalMersenneTwister"/> to produce the | ||
## The random generator <A>rs</A> is used to choose a random number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generator -> source
hpcgap/lib/coll.gd
Outdated
## | ||
## <Description> | ||
## <Index>random seed</Index> | ||
## For a dense list <A>list</A>, | ||
## <Ref Func="RandomList"/> returns a (pseudo-)random element with equal | ||
## distribution. | ||
## <P/> | ||
## This function uses the <Ref Var="GlobalMersenneTwister"/> to produce the | ||
## The random generator <A>rs</A> is used to choose a random number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generator -> source
lib/coll.gi
Outdated
local len; | ||
len := Length(args); | ||
if len = 1 then | ||
return (args[1])[Random(GlobalMersenneTwister, 1, Length(args[1]))]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this difficult to read. May I suggest:
InstallGlobalFunction( RandomList, function(args...)
local rs, list, len;
len := Length(args);
if len = 1 then
rs := GlobalMersenneTwister;
list := args[1];
elif len = 2 then
rs := args[1]
list := args[2];
else
Error(...);
fi;
return list[Random(rs, 1, Length(list))];
end );
lib/random.gd
Outdated
{rs,C} -> RandomList(rs, Enumerator( C ) ) ); | ||
|
||
RedispatchOnCondition(Random,true,[IsCollection],[IsFinite],0); | ||
RedispatchOnCondition(Random,true,[IsRandomSource, IsCollection],[IsFinite],0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I am not so happy about moving these collection specific functions out of coll.gi
. Here is a possible alternative, which allows using InstallMethodWithRandomSource
inside coll.gi
:
diff --git a/lib/coll.gi b/lib/coll.gi
index 264a63de0..3f63919b2 100644
--- a/lib/coll.gi
+++ b/lib/coll.gi
@@ -246,11 +246,6 @@ InstallMethod( RepresentativeSmallest,
## an enumerator of <C> and selects a random element of this list using the
## function `RandomList', which is a pseudo random number generator.
##
-if IsHPCGAP then
- MakeThreadLocal( "GlobalMersenneTwister" );
-else
- DeclareGlobalVariable( "GlobalMersenneTwister" );
-fi;
InstallGlobalFunction( RandomList, function(list)
return list[Random(GlobalMersenneTwister, 1, Length(list))];
end );
diff --git a/lib/random.gd b/lib/random.gd
index c582a930c..763c31482 100644
--- a/lib/random.gd
+++ b/lib/random.gd
@@ -39,7 +39,7 @@
## </ManSection>
## <#/GAPDoc>
##
-BindGlobal( "RandomSourcesFamily", NewFamily( "RandomSourcesFamily" ) );
+BIND_GLOBAL( "RandomSourcesFamily", NewFamily( "RandomSourcesFamily" ) );
DeclareCategory( "IsRandomSource", IsComponentObjectRep );
#############################################################################
@@ -105,6 +105,14 @@ DeclareOperation( "Random", [IsRandomSource, IsInt, IsInt] );
## </ManSection>
## <#/GAPDoc>
##
+
+if IsHPCGAP then
+ MakeThreadLocal( "GlobalMersenneTwister" );
+else
+ DeclareGlobalVariable( "GlobalMersenneTwister" );
+fi;
+
+
(function()
local func;
func := function(installType)
diff --git a/lib/read1.g b/lib/read1.g
index b8d21f9c8..a360f1c4b 100644
--- a/lib/read1.g
+++ b/lib/read1.g
@@ -36,6 +36,8 @@ ReadLib( "set.gd" );
ReadLib( "record.gd" );
+ReadLib( "random.gd" );
+
ReadLib( "cache.gi" );
ReadLib( "coll.gi" );
@@ -70,8 +72,6 @@ ReadLib( "info.gi" );
ReadLib( "assert.gi" );
ReadLib( "global.gi" );
-ReadLib( "random.gd" );
-
ReadLib( "options.gd" );
ReadLib( "options.gi" );
To me, it anyway nicer to declare GlobalMersenneTwister
inside random.gd
instead of coll.gi
-- this matches GlobalRandomSource
. The patch above is not quite what I'd do, though: I'd rather move the definition of GlobalMersenneTwister
where it belongs, i.e. next to the def of GlobalRandomSource
(where we actually already have commented out definitions for GlobalMersenneTwister
!). But this means the def for InstallMethodWithRandomSource
has to be moved downwards, which is simple, but leads to a huge diff).
I can also make a PR with these changes now...
900563b
to
be88044
Compare
Codecov Report
@@ Coverage Diff @@
## master #2204 +/- ##
==========================================
- Coverage 70.38% 70.35% -0.04%
==========================================
Files 481 480 -1
Lines 253228 252217 -1011
==========================================
- Hits 178230 177437 -793
+ Misses 74998 74780 -218
|
be88044
to
d3529a3
Compare
468c7b9
to
5164058
Compare
Assuming the tests pass, I'm now happy with this. I use |
lib/coll.gi
Outdated
# In particular, we want Random(SomeRandomSource, IsList) to come higher | ||
# for lists, to avoid an infinite loop. | ||
InstallMethodWithRandomSource( Random, "for a (finite) collection", | ||
[ IsCollection and IsFinite ], -SUM_FLAGS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would avoid extra ranking of methods whenever possible. And the -SUM_FLAGS
does not seem needed here. E.g., in current master branch:
gap> DeclareCategory("IsDice", IsComponentObjectRep and IsDomain);
gap> dice := Objectify(NewType(NewFamily(""), IsDice), rec());
<object>
gap> InstallMethod(Enumerator, [IsDice], d-> [1..6]);
gap>
gap> InstallMethod(Random, ["IsRandomSource", "IsCollection and IsFinite"],
> function(rs, coll) return Random(Enumerator(coll)); end);
gap> Random(dice);
2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firstly, there is a small bug here, sorry.
This shows the issue I'm trying to fix. Given the following code, then Random(GlobalRandomSource,[1..10])
enters an infinite loop, because this newly installed method ranks more highly than the InstallMethod(Random, [IsGAPRandomSource, IsList],...
in random.gi.
InstallMethod( Random, "for a source and a (finite) collection",
[ IsRandomSource, IsCollection and IsFinite ],
{rs,C} -> RandomList(rs, Enumerator( C ) ) );
MakeReadWriteGlobal("RandomList");
RandomList := function(args...)
local len, source, list;
len := Length(args);
if len = 1 then
source := GlobalMersenneTwister;
list := args[1];
elif len = 2 then
source := args[1];
list := args[2];
else
Error(" Usage: RandomList([rs], list))");
fi;
return list[Random(source, 1, Length(list))];
end;
I am copying this here so it doesn't get lost: This shows the issue I'm trying to fix. Given the following code, then
|
5164058
to
10ba9fe
Compare
lib/coll.gi
Outdated
InstallGlobalFunction( RandomList, function(list) | ||
return list[Random(GlobalMersenneTwister, 1, Length(list))]; | ||
|
||
# RandomList is not a Method to avoid the (often expensive) cost of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method -> method (only change this if you end up having to update this PR anyway; no need to trigger a full rebuild over such minor thing ;-)/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, strictly speaking, I guess this should be "is not an operation", though?
lib/random.gi
Outdated
@@ -311,3 +311,10 @@ end ); | |||
InstallGlobalFunction("InstallMethodWithRandomSource", func(InstallMethod)); | |||
InstallGlobalFunction("InstallOtherMethodWithRandomSource", func(InstallOtherMethod)); | |||
end)(); | |||
|
|||
# Use -SUM_FLAGS, as this should always be the last choice method. | |||
# In particular, we want Random(SomeRandomSource, IsList) to come higher |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"to come higher" -> "to be ranked higher" ?
tst/testrandom.g
Outdated
fi; | ||
# Perform a variety of tests on Random generators. | ||
# filter: The type of random source we are testing | ||
# globalobj: The name of the global instance of this iterator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"this iterator" ? Huh?
Also, why is this named globalobj
, and in how far does the object passed in have to be "global" ? If I create such an instance on the fly, and also create gmethod on the fly, using this, it would be fine, too?
How about:
globalrs: an instance of a random source in (typically a global instance thereof, like , to be used by gmethod.
tst/testrandom.g
Outdated
else | ||
checkmethod := checkin[1]; | ||
fi; | ||
# Perform a variety of tests on Random generators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand "Random generators" here! First I thought you meant "random source", but that doesn't make sense. My next guess was "methods for the Random method", but that also doesn't seem to make sense, as the test are not restricted to Random
.
Right now I think you might mean: "Perform a variety of tests with a given function that is expected to produce/generate random output, and which should allow passing in a random source as optional first argument."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep using random generator, when I mean source. Will generally clean up.
tst/testrandom.g
Outdated
# globalobj: The name of the global instance of this iterator | ||
# gmethod: A single argument method which takes elements of the collection | ||
# using globalobj (This exists to let us test single-argument Random) | ||
# method: Two argument random method, which takes a random source and collection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the name gmethod
? Also, why "method" -- these two can be any function, can't they? And we frequently pass in operations, not methods...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method is just a bad choice of name, I tend to use method to refer to "any function", when it has a special meaning in GAP.
tst/testrandom.g
Outdated
# checkmethod: a way of checking if a value is a member of collection (usually \in) | ||
|
||
randomTestInner := function(filter, globalobj, gmethod, method, collection, checkmethod) | ||
local test1, test2, test3, test4, test5, test6, localgen; | ||
|
||
# We do a single call first, to deal with calling Random causing extra attributes | ||
# of 'collection' to be set, changing the dispatch | ||
method(collection); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this perhaps be gmethod
? Otherwise, the documentation comment is wrong, as here method
is called with a single argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definately wrong
tst/testrandom.g
Outdated
# checkmethod: a way of checking if a value is a member of collection (usually \in) | ||
|
||
randomTestInner := function(filter, globalobj, gmethod, method, collection, checkmethod) | ||
local test1, test2, test3, test4, test5, test6, localgen; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So localgen
is the sibling to globalobj
, and also a randomsource... perhaps call them rs
and gobal_rs
then?
tst/testrandom.g
Outdated
fi; | ||
|
||
randomTestInner(IsMersenneTwister, GlobalMersenneTwister, x -> method(x), method, collection, checkmethod); | ||
randomTestInner(IsGAPRandomSource, GlobalRandomSource, x -> method(GlobalRandomSource, x), method, collection, checkmethod); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we once x -> method(x)
and once x -> method(GlobalRandomSource, x)
? Aha, I assume there is an implicit assumption is that if no RS is specified, then GlobalMersenneTwister
is used? The manual only says this (emphasis mine):
Most methods for
Random
in the GAP library use theGlobalMersenneTwister
as random source.
I am happy to enforce this here in the testing code, but I think this should be stated explicitly, at least in a comment right here, e.g. like this:
If no random source is specified, functions producing randomness are expected to default to using
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should possibly strengthen the documentation, to "All methods", I believe the only method which uses anything else is the undocumented RANDOM_LIST.
tst/testrandom.g
Outdated
# A special test for collections of size 1 | ||
randomTestForSizeOneCollection := function(collection, method) | ||
# Here we can't check different seeds produce different answers | ||
# We do check that the random source is not used, for efficency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While that seems sensible, is this documented anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it seemed to be true in practice, so I decided to test it. I'm not positive I'd want to enforce it, but if it was ever false I'd certainly want to investigate why, as I imagine we'd be doing pointless work.
tst/testrandom.g
Outdated
|
||
if intlist1 <> intlist2 then | ||
Print("Random read from local gen affected global gen: ", collection); | ||
fi; | ||
end;; | ||
|
||
randomTestForSizeOneCollection := function(collection, method, checkin...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I wonder: is there a strong reason to have randomTestForSizeOneCollection
be a separate function? Couldn't randomTest
choose between calling randomTestInner
and randomTestForSizeOneCollectionInner
based on the collection size? (Even better, it could use IsTrivial
on the collection, which means "size = 1" but may work even on collections which don't know their precise size)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should do that.
5ea707a
to
1e9222b
Compare
I believe this is now cleaned up and all issues fixed. |
@ChrisJefferson Thanks for the example. This is not a specific problem of Still, an inflationary But I would remove the |
@frankluebeck could you elaborate why you would avoid ˋ-SUM_FLAGSˋ, resp. why the alternatives are "better"? (Are they? Why?) I.e., other than "people in the past did it that way". Right now this just seems like an equivalent alternative, not worth making @ChrisJefferson jump through hoops. Some factial justification would be very helpful, and motivating. What I don't like about an hardcoded +100 for the Random method for IsList is that I am concerned that it might shadow some other method somebody might install for special kinds of lists. E.g. I might imstall one for ranges, now I need to know I have to give it a +100. |
The problem with pushing up Random(IsGlobalRandomSource,IsList) is that I don't want it to get too high, else it might get above Random(IsRandomSource,IsX) for some X which implements IsList (I haven't checked everything we implement Random for). Also, we would have to do the same with every current and future added random source (and any random sources anyone has already implemented in their own code). Of course the other option is my other plan, which was to delete Random(IsGlobalRandomSource,IsList) and replace it with Random(IsGlobalRandomSource,IsInt,IsInt), which removes the issue (and also requires changes of course). PS sorry there is several random changes at the same time. When I started I did not expect this to get so complicated. |
tst/testrandom.g
Outdated
# | ||
# Test that 'Random(rs,C)' only uses 'rs', and no other source of random | ||
# | ||
# Test Random, PseudoRandom and RandomList |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, ˋPseudoRandomˋ does not yet allow a random source, and in fact for some methods (e.g. for groups) it may not be possible to reset the state of a random source and then get the same set of pseudo random elements, because there is additional mixing state in the group
tst/testrandom.g
Outdated
# | ||
# Where there is a global instance of a random generator | ||
# (GlobalMersenneTwister and GlobalRandomSource), they produce the same | ||
# sequence of answers as an new instance of the same random source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An new -> a new
tst/testrandom.g
Outdated
# global_randfunc(C): A one argument function which is equivalent to | ||
# {x} -> rand(global_rs, x). This lets us check 'Random(C)' and | ||
# 'Random(GlobalMersenneTwister,C)' produce the same answer when testing | ||
# GlobalMersenneTwister. For other random generators, this can just |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generator -> source (and perhaps grep for other occurrences?)
tst/testrandom.g
Outdated
Init(GlobalMersenneTwister, 6); | ||
test1 := List([1..1000], x -> method(collection)); | ||
Init(global_rs, 6); | ||
test1 := List([1..1000], x -> global_randfunc(collection)); | ||
# test2 should = test1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This confused me at first, maybe spell out "equal"?
# so this should be the same as intlist1 | ||
intlist2 := List([1..1000], x -> Random([1..10])); | ||
intlist2 := List([1..10], x -> global_randfunc([1..1000])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just query the state before and after, and compare it? Make an immutable copy to make sure it's not been modified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quite like testing the actual sequence. However, it might be nice to also test the states are equal as well!
tst/testrandom.g
Outdated
GlobalRandomSource, x -> randfunc(GlobalRandomSource, x), randfunc, | ||
collection, checkin); | ||
fi; | ||
end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a trailing newline?
tst/testrandom.g
Outdated
GlobalRandomSource, x -> randfunc(GlobalRandomSource, x), randfunc, | ||
collection, checkin); | ||
else | ||
randomTestInner(IsMersenneTwister, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could avoid some code duplication by adding a local var which you set to either ˋrandomTestInnerˋ or the orher func.
Ad why to avoid SUM_FLAGS, actually, why to avoid any extra ranking in method installation: Now to the specific case: this method is not one of the last resort type (like methods to print certain error messages or to avoid "no method found" errors) and so not a case to use -SUM_FLAGS. The case of installing a method for Now to the methods for If anyone thinks that I'm only writing comments to make poor "@ChrisJefferson jump through hoops" feel free to ignore them. |
@frankluebeck : I don't mind discussing this for a while, this is a sensitive, very old and very well-used bit of code in GAP! Stepping back for a moment, the short-term aim of this PR is to allow calling My final long term aim is to make it easy for users to add new implementations of |
1e9222b
to
39c0a25
Compare
This requires moving the function to random.gi, as we want to use InstallMethodWithRandomSource.
39c0a25
to
88c68e8
Compare
i believe all issues in this PR are now fixed. |
# This method must rank below Random(SomeRandomSource, IsList) | ||
# for any random source SomeRandomSource, to avoid an infinite loop. | ||
InstallMethodWithRandomSource( Random, "for a random source and a (finite) collection", | ||
[ IsRandomSource, IsCollection and IsFinite ], -8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry for being nitpicky, but: now that value -8
seems rather arbitrary, and I'd really prefer if there was a comment here (resp. before the InstallMethod...
invocation) that explains why it has the value; i.e. to push it below the method for IsList
?!?
I wonder if this value could/should instead be computed? Like, using RankFilter(IsList) - RankFilter(IsCollection and IsFinite) - 6
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, shouldn't those methods in random.gi
use IsList and IsFinite
instead of IsList
anyway? They require finite lists, after all (if Length(list)
returned infinity
, they'd have a problem). Wouldn't that also resolve the rank problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IsList and IsFinite
would fix the generators provided in the library, but still cause an infinite loop in any other random generator (like the one in IO).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the comment already explain? We need to rank below Random(SomeRandomSource, IsList) for any SomeRandomSource.
RandomList
RandomList
RandomList
, and to Random
for collections
The aim of this PR is to let Random for a collection take a random source. This required adding a random source also to RandomList, and do a little rearranging.