Skip to content

follow the Random API for Poly and Integers via RandomExtensions.make #648

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Oct 12, 2020

Conversation

rfourquet
Copy link
Contributor

As discussed in the Jitsi meeting last Friday. I made an implementation instead of just proposing an API because it's very difficult to know whether the idea of an API will work in practice, and the amount of implementation work was reasonable here.

In this case, "Following the Random API" means that one object is enough to encode all the necessary information for the various rand methods to know how to draw random objects (numbers, polynomials...) together with an RNG.

For example 1:3 or Bool are such objects (they mean draw numbers from the {1, 2, 3} set or from the {false, true} set respectively).
One counter-example is how we draw integers in AA: our API is e.g. rand(ZZ, 1:3) : two objects (ZZ and 1:3) are necessary in order to convey to rand all the information.

One advantage of following the "Random API" is composability. For example, you can create arrays (rand(1:3, 2, 3)) or randomize already existing arrays (rand!([0, 0, 0], 1:3)).

The obvious solution when more than one object are necessary is to pack them into one struct (not into a Tuple, as rand(::Tuple) already has a meaning). This is for example what distributions from the Distributions package do, e.g. with Normal(0.0, 1.0).
But it can be very tedious to define custom structs for all the various rand methods you want to define (both for the implementer, and for the user who has to remember all the ad-hoc names). For example, we would have rand(RandIntegers(ZZ, 1:3)) where RandIntegers is such a custom struct with two fields.

One solution to this problem is to use a custom generic struct which behaves like Tuple, call it Make: Make(a, b, c) contains as its unique field the tuple (a, b, c), and represents a "distribution" (in an extended meaning), containing all the information needed by rand. Then, the implementer doesn't have to define new structs for each rand method, and the user just has to remember one name: Make. For example we would have:

julia>  rand(Make(ZZ, 1:3), 3)
3-element Array{BigInt,1}:
 1
 3
 2

julia> R, x = PolynomialRing(ZZ, "x");

julia> m = Make(R, 0:5, Make(ZZ, -10:10)); 

julia> rand(RandomDevice(), m) # equivalent to our current `rand(RandomDevice(), R, 0:5, -10:10)`
x^4 + 5*x^3 + 7*x^2 - 3*x + 9

julia>  rand(m, 2)
2-element Array{AbstractAlgebra.Generic.Poly{BigInt},1}:
 -9*x + 2
 -3*x^3 - 10*x^2 - 6*x - 8

julia> rand!(ans,  m)
2-element Array{AbstractAlgebra.Generic.Poly{BigInt},1}:
 9*x^5 - 6*x^4 + 5*x^3 - 6*x^2 - x + 10
 2*x^5 - 9*x^3 - x^2 + 8*x + 7

Now, when elements of a tuple t are types, then t's type doesn't know which type it contains (e.g. typeof(1, Bool) == Tuple{Int,DataType}), so this leads to type instabilities. Working around this makes the implementation of Make slightly less straight-forward; one such implementation is in the RandomExtensions.jl package, which exports the make function (instead of Make) as the interface for creating "distributions" (such that the example above works with this PR just by replacing Make by make).

As seen in the above example, recursivity is handled via nested Make/make calls. To mirror the current "flattened" API (e.g. rand(R, 0:5, -10:10)), this PR also defines for convenience a flattened make API (e.g. make(R, 0:5, -10:10)).

RandomExtensions also defines an experimental convenience API using pairs, such that rand(a => b) is interpreted as rand(make(a, b)) and rand(a => (b, c...)) is interpreted as rand(make(a, b, c...)). So you can do

julia> rand(ZZ => 1:3, 3)
3-element Array{BigInt,1}:
 3
 3
 2

julia> rand(R => (0:5, -10:10))
-3*x^2 - 7*x + 7

julia> S, y = PolynomialRing(R, "y");

julia> rand(make(S, 0:5, 5:10, -10:10))
-8*x^8 - x^7 - 8*x^6 + 8*x^5 - 4*x^4 + 4*x^3 + 9*x^2 + 6*x - 8

julia>  rand(S => (0:5, 5:10, -10:10)) # equivalent to rand(S => (0:5, R => (5:10, ZZ => -10:10)))
(3*x^9 - 10*x^8 + 5*x^7 + 6*x^6 - 9*x^5 + 7*x^4 + 2*x^3 + 10*x^2 - 9*x - 9)*y + 7*x^7 - 2*x^6 - 7*x^5 + 6*x^4 + 3*x^2 + 9*x + 4

I have thought a lot about this problem in general (independently of AA/Oscar), and didn't find better than the make framework (started about 3 years ago) from RandomExtensions, which I consider to be quite solid at this point (allowing a non-type object as the first argument of make is a new feature though, and was not yet extensively tested).

cc @fieker @wbhart

RandomExtensions.maketype(S::PolyRing, dr::UnitRange{Int}, _...) = elem_type(S)

# define rand for make(S, deg_range, v)
function rand(rng::AbstractRNG, sp::Random.SamplerTrivial{<:RandomExtensions.Make3{<:RingElement,<:PolyRing,UnitRange{Int}}})
Copy link
Contributor Author

@rfourquet rfourquet Sep 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make3 is an alias for Make objects containing 3 elements; it's first type parameter is the type of produced objects, and the 3 subsequent type parameters correspond to the types of the 3 stored objects.

@@ -2743,16 +2743,38 @@ end
#
###############################################################################

function rand(rng::AbstractRNG, S::AbstractAlgebra.PolyRing, deg_range::UnitRange{Int}, v...)
RandomExtensions.maketype(S::PolyRing, dr::UnitRange{Int}, _...) = elem_type(S)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maketype takes the same arguments as make, and must return the type of produced random elements. We could probably define a catch-all method in AA, like maketype(S::AbstractAlgebra.Set, x...) = elem_type(S).

@wbhart
Copy link
Contributor

wbhart commented Sep 10, 2020

This final line is something I can personally live with, if it solves some other problem you'd like solved:

julia>  rand(S => (0:5, 5:10, -10:10)) # equivalent to rand(S => (0:5, R => (5:10, ZZ => -10:10)))

So if that works, I'm ok with this.

@wbhart
Copy link
Contributor

wbhart commented Sep 10, 2020

And I'm personally ok with adding a dependency on RandomExtensions, so long as it is not doing anything hacky that will need updating with every new version of Julia. I know others don't necessarily agree with adding dependencies. But I'll let them chime in with their opinion on that.

@rfourquet
Copy link
Contributor Author

Note that the => API is experimental, and does somehow commit type piracy: base Julia could decide to give a different meaning to rand(::Pair) (although I don't think this will happen, see e.g. JuliaLang/julia#28704). So IMO library code should refrain (for the time being) from using => and use make instead (but yes I find => clearer, and more practical at the REPL).

I also like the "flattened" API for convenience, although the nested calls provide some clarity in reflecting the nesting structure: it's easy to forget what each argument do in rand(S => (0:5, 5:10, -10:10)), while it's made more explicit in rand(S => (0:5, R => (5:10, ZZ => -10:10))).

And I'm personally ok with adding a dependency on RandomExtensions, so long as it is not doing anything hacky that will need updating with every new version of Julia

Nice! I don't think that anything hacky will cause any problem. My current medium term plan (maybe within 6 months) is to extract out the core feature related to Make in a very small and stable package, which doesn't do any type piracy (so without the => API), and without all the code that RandomExtensions currently has which makes use of Make to define rand on a bunch of Base types, and does also more experimental stuff.

@wbhart
Copy link
Contributor

wbhart commented Sep 10, 2020

My main concern was not rand(S => (0:5, 5:10, -10:10)) vs rand(make(0:5, 5:10, -10:10)), though I certainly find the former more appealing, but in comparison with rand(S => (0:5, R => (5:10, ZZ => -10:10))) which is very often in real user code not at all practical. It'd be better in test code, for sure, as it's clearer. But the typical user won't necessarily even know all the intermediate rings, as they may have been autogenerated by the system.

@rfourquet
Copy link
Contributor Author

So if that works, I'm ok with this.

Note also that I didn't remove the existing rand(S, 0:5, 5:10, -10:10), I just re-define it in terms of the rand method which uses make. I also experimented a lot with this kind of convenience methods, they usually work fine, but I found that it's best to define first rand methods using one object (like with make), and define those using multiple objects in terms of the former.

@rfourquet
Copy link
Contributor Author

But the typical user won't necessarily even know all the intermediate rings, as they may have been autogenerated by the system.

Good point, although a minimal amount of knowledge is necessary to know which arguments to pass.

As a side note, I have a very experimental work in progress to allow the user to not have to give these parameters, for testing purposes. make is replaced by test, and then random objects with non-uniform distribution are drawn, which try to cover all corner cases. The implementer can provide defaults for the parameters, such that in the example above you could do rand(test(S, 0:5, 5:10, -10:10)) or rand(test(S, 0:5)) or even rand(test(S)).

@wbhart
Copy link
Contributor

wbhart commented Sep 10, 2020

Of course that would also be useful from my personal point of view. The better our testing, the better everything is. But I imagine this particular feature is not a high priority item.

@thofma
Copy link
Member

thofma commented Sep 11, 2020

Thanks @rfourquet, I like the interface.

I am not too big of a fan of external dependencies. On the other hand RandomExtensions.jl is quite small and, apart from some generated functions, does not do any super fancy things. So if the others are happy, I am happy too.

@fieker
Copy link
Contributor

fieker commented Sep 18, 2020

fine with me - waiting for the corresponding pull request for Nemo, Singular, (Polymake,) Hecke and Oscar...

@rfourquet rfourquet force-pushed the rf/randmake branch 2 times, most recently from 3d6116f to 29d2b03 Compare September 22, 2020 18:13
@rfourquet
Copy link
Contributor Author

Given the support, I will then add the few remaining needed make and rand methods in this PR and merge.

waiting for the corresponding pull request for Nemo, Singular, (Polymake,) Hecke and Oscar...

I will gladly do! For Singular, Polymake and GAP I didn't find any defined rand methods so there is nothing to do on this front. In Oscar.jl, there are very few rand methods, but they rely on GAP's own non-exposed RNG, so a bit more work has to be done, but this currently doesn't need make methods, as there is already only one argument to rand.

@fieker
Copy link
Contributor

fieker commented Sep 23, 2020 via email

@rfourquet
Copy link
Contributor Author

Singular has them

Ok, so I looked only for Julia code in Singular.jl, where I didn't find any rand method defined. I will look eventually into writing them, but I was planning to first update those which exist already (so mainly AA, Nemo, Hecke for now).

@fieker
Copy link
Contributor

fieker commented Sep 23, 2020 via email

@rfourquet
Copy link
Contributor Author

So this change (and others I haven't yet pushed to this branch) breaks Nemo tests, until Nemo is updated with similar changes. I don't know if this is "considered" as breaking (as no documented behavior is changed AFAICT), so I don't know if this should increment the patch or minor version number. What I would do is to bump the patch version, and depend on that right away in Nemo with the new changes, so that both master branches remain compatible (and probably similarly with Hecke). If the minor version must be bumped, maybe you prefer accumulating more breaking changes before a new version... let me know how to proceed.

@thofma
Copy link
Member

thofma commented Sep 23, 2020

If we know it will break packages upstream, we should bump the minor version. Otherwise, with just a patch version bump, one might end up with a combination of packages which we know will not work.

I will mark this as breaking.

@rfourquet
Copy link
Contributor Author

Yes, good point! A question on style: in this PR there is currently heave signatures like

function rand(rng::AbstractRNG, sp::Random.SamplerTrivial{<:RandomExtensions.Make3{<:RingElement,<:PolyRing,UnitRange{Int}}})

In the corresponding Nemo PR I prepared, I imported SamplerTrivial and Make3, here this would become

function rand(rng::AbstractRNG, sp::SamplerTrivial{<:Make3{<:RingElement,<:PolyRing,UnitRange{Int}}})

Which I find easier on the eyes, but I'm biased as I know where SamplerTrivial and Make3 come from. What do you prefer?

As a side note, I will write a macro eventually to simplify this boilerplate, so this becomes something like

@rand function Make(S::PolyRing, val_range::UnitRange{Int})
   ...
end

@thofma
Copy link
Member

thofma commented Sep 25, 2020

I personally find the shorter signatures better.

Currently we do not have some kind of dev docs, otherwise it would be a good idea to record some of the design decisions. Maybe we can find a place somewhere in the source files and put it as a comment there.

@rfourquet
Copy link
Contributor Author

I personally find the shorter signatures better.

Ok, I will update accordingly.

Currently we do not have some kind of dev docs, otherwise it would be a good idea to record some of the design decisions. Maybe we can find a place somewhere in the source files and put it as a comment there.

Yes good idea. For this particular case, it would be good that RandomExtensions provides better docs explaining the rationale of the make methods, and how this can be used. But yes, I should probably write something in AA docs about that.

@fieker
Copy link
Contributor

fieker commented Sep 25, 2020 via email

@thofma
Copy link
Member

thofma commented Oct 7, 2020

Let's try to merge soon. Is there something that needs to be added @rfourquet?

@rfourquet
Copy link
Contributor Author

Is there something that needs to be added

Ah yes, let me import some names to make signatures shorter...

@rfourquet
Copy link
Contributor Author

Before I push the update: shall I bump the AA version to 0.11.0 here?

@thofma
Copy link
Member

thofma commented Oct 8, 2020

Yes, thanks.

@rfourquet rfourquet merged commit ad6732d into master Oct 12, 2020
@rfourquet rfourquet deleted the rf/randmake branch October 12, 2020 07:36
@thofma
Copy link
Member

thofma commented Oct 13, 2020

Am I doing something wrong?

julia> rand(Make(ZZ, 1:3), 3)
ERROR: UndefVarError: Make not defined
Stacktrace:
 [1] top-level scope at REPL[5]:1

julia> AbstractAlgebra.Make
ERROR: UndefVarError: Make not defined
Stacktrace:
 [1] getproperty(::Module, ::Symbol) at ./Base.jl:26
 [2] top-level scope at REPL[6]:1

@rfourquet
Copy link
Contributor Author

rfourquet commented Oct 13, 2020

So the function to use as a user is make, while Make is meant to implement make but should rarely be needed (So Make is available in the Generic module, because it was needed there). Also, I didn't export make from AbstractAlgebra, so you either do AbstractAlgebra.make(...) or Generic.make(...) or using RandomExtensions; make(...). Maybe make should be exported from AA?

@rfourquet
Copy link
Contributor Author

What is available out of the box is the (experimental) Pair API:

julia> rand(ZZ => 1:3, 3)
3-element Array{BigInt,1}:
 1
 2
 2

@thofma
Copy link
Member

thofma commented Oct 13, 2020

Sorry for my misunderstanding. I thought that if I want to repeatedly call rand with the same object and parameters, I should first construct the object using Make. Or is this not necessary for best performance?

@thofma
Copy link
Member

thofma commented Oct 13, 2020

Ah, sorry, make creates that object. Everything is fine!

@rfourquet
Copy link
Contributor Author

rfourquet commented Oct 13, 2020

EDIT: oh ok, I see things are clear now for you! Anyway, I keep this answer I had written:

I thought that if I want to repeatedly call rand with the same object and parameters, I should first construct the object using Make. Or is this not necessary for best performance?

No, make is strictly a constructor for Make; the difference is that Make(x, y, z) is required to construct a Make object with fields x, y, z, whereas make is meant to be overloaded, but defaults to a call to Make. The idea is that you might want e.g. make(R, n::Int) to describe the same thing as make(R, -n:n), then you write make(R::YourRingType, n::Int) = make(R, -n:n).

So Make describe "distributions", and should always be very cheap to construct. Where performance matters, the important step is when constructing a Sampler object, which allows to cache precomputed values. For example:

julia> r = big(1):big(1000)
1:1000

julia> @btime rand($rng, $r)
  279.647 ns (6 allocations: 96 bytes)
112

julia> s = @btime Sampler(MersenneTwister, $r)
  97.814 ns (2 allocations: 48 bytes)
Random.SamplerBigInt(1, 999, 1, 1, 0x00000000000003ff)

julia> @btime rand($rng, $s);
  179.578 ns (4 allocations: 48 bytes)

So if you are going to call rand(r) repeatedly, better to create a Sampler object s, and call rand(s) repeatedly (this is what happens behind the scene when rand creates an array of objects).

So for AA, I didn't try yet to find optimizations related to creating a custom Sampler object (they are all SamplerTrivial, which doesn't cache anything). I have written hundreds of such methods and it's quite tedious, and adding this to AA/etc will make the code a bit more obscure, so I hope to develop a macro which abstracts away most of the boilerplate.

@thofma
Copy link
Member

thofma commented Oct 14, 2020

Thanks again for clarification, greatly appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants