Skip to content

include functor #43

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

include functor #43

wants to merge 2 commits into from

Conversation

ccasin
Copy link

@ccasin ccasin commented Jul 10, 2024

This is a proposal for a new structure and signature item form, include functor.

Rendered version

(Thanks to @OlivierNicole and @goldfirere for help preparing this RFC)

@samsa1
Copy link

samsa1 commented Jul 11, 2024

In a similar way could we also add a module functor E = F

This might be useful in some other cases and does not seem much work in addition to this feature. The use case that I have in mind would be linked to modular implicits :

module type Eq = sig
    type t
    val eq : t -> t -> bool
end

module type Ord = sig
    type t
    val cmp : t -> t -> int
    module E : Eq with type t = t
end

module F (X : sig type t val ord : t -> t -> int) : Ord with type t = X.t = body

module OInt = struct
    type t = int
    let cmp = Int.compare
    module functor E = F
end

This allows for the OInt module (an instance of Ord) to automatically implement an equality that respects ordering.

@ccasin
Copy link
Author

ccasin commented Jul 11, 2024

In a similar way could we also add a module functor E = F

Indeed. I agree this is a reasonable feature and not much more work. It has occasionally been requested by users of include functor at Jane Street. We've held off on implementing it, but not for any particularly principled reason (mainly: I think it will get a little less use, and I think the meaning of the syntax is slightly less intuitive) but I'm very happy to add it if there is consensus it is desirable.

@chambart
Copy link

chambart commented Jul 11, 2024

This is a pattern that is actually quite common in the flambda2 code base, in particular

module T = struct
  module M = struct
    type t = ...
    let compare = ...
  end
  include M
  module Set = Set.Make(M)
end

Which would allow to get rid of that spurious M module

module T = struct
  type t = ...
  let compare = ...
  module functor Set = Set.Make
end

One such example in the upstream compiler code base:
https://github.com/ocaml/ocaml/blob/0d18e1287e49e92cf37824559cda5c09a2438b32/typing/shape.ml#L103-L135

@yallop
Copy link
Member

yallop commented Jul 11, 2024

Have you considered the alternative of giving a name (say "_") to the current module prefix (i.e. "the module up to this point"), so that instead of

module M = struct
  type t = ...
  [@@deriving compare, sexp]
  include functor Comparable.Make
end

you'd write

module M = struct
  type t = ...
  [@@deriving compare, sexp]
  include Comparable.Make(_)
end

?

With that alternative design it'd be possible to refer to the module prefix in arbitrary module expressions rather than always passing it as the argument of a single-parameter functor, so you could also write things like:

include F(_)(X)

and

module type of _

and

open F(_)

and

module E = F(_)

and

include S with module type T = _

and perhaps even

type t = F(_).t

etc.

@alainfrisch
Copy link
Contributor

whose parameter can be "filled in" with the previous contents of the module

Just to be sure : do the components used to "fill in" the parameter need to be defined from the current structure, or do they only need to be visible at this point (coming from a surrounding structure or from some open)?

@lthls
Copy link

lthls commented Jul 11, 2024

We're not very fond of the underscore, so @Ekdohibs suggests module as of then and @chambart proposes virtual module downto begin

@samsa1
Copy link

samsa1 commented Jul 11, 2024

A better argument against using underscore to talk about the beginning of the module is that current work on modular implicits. We are currently thinking of defining _ as an arbitrary module expression that should be inferred but this would be incompatible with the proposal of @yallop.
However I think that his idea is more expressive and should be discussed but with another name in mind.

@ccasin
Copy link
Author

ccasin commented Jul 11, 2024

Just to be sure : do the components used to "fill in" the parameter need to be defined from the current structure, or do they only need to be visible at this point (coming from a surrounding structure or from some open)?

They need to be defined from the current structure. One could imagine doing either thing, but this has a nice clear rule, makes it less likely refactorings will cause errors due to what is in scope for include functor changing, and simplifies the implementation.

@goldfirere
Copy link
Contributor

For syntax, we could use use just plain old module. Examples:

include F(module)
module M = F(module)

Or we could be even bolder and use a symbol:

include F(^^)
module M = F(^^)

I think any syntax should not be available in paths.

@lpw25
Copy link
Contributor

lpw25 commented Jul 15, 2024

Personally, I dislike both the:

module functor E = F

form and mechanisms based on a name for the contents of the current module, and would prefer to push people towards include functor instead. That is because I think it is better to have a name for this interface:

sig
  type t
  module Set : Set.S with type elt = t
end

and use that, rather than having each user choose the name for their set module.

include functor supports that style very naturally. In the Set module you can define:

module type MixS = (X : OrderedType) -> sig module Set : S with type elt = X.t end
module Mix : MixS

and then you can write:

module Foo : sig
  type t
  include functor Set.MixS
end = struct
  type t = [...]
  let compare = [...]
  include functor Set.Mix
end

@yallop
Copy link
Member

yallop commented Jul 23, 2024

If I understand correctly, this use of include functor in signatures amounts to treating functor types as parameterized signatures. It certainly makes the example look elegant, but it doesn't really seem harmonious with the way that module types work in the rest of the language.

@lpw25
Copy link
Contributor

lpw25 commented Jul 23, 2024

That is one way to look at it and it does look different from other uses of module types in that perspective. An alternative though is to consider include S to mean "extend the module type as it would be if it had include M done to it for some unknown M : S, and then treat include functor S in the same way: extend the module type as it would be if it had include functor M done to it for some unknown M : S. I think that is a quite natural way for users to think about it, and there isn't any other obvious way to interpret include functor S in a signature.

@clementblaudeau
Copy link

If I understand correctly, include functor for modules (not for signatures) is needed when the functor does not re-export its parameter. I.e, the pattern

module F = functor (Y:S) -> struct (* ... *) end  
module Foo = struct
   (* code *)
   include functor F
end

could be replaced by changing F to re-export its argument and putting the application at top-level :

module F = functor (Y:S) -> struct include Y (* ... *) end
module Foo = F(struct 
   (* code *)
end)

Overall, could the role of include functor be taken by having a special mechanism to apply and include argument in the result ? A downside I can see is that it puts the functor application at the beginning of the struct, which has not the same flow as putting include functor at the relevant point inside the structure.

@ccasin
Copy link
Author

ccasin commented Aug 22, 2024

If I understand correctly, include functor for modules (not for signatures) is needed when the functor does not re-export its parameter. I.e, the pattern

module F = functor (Y:S) -> struct (* ... *) end  
module Foo = struct
   (* code *)
   include functor F
end

could be replaced by changing F to re-export its argument and putting the application at top-level :

module F = functor (Y:S) -> struct include Y (* ... *) end
module Foo = F(struct 
   (* code *)
end)

Overall, could the role of include functor be taken by having a special mechanism to apply and include argument in the result ? A downside I can see is that it puts the functor application at the beginning of the struct, which has not the same flow as putting include functor at the relevant point inside the structure.

I think this is a reasonable idea, but doesn't quite offer the full convenience of include functor. In this example from the RFC:

module M = struct
  module T = struct
    type t = ...
    [@@deriving compare, sexp]
  end

  include T
  include Comparable.Make(T)
end

I think your proposal saves the include T, but not the need to define T in the first place when its only purpose is to be a parameter.

@clementblaudeau
Copy link

clementblaudeau commented Aug 22, 2024

It can save T by doing a functor call directly on the unnamed structure. To be more precise:
What I had in mind was some new construct to mark functor applications where the functor parameter should be included in the result of the application, something like F [reexport] (M) which is syntactic sugar for

struct
  open (struct module X = M end)
  include X
  include F(X)
end

Then the example of the RFC would become:

module M = Comparable.Make [reexport] (struct
  type t = ...
  [@@ deriving compare, sexp]
end)

I think it provides more or less the same functionality. An upside is that it does not depend on a specific position in the code like include functor does, which I think might be a bit brittle. A downside is that it puts the functor application at the top, not in the flow of the definition of the module like include functor does. I'm not sure how it would support patterns where there are several include functors separated by other bindings, like :

module M = struct
  type t = ...
  include functor F
  let x = 42
  include functor G 
end

@clementblaudeau
Copy link

Actually a key issue with the re-export pattern I was suggesting is that the functor can only re-export the field indicated in its parameter signature, which seems much more restricted than include functor, for which all fields of the current structure are kept.

@ccasin
Copy link
Author

ccasin commented Feb 21, 2025

This has sat for a while and there is some unresolved debate about the best design. @Octachron, could I request that the language committee take this RFC up? Thanks!

@Octachron
Copy link
Member

Having the committee relaunch the debate sounds sensible to me, I will keep you updated once we have a shepherd.

@gasche
Copy link
Member

gasche commented Feb 21, 2025

The proposal suggests to allow include functor FT in signatures, where FT is the type/signature of a functor, but it also suggests that naming functor types is uncommon (this is also my experience), and that the form include functor (module type of F) may be used for a functor F.

Question: have you considered having include functor F in signatures, and maybe something like include functor type FT in addition for the more complex, less common form?

@fpottier
Copy link

Hello,

Here is my two cents.

Regarding the proposed mechanism, I believe that it is clearly useful. I have commonly felt the need to say "please apply the functor Foo to the types and values that I have defined above".

Regarding the concrete syntax, I rather dislike include functor Foo, because it does not make intuitive sense; include is normally applied to a structure, not to a functor. As Jeremy pointed out on July 11, what this construct does is really include Foo(this), where this denotes the content of the (as yet incomplete) current module, and include is the usual include construct.

This suggests that perhaps the new feature that is really needed is not include functor, but is actually this (or whatever concrete name one chooses for this concept). This construct seems in fact more powerful than the proposed construct, as it allows writing (for example) include Foo(A)(this)(C), or include Foo(struct include this let x = 0 end), whereas include functor cannot easily express these forms, I believe.

@gasche
Copy link
Member

gasche commented Feb 21, 2025

I find that @yallop's suggestion of naming the "current module so far" has merit, at least when equipped with the decent syntax module proposed by @goldfirere. In modules, one would write include F(module) for what the RFC proposes as include functor F. But how would that work in signature? Would include functor FT be replaced by include module type of FT(module), where module would be understood as "some module whose signature is the current signature so far"? (The current signature is, of course, module type of module.)

@lpw25 if I understand correctly, your argument is that you prefer to extend modules by inclusion, rather than by naming new submodules, because this tends to encourage a coherent style where the same module names are reused consistently. So you like include F(module) better than module Sub = F(module), and you appreciate that the less expressive include functor F syntax can express only the former and not the latter.

From a distance, I'm not convinced:

  1. Extension-by-inclusion can be recommended in style guidelines, it is not clear to me that it is such an important idea that it needs to be enshrined in the language constructs -- as long as it is easy to express, which is the need here.
  2. There are other mechanisms to encourage coherent naming styles for submodules, for example transparent ascription. Your example becomes even more natural to me if you name the HasSet signature instead of the MixS signature that produces this interface, and you ascribe it to your module.

On the other hand, one could argue that the following ought to work, which has comparable expressivity to @yallop's proposal, and could be presented as easier to understand than a magical module keyword:

include functor (functor Self => F(Self)(X))

(This doesn't quite work today because there is no syntax nor bidirectional-propagation mechanism to have the signature of Self inferred from the context. But it sounds doable from a distance.)

@gasche
Copy link
Member

gasche commented Feb 21, 2025

@fpottier: include Foo(struct include this let x = 0 end): wait a minute, clearly this is the empty module in this context, right? You probably meant this/2 ;-)

@didierremy
Copy link

@fpottier: include Foo(struct include this let x = 0 end): wait a minute, clearly this is the empty module in this context, right? You probably meant this/2 ;-)

In fact, rather than a keyword for the name, we might allow naming the current module as we do for objects...
struct (foobar) type t = ... let x = ... include functor F(foobar) end, which would be more consistent.

@fpottier
Copy link

fpottier commented Feb 21, 2025

But how would that work in signature?

What problem do you see with signatures? Can't we just use this (or module or whatever concrete syntax you prefer) to designate "the current (incomplete) signature"?

@fpottier
Copy link

@fpottier: include Foo(struct include this let x = 0 end): wait a minute, clearly this is the empty module in this context, right? You probably meant this/2 ;-)

You got me. Indeed, in the presence of nested modules, there might conceivably be a need for multiple levels of this.

It is a bit unsettling that this is not synonymous with struct include this end. I believe that currently if M is a structure then M is synonymous with struct include M end (I mean, when used inside a module expression, not a module type).

@fpottier
Copy link

In fact, rather than a keyword for the name, we might allow naming the current module

This is an intriguing idea, but I don't think that it is acceptable... the name of the current module would look like a variable, as its name is chosen by the user, but it is not a variable in the usual sense, since its meaning changes every time a new definition is made in the current module.

@didierremy
Copy link

This is an intriguing idea, but I don't think that it is acceptable... the name of the current module would look like a variable, as its name is chosen by the user, but it is not a variable in the usual sense, since its meaning changes every time a new definition is made in the current module.

Sure the meaning changes over time, but this is exactly the same problem with this as a keyword: it suggests the current module .... at the end of the struct as one is used to with objects, but it only means up to the current point. If you wish to avoid the ambiguity, the keyword should not be this but the-current-value-of-this.

@gasche
Copy link
Member

gasche commented Feb 21, 2025

I also thought of struct (Self) ... end as for objects, and I don't see a big issue with the fact that the name refers to the fraction of the module that has already been defined above. But I'm not sure this construction works in signature context: in class type foo = object ('s) ... end, 's is a class type, the natural transposition is to have sig (Self) ... end have a module type Self, but this is not what we want for include functor Foo in signatures, which must translate to include module type of Foo(Self) for a module Self.

@ccasin
Copy link
Author

ccasin commented Feb 21, 2025

Question: have you considered having include functor F in signatures, and maybe something like include functor type FT in addition for the more complex, less common form?

Do I understand correctly that your idea is for include functor F to implicitly take a module type of, so that include functor F means the same as include functor type (module type of F)?

I think that's a reasonable idea. I tend to prefer the current form because I think it encourages people to give a real name to these functor types rather than relying on module type of. I expect this to be better behaved in general, considering the various issues with that construct (e.g., ocaml/ocaml#13765 to pick a random recent unrelated example). But I could be convinced!

@goldfirere
Copy link
Contributor

(Slightly cheeky suggestion.) If folks are worried that module changes meaning over time (I'm not), we could change it to be as above:

include F(as above)
module M = F(as above)

Note that as is already a keyword, unused in module syntax. The above would be a required next token, though we now have unbounded syntactic space, so we could as outer above or as outer outer above or some such. We could also require a word module at the beginning:

include F(module as above)

@lpw25
Copy link
Contributor

lpw25 commented Feb 21, 2025

include functor is essentially a shallow nonrecursive mixin construct. Mixin constructs have been reinvented for many different languages. People seem to find them easy to understand and easy to use. The counter proposals here are all ad-hoc constructs involving a non-standard form of binding -- either explicit or implicit. They seem likely to be hard to understand and unfamiliar to users.

The references to the this construct in object-oriented languages is interesting because obviously object-oriented languages use this for method calls, they don't use them for inheritance constructs. I think that is because doing so adds needless complexity.

In exchange for deviating from well-trod paths and introducing binding to the mix -- which is of course notoriously easy to understand -- we get a small improvement in expressivity. This just seems like a bad trade to me.

@gasche
Copy link
Member

gasche commented Feb 22, 2025

Personally I am rather reassured by the include functor (functor Self => ...) form, which is a clear path to a way to name the current module for extra expressivity, without introducing another binding form. It doesn't need to be supported in a first iteration, but it suggests that the feature as proposed can be made expressive.

@yallop
Copy link
Member

yallop commented Feb 22, 2025

I don't understand the point about binding. In what sense is include F(module) a binding construct, but include functor F not a binding construct?

@lpw25
Copy link
Contributor

lpw25 commented Feb 22, 2025

module is clearly behaving as an identifier that is implicitly bound at each signature item.

@yallop
Copy link
Member

yallop commented Feb 22, 2025

That's the part I understand. But it seems to me that the unnamed argument of include functor F is behaving as an implicitly-bound identifier in exactly the same way, so I don't understand the distinction that's being drawn.

@clementblaudeau
Copy link

I agree with @lpw25 that a non-recursive shallow mixin construct is usefull and well-understood, and should not require a new form of binding. Instead of seeing include functor as relying on a binding of the current module so far, I see it as syntactic sugar for a mixin composition of an anonymous structure and a functor:

module M = struct 
  (* decls1 *)
  include functor F
  (* decls2 *)
end

would be sugar for :

module M = struct
  include (
    struct 
      (* decls1 *) 
    end mixin F )
  (* decls2 *) 
end

Where the structure delimiters struct/end make it clear what modules are mixed with what. To me, the question that remains is to either add include functor directly or add a mixin construct (and include functor as sugar).

Note on mixin The language is already expressive enough for non-recursive mixins (aka sequential mixins), with the caveat of the introduction of fresh names when handling anonymous structures :
  1. When both modules are (possibly anonymous) structures,
module M = M1 mixin M2  

is sugar for

module M = struct 
  open (struct module Temp1 = M1 module Temp2 = M2 end)
  include Temp1 
  include Temp2 
end
  1. When one is a (possibly anonymous) functor :
module M = M1 mixin F 

is sugar for

module M = struct
  open (struct module Temp1 = M1 module Temp2 = F end)
  include Temp1
  include Temp2(Temp1)
end

However in practice, avoidance and loss of type sharing would probably make such patterns a pain.

@gasche
Copy link
Member

gasche commented Feb 23, 2025

I am willing to be convinced by include functor in module expressions: we tested alternative proposals and none of them feel like a strong improvement to me, they introduce other issues, and of course they would introduce a migration cost for Jane Street (all things being equal, was is already implemented and used in practice could be preferred).

On the other hand, I am less convinced by the naturality of include functor <functor-type> in signatures, along the argument of @yallop in #43 (comment). I can see that it is useful to have parametrized signatures, but in OCaml I've generally seen this done by having a functor return a signature, and that form is arguably more expressive than having a functor type (even when include functor <FT> is added into the mix): unless I am missing something, we can construct a functor type from a signature-returning functor, but not the other way around, which suggests that functor types are not such an expressive way to manipulate parametrized signatures.

(* the parametrized signature of sets,
   as a signature-returning functor *)
module SetS1 (X : Set.OrderedType) = struct
  module type S = Set.S with type elt = X.t
end

(* the parametrized signature of sets
   as a functor type *)
module type SetS2 = functor (X : Set.OrderedType) ->
  Set.S with type elt = X.t

(* Building a functor type from a
   signature-returning functor seems to work. *)
module type SetS2' = functor (X : Set.OrderedType) ->
  SetS1(X).S
    
(* Building a signature-returning functor
   from a functor type does not work,
   at least the following fails. *)
module SetS1' (X : Set.OrderedType) = struct
  module type S = module type of SetS2(X)
  (* Error: the module type SetS2 is not a functor,
     it cannot be applied. *)
end

(* I think we could do this, but we get
   a wider signature that includes X *)
module SetS1' (X : Set.OrderedType) = struct
  module type S = sig
    include X
    include functor SetS2
  end
end

@gasche
Copy link
Member

gasche commented Feb 23, 2025

Thinking out loud: if we had a language construct that takes a functor type FT = functor (X : S) -> S' and a specific module M : S, and returns the output signature S'[M/X] (with appropriate strengthening), then we could use functor types as parametrized signatures and it would be rather convenient -- more so than signature-returning functor. It is really tempting to propose FT(M) as syntax for this construct. It is unambiguous in a module-type context (where functor applications are not valid), but a bit confusing as an uppercase identifier could be the name of both a functor and a functor type.

@yallop
Copy link
Member

yallop commented Feb 24, 2025

it seems to me that the unnamed argument of include functor F is behaving as an implicitly-bound identifier in exactly the same way

I suppose the practical difference is that (module) is a module expression, but include functor is a structure item, which means that there are fewer places that include functor can appear, making it slightly harder for users to inadvertently capture the wrong context.

@lpw25
Copy link
Contributor

lpw25 commented Feb 24, 2025

it seems to me that the unnamed argument of include functor F is behaving as an implicitly-bound identifier in exactly the same way, so I don't understand the distinction that's being drawn.

The distinction I'm drawing is that an identifier can be used deeply within the term. It is this depth which brings in all the various issues around identifiers e.g. if you nest them how does shadowing work etc.

@lpw25
Copy link
Contributor

lpw25 commented Feb 24, 2025

On the other hand, I am less convinced by the naturality of include functor in signatures, along the argument of @yallop in #43 (comment).

I am somewhat sympathetic to this point: include functor in signatures is more unusual and, depending on how you look at it, uses functor types in a way they are not used elsewhere in the language. I do stand my argument above though: if you consider include S in a signature not as: "insert the definitions of S at this point" but as "insert what would happen if you included an anonymous module of type S at this point", then include functor is a natural extension of this idea.

The idea isn't at all unusual when you consider how mixins and inheritance usually to work. You have some "mixin type" or "class type" that you use to both classify the things you can inherit and also to create signatures for modules/classes constructed using inheritance. For us, the types of our mixins are just functor types, and so the thing used to create signatures for modules constructed using mixins should also be functor types.

@Octachron Octachron added the pending-shepherd-recommendation An OCaml language committee shepherd is writing a recommendation on the subject label Feb 24, 2025
@fpottier
Copy link

I wonder if the RFC could be improved a little bit so as to be more explicit about the two proposed features, namely include functor in structures and include functor in signatures.

I think I have understood that include functor in a structure expands to something that can be expressed in the language today. So include functor in a structure is just sugar, in a sense. I would like to see this expansion spelled out in the RFC.

Regarding include functor in a signature, I have not clearly understood whether it is "just sugar" in a similar sense.

@ccasin
Copy link
Author

ccasin commented Feb 24, 2025

I think I have understood that include functor in a structure expands to something that can be expressed in the language today. So include functor in a structure is just sugar, in a sense. I would like to see this expansion spelled out in the RFC.

A very reasonable request. I am a bit swamped today, but will flesh out the bit about signatures in the proposal later in the week.

@ccasin
Copy link
Author

ccasin commented Mar 4, 2025

I think I have understood that include functor in a structure expands to something that can be expressed in the language today. So include functor in a structure is just sugar, in a sense. I would like to see this expansion spelled out in the RFC.

A very reasonable request. I am a bit swamped today, but will flesh out the bit about signatures in the proposal later in the week.

@fpottier I've now pushed a commit that says more about signatures - sorry this took me a week to get back to! The new text incorporates some ideas from @gasche and @lpw25 from the discussion above, which I found helpful in explaining this construct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending-shepherd-recommendation An OCaml language committee shepherd is writing a recommendation on the subject
Projects
None yet
Development

Successfully merging this pull request may close these issues.