Skip to content

Implicit source positions #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

OlivierNicole
Copy link

@OlivierNicole OlivierNicole commented Feb 21, 2025

Rendered version

Introduction of the RFC:

A new type of arrow is introduced. Syntactically, it looks like:

val f : loc:[%call_pos] -> ...

in a signature, and

let f = fun ~(loc : [%call_pos]) ... ->
  ...

in an implementation.

The label name doesn’t matter, but there must be a label. The function parameter thus defined has type Lexing.position and, if not specified at a call site, is automatically filled with the position of the call site.

This can serve several purposes:

  • When chaining monadic binds, the usual stack backtraces are very noisy and less useful. You can replace (>>=) : 'a t -> ('a -> 'b t) -> 'b t with (>>=) : loc:[%call_pos] -> 'a t -> ('a -> 'b t) -> 'b t to automatically track binding locations (and then implement a tracing facility).
  • It can be used to improve the usefulness of testing frameworks (see e.g. Best-effort automatic reporting of the location of Alcotest.check mirage/alcotest#366) or tracing systems (such as meio[^1]), or error reporting in embedded languages.

History

A first PR[^2] has been proposed ten years ago by @let-def. After some design amendments, that PR was stalled for a while, until renewed interested was expressed in 2023 by @TheLortex. The feature was then discussed during a developer meeting and gathered consensus on the principle[^3]. It was subsequently implemented at Jane Street and has been in use there for several months.

The goal of this RFC is therefore to present the details of the design in use at Jane Street to make sure there is agreement to port the feature upstream. Jane Street has asked Tarides for help in the summarizing work that led to this RFC.

@OlivierNicole
Copy link
Author

cc @goldfirere @MisterDA

@goldfirere
Copy link
Contributor

This feature has widespread use already within Jane Street. It was originally implemented by @vivianyyd, with much help getting it into production by @Enoumy.

Copy link

@nojb nojb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I am in favour. Incidentally, we had such a facility in LexiFi's compiler fork since 2008 and recently we removed it in order to reduce the diff with upstream. Will be happy to get it back as an official feature! :)


### Controlling file paths in source positions

When implementing this feature, Jane Street also introduced a new compiler flag, `-directory`. When specified, `-directory some/path` causes the filename in automatically inserted positions to be prepended with `some/path`. Jane Street’s build system uses that flag to control the file paths reported by `[%call_pos]`. For example, this can be used to give more informative file paths than just a single `./filename.ml`, all the while ensuring a reproducible result and avoiding to include machine-specific paths like `/home/user/build/.../filename.ml` in the compiled program.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part seems orthogonal and indeed probably best left out of an initial PR. Also, note that in Dune, for example, all compilation invocations are made from the workspace root, so that the paths appearing in compiler locations are relative to the workspace root and not bare filenames.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We found that the feature was not very useful without this -directory bit, hence its inclusion here. But maybe our Dune is different from the upstream Dune in this way and that upstream Dune will work without -directory.

@gasche
Copy link
Member

gasche commented Feb 22, 2025

As mentioned previously, there is broad maintainer consensus in favor of the feature. I suppose that the purpose of the RFC is more to discuss the proposed syntax and the practical details.

The proposed syntax looks fine to me.

The -directory bit (and the partial overlap with BUILD_PATH_PREFIX_MAP) stands out as worth investigating: if we implement the feature without this, how does it work in practice in multi-directories projects? Does it work well with Dune? What needs to be changed for it to work better?

I think that we may need a PR that implements the language-level feature to answer these questions, unless you can use the Jane Street compiler with upstream dune to make tests and report on their behavior.

@dbuenzli
Copy link
Contributor

The -directory bit (and the partial overlap with BUILD_PATH_PREFIX_MAP) stands out as worth investigating: if we implement the feature without this, how does it work in practice in multi-directories projects?

Note that it's hard to answer these questions without fixing BUILD_PATH_PREFIX_MAP first.

My gut feeling is that with a correctly working BUILD_PATH_PREFIX_MAP, you don't want to add yet another compiler flag to manipulates what gets into __FILE__. The specification of __FILE__ should simply be: the file path as specified on the cli as transformed by the BUILD_PATH_PREFIX_MAP active during the invocation.

@nojb
Copy link

nojb commented Feb 22, 2025

Does it work well with Dune?

As mentioned, all Dune compiler invocations are done from the workspace root, so compiler paths are paths relative to the workspace root as well, so I think the answer to this question is "yes".

ocamlbuild displays a similar behaviour, except that the paths are relative to the directory from which ocamlbuild is invoked.

@gasche
Copy link
Member

gasche commented Feb 22, 2025

Thanks! This suggest that we could forget about the -directory part of the RFC, unless I am missing something.

@goldfirere
Copy link
Contributor

Yes, this does suggest we could drop the -directory bit. I'm happy to. The worst case scenario is that we've miscalculated here and that directories won't always show up correctly in some scenarios, in which case we can take another stab at fixing this -- not so bad.

in an implementation.

The label name doesn’t matter, but there must be a label. The function parameter thus defined has type `Lexing.position` and, if not specified at a call site, is automatically filled with the position of the call site.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find RFC quite unclear as to what positions you concretely get in the resulting Lexing.position values and if it is possible to specify multiple implicit source positions. Basically the notion of "position of the call site" is quite unclear to me.

E.g. if I have:

let f ~(loc0 : [%call_pos]) a0 ~(loc1 : [%call_pos]) a1 = …

And call

let () = f "hey" "ho" 

what do loc0 and loc1 point to in f. This:

let () = f "hey" "ho" 
           ^     ^
           |     |
           loc0  loc1

or something else ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think loc0 and loc1 will be equal and refer to the location of the whole expression f "hey" "ho".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what do I get if I do:

let g = f "hey" 
let () = g "ho"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess that loc0 will be the location of f "hey" and loc1 that of g "ho".

Copy link

@jberdine jberdine Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question I would have as a user of this feature is how the presence of an implicit source position parameter might affect code transformations the compiler might perform. For example, I would suppose with the example f above that f "hey" "ho" and (f "hey") "ho" might lead to different values of loc0 and loc1. This would make me worry that the implicit source position parameter might block something like contification. Such transformations aren't explained as part of the source language, so are probably out of scope for the RFC, but just as a data point, I would need to figure this out before using the feature.

Copy link
Contributor

@dbuenzli dbuenzli Mar 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if it's possible to access the location of the application here without fuss (if that's where the magic happens), I would rather suggest to change the terminology to locations (i.e. [%call_loc]) and perhaps simply type those as string * int * int * int to be understood as fname, line_num, start_byte_offset, end_byte_offset with offset relative to the first byte of line_num1; in other words the (multiline aware) format of OCaml error messages.

Having the whole range of the application enables quite a few more tricks :–)

Footnotes

  1. It's not the best format as you have to count lines in the source string to find the line first byte position, but it has the same memory footprint, if people care about that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming and the representation choice came from the fact that this is how the Lexing module presents the information, both in terms of content and name ("position"). I think we should consider a change to the stdlib if we are going to chance the specification of this feature similarly.

Copy link
Contributor

@dbuenzli dbuenzli Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I'm not complaining about the representation choice or information presentation, I'm complaining about the actual bits of information you report: one position versus two positions (a "text location").

In any case I'm not sure I see why a change to the stdlib is needed.

The whole idea of using the Lexing module (which is a runtime module for lexers generated by ocamllex) was a bit odd in the first place. For example the magic debugging variable confusingly named __POS_OF__ simply uses a structural type to report a location.

Copy link
Member

@gasche gasche Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: having two positions lets us denote a span/interval within the source, rather than just a point. This is the difference between Lexing.position, which is just one point, and Location.t which is an interval.

type t = Warnings.loc =
  { loc_start: position; loc_end: position; loc_ghost: bool }

I would also expect a source location/span/interval as opposed to a source position. (I have the impression that the choice of a source position goes back to ocaml/ocaml#126 .)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FTR Printexc does use a record type for reporting the same kind of information, see Printexc.location. Perhaps something to consider consolidating.

Note that the semantics is not better documented there though. There's a lot to unpack in the format of OCaml error messages –– namely byte start and end positions relative to the first byte of a line identified by one-based line number; with positions understood as defined in the introduction of the OCaml string module. It would be a good idea to once solve ocaml/ocaml#8978 and make all the mecanisms that report this kind of information refer to it.

@goldfirere
Copy link
Contributor

Conversation seems to have died down here, with the following results:

  • There is consensus in favor of adding the feature, with the given semantics (basically, echoing today's optional parameters but with a new behavior in the case that the caller omits the argument).
  • There is consensus around the following bits of syntax:

In types: e.g.

val f : ?loc:[%call_pos] -> unit -> unit

In function definitions: e.g.

let f ?(loc = [%call_pos]) () = ()
(* or *)
let f = fun ?(loc = [%call_pos]) () -> ()

In function call sites: e.g.

let () = f ()   (* omitted *)
let () = f ~loc:some_lexing_position ()   (* supplied *)
let () = let loc = some_lexing_position in f ~(loc : [%call_pos]) ()   (* supplied, with a hint to type inference that [loc] is a call-pos parameter, not a labeled one *)
let () = f ~loc:(some_lexing_position : [%call_pos]) ()  (* more direct syntax for the same thing *)

These last two examples will be very rare in practice, useful only really in higher-order contexts where we can't know the type of f beforehand. Using [%call_pos] as a type in any other place will be an error. (So, for example, let loc : [%call_pos] = some_lexing_position in f ~loc () would be an error.) Support for these last examples was added by demand within Jane Street.

Note also that it is never allowed to pass a call-pos parameter with ?, even though ? is used both in the type and the function definition. (This is not all that different from ordinary optional parameters, which always use ? in types and at definitions, but only rarely at call sites.)

  • There is consensus against adding a -directory flag, as it is not needed with dune.

Does that summarize the state of play well here? Once this gets merged in, we at Jane Street will adjust our use of this feature to match the new syntax.

Thanks!

@dbuenzli
Copy link
Contributor

Does that summarize the state of play well here?

The summary is missing the discussion about:

  1. Positions versus locations.
  2. The actual datatype to use.

Regarding 1. I think this should be implicit source locations which span the application and thus the syntax should rather be [%call_loc].

Regarding 2. I think it would be fine to reuse what has be used for __POS_OF__, which, despite the misleading name represents a location.

@dbuenzli
Copy link
Contributor

(This is not all that different from ordinary optional parameters, which always use ? in types and at definitions, but only rarely at call sites.)

I wouldn't say it's rarely used at call sites, you often use ? in call sites to propagate defaults "upwards", e.g.

let primitive ?(limit = 3) x =let derived_combinator1 ?limit y = … primitive ?limit …
let derived_combinator2 ?limit y = … primitive ?limit …

@goldfirere
Copy link
Contributor

The summary is missing the discussion about:

  1. Positions versus locations.
  2. The actual datatype to use.

Yes, thanks for reminding me about this! I think that (2) determines (1) -- that is, the actual data captured should dictate the name. After asking around within Jane Street, there is considerable enthusiasm for keeping just the single position, rather than the whole span. We don't need the full span, and the extra information is likely to have a performance impact (this surprised me, too). So the thought here is to keep the existing [%call_pos] feature using Lexing.position, but if there is demand, we could have an additional feature [%call_loc] that also grabs the end position.

It would not be hard to take the existing implementation of [%call_pos] and extend it to also support [%call_loc] -- but that step would actually take a little bit of extra design, given that there's no suitable type in the stdlib to track spans instead of single locations. I propose we separate these ideas and focus only on the [%call_pos] idea here. (I would favor [%call_span] over [%call_loc] for the other feature -- to me, "span" more clearly includes an endpoint. But I don't feel strongly.)

@gasche
Copy link
Member

gasche commented Mar 14, 2025

My intuition is that providing a location/span is likely to have various usability benefits in practice for some use-cases. (It's not even obvious what the 'position' of a function call is, I suppose we mean the starting position?) Given that this feature is precisely defined for usability, I would be inclined to prioritize this instead of performance.

I wouldn't believe the performance argument without more explanations. Can you elaborate what the issue would be? Note that Lexing.position is a four-element record that could probably be packed more efficiently if we wanted to consume less space. (It is probably possible to store a location/span in less space than your current proposal uses to represent a single position.)

If performance is expected to be a concern, I would propose using an abstract type for the span/location, with accessors that return various information -- could start as just val start : t -> Lexing.position val end : t -> Lexing.position. This would let us start with whatever representation we like, and optimize it later on without breaking clients.

@dbuenzli
Copy link
Contributor

dbuenzli commented Mar 14, 2025

We don't need the full span, and the extra information is likely to have a performance impact (this surprised me, too).

The runtime representation that __POS_OF__ uses for locations is exactly the same as Lexing.position, namely a quadruplet. So memory consumption is not going to change. Now the thing that changes is that if you want to extract the byte range of the span in the sources you need to count the number of newlines in your sources, I doubt that's going to be a performance hog in practice.

Another take would also be to simple have string * int * int a filename and inclusive byte range. Though that would make it harder for clients to directly spit out a line number which would hinder certain applications where you are not interested in processing the sources but only referring to them using OCaml's error message format (e.g. logging).

Personally I'm not super fond of these ad-hoc primitives so I think we should rather aim for having a single one and one that allows to express the functionality of __POS_OF__ that is the location span of a function call in the format the compiler uses to report errors, warnings and stack traces.

As far as naming goes, text locations vs position seems quite established in the compiler and beyond so I think simply having call_loc would be suitable.

but that step would actually take a little bit of extra design, given that there's no suitable type in the stdlib to track spans instead of single locations.

I think I have already mentioned at least two times that this is a red herring, you can simply reuse what __POS_OF__ uses.

@c-cube
Copy link
Contributor

c-cube commented Mar 15, 2025 via email

@TyOverby
Copy link

TyOverby commented Mar 15, 2025

On the "performance" angle, there's really two issues:

  1. Binary size: @c-cube is right that these values aren't allocating at runtime, but they still take up space in the binary, which does matter for js_of_ocaml programs, especially when you have thousands to tens-of-thousands of instances like we do in some programs.
  2. Conversion: Though you could always convert a span to a position, that conversion will probably allocate at runtime. Also, this language extension is great for reducing boilerplate for callers, it would be a shame if it introduced more boilerplate for callees in the form of these conversions.

Now, I don't expect that my "Conversion" point will be convincing if you think that spans are always preferable, but Lexing.position has served us quite well, with the [%here] ppx, and now with [%call_pos]. In their widespread use at Jane Street, [%call_pos] is almost always used to provide a better debugging and instrumentation experience for end-users, and the position of the invoked function is both unambiguous and desirable (especially when formatting it in a way that most text-editors have good "jump to file/line/column" support for). Although I'm sure there are examples of places where having the whole span would be better, if we only provided [%call_span], I'd expect that 99% of use-sites would immediately convert it into a Lexing.position, and that would be a shame.

Another take would also be to simple have string * int * int a filename and inclusive byte range. Though that would make it harder for clients to directly spit out a line number which would hinder certain applications where you are not interested in processing the sources but only referring to them using OCaml's error message format (e.g. logging).

As mentioned above, surfacing positions for logging/telemetry/debugging tools is definitely the main use case for this feature.

I think I have already mentioned at least two times that this is a red herring, you can simply reuse what __POS_OF__ uses.

I like Lexing.position. Unlike __POS_OF__ which returns (string * int * int * int), Lexing.position is a record, so it's harder to accidentally misuse or make mistakes constructing one (I'm really not a fan of having three ints floating around and having to remember which one corresponds to what.) Also, but it looks like __POS_OF__ doesn't handle expressions that span multiple lines, which are pretty common for function calls.

fwiw, I'm not suggesting that we ignore spans, I think it'd be nice to have both [%call_pos] and [%call_loc]. This proposal is motivated by improving the usability of the language, and I'd be pretty sad if I had to litter my codebase with conversion functions just because it's conceptually cleaner to only support one construct instead of two.

@dbuenzli
Copy link
Contributor

2. Conversion: Though you could always convert a span to a position, that conversion will probably allocate at runtime.

If all you do is format, unlikely. It will not be different from the transform performed by the code you linked below:

(especially when formatting it in a way that most text-editors have good "jump to file/line/column" support for).

That's not a very good format. Unlike OCaml's error and stacktrace format, it will routinely fail to locate the right bit on sources with Unicode characters. Besides personally, in my program, I'd rather not have a different syntax for stacktraces and instrumentation statements – there's no need to overload my brain with different notations. So I wouldn't take that as an argument in favour of the information given by Lexing.position.

Lexing.position is a record, so it's harder to accidentally misuse or make mistakes constructing one (I'm really not a fan of having three ints floating around and having to remember which one corresponds to what.)

I'm not against adding a type for locations as formatted by OCaml. Also as already mentioned there is one in Printexc.location though for reasons that are unknown to me it has one more integer slot.

Also, but it looks like __POS_OF__ doesn't handle expressions that span multiple lines, which are pretty common for function calls.

It absolutely does handle multiple lines. There seems to be misunderstandings about OCaml's error location and stacktrace reporting format which I explained in the discussion here and which is what __POS_OF__ gives you.

@dbuenzli
Copy link
Contributor

If all you do is format, unlikely. It will not be different from the transform performed by the code you linked below:

Just to make that clear with a location __POS_OF__ the snippet code you refered to would just turn into:

(* This is the same function as Ppx_here.lift_position_as_string. *)
let make_location_string (pos_fname, pos_lnum, pos_first, _pos_last) =
  String.concat
    [ pos_fname; ":"; Int.to_string pos_lnum; ":"; Int.to_string pos_first ]
;;

@TyOverby
Copy link

TyOverby commented Mar 15, 2025

Besides personally, in my program, I'd rather not have a different syntax for stacktraces and instrumentation statements – there's no need to overload my brain with different notations.

That's fine; you can always use [%call_loc]!

Also, but it looks like POS_OF doesn't handle expressions that span multiple lines, which are pretty common for function calls.
It absolutely does handle multiple lines.

If I wanted to format a span for people to read and understand, then I'd expect (start-line start-col), (end-line, end-col), not whatever "pos_last" is.

let make_location_string (pos_fname, pos_lnum, pos_first, pos_last) =  
  Core.String.concat [ 
    pos_fname; 
    ":"; Int.to_string pos_lnum; 
    ":"; Int.to_string pos_first; 
    ":"; Int.to_string pos_last ]

let positions, _ = __POS_OF__(
  let x = 5 in 
  let y = 10 in 
  x + y)

make_location_string positions
"//toplevel//:1:19:72"

If I'm reading that string, it's not obvious at all what span of code is covered by the "72" there. Maybe a tool could use that in a useful way, but as I've mentioned, the primary usecase here is for people to read.

  1. Conversion: Though you could always convert a span to a position, that conversion will probably allocate at runtime.
    If all you do is format, unlikely. It will not be different from the transform performed by the code you linked below:

Could you expand on this? It sounds like you're saying that building a Lexing.position record at runtime or doing string concatenation doesn't allocate, which can't actually be your claim.

Unlike OCaml's error and stacktrace format, it will routinely fail to locate the right bit on sources with Unicode characters.

let x = "a" in __POS_OF__ ();
- : (("//toplevel//", 1, 15, 28), ())

let x = "🐫" in __POS_OF__ ();;
- : (("//toplevel//", 1, 18, 31), ())

Looks like __POS_OF__ doesn't do great either in this regard either if you expect it to count visual characters instead of byte offsets. Fwiw, at least vim counts bytes, so I actually prefer this behavior.

Oh also, neither do type error messages:

let x = "a" in __POS_OF__ (1 + "hi");;
Line 1, characters 31-35: Error ...
let x = "🐫" in __POS_OF__ (1 + "hi");;
Line 1, characters 34-38: Error ...

@dbuenzli
Copy link
Contributor

dbuenzli commented Mar 15, 2025

That's fine; you can always use [%call_loc]!

That's not the point, what I'm saying is that the link you gave on how you want to format things for users with a [%call_pos] and Lexing.position is trivially embedded in my proposal with [%call_loc]. It will even save you a subtraction. So I don't see a strong need to add multiple things here.

Could you expand on this? It sounds like you're saying that building a Lexing.position record at runtime or doing string concatenation doesn't allocate, which can't actually be your claim.

What I said, and as shown by my snippet, is that there are going to be no conversion costs between locations as I want them and positions as you want them, as you claimed there would be in your point 2.

Looks like __POS_OF__ doesn't do great either in this regard either if you expect it to count visual characters instead of byte offsets. Fwiw, at least vim counts bytes, so I actually prefer this behavior.

I does if you have a properly configured editor, that understands the semantics. Trying to count visual character in the Unicode world is a doomed endeavour, even counting Unicode scalar values won't do, you need feedback from the rendering layer. Talking in (relative as far as OCaml's format is concerned) bytes with the editor is the only reliable way.

In that respect the GNU standard error format which expects you to count visual columns with wcwidth is broken in the real world of Unicode.

Now the thing with OCaml's error information is that it is in fact a marvel of nice tradeoffs, these 3 integers – which again have the same memory footprint as a Lexing.position and can report the same broken positions – allows you to:

  1. Address a full, precise, multi-line span of Unicode text in a source, for editors that understand the semantics of the format.
  2. Has a graceful degradation for humans if they can't use the machine. They can read the line number and the first column number (exactly like the way you want to format your Lexing.position) which will only be an approximation if your line has Unicode characters (again exactly like a Lexing.position). The last number is a bit more difficult to comprehend humanely but you can still use it if you are in the secrets of the gods.

@OlivierNicole
Copy link
Author

Here is a hopefully fair summary of the debate between “point location” and “span location”. A full location span would be more general than a single position, however @TyOverby expresses concerns about the consequences in terms of binary size and conversion costs.

@dbuenzli makes the point than using the same format as used by __POS_OF__ namely,

  • the filename
  • the line number of the start position, counting from 1
  • the byte offset relative to the beginning of that line
  • the byte offset of the end position relative to the start position

takes up the same size as the currently used Lexing.position and has strictly more expressive power, and the same limitations regarding Unicode. Also it directly contains the three often used values: filename, start line, start “column”.

And indeed with a position encoded in this format, one can do everything that can be done with Lexing.position and more (for example, if the source file contents are available, one can compute the end line and end “column”.

Suggestion: we could make the location type abstract with accessors as suggested by @gasche so that the actual representation is an implementation detail. For starters, we could implement it using the format described above, as it takes up no more space than the current proposal and allows to do the same things without cost, and also to derive a span.

In the future, if needed, we can always switch to a more compact and flexible representation, such as the one used in native debug info (see ocaml/ocaml#10111).

@gasche
Copy link
Member

gasche commented Mar 17, 2025

I'm convinced that we should favor usability over performance here. The performance costs that we are talking about are mostly minor (how large already are the js_of_ocaml outputs that we would be adding tens of thousands of words to?), and they can be further reduced by being clever.

@OlivierNicole
Copy link
Author

OlivierNicole commented Mar 17, 2025

Maybe I’m already tired at this time of the afternoon, but it’s not entirely clear from your message what part of my suggestion you are objecting to, and what you propose to replace it with.

@gasche
Copy link
Member

gasche commented Mar 17, 2025

Your summary is fine! I am clarifying my own opinion: after consideration and the further discussion, I am now convinced that we should use locations and not positions.

@gasche
Copy link
Member

gasche commented Mar 17, 2025

Some further comments.

You suggest to start with the same internal representation as POS_OF, and I think this is very reasonable.

Deciding whether to hide the representation within an abstract type or not is not obvious:

  • As @dbuenzli point out, the current hack of POS_OF has been serving us well, so maybe it would be fine to keep using this. (Maybe it's possible to use labelled tuples here, without breaking POS_OF-using code?)
  • On the other hand, if you want a chance to optimize the representation later on, you need to ask for an abstract type now. If I had JaneStreet-sized binaries I would support this, so my guess would be that you @OlivierNicole should explore this option.

Designing an abstract type for this is not obvious, in particular because for anything non-structural we need this type to be defined in the standard library -- not compiler-libs. Maybe an option would be to introduce a new type in the Lexing module, Lexing.Loc.t or Lexing.Span.t? (This is a hack because Lexing is meant to provide the runtime for ocamllex, not for this, but oh well, aren't those fine names?) For now @dbuenzli is in favor of keeping the trite design that we inherited from POS_OF, but maybe he could be convinced if we do something clearly better; @dbuenzli, you may even have suggestions for an API for such a submodule?

Maybe a dummy proposal to get things going:

module Lexing.Location : sig
  type t
  val file : t -> string

  val start_pos : t -> Lexing.position
  val end_pos : t -> Lexing.position

  val start_line : t -> int
  val end_line : t -> int

  val start_column: t -> int (** TODO document 0- or 1-indexing and that it is a byte offset *)
  val end_column: t -> int

  val start_offset : t -> int (** TODO document that it is a byte offset *)
  val end_offset : t -> int

  val to_POS_tuple : t -> (string * int * int * int) (** See {!__POS_OF__}. *)
end

I am not sure that it is possible to implement end_column without recomputations if we use the POS_OF tuple format, and maybe this would be an argument in favor of not doing that.

Edit: the constructor for this abstract type is missing from this API sketch. There should be one. I suppose that the compiler would be synchronized with the internal implementation, and unsafely inject a structured constant matching this representation.

@dbuenzli
Copy link
Contributor

@OlivierNicole it's not clear what you mean by the start position, but this is not what the OCaml error format does:

  • the byte offset of the end position relative to the start position

Both offsets are relative to the start of the line identified by the line number.

and the same limitations regarding Unicode

For humans. But for machines, if the process in charge of highlighting the location information is smart about it, it allows to overcome the problems posed by Unicode (by specifying the location precisely in the text encoding space rather than in the visual space).

Regarding reusing the format of __POS_OF__ that's not something I'm especially attached to. What I'd like is the location (vs. a Lexing.position).

I did however wonder whether keeping the same format would allow to use __POS__ (which is also a location, albeit always occurs on the same line) and __POS_OF__ in interesting way in an explicit specification of a [%call_loc] argument at the call site. But I don't think so (?).

Maybe a dummy proposal to get things going:

@gasche, note that if you want to keep the memory footprint of __POS_OF__/Lexing.position you can't recover most of the information you mentioned there (e.g. Lexing.position values, end_line). Also I would disagree with quite a few of the names there (e.g. avoid the whole column business, that idea is dead).

Regarding a suggestion, in a few projects now I have been using this for text locations which allows to recover all forms of error reporting (OCaml-like or broken GNU multi-lines) but it's a bit more heavy weight (6 integers, though they could be flatten in a single record).

Something I would like to understand is why two more field in Printexc.location were added in 5.2, because without them, it's actually represents __POS_OF__ exactly and these additions look rather ad-hoc to me.

@gasche
Copy link
Member

gasche commented Mar 18, 2025

I'm not worried about having 6 integers, as one could probably pack them into three Int63 values. (I'm assuming it is reasonable to assume that these line numbers or byte offsets inside one single source file will each fit 31 bits.) If we use an abstract type, it would also be easy for people who maintain a fork of the compiler to choose a different internal representation that preserves less information, and return -1 on some of the accessors.

@dbuenzli
Copy link
Contributor

dbuenzli commented Mar 18, 2025

I'm not worried about having 6 integers, as one could probably pack them into three Int63 values.

Another option would be to have a string with all these numbers packed in sequence one after the other using a simple unsigned variable byte encoding with a continuation bit. This is what is done for example in source maps (signed there apparently and wrapped in a base64 layer).

The nice thing about the Textloc.t type I linked to is that you get absolute byte offsets in the file which means you don't have to perform an elaborate dance whenever you want to find the concrete byte range to perform surgery on the sources.

@jberdine
Copy link

Aside, but if people want to look at some prior work on representations for positions and spans, it might be interesting to look at hack's Pos module and its dependencies. The size of these is significant for hack. There are files with ridiculously long lines, and some with ridiculously large numbers of lines. While the smallest packs a span into a single int, not every span can fit and so there are several representations and a compress operation that selects the smallest usable representation.

@goldfirere
Copy link
Contributor

If your need is that computing the start position is fast, we (or you) can represent source locations as a pair of positions, then computing the start position is just a projection and I don't see performance concerns.

True! But it still means that we have twice as many locations statically allocated in the binary as necessary -- and it requires extra boilerplate in functions to do the conversion. I'm probably more concerned about the latter (manually calling the conversion function) than the former. I agree that this seems to address the runtime performance concern.

I still don't see what's so bad about having two features, catering to different use-cases. We have many integer types (each with its own literal) and two built-in collection types (each with its own literal). Instead, we could have just one integer type and one collection type and then ask programmers to convert -- that would be as expressive. But it's a poor experience, and so we provide multiple different types. That's all I'm arguing for here.

@gasche
Copy link
Member

gasche commented Mar 26, 2025

My impression is that we would not be be providing a better experience to end-users by providing two almost-identical features with related (but not parametrized) syntaxes, but rather pushing complexity costs onto them (and onto future OCaml maintainers and contributions) to avoid making a decision now.

I don't see a scenario where end-users would clearly benefit from having both options available (either they only want one position and either design is just fine, or they want a span and they cannot use %call_pos), and I could see how it would create decision fatigue.

My impression right now is that if we magically decide to only support locations/spans upstream (as I think was originally intended by let-def) and merge a reasonable implementation of that, then Jane Street will be just fine -- there are plenty of options, supporting both features in a transition period or forever, or keeping just your code as you like it, or proposing a refactoring to make it trivial for both features to share as much code as possible. I sort of wish I had been forceful about this three weeks ago when we realized the position-vs-location misunderstanding, we might have been done by now.

@goldfirere
Copy link
Contributor

I still don't see how this is different than with literals -- I could make the same argument around integer literals, keeping only the largest literal form. Those, too, are not parametrized.

It would make me sad to see the conversation go the way you're suggesting. Up until now, all the surface-language features we have written, we have expected to be able to get, in some form, into the upstream language. The truth is that we don't want to maintain a difference forever, but the spans feature simply does not fit our needs. We can, of course, maintain our version. But it's surprising to me that our claims of "this won't work for us" seem not to suggest that it might also not work for others. After all, the goal here is to reduce boilerplate, and the suggestion you're making for this feature would fail to do that (as much as is possible) for our use-case.

@dbuenzli
Copy link
Contributor

But it's surprising to me that our claims of "this won't work for us" seem not to suggest that it might also not work for others.

As far as I read you in this conversation the "this won't work for us" seems rather to be related to a data structure mismatch (Base.Source_code_position.t) that you can't change immediately because you didn't make it abstract in your code base.

This would of course not apply to other users who have never been subjected to the feature. So I don't see any evidence so far "that it might not work for others" too.

I also didn't see exactly what – except for the mismatch problem – in spans does not fit your need. Once you manage to make your Base.Source_code_position.t represent them: it fully includes it (at a space/time cost tradeoff we can change or make exactly as the current one if you desire so).

Could you perhaps be more specific ? So far you have only shown this which is trivial to support with a span.

@goldfirere
Copy link
Contributor

As far as I read you in this conversation the "this won't work for us" seems rather to be related to a data structure mismatch (Base.Source_code_position.t) that you can't change immediately because you didn't make it abstract in your code base.

Yes that's basically true. But I don't see how it invalidates my claim about others -- the base library is open-source, and thus anyone who uses it will also be potentially affected here.

Even if it were abstract, though (and thus could be more easily migrated to have a different representation) I think I'd still want the ability to talk about single positions, which presumably take up less space to encode. My motivation would be weaker, admittedly, but not gone entirely. It seems silly to require us to synthesize full spans throughout our programs if all we want are single locations.

The difference I'm proposing between [%call_pos] and [%call_loc] is just that they would produce different types -- this is a distinction that programmers are very accustomed to. I think the cognitive burden here is very low.

@dbuenzli
Copy link
Contributor

Yes that's basically true. But I don't see how it invalidates my claim about others -- the base library is open-source, and thus anyone who uses it will also be potentially affected here.

These people did not use implicit positions in their code…

Anyways, I don't feel this discussion is productive so I'm going to stop here. I'm not the one making decisions anyways.

But I'll just say that while I generally think Jane Street's work on the language is very beneficial for OCaml I'm not very fond of the way this particular RFC is being handled. The way I personally see it is that it is not being argued from an OCaml system design perspective, it is argued from an "avoid a technical debt for Jane Street" perspective.

@shindere
Copy link

shindere commented Mar 28, 2025 via email

@dbuenzli
Copy link
Contributor

dbuenzli commented Mar 28, 2025

@shindere the whole system is confused by these terms and their semantics and, as this discussion has repeatedly shown, they are not even clear to people working on the compiler.

Here's a glimpse on the confusion:

That being said if people find it would be an opportunity to clarify I wouldn't object to [%call_span] and a corresponding Textspan module1 (EDIT: thinking more about it in relationship to the footnote, not sure in fact, perhaps a Textspan.t should rather represent a string * Textloc.t).

Footnotes

  1. I remember pondering using Cmarkit.Textspan but eventually stelled on Cmarkit.Textloc because the notion of span exists in the commonmark language (code span, math span) and in HTML. Using the notion of location was a way to distinguish between a spans' textual content and its localisation in text.

@shindere
Copy link

shindere commented Mar 28, 2025 via email

@dbuenzli
Copy link
Contributor

I would have thought they refer all to the same thing and that, precisely, would be the reason for usingspan everywhere, because it's widely understood as meaning text region?

I would say that in the end it's not entirely clear. As your program may need vocabulary and types for:

  1. Locating a subrange of bytes. That a specification on how to index a string without having the string.
  2. The actual subrange of bytes. That is a specification on how to index a string with the string itself.

Otherwise said:

type textloc = … 
type textspan = string * textloc 

If you have used textspan for 1. then it becomes difficult to name 2.

@shindere
Copy link

shindere commented Mar 28, 2025 via email

@goldfirere
Copy link
Contributor

I apologize for using “location” where I should have used “position” (which, until this conversation, held the same meaning for me — the SrcLoc type in GHC describes one point in the source, as opposed to SrcSpan). I understand that in this conversation I should use “position” to describe one place in the source code and “location” to describe the range between two places; please forgive the slip in my last message.

On to the substance:

I am arguing for having two different ways of representing implicit source whatevers for two reasons:

  1. To avoid technical debt for Jane Street and anyone who uses our open-source set of packages. While it’s true that non-JS folks won’t be using implicit source positions already, they have to be using explicit ones. This is done through the open-source ppx_here preprocessor, which inserts a value of type Lexing.position into your code in place of [%here] and is a required argument of many functions in our libraries. It’s possible that the change would affect no one beyond Jane Street — I’m murky on how pervasive use of our libraries is outside our walls. But even if it affected only Jane Street, we’re still talking about churn in a codebase used by 1,000+ OCaml programmers.
  2. Even neglecting technical debt, having only a single position would be better for us. We simply don’t need the end of the source code ranges. And so synthesizing less information keeps our binaries smaller and faster. I imagine that, if we had only [%call_loc] in the language (and our libraries worked with it), we would probably seek to add [%call_pos] as an optimization. Furthermore, I would imagine that there exist other users, beyond Jane Street, who also do not care about the end of the range and would be better served by [%call_pos].

My guess is that @gasche, @dbuenzli, and I will not be able to convince each other to change position (or location!) at this point. So I’m curious how others feel here. Please weigh in! If others here agree that having two different possibilities here would really harm the language, I could see myself retreating. But as of right now, I still think having two possibilities would make OCaml a better language.

@nojb
Copy link

nojb commented Mar 28, 2025

My guess is that @gasche, @dbuenzli, and I will not be able to convince each other to change position (or location!) at this point. So I’m curious how others feel here. Please weigh in! If others here agree that having two different possibilities here would really harm the language, I could see myself retreating. But as of right now, I still think having two possibilities would make OCaml a better language.

I am a bit wary of wading into the quicksand :)

From my perspective, the discussion feels a bit "theoretical" in that (at least I) do not have a good picture of what are the potential uses of this feature that may need "locations" instead of "positions". More precise descriptions of these uses may be useful to move the discussion forward. For what is worth at LexiFi we carried for many years a version of this feature that used a "position" type and it worked very well for us (used exclusively for debugging and logging), but these were of course very specific uses and are not representative.

I would not be shocked to have both variants in the language; it does not seem very different from __POS__, __LINE__, __MODULE__, etc. Again, it is hard to agree on what is the "best" variant without talking about the expected usecases. For example, I can imagine users who may want even more information than what is offered by a hypotethical "location" type, and have for example access to the positions/locations of individual arguments... again, hard to know where to draw the line without discussing concrete usecases.

@gasche
Copy link
Member

gasche commented Mar 28, 2025

I can imagine users who may want even more information than what is offered by a hypotethical "location" type, and have for example access to the positions/locations of individual arguments.

I agree, it would also make sense to hope for separate locations for the function being called and for each argument. (The position of the application is presumably a join of that.)

@MisterDA
Copy link

MisterDA commented Mar 28, 2025

I became interested by this feature as a user and occasional maintainer of Alcotest, a testsuite for OCaml. Ideally, it would expose a function Alcotest.check taking this optional parameter, which allows the library to pretty-print the location of a failed assertion in the test program. The interface of Alcotest currently looks like this:

module Source_code_position : sig
  type here = Lexing.position
  (** Location information passed via a [~here] argument, intended for use with
      a PPX such as {{:https://github.com/janestreet/ppx_here} [ppx_here]}. *)

  type pos = string * int * int * int
  (** Location information passed via a [~pos] argument, intended for use with
      the [__POS__] macro provided by the standard library. See the
      documentation of [__POS__] for more information. *)
end

type 'a extra_info =
  ?here:Source_code_position.here -> ?pos:Source_code_position.pos -> 'a
(** The assertion functions optionally take information about the {i location}
    at which they are called in the source code. This is used for giving more
    descriptive error messages in the case of failure. *)

val check : ('a testable -> string -> 'a -> 'a -> unit) extra_info

I've seen in the wild arguments labeled __POS__ to avoid using ~pos:__POS__. If I'm not mistaken Alcotest supports both positions and locations.

Now, as the compiler doesn't (yet) insert source code {posi,loca}tions, I'm walking the callstack to find the first call outside of the library, which must be the site of the check function (callsite_loc.412.ml). I can then use the information to report the site of the failed assertion to the user.

Take this dummy program that raises somewhere down the stack, and look at the printed backtrace:

$ cat >test.ml <<EOF
let f () =
    assert


      false

let g () =
    Sys.opaque_identity (f ())

let () =
    Sys.opaque_identity (g ())
EOF
$ ocamlfind opt -g test.ml && OCAMLRUNPARAM=b ./camlprog.exe
Fatal error: exception Assert_failure("test.ml", 2, 4)
Raised at Test.f in file "test.ml", lines 2-5, characters 4-11
Called from Test.g in file "test.ml" (inlined), line 8, characters 24-30
Called from Test in file "test.ml", line 11, characters 24-30

If I understand correctly, at each point in the callstack the compiler has stored a range (location), except in the exception itself.

I want to replace the problematic walk of the callstack (what if there's no debug info? inlining?) with the implicit source location, but I'd rather not have fewer information than what walking the callstack can give me.

A bit orthogonal to the discussion, but the dream feature would be that Alcotest detects if it's running in a CI. If so, for instance with GitHub Actions, it can print annotations that the CI recognizes. One of them would allow the pretty-printer of the assertion failure to link back to the source code within the GitHub UI (see Workflow commands / Setting an error message). The GitHub workflow language supports code spans ({start,end}×{column,line}). I'd rather use this feature to its full extend.

I'm also wondering how backward-compatibility will be achieved. If a library exposes [%call_pos] in its mli, presumably some sort of preprocessing need to happen to remove it for older OCaml compilers? if so, could be have a "blessed" way of doing that?

@gasche
Copy link
Member

gasche commented Mar 28, 2025

I'm also wondering how backward-compatibility will be achieved. If a library exposes [%call_pos] in its mli, presumably some sort of preprocessing need to happen to remove it for older OCaml compilers?

We could indeed have a ppx that turns [%call_pos] arguments into standard optional arguments with a dummy/none location as the default value, and never synthesizes them when they are missing at callsites. This would result in code not having any actual locations (except when explicitly provided by callers) on older OCaml versions.

If so, could be have a "blessed" way of doing that?

I would personally be happy to provide my blessing to anyone motivated to do it, but I would not count on an "official" implementation as part of the upstream integration work.

@nojb
Copy link

nojb commented Mar 28, 2025

I'm also wondering how backward-compatibility will be achieved. If a library exposes [%call_pos] in its mli, presumably some sort of preprocessing need to happen to remove it for older OCaml compilers? if so, could be have a "blessed" way of doing that?

I don't think we should spend too much time on this: this is a language change, and we don't typically try to make new language features compatible with older compilers.

@dbuenzli
Copy link
Contributor

dbuenzli commented Mar 28, 2025

@nojb said

From my perspective, the discussion feels a bit "theoretical" in that (at least I) do not have a good picture of what are the potential uses of this feature that may need "locations" instead of "positions".

Some have been provided here. In general as @MisterDA hinted it can be useful for precise application-level stacktraces for specific operations in your system.

@gasche said

I agree, it would also make sense to hope for separate locations for the function being called and for each argument. (The position of the application is presumably a join of that.)

I think we are never going to get to something if we start factoring in this in :–) Note that with spans you can already trace your function argument locations modulo a call to a combinator or a high-precedence unary operator:

trace_arg : ?loc:[%call_loc] -> 'a -> 'a * textloc
let ( !! ) = trace_arg

as far as I'm concerned that would be good enough® – and in fact exactly what I would end up doing for my replacement of __POS_OF__ for snapshot testing.

@goldfirere said

2. if we had only [%call_loc] in the language (and our libraries worked with it), we would probably seek to add [%call_pos] as an optimization.

Note that doing this optimization would also entail for you not to depend on Lexing.position as you do now: Lexing.position is already wasteful for how you use it. So if you want to retain only the information you really want you'd also run into the same technical debt problems.

I think it would be worth nailing down a bit some of the possibilities.

Let's say that a representation is:

  • A position if it represents a single byte in the source.
  • A location if it represents a byte range in the source.
  • direct if you can directly index the bytes in the source, indirect if you can find the exact bytes by counting the lines in the source, and lossy if you can no longer index the bytes.

I think either one or both of these two things could be done, each with a different variant (a or b).

  1. a. [%call_pos] is added representing the direct start position of a function call with a value of type Lexing.position
    b. [%call_pos] is added representing the indirect start position of a function call with a new abstract type representing the file, line and byte offset on the line.
  2. a. [%call_loc] is added representing the indirect location of a function call by a value of the type of __POS__. Possibly abstracted into a Textloc module.
    b. [%call_loc] is added representing the direct location of a function call by a new compact type operated upon via a new Textloc module.

A few comments:

  • What seems to be misunderstood is that doing 2a only would not change anything to Jane Street's current performance characteristics, it would even save them a subtraction – but would entail them technical debt.
  • I'd still be curious what a minimalistic 2b. only could give. As far as memory footprint goes I'm pretty sure it would end up being smaller than their current 1a. The question is if it would be competitive (three varint decodes) to render their positions when they need to in the context of a running program. Depends how often you do it and what else your program is doing.

That being said what would really upset me is to do only 1. or only "1. as a start" :–)

@goldfirere
Copy link
Contributor

Thanks @dbuenzli for that analysis -- very helpful.

In talking about this all (with @OlivierNicole) I also came to the conclusion that we would have to pay back that technical debt one way or another. So I may spend part of my afternoon tackling that -- we'll see. But I'm more convinced than I was that I shouldn't worry about that detail.

I'm also convinced that we should not just do (1). My argument at this point is simply that I don't want to just do (2) -- let's do both. I have a harder time choosing between (a) and (b) within (1) and (2). My instinct is that (b) is the right setting within (1). After all, if you want more than just to print a standard <file>:<line>:<column>, you can always use (2). And I think at this point I'm agnostic on the difference between (2a) and (2b).

@dbuenzli
Copy link
Contributor

a standard <file>:<line>:<column>

That reminds me: something that may not be clear from my terminology is that "lossy" implies likely inaccurate in the column, both for the machine and the human, if you have Unicode characters on the line.

And I think at this point I'm agnostic on the difference between (2a) and (2b).

Note that 2b can express 2a. So if we are smart about the way we devise the API for accessing 2a (instead of just using a concrete __POS__ type) we can start with 2a and change it later to 2b without breaking (though that change would likely never happen: everyone would be scared by space/time performance regressions).

Also another thing to consider is that compact varint encodings could be devised for any of 1b/2a/2b. At least for the memory footprint it's quite clear there is a win. E.g. for 1b column numbers SHOULD mostly be <= 80 so they would end up being encoded most of the time on 1 byte instead of 8 (on 64-bit platforms).

@dbuenzli
Copy link
Contributor

dbuenzli commented Mar 28, 2025

a standard <file>:<line>:<column>

That reminds me: something that may not be clear from my terminology is that "lossy" implies likely inaccurate in the column, both for the machine and the human, if you have Unicode characters on the line.

Well in fact lossy doesn't exist. 1b is is not lossy, it's indirect. If you compute <column> like you do now the result is the byte offset on the line, not a GNU visual column (except on ASCII text where both concept coincide).

So if you render that byte offset pretending to be a GNU <file>:<line>:<column> format the interpretation may be inaccurate if you have Unicode characters on the line. But if you can convince your editor to interpret it as <file>:<line>:<byteoffset-on-the-line> it will be perfectly accurate, all of the time.

@goldfirere
Copy link
Contributor

So where do we sit here? It feels like there is support for 1b over 1a. And if we have an abstract interface for 2, then the details matter less there. Are we happy to have both 1 (of some form) and 2 (of some form)? Would it be helpful at this point to ask the committee to guide us to a decision?

@gasche
Copy link
Member

gasche commented Apr 1, 2025

My preference is to do (2a) as a first step, while writing the code in such a way that it's easy for Jane Street to maintain (1) if they want and share most of the code. For example, one concrete way I proposed to do this is to first implement both (1) and (2) in the parsetree and below (I envision a construct that takes a variant type with two possible constructors), and then remove the code for (1) (same construct, same variant type, just one constructor left).

In this direction (which I understand may not be unanimous) the main blocker is that I'm still not sure what API we want to expose for the type of program locations. I'm happy with the idea of exposing a Textloc module that other people can benefit from, but there is a choice between having just Textloc.t, or also having an explicit separate Textloc.Compact.t (which influences the user-facing API design). Finally, we could decide that the compiler-generated type for [%call_pos] is not even Textloc.t but some abstract type application_source_location that can be unpacked into Textloc.t and nothing else, and keeps the possibility of adding more information later if we fancy.

If I wanted to converge quickly in this direction I would try to iterate on a concrete API, or maybe kindly ask @dbuenzli to come up with one and see what this gives (he is often the most demanding API critic, so starting from something he likes saves time).

@OlivierNicole
Copy link
Author

My understanding after discussing with @goldfirere is that Jane Street would rather use a language that is as close as possible as the OCaml that everyone else uses, and maybe one day the same language; so they would rather not use a non-exposed feature of the language.

IIUC @dbuenzli and @gasche are against doing both 1a and 2a, because 2a contains strictly more information than 1a with no performance penalty, and they estimate that avoiding technical debt for Jane Street does not justify having two forms of position arguments.

So maybe a decision that would make everyone slightly happier is to have both 1b and 2b as proposed by @goldfirere, if Jane Street can convince the other maintainers that 2b will not be performant enough for them, and if they don’t mind paying the work of switching away from Lexing.position.

Also noting that depending on which performance indicators should be optimized, Lexing.position may be the best candidate for 1b: as noted somewhere by @dbuenzli, it isn’t the most compact format, but allows to retrieve the line, “column” and byte-offset-in-the-file with no processing or nearly none (only a subtraction for the “column”).

@gasche
Copy link
Member

gasche commented Apr 4, 2025

My general reasoning so far has been that making the type of implicitly-inferred locations abstract can give us leeway to change things in the future, to move to richer representations and/or to more optimized representations. I think that any sensible representation would work in practice, and in particular a pair of Lexing.position would be just fine. But apparently we are unable to converge towards a consensus: Jane Street insists that we really must support exactly what they implemented because moving away to something more general is too difficult for them, and I don't like going away from the simplicity of exposing a single representation to users.

Perhaps we could try, instead of having a single abstract type that strikes a good compromise, to have a GADT that lists several different types. We can then evolve the design by adding more representations later on, which is a user-visible change (people have to change their code to benefit from richer or more efficient representations) but perhaps something that we can compromise on.

For example this could look like this:

type _ implicit_arg =
  | Pos : Lexing.position implicit_arg
  | Loc : Textloc.t implicit_arg
(* maybe someday:
  | Call_loc : Textloc.t list implicit_arg
*)

val f : ?loc:[%implicit_arg Loc] -> ...

let f ?(loc=[%implicit_arg Loc]) = ...
  (* loc has type Textloc.t, derived from the type of Loc *)

@dbuenzli
Copy link
Contributor

dbuenzli commented Apr 4, 2025

I have to say that I'm not super fond of @gasche, "let's not make a choice in the end" proposal.

For me the problem is that we are in total conjecture limbo as far as (non technical debt) costs are concerned. For example I wouldn't be so sure to follow @gasche that a pair of lexing positions would be just fine.

There are two main costs that are interesting to look at:

  1. Increase in executable size.
  2. Costs of using the representation at runtime (note that due to cache effects compact representations may not necessarily be slower, but that needs to be measured).

Isn't there a way to make a few real world measures on JaneStreet's executable to weigh the costs and make informed design decision ? I think the following would be useful:

  1. The exact number of [%call_pos] expansions on JaneStreets' executable.
  2. How exactly JaneStreet uses the data of the call pos ? Do they a) store it away in binary logs ? Do they b) render them immediately to strings ? Do they do that all of the time or is it predicated on a log level ?

Equiped with 1) we can compute precisely the storage costs of each variants. Informed by 2) we can adapt the representation for the common case, for example if we have a Textloc.Compact.t representation we can lay it out so that it is efficient to decode a position <filename>:<line>:<byte_offset> string without going through a full Textloc.t.

@goldfirere
Copy link
Contributor

I appreciate the discussion here, but I'd like to push back slightly on

Jane Street insists that we really must support exactly what they implemented because moving away to something more general is too difficult for them

While I work for Jane Street, I argue here as me -- informed by my knowledge of what my colleagues want. I don't mean to insist on anything; instead, I'm advocating for allowing programmers to use different types for different use-cases (which does not sound so controversial to me). Furthermore, I've agreed that the new syntax proposed in this discussion is better than our currently-implemented one, and I said up-thread that I would explore further the feasibility of changing our representation (indeed, my advocacy for 1b above would require that change of representation). So I think we're all showing evidence of having a give-and-take conversation of what would be best here.

Now, on to technical matters:


I don't love @gasche's new proposal. It definitely would work for Jane Street -- but I think it's a worse design. It exhibits the modest complexity of having multiple different source indicator types, but has extra complexity in the encoding. This design might be good if we expected to have heterogeneous collections of source information, but I doubt we would want that.

@dbuenzli: I appreciate these questions, but answering them is hard. I don't see a good way of assessing this without instrumenting the compiler to count the occurrences and then somehow working with the build system to accumulate the results somehow. Or maybe it would just print out the result and then we'd have some parser to extract those results from the output? Something like that would probably bork our build system's expectations, though. And we have many different executables... which to test? Similarly to figure out how the information is used, among 65+ million lines of OCaml code. I'm not saying these questions are impossible to answer -- just that it would take real time to produce answers, and it's not clear to me yet that I should ask Jane Street to invest those resources. (For example: what answers could I have to these questions that would change the debate?)

Instead, I asked a few colleagues about this, and the general consensus was that it's mostly used for printing errors (as you would expect). The received wisdom was also that changing away from Lexing.position would be hard. I took my own look, though, and I think it's plausible. The most troublesome field of Lexing.position is its pos_bol field, which is used by name about 1,000 times in our codebase. (Around here, 1,000 is low.) And many of the usages are in subtractions with pos_cnum, as you'd expect. So I think it could be managed using an automated tool.

The other datum I have is that [%call_pos] appears 5,827 times in our codebase. Of course, that doesn't say how many times it appears in an executable, because of course usages are implicit -- that's the whole point! Not sure what to do with this, but I thought I'd share.

@gasche
Copy link
Member

gasche commented Apr 4, 2025

("argue here as me": noted, apologies.)

It exhibits the modest complexity of having multiple different source indicator types, but has extra complexity in the encoding.

I'm not sure what you mean by "extra complexity in the encoding". In particular if you meant that we would need bits at runtime to distinguish among the different cases, then I would disagree: the GADT is compile-time information for the compiler to decide which value (a Lexing.position or Textloc.t or something else) to inject at compile-time, and those values are represented directly at their type.

I don't see a good way of assessing this without instrumenting the compiler to count the occurrences and then somehow working with the build system to accumulate the results somehow.

For space, it may be possible to write a tool that looks at the constant sections of your binaries, and counts the number of constant values that look like locations. You would get a sense of the portion of binary size taken by location information, and the result would probably be that it is neglectible.

For time, the easiest way I can think of is to write a function that does a shallow copy of Lexing.position, write a slightly modified compiler that inserts calls to this function at all call sites that elaborate a [%call_pos], and then run benchmarks on this (presumably: a benchmark of a library that heavily uses [%call_pos]) to estimate worst-case slowdown. Again, I would expect that the result is that it is neglectible.

@dbuenzli
Copy link
Contributor

dbuenzli commented Apr 4, 2025

just that it would take real time to produce answers,

Ok no problem. I thought perhaps you had an easy way to test experimental compilers on a build of your code base (and thus just modify the compiler to increment a counter in a global file each time the implicit location writing code path gets hit).

And many of the usages are in subtractions with pos_cnum, as you'd expect.

It would be interesting to investigate when these pos_bol are not used as such because these cases would prevent you from using 1b (or force you into 2 :-). You don't get pos_bol in 1b.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.