Skip to content

[RFC] Allow $ to be override-able for custom types #139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
awr1 opened this issue Mar 5, 2019 · 14 comments
Closed

[RFC] Allow $ to be override-able for custom types #139

awr1 opened this issue Mar 5, 2019 · 14 comments
Labels

Comments

@awr1
Copy link

awr1 commented Mar 5, 2019

Consider this maybe a saner version of #8023. From Araq's comment in that thread:

I've hit the bug twice now that my own $ failed to be imported and the compiler used system.$ instead which is wrong. This is just too error prone and should be done explicitly via implementToString(MyType) where implementToString is a simple template in system.

I myself have had this problem multiple times - $ is desirable to re-implement in a number of cases where system.$ would not produce a useful result - for instance, objects containing {.importc.} types as fields. One can make a new $ procedure for a type in a module now, sure, and those types will use that new procedure when stringified - only when such stringifications occur inside that module, which is mostly useless. If you import that module and use $ anywhere, we will still always use system.$. The export marker makes no difference.

There is a way you can override $ and have it work when imported - make the type in question distinct. I have tried this, and it's a silly solution.

The alternative, I guess, is to just suck it up and make a toString() proc for a given type and call that manually with the extra explicitness that incurs.

@narimiran narimiran transferred this issue from nim-lang/Nim Mar 6, 2019
@andreaferretti
Copy link

andreaferretti commented Mar 6, 2019

Why do you say this?

only when such stringifications occur inside that module, which is mostly useless

I have no issue in importing $, for instance

# foos.nim

type Foo* = object
  a*: int

proc `$`*(f: Foo): string = "hello"

and then

import foos

let x = Foo(a: 3)

echo x # hello

Even moving the $ to a third, unrelated, module will work just fine.

@Araq
Copy link
Member

Araq commented Mar 8, 2019

The alternative, I guess, is to just suck it up and make a toString() proc for a given type and call that manually with the extra explicitness that incurs.

That's what I do too though I tend to use more descriptive names. Usually these grow a set[Flags] parameter anyway that further describes how to render it, so I made my peace with system.$.

@krux02
Copy link
Contributor

krux02 commented May 8, 2019

I don't see an RFC here. $ can already be overloaded for custom types. When it doesn't work and system.$ is used instead, it is either a bug (please report), or a user error (you might still report it and get some feedback about what went wrong).

@Araq
Copy link
Member

Araq commented May 8, 2019

The RFC is either about removing system.$ for objects or to turn $ into a type-bound operation. Which would make sense for == and hash too. In fact, it makes so much sense that we should do it for Nim version 2 or something.

@krux02
Copy link
Contributor

krux02 commented May 8, 2019

well, what is called template specialization in c++ would solve this problem for Nim. Here you don't add anther overload of $ to the mix that needs to be visible to the compiler at all the right places, instead you specialize the generic proc `$`*[T : object](arg: T) for your own type. The specialization then doesn't need to be exported anymore, it will simply replace the instance proc `$`[MyObject](arg: MyObject for system.`$`

@Araq
Copy link
Member

Araq commented May 8, 2019

The specialization then doesn't need to be exported anymore, it will simply replace the instance proc $[MyObject](arg: MyObject for system.$

I fail to see how this solves much; all specializations would still be tied to a scope, not to a type.

EDIT: I stand corrected, C++ continues to surprise.

@krux02
Copy link
Contributor

krux02 commented May 8, 2019

I don't exactly know what you mean with "tied to a scope", but only the specialization needs to be able to see the generic procedure to be specialized. Other modules don't need to see the specialized function it can be written in an arbitrary cpp file, nothing needs to be added to a header to implement the specialization.

@zah
Copy link
Member

zah commented May 9, 2019

I grappled with this "overload visibility problem" myself in the context of our serialization library. Just like $, it provides a default handling for every type that sometimes ends up activated by accident when you forget to import a certain module defining a proper override.

It feels a bit unfortunate to give up on the "working by default" tenet of the library, so I came up with an alternative idea that might make sense.

What if we expand the meaning of method beyond type hierarchies? If I define a method in a library, I define a type bound operation that can be "filled in" from any module in the project (even in generic context, where the types don't have a common base). The compiler will use the existing pass after sem to properly resolve the final bindings. If two modules try to define the same method for a given type, you'll get a compile-time error after sem. This wider treatment of methods seems consistent with the existing meaning and it solves the problem without breaking changes.

Please note that any solution that requires the type bound operation to be defined in the module of the type itself doesn't quite work for the serialization library, because the author of the type could not possibly imagine all the possible serialization formats that type could support. Defining serialization methods for types in 3rd party libraries is already common-place in our code.

@Araq
Copy link
Member

Araq commented May 10, 2019

The compiler will use the existing pass after sem to properly resolve the final bindings. If two modules try to define the same method for a given type, you'll get a compile-time error after sem. This wider treatment of methods seems consistent with the existing meaning and it solves the problem without breaking changes.

But like multi-methods, this is a non-modular, global mechanism tying us further to the current "always compile the full project" design.

Please note that any solution that requires the type bound operation to be defined in the module of the type itself doesn't quite work for the serialization library, because the author of the type could not possibly imagine all the possible serialization formats that type could support. Defining serialization methods for types in 3rd party libraries is already common-place in our code.

Well that would be exactly what I would propose anyway. It works well enough for every mainstream programming language that I've ever used -- and they can all do partial rebuilds / incremental compilations too, this is not a coincidence.

@zah
Copy link
Member

zah commented May 11, 2019

Well that would be exactly what I would propose anyway. It works well enough for every mainstream programming language that I've ever used -- and they can all do partial rebuilds / incremental compilations too, this is not a coincidence.

Er, I'm not sure my comment was read properly :)
I meant that putting the type bound operation in the same module as the type is not reasonable in the context of serialization. It makes little sense to define Json, Yaml, RLP and SSZ serialization in the module of the Nymcrypto hash type as a concrete example.

Of course, I can just give up on the notion of default serialization and require that a proper overload is always in scope where needed, but I don't think that having a little bit of full-program-dependant codegen is such a deal breaker. It just needs to be kept reasonably small and efficiently updated when it needs to change. Multi-staged programming based on global compile-time variables will need similar mechanisms eventually.

@ghost
Copy link

ghost commented May 13, 2019

The RFC is either about removing system.$ for objects or to turn $ into a type-bound operation. Which would make sense for == and hash too. In fact, it makes so much sense that we should do it for Nim version 2 or something.

How about a pragma which defines "standard" functions:

proc `$`(obj: SomeObj): string {.core.}

The {.core.} pragma would import the function automatically(or throw an error when attempting to use $ while the specialized version is not imported) and would override any default function but the user wouldn't be able to override it. If a type doesn't have a special $ then it will use the system's $. If it has a {.core.} $ then it will use that and if another source would try to define a new one it would create a compilation error to ensure the user that there's at most one specialized $ and if it exists it is in the scope(edit: or it would get implicitly imported when there is a {.core.} function with the same alias). If certain code requires a special $ then it should use a toString instead because the $ is the "standard" way to convert the type to string. {.core.} could be tied to the specific object type in case of inheritance.

edit: forgot to mention that the goal of this approach is to clearly mark an often overloaded function's main implementation as something important to the business logic. $ is mostly used for debugging but there are cases when other functions could depend on its output.

@krux02
Copy link
Contributor

krux02 commented May 14, 2019

To make a final point about template specializations. They are exactly what would be the solution to our problem. As a proof, I made an example in C++ that uses template specialization to solve exactly this problem:

main.cpp

#include "system.hpp"

// extendable for user types (unlike printf)
struct MyStruct {
  float a,b,c;
};

template <>
void dollar(FILE* out, const MyStruct& arg) {
  fprintf(out, "[%f, %f, %f]", arg.a, arg.b, arg.c);
}

struct UnknownStruct {};

int main() {
  // specialization in main.cpp
  MyStruct ms = {1,2,3};
  printLine(stdout, "Hallo ", 123, " ", ms);

  // specialization in system.cpp, no exposure in system.hpp
  SystemStruct tmp = {"Egon", 456789};
  printLine(stdout, tmp);

  // fallback to the default implementation from the header
  UnknownStruct us;
  printLine(stdout, us);
}

system.hpp

#pragma once
#include <cstdio>
#include <utility>
#include <typeinfo>

template <typename T>
void dollar(FILE* out, const T& arg) {
  fprintf(out, "--%s--", typeid(arg).name());
}

// this is a normal overload, not a template specialization.
static inline void dollar(FILE* out, const char* arg) {
  fprintf(out, arg);
}

struct SystemStruct {
  const char* name;
  int value;
};

static inline void printLine(FILE* out) {
  fprintf(out, "\n");
  fflush(out);
}

template <typename Arg, typename... Args>
void printLine(FILE* out, Arg&& arg, Args&&... args)
{
    dollar(out, std::forward<Arg>(arg));
    printLine(out, std::forward<Args>(args)...);
}

system.cpp

#include "system.hpp"

using cstring = const char*;


// the empty template parameter list signals that this is a template
// specialization, not an overload.
template <>
void dollar(FILE* out, const int& arg) {
  fprintf(out, "%d", arg);
}

template <>
void dollar(FILE* out, const float& arg) {
  fprintf(out, "%f", arg);
}

template <>
void dollar(FILE* out, const SystemStruct& arg) {
  fprintf(out, "[name: %s value:%d]", arg.name, arg.value);
};

Makefile:

all: a.out

main.o: main.cpp system.hpp
	c++ -c main.cpp

system.o: system.cpp system.hpp
	c++ -c system.cpp

a.out: main.o system.o
	c++ main.o system.o

output:

$ make && ./a.out
c++ -c main.cpp
c++ -c system.cpp
c++ main.o system.o
Hallo 123 [1.000000, 2.000000, 3.000000]
[name: Egon value:456789]
--13UnknownStruct--

@github-actions
Copy link

github-actions bot commented Jun 2, 2023

This RFC is stale because it has been open for 1095 days with no activity. Contribute a fix or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 2, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 9, 2023
@metagn
Copy link
Contributor

metagn commented Jun 9, 2023

removing system.$ for objects

This was implemented as objectdollar

#475 also covers the case of ==

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants