Skip to content

Handling _SocketType and SocketType #2720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jakkdl opened this issue Jul 29, 2023 · 16 comments · Fixed by #2774
Closed

Handling _SocketType and SocketType #2720

jakkdl opened this issue Jul 29, 2023 · 16 comments · Fixed by #2774
Labels
design discussion typing Adding static types to trio's interface

Comments

@jakkdl
Copy link
Member

jakkdl commented Jul 29, 2023

SocketType is currently an empty type and what's meant for end-users to inherit from to show compatibility with trio sockets, but _SocketType (which inherits from SocketType is what's actually used internally and has all the methods needed to get code to actually pass type checking.

So we probably want to merge these one way or another, I started trying to tackle it in #2705 but all my ideas hit problems and will likely have some effect on user code that inherits from either of them, so I think it deserves it's own issue + PR.

In tests we have three different FakeSocket classes inheriting from SocketType: test_highlevel_socket.py, test_highlevel_open_tcp_stream.py and test_highlevel_open_tcp_listeners.py

Options include:

  1. Make SocketType an abstract base class with abstract methods. Will break any code that has methods inheriting from it and not defined - so would need to only mark the methods that are absolutely required to be abstract, and others could instead raise NotImplementedError.
    *. Will almost certainly raise a lot of type checking errors.
  2. Remove _SocketType, moving it all into SocketType. I think it's impossible to avoid this being a breaking API change.
  3. Don't make any runtime changes at all, but during type checking assign SocketType = _SocketType.
    *. will raise a ton of type checking errors, might be possible to reduce the number with some effort.
  4. Rename _SocketType something like InternalTrioSocketType and have that be the class used in signatures that require some methods to be available.
  5. ...
@TeamSpen210
Copy link
Contributor

I think before deciding how to deal with that, it'd be good to try and categorise the methods - which should an implementation always need to have defined, are there groups which should be defined together? Also how many could have a default implementation, beyond raise NotImplementedError?

A 5th option might be to use a bunch of protocols, though there'd be a little bit of a runtime cost. Make SocketType a Protocol with the most basic methods, then have some additional protocols inheriting from that for any other "group" of methods. Code could use either @runtime_checkable or a TypeGuard using hasattr() to detect if it applies. That won't help with verifying that they're valid for set_custom_socket_factory though.

@jakkdl jakkdl added the typing Adding static types to trio's interface label Jul 29, 2023
@A5rocks
Copy link
Contributor

A5rocks commented Aug 1, 2023

I think 1. is nice, because that's essentially the only method that I can think of that is both palatable and has a solid deprecation path (i.e. make certain methods abstract only at type checking time + check that they're non-abstract at initialization time, otherwise warn).

@jakkdl
Copy link
Member Author

jakkdl commented Aug 1, 2023

Ah, that sounds like a good solution and relatively straightforward.
It will lead to some minor API breakages if people are doing very fancy stuff in their classes inheriting from SocketType, but that shouldn't be overly common?

The best way of categorizing methods is probably just to start writing a PR to get an overview.
If anybody wants to take a stab feel free, otherwise I'll give it a go.

@Fuyukai
Copy link
Member

Fuyukai commented Aug 1, 2023

Option 5: Vaporise SocketType and stop exposing raw BSD sockets. This would have to be done for uring support anyway.

@A5rocks
Copy link
Contributor

A5rocks commented Aug 7, 2023

Huh, what kind of interface changes are we looking at for that sort of removal? I'm not too familiar with what trio offers w/r/t sockets.

@njsmith
Copy link
Member

njsmith commented Aug 7, 2023 via email

@Fuyukai
Copy link
Member

Fuyukai commented Aug 7, 2023

BSD sockets is more than just send/recv, it has the awful type-punning socket address and socket option APIs. But aside from that, uring linked requests is intrinsically incompatible with the socket.socket API of single-call waits. If you extend it, it's no longer a stdlib socket inside a wolf's suit, so why bother pretending to be one?

  • e.g. you can't express a bind -> multishot accept call with the socket API.
  • or a shutdown -> close call which avoids going into the event loop twice to do the same thing (shutdown is not async. good luck doing a deprecation cycle for that!).

If we wanted direct sockets (a good feature), we'd then have to use the io_uring calls for sockopt, sockaddr, etc, which means either copying all the validation logic over from the terrible stringly-typed socket.socket API so that somebody doeesn't pass in a too small buffer and segfaults. Or, better, make a proper strongly-typed API that can be statically typed to avoid these issues but is once again incompatible with stdlib sockets (so we should stop pretending to be one).

@njsmith
Copy link
Member

njsmith commented Aug 7, 2023

(We're not going to remove the current socket api, even if we do add fancy io_uring related features on top of it in the future, so for this issue it doesn't matter too much. @Fuyukai do you want to open a separate issue for possible new socket operations to expose via io_uring? E.g. you can't express a multishot accept with the current API, but we can certainly add a multishot accept API. idk what a shutdown->close is tho. And fyi it's totally legal to have a single socket that's both registered for direct io and has a regular fd for regular syscalls, if that's more convenient.)

Anyway for SocketType, I think we want to basically make it

class SocketType:
    def __init__(self, [appropriate types here]):
        raise NotImplementedError

    def recv(self) -> bytes:
        raise NotImplementedError
# etc.

That makes it legible to type checkers, while keep all the current runtime behavior the same.

@jakkdl
Copy link
Member Author

jakkdl commented Aug 26, 2023

I think before deciding how to deal with that, it'd be good to try and categorise the methods - which should an implementation always need to have defined, are there groups which should be defined together? Also how many could have a default implementation, beyond raise NotImplementedError?

I think SocketType needs to have everything, unless we overhaul how SocketFactory works, since the return type of all methods will be SocketType, and _SocketType will never be seen in the public API, so if they ever want to call e.g. recvmsg_into then it has to be part of the public API of SocketType.

Default implementations could be made for all the pure wrapper methods though, if they check that the user-made SocketType does have an underlying stdlib socket.
Something like

class SocketType:
  def __init__(self):
    if type(self) == SocketType:
      raise TypeError("abstract class, use trio.socket.socket...")
    self._socket: _stdlib_socket.socket | None = None

  def detach(self) -> int:
    if self._sock is None:
      raise NotImplementedError("No default implementation unless self._socket is initialized") 
    return self._socket.detach()

@TeamSpen210
Copy link
Contributor

For that case, what we do is make a protocol that has _socket defined, and annotate self with that. Then type checkers will complain if any of the defaults are used but that attribute is not present.

@jakkdl
Copy link
Member Author

jakkdl commented Aug 28, 2023

For that case, what we do is make a protocol that has _socket defined, and annotate self with that. Then type checkers will complain if any of the defaults are used but that attribute is not present.

@TeamSpen210 just slapping on a HasSocket protocol to the self argument is turning out to be far from trivial, due to lack of Intersection it warrants creating another type SocketTypeWithSocket which looks like it ends up being the type used pretty much everywhere in the internal code, since out of 30 methods on SocketType only 7 of them are more than a simple wrapper and left with a NotImplementedError, and the others all require HasSocket. Pretty much all user code is going to need to be typed as using SocketTypeWithSocket (or _SocketType).

If going in this direction I think it's probably better to just go all the way and fully merge the types, or expose _SocketType as a public type that can be specified.

@jakkdl
Copy link
Member Author

jakkdl commented Aug 28, 2023

That makes it legible to type checkers, while keep all the current runtime behavior the same.

FTR: The raise NotImplementedError certainly doesn't make all runtime behavior the same. Any code doing dynamic inspection on socket types with hasattr/getattr/dir, which isn't that weird given that the current trio tests already do that for platform differences, could get messed up. Also any code that does multiple inheritance and uses super() calls will now raise NotImplementedError.

@jakkdl
Copy link
Member Author

jakkdl commented Aug 28, 2023

Ah, I suppose there's always

class SocketType:
    if TYPE_CHECKING:
        def recv(self, ...) -> bytes:
            ...
        def fileno(self) -> int:
            ...

if we want to guarantee no impact on runtime.

@njsmith
Copy link
Member

njsmith commented Aug 28, 2023 via email

@jakkdl
Copy link
Member Author

jakkdl commented Aug 29, 2023

I'm having trouble following what people are trying to accomplish here – what's wrong with my proposal above?

Your proposal can affect runtime behaviour, though admittedly only in weird cases. But I mostly just see no upside of raise RuntimeError vs hiding behind TYPE_CHECKING, unless you specifically want to catch people doing weird things with SocketType - which might make sense if long-term plan is to make SocketType abstract, or merging the two types.

Going into protocols and/or default implementations seemed like a good idea, but is turning out to be really messy and I'm not going to bother trying that.

@njsmith
Copy link
Member

njsmith commented Aug 30, 2023

Conceptually SocketType is an ABC. It's just that since the API is huge and the main use case for implementing it is for tests, I bowed to practicality and made all the abstract methods optional -- the socket API is huge and any given user only touches a small fraction of it, and when you're writing a test fake, there's no point in making you fill in exotic methods you don't use. Plus, if you accidentally leave out an important method, the exact error doesn't matter so long as you get some error and fix your test; these aren't errors that you're catching and handling at runtime. And since this was before static type checking was a thing, I didn't bother typing in all the optional abstract methods, since all it would do is convert AttributeError into NotImplementedError and who cares?

Anyway that's the background.

Agreed that the difference between raise NotImplementedError and if TYPE_CHECKING is pretty minimal; I was more confused at how we were getting onto discussions about abstract protocols with _socket attributes and stuff :-).

I think I do have a weak preference for raise NotImplementedError over if TYPE_CHECKING, simply because having the same code for runtime and static checking is one less thing you have to think about when figuring out what's going on, and in this case I don't think the runtime difference between AttributeError vs NotImplementedError is important or worth preserving -- arguably it should have been NotImplementedError all along, just I was too lazy to bother :-) But either seems workable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design discussion typing Adding static types to trio's interface
Projects
None yet
5 participants