-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discard unused body of Unary and ClientStream methods #5331
Conversation
Forgot to regenerate files. Doing it now. |
Missing import for sync.WaitGroup. I don't feel like tinkering with the import machinery - and benchmarks don't show any significant slowdown when using channels instead. I'll update the implementation to use plain channels. |
I'll try to write an integration test for this too. |
Apparently, transactional request-response flow is inherent to Go HTTP, including grpc-gateway runtime. @johanbrandhorst |
Sure, it'd be great to know if we break the websocket proxy again in the future, why not! |
I've opened a repository for the websocket proxy itself: It's fairly covered with tests, which fail on |
Please see: #5338 |
Integration tests are failing even with the new changes :) good thing we decided to include them. Unfortunately, seems like the very act of waiting for This is certainly not desirable behavior - we want clients to be disconnected when server returns. That only leaves us with I will revert the commits replacing the |
Is it only affecting server side streaming methods, or bi-directional methods too? |
It happens on Unary methods and Server Stream methods (i.e. non-client-stream methods). Bidirectional and Client Stream methods don't use this template branch, so the problem doesn't apply - and neither does the root issue in which gateway doesn't read client body... In other news, I'm having trouble with properly testing WebSocket context cancellation:
Since WebSocket proxy layer cannot possibly know which method is going to be executed, we cannot decide what to do on WS close message:
In an ideal world, grpc-gateway would generate the WebSocket proxy itself, and it would generate two different implementations depending on the method type. Maybe that would be a cool feature in the future? So yeah, this is going to be a hairy one. I'll try to figure something out. |
I think @tmc has wanted to integrate the websocket proxy into the gateway for a long time but we just haven't decided the best way forward. I think it would be a bigger task than this issue. If we have to choose between supporting the websocket proxy and removing a bug in the gateway (#5236), then we're going to fix the bug. It might be good enough that we support the websocket proxy only for bi-directional streaming methods? |
Eh, you're right. I'm way too deep in the rabbit hole. So I can use both 1) coder/websocket library, as suggested, and 2) your suggestion to wait for io.Copy at end of the request. For any further bugs and features, I'll open new issues and possibly further PRs. |
Will test how this works with https://github.com/jnis23/grpc-gateway-proxy-example provided by the bug report. |
From example provided by the bug report, even "working" server-stream (with body) doesn't properly close websocket methods. And since server-stream request needs to return before request body is closed (in case of websockets), we should add I'll add that to templates and integration test as well... |
Integration tests are failing on Let me know if there's anything else I need to do. |
I've modified the PR name to better align with changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry but I really think we need to get to the bottom of the behavior here before we can merge it. This proposed solution will affect all users of the gateway (the goroutine), and I want to be sure this is really necessary. The original fix (#5240) fixed a minor connection leak. The impact of that change was very small because it only affected RPCs with no body. Somehow it broke the websocket proxy. That is also probably a pretty small impact. Now fixing both of the issues requires creating a goroutine, which may have a big impact on all users of the gateway. I just want to be sure this is the right solution, and honestly I'm still tempted to say we should just break the websocket proxy rather than impose this on all users.
Fair enough. I'll try to investigate the root cause of this issue. |
I've deduced that the issue appears in the google.golang.org/grpc/internal/transport/clientStream.waitOnHeader() method - when request is not closed (or completely read, in case of no-body methods), the The newClientWithParams function also mentions a comment that may be related: // Possible context leak:
// The cancel function for the child context we create will only be called
// when RecvMsg returns a non-nil error, if the ClientConn is closed, or if
// an error is generated by SendMsg.
// https://github.com/grpc/grpc-go/issues/1818. Indeed, during tests, the context doesn't seem to be closed when connection closes, but instead blocks forever. This is what I have for now. I'll try to figure out why this happens in the first place. |
You're right - it seems that this issue can be fixed in the websocket-proxy itself, instead of in the grpc-gateway. The root cause of this is the conn-read -> req-write loop in the websocket proxy: If nobody is reading the Arguably, it should be the server's responsibility to read requests, even when they are not needed e.g. in order to not block the network buffer. Of course, in this case, the fault is on the websocket-proxy library, since it mistakenly assumes that request write will always pass. This has been a wild ride. There's a lot to consider here. |
Wait, that won't work either. The blocking So either we break HTTP clients that send body when body is not defined or all WS clients when body is not defined... I'd like to point out that grpc-gateway already use a goroutine in the implementation of client-streaming methods (ClientStream and Bidirectional): go func() {
for {
if err := handleSend(); err != nil {
break
}
}
if err := stream.CloseSend(); err != nil {
grpclog.Errorf("Failed to terminate client stream: %v", err)
}
}() For what it's worth, my benchmark with and without
|
So websocket clients with a body defined still work? I'm happy to put constraints on what types of websocket clients we can support, but I'm not really happy breaking any part of the gateway to support the websocket wrapper. |
I understand. Messing with a widely used library is tricky, and one must be careful. Some shortcuts to cut down on repetitive phrases:
However, I'd like to present some more arguments for including
For reproduction of second bug, please see: https://github.com/paskozdilar/bug-report-grpc-gateway/tree/bug-when-body-defined (this line being crucial). So this bug on body-defined methods exists even with HTTP clients. It's just that json.Decode accidentally fixes it for excess data under 512 bytes. We could add a blocking If adding a goroutine is categorically not acceptable, then using a codegen for WebSocket clients will be necessary for it to work. Alternatively, similar mechanism could be used by specifying MethodType query parameter in WebSocket URL, similar to how HTTP method is defined. I'm willing to work on this to help tmc/grpc-websocket-proxy establish a correct behavior. Then, a question comes to mind - if client sends more data than server expects (e.g. more than a single JSON object in case of BDUSM, or anything at all for NBDUSM), should grpc-gateway return an error, e.g. 400 Bad Request? |
Yes, I think this could work. That way we could add blocking I have a proof-of-concept example here: https://github.com/paskozdilar/grpc-gateway-proxy-example So, would it be acceptable to add a blocking If so, I'll implement the integration test for that case, too. |
Thank you for spending so much time investigating this and laying out the options. I absolutely think a blocking defer func() {
io.Copy(io.Discard, req.Body)
req.Body.Close()
}() What do you think? |
No, that would not work.
Considering many HTTP servers would read the body completely before even starting to process the response, I believe the correct behavior would be adding the following lines before stream.Header() call: // NBDUSM
[...]
n, err := io.Copy(io.Discard, req.Body)
if err != nil {
return nil, metadata, status.Errorf(codes.InvalidArgument, "%v", err)
}
if n != 0 {
return nil, metadata, status.Errorf(codes.InvalidArgument, "unexpected data")
}
[...]
// BDUSM
[...]
d := marshaler.NewDecoder(req.Body)
if err := d.Decode(&protoReq); err != nil {
return nil, metadata, status.Errorf(codes.InvalidArgument, "%v", err)
}
if err := d.Decode(&struct{}{}); !errors.Is(err, io.EOF) {
return nil, metadata, status.Errorf(codes.InvalidArgument, "unexpected data")
}
[...] Interestingly, this issue of not finishing
Do you agree with this approach? Pros:
Cons:
We could circumvent the latter by adding a backwards-compatibility |
I have written a proof-of-concept fix with the above code on a brach Running the code prints pretty expected scenario - invalid-body gets rejected, valid body handles contexts nicely:
As for WebSocket issue, would you be open to adding method-type metadata to response headers? That way, tmc/grpc-websocket-proxy could detect the method type automatically and adjust its behavior accordingly - and everything would work correctly! |
This issue (sending body to methods that don't expect it) also seems to appear in some of the tests, e.g. many of the This is a sign that many clients may expect methods to work even with excess body. So the way I see it, there are three choices:
I am currently in favor of the third option - I think strictness-that-breaks-some-clients is better than relaxedness-that-allows-erroneous-behavior-to-go-unnoticed. If that's not acceptable, the second one would be preferable. What do you think? |
Strange, tests are failing on http.MethodOption calls, which I didn't touch at all... I'll investigate this. |
Thanks again for your work on this. Out of the options, I'm afraid that even if the third option is the more correct, we cannot break users, even if they are doing the wrong thing today. Thus I think option 2 is preferred to me. I'm confused about the need for the method type in the response headers. What do yo mean by method type? Could it be an option, or even better, a third party middleware? |
Method type would be one of the strings: "Unary", "ServerStreaming", "ClientStreaming", "DuplexStreaming", signifying the gRPC method type that grpc-gateway is forwarding.
The issue is that:
WebSocket proxy cannot, by itself, know when it is calling a Unary / ServerStream method, and when it is it is calling ClientStream / Bidirectional methods, so it can't know when to close the request writer and when to keep it open. WebSocket proxy could add a query parameter for users to fill in with correct method type, but I figured if it can automatically get that information from the response header itself, that would be more convenient for the users. You can see preliminary implementation here:
It could be an option, but it would have to be codegen, since (I assume?) codegen is the only place where we have data about which method is which type. A third party middleware, I'm not sure how it would work. The tmc/grpc-websocket-proxy is a third party middleware. |
Alright, the tests are passing now. As for the server metadata, since we're pretty liberal about the request body, we only need method type metadata for WebSocket proxy to work fine without changes. I see there's already some metadata header prefixes used in https://github.com/grpc-ecosystem/grpc-gateway/blob/main/runtime/context.go#L23: const MetadataHeaderPrefix = "Grpc-Metadata-"
[...]
const MetadataPrefix = "grpcgateway-"
[...]
const MetadataTrailerPrefix = "Grpc-Trailer-"
[...]
const xForwardedFor = "X-Forwarded-For"
const xForwardedHost = "X-Forwarded-Host" I am not 100% sure what are those prefixes used for, but I'll dig in depeer. We could define a new Assuming everything is fine, you may merge this PR as-is if you want, or we can work on the request metadata in this PR, too. Your call. |
I believe the header prefixes you found are used for parsing incoming headers, I don't know that we set any specific headers in the response. You can write a custom ForwardResponseOption that sets headers in the response (such as cookies), and maybe we could have some way for users to use that to inform the websocket of the type? I also think it's OK for us to require websocket users to set the type of the RPC explicitly. I expect doing it manually won't be that hard because who maintains more than 1-2 websockets around the gateway anyway? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for all your work on this, lets tackle the exact websocket wrapper work in the original issue. I can happily approve this PR knowing that there's nothing particularly bad about draining the request body.
When
io.Copy
is called synchronously on HTTP request body, it stalls the clients who keep the request stream open.This broke WebSocket gateway implementations which rely on Close detection to know when to terminate.
This PR runs the
io.Copy
method in the background.References to other Issues or PRs
Fixes #5326.
Have you read the Contributing Guidelines?
Yes.
Brief description of what is fixed or changed
Added
go
keyword in front ofio.Copy
call in the template whenbody
annotation is not set.Other comments