Skip to content

Is there any way to tell what PeerId has sent Inbound message in InboundUpgrade? #2602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
folex opened this issue Apr 2, 2022 · 7 comments
Closed

Comments

@folex
Copy link
Contributor

folex commented Apr 2, 2022

Some Context

I'm debugging a heisenbug which happens periodically: peer A sends message to peer B, but peer B is unable to read it due to unexpected end of file error.

And I'm having trouble reproducing and/or figuring out, which peer causes this. It would help me a lot if I could tell who has sent the message, i.e. PeerId of the sender.

I don't see how to get that info inside upgrade_inbound:

fn upgrade_inbound(self, mut socket: Socket, info: Self::Info)

Socket is multistream_select::negotiated::Negotiated<rw_stream_sink::RwStreamSink<Chan>>

The Question

Is there a way to tell sender PeerId inside InboundUpgrade implementation like this one upgrade.rs#L115?

Version

libp2p-core: 0.32
libp2p: 0.43

P.S.

Error happens inside read_length_prefixed. Exception looks like this

Error processing inbound ProtocolMessage: unexpected end of file

     Location:
         /home/circleci/project/particle-protocol/src/libp2p_protocol/upgrade.rs:126:30

     Stack backtrace:
        0: core::ops::function::Fn::call
        1: eyre::capture_handler
        2: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
        3: <libp2p_core::either::EitherFuture2<AFut,BFut> as core::future::future::Future>::poll
        4: futures_util::future::future::FutureExt::poll_unpin
        5: <libp2p_swarm::connection::handler_wrapper::SubstreamUpgrade<UserData,Upgrade> as core::future::future::Future>::poll
        6: futures_util::stream::stream::StreamExt::poll_next_unpin
        7: libp2p_swarm::connection::handler_wrapper::HandlerWrapper<TProtoHandler>::poll
        8: libp2p_swarm::connection::Connection<THandler>::poll
        9: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
       10: futures_executor::thread_pool::PoolState::work
       11: std::sys_common::backtrace::__rust_begin_short_backtrace
       12: core::ops::function::FnOnce::call_once{{vtable.shim}}
       13: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
                  at ./rustc/ec4bcaac450279b029f3480b8b8f1b82ab36a5eb/library/alloc/src/boxed.rs:1854:9
       14: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
                  at ./rustc/ec4bcaac450279b029f3480b8b8f1b82ab36a5eb/library/alloc/src/boxed.rs:1854:9
       15: std::sys::unix::thread::Thread::new::thread_start
                  at ./rustc/ec4bcaac450279b029f3480b8b8f1b82ab36a5eb/library/std/src/sys/unix/thread.rs:108:17
       16: start_thread
       17: clone

Protocol implementation is here https://github.com/fluencelabs/fluence/blob/master/particle-protocol/src/libp2p_protocol/upgrade.rs#L126

@mxinden
Copy link
Member

mxinden commented Apr 6, 2022

InboundUpgrade has no way to tell who the sender is, given that the security handshake, and thus the authentication of the sender might not have happened yet.

The only way I see is logging the PeerId once the security handshake has been successful and then try to correlate the log line with the panic. Though I guess this is happening on a busy machine and thus hard to correlate, correct?

@folex
Copy link
Contributor Author

folex commented Apr 13, 2022

Yes, exactly. It's very hard to correlate.

@folex
Copy link
Contributor Author

folex commented Apr 13, 2022

If it's not possible to provide PeerId information there, maybe it makes sense provide address info? Like multiaddr.

That would allow me to understand who is on the other end of the socket.

If that makes sense, I can try to implement that.

@mxinden
Copy link
Member

mxinden commented Apr 17, 2022

If it's not possible to provide PeerId information there, maybe it makes sense provide address info? Like multiaddr.

Note that we are using the same InboundUpgrade and OutboundUpgrade across connection and stream upgrades. E.g. in the latter case we don't know the address, i.e. the address is not relevant. We could break the trait into two one for streams one for connections, though I am not sure that is worth it?

@folex
Copy link
Contributor Author

folex commented Apr 19, 2022

Well, in OutboundUpgrade case it seems relevant, too, and for similar reason.

Imagine a situation, where you're trying to communicate with some peer, and communication fails at Inbound/Outbound Upgrade level. How do you debug it in an open big network?

Let me try to give more specific arguments.

InboundUpgrade

In my case, some peers close connection before fully writing protocol message. That's InboundUpgrade.

And that's exactly what happened in my case. And I think I have a fix for that (as described here libp2p/rust-yamux#117).

BUT since the network is rather inert when it comes to updates, some peers still follow invalid algorithm for sending data (ie closing too early). So my logs are still full of EOF errors, and it makes me uneasy, because I can't be sure I have fixed the problem.

If I could tell multiaddress/peerid/etc of the sender, I could go to that peer and ask for it's version via other protocol (eg identify), or at least tell if it's one of peers managed by me or not.

OutboundUpgrade

This would be a hypothetical example. Consider the following situation:

I send protocol message to some peer, but sending fails. It throws an EOF error for some peers, and messages do not get delivered.

If PeerId/multiaddress would be possible to learn, then I could pinpoint the peer at failure, and try to debug it somehow. Know its version, maybe talk to hoster of that peer, etc.

Conclusion :)

I see address information as vital for debugging networking problems. It is also feels natural to me to be able to access addresses of sockets, since OS has that information if I'm not mistaken.

WDYT?

Thanks for your openness and readiness to discuss possible changes in such a core API, really appreciate that. Cheers!

@mxinden
Copy link
Member

mxinden commented Apr 23, 2022

I am still a bit reserved to add an additional argument to OutboundInboundUpgrade for the sake of debugging.

Would it not be an option to return an error from OutboundUpgrade or InboundUpgrade and then log the error along with the peer information in ConnectionHandler::inject_dial_upgrade_error or inject_listen_upgrade_error or NetworkBehaviour::inject_dial_failure or inject_listen_failure?

I might be missing something.

@folex
Copy link
Contributor Author

folex commented Apr 25, 2022

Thank you, that sounds like what I originally wanted to know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants