Skip to content

p2p: peer metainfo support #6312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

algorandskiy
Copy link
Contributor

@algorandskiy algorandskiy commented May 3, 2025

Summary

P2P network was missing peer meta info exchange including peer features like vote compression.

General idea: current p2p /algorand-ws/1.0.0 handler has a stub for meta info exchange (handshake) similar to HTTP: connection initiator writes a single byte, server reads a byte. Unfortunately there is no elegant solution for keeping a single /algorand-ws/1.0.0 but also support both old single byte handshake and multibyte meta info exchange when both old and new node have to communicate each other. The solution is to add a new p2p /algorand-ws/2.2.0 proto with multibyte handshake consisting of 2 bytes of data length and msgp-encoded http.Header-like map as a data portion. p2p nodes can list remote protocols and determine either /algorand-ws/2.2.0 or /algorand-ws/1.0.0 to use.

In this PR:

  1. Generalize setHeader to be able to use in both wsnet and p2pnet
  2. Add /algorand-ws/2.2.0 p2p proto in addition to /algorand-ws/1.0.0
  3. Add meta info exchange into /algorand-ws/2.2.0 handler
  4. Switch from network's Connected notification to EvtPeerIdentificationCompleted in order to be able to communicate with both old /algorand-ws/1.0.0 and new /algorand-ws/2.2.0 nodes
  5. Removed the old telemetryID exchange from p2p since it is racy

Test Plan

Added unit tests:

  1. meta info read write functions
  2. p2p nodes meta info exchange
  3. nodes of different net proto versions

Manual test:
Run a new p2p node vs betanet and testnet to ensure it can fast catchup, catchup and follow the chain.

Copy link

codecov bot commented May 3, 2025

Codecov Report

Attention: Patch coverage is 61.60338% with 91 lines in your changes missing coverage. Please review.

Project coverage is 51.87%. Comparing base (005bf7e) to head (247f48d).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
network/p2p/streams.go 0.00% 50 Missing ⚠️
network/p2pNetwork.go 68.11% 15 Missing and 7 partials ⚠️
network/p2p/p2p.go 0.00% 9 Missing ⚠️
network/wsNetwork.go 86.79% 5 Missing and 2 partials ⚠️
network/p2pMetainfo.go 94.54% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6312      +/-   ##
==========================================
+ Coverage   51.85%   51.87%   +0.01%     
==========================================
  Files         652      653       +1     
  Lines       87433    87554     +121     
==========================================
+ Hits        45340    45417      +77     
- Misses      39226    39267      +41     
- Partials     2867     2870       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@algorandskiy algorandskiy force-pushed the pavel/p2p-handshake branch from 7647d0e to 0d674e2 Compare May 6, 2025 21:56
@algorandskiy algorandskiy force-pushed the pavel/p2p-handshake branch from df213ee to 249c77d Compare May 8, 2025 19:16
@algorandskiy algorandskiy requested review from cce and gmalouf and removed request for cce May 8, 2025 19:53
@algorandskiy algorandskiy marked this pull request as ready for review May 8, 2025 19:54
@algorandskiy algorandskiy requested a review from Copilot May 8, 2025 19:55
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds peer meta info exchange support for p2p networks and generalizes header setting to work across both wsnet and p2pnet. Key changes include:

  • Renaming of identifier fields (e.g. GenesisID/RandomID to genesisID/randomID) and updating their usages.
  • Refactoring of header-setting into a generic function using a peerMetadataProvider interface.
  • Addition of support for the new /algorand-ws/2.2.0 protocol in p2p stream handlers and adjusted test timeouts to aid stability in race builds.

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated no comments.

Show a summary per file
File Description
node/node_test.go Updated log messages to include peer IDs in phonebook responses.
network/wsNetwork.go Transitioned from public to lowercase identifier fields and refactored header setting.
network/p2pNetwork.go Introduced support for dual protocol versions and temporary logic with a TODO comment.
network/*_test.go Extended test timeouts for p2p tests to mitigate race conditions.
network/streams.go Refactored stream dispatch to use protocol-specific handler maps.
network/msgp_gen*.go Regenerated msgp code for peer meta headers/values.
Comments suppressed due to low confidence (3)

network/p2pNetwork_test.go:109

  • Increasing the timeout from 2 to 5 seconds can help with race conditions on CI builds, but please confirm that the longer wait is intentional and does not hide potential performance issues.
}, 5*time.Second, 50*time.Millisecond)

network/p2pNetwork.go:292

  • The temporary protocol version selection logic gated by the disableV22Protocol flag is clearly marked with a TODO; ensure that this block is removed once consensus v41 is deployed.
TODO: remove after consensus v41 takes effect.

node/node_test.go:1025

  • [nitpick] Please verify that using ni[0].p2pID for node 0 is the intended behavior and is consistent with the later cases that log ni[1].p2pID and ni[2].p2pID.
t.Logf("Node%d %s phonebook: %s", i, ni[0].p2pID, ni[1].p2pMultiAddr())

@algorandskiy algorandskiy changed the title WIP: p2p: peer metainfo support p2p: peer metainfo support May 8, 2025
@algorandskiy algorandskiy requested a review from jannotti May 8, 2025 20:31
Copy link
Contributor Author

@algorandskiy algorandskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Posted some notes for reviewers

const algorandGUIDProtocolTemplate = algorandGUIDProtocolPrefix + "%s/%s"
// AlgorandWsProtocolV22 defines a libp2p protocol name for algorand's websockets messages
// as a version supporting peer metadata and wsnet v2.2 protocol features
const AlgorandWsProtocolV22 = "/algorand-ws/2.2.0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: should it somehow in sync with wsnetwork.ProtocolVersion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally one would reference the other..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going through wsNetwork - it definitely feels like these two constants should be defined together/one should direct reference the other to help keep in sync.

Copy link
Contributor Author

@algorandskiy algorandskiy May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a "problem" with this approach is wsnet v1.0 is not supported but we still need /algorand-ws/1.0.0 in p2p. I suggest to unify 2.2 in later release after the consensus upgrade when /algorand-ws/1.0.0 can be removed

localPeer := n.host.ID()
func (n *streamManager) peerWatcher(ctx context.Context, sub event.Subscription) {
defer sub.Close()
for e := range sub.Out() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the inner body of the for loop is the same as Connected handler with return replaced by continue

@@ -106,7 +106,7 @@ func TestP2PSubmitTX(t *testing.T) {
)
require.Eventually(t, func() bool {
return netA.hasPeers() && netB.hasPeers() && netC.hasPeers()
}, 2*time.Second, 50*time.Millisecond)
}, 5*time.Second, 50*time.Millisecond)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new EvtPeerIdentificationCompleted handler requires more time on CI

Copy link
Contributor

@gmalouf gmalouf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some initial comments, my biggest concerns are the compatibility issues.

@algorandskiy
Copy link
Contributor Author

my biggest concerns are the compatibility issues.

that's the point of an extra p2p proto to have it fully backward compatible (telemetryID stuff is removed b/c it non-working I believe in most cases).

@algorandskiy algorandskiy force-pushed the pavel/p2p-handshake branch from fa45b75 to 610d555 Compare May 13, 2025 15:51
@jannotti
Copy link
Contributor

I am only getting started understanding how this stuff works, but it looks to me like in v1, there is a single byte exchange, but neither side cares what the bytes actually are. It looks like they send ascii "1".

Why couldn't we say "if you send a certain byte, the headers come next"?

@algorandskiy algorandskiy force-pushed the pavel/p2p-handshake branch from 610d555 to 7108389 Compare May 13, 2025 16:22
@algorandskiy
Copy link
Contributor Author

Why couldn't we say "if you send a certain byte, the headers come next"?

because of the following: new client and old server combo: old server can only read one byte and write one byte, and new client initiates the exchange by writing something not knowing what other side expects - one byte or multiple bytes.
Ok, new client can write a byte and read with timeout, if nothing came out then guess it is a new server and push meta info. But it brings 1) connection delay 2) more complex logic rather then two well separated p2p protocols.

return args.Int(0), args.Error(1)
}

func TestReadPeerMetaHeaders(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't have any testing of bad data on the stream. Is the assumption that bad data isn't very dangerous, it'll just cause a error and the stream to close if it's too short, or if the length bytes are wrong, or whatever?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I guess with only a uint16 for length, it's not like a bad peer can make you read a crazy amount of data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, added test cases covering all the error branches

@jannotti
Copy link
Contributor

jannotti commented May 13, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants