-
Notifications
You must be signed in to change notification settings - Fork 326
Limit to_device EDU size to 65536 #18416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
@@ -28,6 +28,7 @@ | |||
|
|||
# the max size of a (canonical-json-encoded) event | |||
MAX_PDU_SIZE = 65536 | |||
MAX_EDU_SIZE = 65536 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAX_EDU_SIZE = 65536 | |
# This isn't spec'ed but is our own reasonable default to play nice with Synapse's | |
# `max_request_size`/`max_request_body_size`. We chose the same as `MAX_PDU_SIZE` as our | |
# `max_request_body_size` math is currently limited by 200 `MAX_PDU_SIZE` things. The | |
# spec for a `/federation/v1/send` request sets the limit at 100 EDU's and 50 PDU's | |
# which is below that 200 `MAX_PDU_SIZE` limit (`max_request_body_size`). | |
# | |
# Allowing oversized EDU's results in failed `/federation/v1/send` transactions (because | |
# the request overall can overrun the `max_request_body_size`) which are retried over | |
# and over and prevent other outbound federation traffic from happening. | |
MAX_EDU_SIZE = 65536 |
edu_contents = get_device_message_edu_contents( | ||
sender_user_id, message_type, messages, context | ||
) | ||
remote_edu_contents[destination] = edu_contents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of changing the structure of remote_edu_contents
(was a map from destination to EDU meta) (to a map from destination to multiple EDU meta), could we just call add_messages_to_device_inbox(...)
multiple times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The multi version should have some gain performance side, since it's in an unique transaction, and pre-allocate all the stream ids.
I am fine if we decide to keep it simple and sacrifice some perf for that, but I am not sure it's worth it here it's not overly complicated.
"type": message_type, | ||
"message_id": random_string(16), | ||
} | ||
# This is the size of the full EDU without any messages and without the opentracing context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the BASE_EDU_SIZE
calculated without BASE_EDU_CONTENT["org.matrix.opentracing_context"]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my understanding it's some data helping reporting to the good opentracing context through the code, and it's stripped out of the payload before sending the transaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MatMaul Thanks for linking the context! We should explain this here in the comment
if current_edu_size + message_entry_size > MAX_EDU_SIZE: | ||
edu_contents.append(current_edu_content) | ||
logger.debug( | ||
"Splitting %d device messages from %s into EDU msgid %s, %d EDUs queued", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Splitting %d device messages from %s into EDU msgid %s, %d EDUs queued", | |
"Splitting %d to-device messages from %s into EDU (message_id=%s), (total EDUs so far: %d)", |
|
||
edu_contents = [] | ||
|
||
current_edu_content: JsonDict = deepcopy(BASE_EDU_CONTENT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this cloning, perhaps it's easier to understand if we just have a little helper (maybe performs better as well 🤷):
def create_new_to_device_edu_content() -> JsonDict:
"""Create a new `m.direct_to_device` EDU `content` object with a unique message ID."""
content = {
"messages": {},
"sender": sender_user_id,
"type": message_type,
"message_id": random_string(16),
"org.matrix.opentracing_context": json_encoder.encode(context)
}
return content
) -> List[JsonDict]: | ||
""" | ||
This function takes a dictionary of messages and splits them into several EDUs if needed. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use a docstring for the args and return.
And context of why we care to split similar to how we explain it for MAX_EDU_SIZE
above.
logger.debug( | ||
"Queuing last %d device messages from %s into EDU msgid %s, %d EDUs queued", | ||
len(current_edu_content["messages"]), | ||
sender_user_id, | ||
current_edu_content["message_id"], | ||
len(edu_contents), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.debug( | |
"Queuing last %d device messages from %s into EDU msgid %s, %d EDUs queued", | |
len(current_edu_content["messages"]), | |
sender_user_id, | |
current_edu_content["message_id"], | |
len(edu_contents), | |
) | |
logger.debug( | |
"Splitting remaining %d device messages from %s into EDU (message_id=%s), (total EDUs so far: %d)", | |
len(current_edu_content["messages"]), | |
sender_user_id, | |
current_edu_content["message_id"], | |
len(edu_contents), | |
) |
|
||
mock_send_transaction.reset_mock() | ||
|
||
# 2 messages, each just big enough to fit in an EDU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# 2 messages, each just big enough to fit in an EDU | |
# 2 messages, each just big enough to fit into their own EDU |
|
||
self.assertEqual(mock_send_transaction.call_count, 2) | ||
|
||
# A transaction can contain up to 100 EDUs but synapse reserves 10 EDUs for other purposes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my own understanding, this happens at
) = await self.queue._get_to_device_message_edus(edu_limit - 10) |
It would be good to label this magic value as a constant which we could also cross-reference here.
|
||
for recipient, message in messages.items(): | ||
# We remove 2 for the curly braces and add 1 for the colon | ||
message_entry_size = len(encode_canonical_json({recipient: message})) - 2 + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive-by thought: instead of trying to work out the lengths and calculate the number of messages we can add, it might be easier to just generate the EDU and then check the size of it. If its too big you half the number of messages and try again.
The common case will be that we don't need to split up the EDU, at the expense of duplicating some work. It feels a bit hacky, but I think might be a little less brittle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure it will be that much simpler for comprehension TBH.
If we think we can eat the perf cost, the simpler is probably to call encode_canonical_json
on the whole EDU for each added message, and remove it and create a new EDU if it's larger than the max.
My calculation tricks were to avoid doing a full serialization on each added message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And add a special case tried first where we try to put everything in one ?
I don't know TBH, the idea of spitting in 2 is nice too but I feel like it is going to be quite annoying to implement and hence not simpler.
If a set of messages exceeds this limit, the messages are splitted across several EDUs.
Should fix #17035.
There is currently no official specced limit for EDUs, but the consensus seems to be that it would be useful to have one to avoid this bug by bounding the transaction size.
As a side effect it also limits the size of a single to-device message to a bit less than 65536.
This should probably be added to the spec similarly to the message size limit.
Pull Request Checklist