Batched requests #53

clostao · 2025-03-14T22:49:36Z

No description provided.

teor2345 · 2025-03-31T20:07:03Z

services/file-retriever/src/services/dsnFetcher.ts

+
+        // we group the object mapping by the piece index
+        const PIECE_INDEX_KEY = 1
+        const nodes = groupBy(objectMappings, PIECE_INDEX_KEY)


This will work, and once I implement the first stage of autonomys/subspace#3316 , it will re-use pieces most of the time.

But if an object crosses multiple pieces, the last piece will be downloaded twice (once at the end of the batch for the first piece, because that object needs data from both pieces, and again at the start of the batch for the last piece).

If you want to re-use even more pieces, you could batch groups with nearby piece indexes together. That way, you'll re-use the pieces in this situation as well:

There are objects in nearby pieces, and one object crosses multiple pieces

The first piece is re-used for all the objects in the batch in that piece

The object that crosses multiple pieces re-uses the first piece, and downloads the later pieces (up to 5)

The last piece is only downloaded once, and re-used for the next objects in the batch

You can actually combine as many nearby pieces as you like this way. So if you want the most download-efficient code, an alternative algorithm is:

Sort the mappings by piece index

Split the batch when the next object couldn't possibly share a piece with the last one

If the piece is already cached, the response will be almost instant, because it is just moving some data around, and doing one blake3 hash. (Or up to 4 hashes if the object crosses segments - but that's rare, and will only happen once per batch at most.)

Here is how you can work out if two objects could share a piece:

Blocks and objects are limited to 5 MB. So if the difference between the piece indexes in two mappings is greater than 5, they can't share any pieces - and you can split the batch there.

If you know the size of the object, you can calculate the number of pieces in the object using: (object_size + 100) / (2^15 * 31), and rounding up. Then split the batch if the difference between the two piece indexes is greater than the number of pieces in the object.

(The calculation is a bit complicated to account for segment padding and headers, and the unused bytes in the cryptographic scalars we use to generate parity pieces.)

I implemented this change in my last commit. As you can see I've implemented that no re-utilisation is to be tried in the case that objects' pieces are not consecutive.

This is because though objects at the protocol level are limitted to 5MB @autonomys/auto-dag-data limits to 64KB since it's the biggest a Bytes input can take for an extrinsic (that is what is used bySystem.remark) so for two objects to share a piece they have to be consecutive.

Thanks! That seems like an annoying limitation 🙂

Hopefully my next PR will help re-use pieces more, if we're splitting into 64kB objects, then that's a lot of small objects in a single piece for a 1 MB file.

…into batched-requests

… for file retrieval service

socket-security · 2025-04-16T01:49:53Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Quality	Maintenance
	@oven/bun-linux-aarch64-musl@1.2.8 ⏵ 1.2.9	^-19	^-65	^-2
	@oven/bun-linux-x64-musl@1.2.8 ⏵ 1.2.9		^-65	^-2
	@oven/bun-linux-x64-musl-baseline@1.2.8 ⏵ 1.2.9		^-65	^-2
	@oven/bun-darwin-aarch64@1.2.8 ⏵ 1.2.9	^-3	^-63	^-2
	@oven/bun-darwin-x64@1.2.8 ⏵ 1.2.9	^-13	^-63	^-2
	@oven/bun-linux-aarch64@1.2.8 ⏵ 1.2.9	^-2	^-63	^-2
	@oven/bun-linux-x64@1.2.8 ⏵ 1.2.9		^-63	^-2
	@oven/bun-windows-x64@1.2.8 ⏵ 1.2.9	^-8	^-63	^-2
	@oven/bun-linux-x64-baseline@1.2.8 ⏵ 1.2.9		^-62	^-2
	@oven/bun-darwin-x64-baseline@1.2.8 ⏵ 1.2.9	^-11	^-62	^-2
	@oven/bun-windows-x64-baseline@1.2.8 ⏵ 1.2.9	^-7	^-62	^-2
	isarray@1.0.0 ⏵ 2.0.5		^-20
	array-buffer-byte-length@1.0.2
	which-collection@1.0.2
	object.groupby@1.0.3
	is-set@2.0.3
	is-weakmap@2.0.2
	is-map@2.0.3
	define-properties@1.2.1
See 89 more rows in the dashboard

View full report

clostao added 8 commits March 13, 2025 18:12

update: object retrieval

2b621a4

feat: implement weighted concurrency limit

6c31195

fix some bugs and add tests for them

3035092

chore: solve styling and eslint

6acb2cb

update: use @autonomys/asynchronous pkg

efef3a4

Merge remote-tracking branch 'origin/main' into batched-requests

c1b2e5b

fix: misusage of command parameters

655bdfb

update: docker compose setup

84224b5

teor2345 reviewed Mar 31, 2025

View reviewed changes

clostao added 2 commits April 1, 2025 17:13

feat: optimizing how batches are formed for reusing the most pieces

d0709fc

Merge branch 'main' into batched-requests

5653927

autonomys deleted a comment from socket-security bot Apr 2, 2025

clostao added 6 commits April 9, 2025 16:36

Merge branch 'main' into batched-requests

abaa51d

Merge branch 'main' of https://github.com/autonomys/auto-files-gateway …

b54170c

…into batched-requests

feat: add standalone Docker Compose configuration and CA certificates…

2cc6774

… for file retrieval service

Merge remote-tracking branch 'origin/main' into batched-requests

55fe26b

delete: remove unused test.ts file

8ba5c02

Merge branch 'add-standalone-api' into batched-requests

62ce54a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batched requests #53

Batched requests #53

clostao commented Mar 14, 2025

teor2345 Mar 31, 2025

clostao Apr 2, 2025 •

edited

Loading

teor2345 Apr 2, 2025

socket-security bot commented Apr 16, 2025 •

edited

Loading

Batched requests #53

Are you sure you want to change the base?

Batched requests #53

Conversation

clostao commented Mar 14, 2025

teor2345 Mar 31, 2025

Choose a reason for hiding this comment

clostao Apr 2, 2025 • edited Loading

Choose a reason for hiding this comment

teor2345 Apr 2, 2025

Choose a reason for hiding this comment

socket-security bot commented Apr 16, 2025 • edited Loading

clostao Apr 2, 2025 •

edited

Loading

socket-security bot commented Apr 16, 2025 •

edited

Loading