Skip to content

Improve performance of bdk_file_store::EntryIter #1270

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 25, 2024

Conversation

evanlinjin
Copy link
Member

@evanlinjin evanlinjin commented Jan 13, 2024

Description

EntryIter performance is improved by reducing syscalls. The underlying file reader is wrapped with BufReader (to reduce calls to read and seek).

Two new tests are introduced. One ensures correct behavior when the last changeset write is too short. The other ensures the next write position is correct after a short read.

Notes to the reviewers

This is extracted from #1172 as suggested by #1172 (review).

Changelog notice

Changed

  • EntryIter performance is improved by reducing syscalls.

Checklists

All Submissions:

  • I've signed all my commits
  • I followed the contribution guidelines
  • I ran cargo fmt and cargo clippy before committing

New Features:

  • I've added tests for the new feature
  • I've added docs for the new feature

* Wrap file reader with `BufReader`. This reduces calls to `read`.
* Wrap file reader with `CountingReader`. This counts the bytes read by
  the underlying reader. We can rewind without seeking first.
This test simulates a situation where the last write to the db is short.

Aggregating the changeset after reopening the file should return an
error (which includes a partially-aggregated changeset) containing an
aggregation of changesets that were fully written.

At this point, the test re-writes the final changeset (and this time it
successfully writes in full).

The file should be recoverable with all changesets, including the last
one.
// This syncs the underlying file's offset with the buffer's position. This way, we
// maintain the correct position to start the next read/write.
#[allow(clippy::seek_from_current)]
let _ = self.db_file.seek(io::SeekFrom::Current(0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this so we instead just seek the underlying file to the stream_position that the BufReader returns (by recording it and then into_inner or whatever to get the file). This is more explicit about what we're trying to do and doesn't have clippy ignores 🙈.

Something feels off about this. It we be cool if we could get rid of the implicit state being stored in the file position and make the user explicitly start writing at a certain position that we've advised instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LLFourn How is this?

        if let Ok(pos) = self.db_file.stream_position() {
            let _ = self.db_file.get_mut().seek(io::SeekFrom::Start(pos));
        }

Can we do explicit position writes in a separate PR? I think it is out of scope for this PR (+ I would like to use these changes here).

Because we use wrap the file with `BufReader` with the `EntryIter`, we
need to sync the `BufReader`'s position with the file's offset when we
drop the `EntryIter`. Therefore we have a custom drop impl for
`EntryIter`.
@evanlinjin evanlinjin force-pushed the file_store_performance branch from d6c5287 to 51bd01b Compare January 25, 2024 08:20
Copy link
Contributor

@LLFourn LLFourn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 51bd01b

@evanlinjin evanlinjin merged commit 07116df into bitcoindevkit:master Jan 25, 2024
@notmandatory notmandatory mentioned this pull request Jan 30, 2024
23 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New feature or request
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants