Skip to content

Use a BufReader in bdk_file_store #1293

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

LLFourn
Copy link
Contributor

@LLFourn LLFourn commented Jan 22, 2024

PR over #1270 to simplify it. Key observation is that you can use stream_position in BufReader without making a system call. I think @evanlinjin introduced CountingReader to keep track of where we are in the stream but BufReader already does this (correct me if I'm wrong).
Take a close look please @evanlinjin. It passes the tests but I might be missing something.

evanlinjin and others added 3 commits January 13, 2024 16:07
* Wrap file reader with `BufReader`. This reduces calls to `read`.
* Wrap file reader with `CountingReader`. This counts the bytes read by
  the underlying reader. We can rewind without seeking first.
This test simulates a situation where the last write to the db is short.

Aggregating the changeset after reopening the file should return an
error (which includes a partially-aggregated changeset) containing an
aggregation of changesets that were fully written.

At this point, the test re-writes the final changeset (and this time it
successfully writes in full).

The file should be recoverable with all changesets, including the last
one.
@LLFourn LLFourn force-pushed the llfourn_file_store_performance branch from 2fd8d53 to aa1b16a Compare January 22, 2024 02:55
@LLFourn LLFourn requested a review from evanlinjin January 22, 2024 04:47
Comment on lines 75 to 72
impl<'t, T> Drop for EntryIter<'t, T> {
fn drop(&mut self) {
if let Some(r) = self.db_file.as_mut() {
// This syncs the underlying file's offset with the buffer's position. This way, no data
// is lost with future reads.
let _ = r.stream_position();
}
}
})()
.transpose()
}
}
Copy link
Member

@evanlinjin evanlinjin Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still need a Drop implementation that syncs the underlying file's offset. However, we should use self.seek(SeekFrom::Current(0)).

Copy link
Contributor Author

@LLFourn LLFourn Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we write a test that demonstrates this need? I think I understand the rationale I'll see if I can come up one.

[EDIT] or maybe we can use a BufReader in the main type (not just the iterator).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LLFourn I don't think using a BufReader with the internal file getting written to is a good idea. We'll need to do some crazy stuff with .get_mut and flush the buffer with every write.

I thought that the test I added demonstrated the need for the Drop impl but clearly not. We need a test to write after a failed read without closing the file in-between.

@evanlinjin
Copy link
Member

@LLFourn I cherry-picked these commits onto #1270. Can we close this PR and continue discussion on #1270?

@evanlinjin
Copy link
Member

This is continued in #1270

@evanlinjin evanlinjin closed this Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants