Skip to content

Commit cd7fade

Browse files
committed
Add small-copy optimization for io::Cursor
During benchmarking, I found that one of my programs spent between 5 and 10 percent of the time doing memmoves. Ultimately I tracked these down to single-byte slices being copied with a memcopy in io::Cursor::read(). Doing a manual copy if only one byte is requested can speed things up significantly. For my program, this reduced the running time by 20%. Why special-case only a single byte, and not a "small" slice in general? I tried doing this for slices of at most 64 bytes and of at most 8 bytes. In both cases my test program was significantly slower.
1 parent 8e373b4 commit cd7fade

File tree

1 file changed

+15
-3
lines changed

1 file changed

+15
-3
lines changed

src/libstd/io/cursor.rs

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -219,9 +219,21 @@ impl<T> io::Seek for Cursor<T> where T: AsRef<[u8]> {
219219
#[stable(feature = "rust1", since = "1.0.0")]
220220
impl<T> Read for Cursor<T> where T: AsRef<[u8]> {
221221
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
222-
let n = Read::read(&mut self.fill_buf()?, buf)?;
223-
self.pos += n as u64;
224-
Ok(n)
222+
// First check if the amount of bytes we want to read is small: the read
223+
// in the else branch will end up calling `<&[u8] as Read>::read()`,
224+
// which will copy the buffer using a memcopy. If we only want to read a
225+
// single byte, then the overhead of the function call is significant.
226+
let num_read = {
227+
let mut inner_buf = self.fill_buf()?;
228+
if buf.len() == 1 && inner_buf.len() > 0 {
229+
buf[0] = inner_buf[0];
230+
1
231+
} else {
232+
Read::read(&mut inner_buf, buf)?
233+
}
234+
};
235+
self.pos += num_read as u64;
236+
Ok(num_read)
225237
}
226238
}
227239

0 commit comments

Comments
 (0)