-
Notifications
You must be signed in to change notification settings - Fork 13.3k
byte_pattern
: share the TwoWaySearcher
between byte and str
#135931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
byte_pattern
: share the TwoWaySearcher
between byte and str
#135931
Conversation
SearchStep::Reject(a, mut b) => { | ||
byte_pattern::SearchStep::Reject(a, mut b) => { | ||
// skip to next char boundary | ||
while !self.haystack.is_char_boundary(b) { | ||
b += 1; | ||
} | ||
searcher.position = cmp::max(b, searcher.position); | ||
SearchStep::Reject(a, b) | ||
} | ||
otherwise => otherwise, | ||
byte_pattern::SearchStep::Match(a, b) => SearchStep::Match(a, b), | ||
byte_pattern::SearchStep::Done => SearchStep::Done, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I duplicated SearchStep
because it is a public type and its documentation refers to the Searcher
trait. The byte_pattern
module will have it's own Searcher
trait (or ByteSearcher
maybe) and so that documentation would be misleading one way or the other.
library/core/src/str/pattern.rs
Outdated
if let Some(result) = simd_contains(self, haystack) { | ||
if let Some(result) = simd_contains(self.as_bytes(), haystack.as_bytes()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function just takes a slice of bytes now. From what I can see the implementation does not rely on the input being UTF8 at all.
This comment has been minimized.
This comment has been minimized.
ea951c7
to
0b23d41
Compare
This comment has been minimized.
This comment has been minimized.
0b23d41
to
f3cb4ca
Compare
This comment has been minimized.
This comment has been minimized.
f3cb4ca
to
c631191
Compare
tracking issue: #134149
An attempt to break up #134350 into more manageable pieces.
From what I can see, the
TwoWaySearcher
implementation does not have special logic for UTF8 boundaries, so it should work just as well on any&[u8]
. So this PR just moves theTwoWaySearcher
implementation toslice/byte_pattern.rs
, and then uses it fromstr/pattern.rs
. No functional changes, no additional API surface.r? @BurntSushi