-
-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(stdlib): Add Buffer.setChar
and Buffer.getChar
#2262
base: main
Are you sure you want to change the base?
feat(stdlib): Add Buffer.setChar
and Buffer.getChar
#2262
Conversation
@unsafe | ||
provide let getChar = (index, buffer) => { | ||
use WasmI32.{ (+), (&), (+), (==), (>) } | ||
checkIsIndexInBounds(index, 1, buffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this 1 if it's UTF-8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Characters can be between 1
and 4
bytes, we need to ensure the first byte exists so we can check the char size on line 406 and we do an additional length check on line 408 with the actual length.
This is why getChar
needs to operate on the bytes directly rather than just using Bytes.getChar
like our other helpers.
@@ -376,6 +376,68 @@ provide let addString = (string, buffer) => { | |||
buffer.len += bytelen | |||
} | |||
|
|||
/** | |||
* Gets the UTF-8 encoded character at the given byte index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the operation is intended to get a character starting at a byte index then if you point your index in the middle of a UTF-8 character then what would the expectation be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure it would either return a character in the case that the rest was a valid char (I can't recall if thats ever the case), but more likely MalformedUnicode
.
This adds
Buffer.setChar
andBuffer.getChar
to complement their bytes counterparts.Note: We have to read the first byte before
getChar
so we can correctly do the bounds check similar to how its done inBytes.getChar
.Note: I realized we missed a piece of documentation on
Bytes.getChar
so I added it here.