Skip to content

Implement RawValue type (alternative) #485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Sep 20, 2018
Merged

Implement RawValue type (alternative) #485

merged 17 commits into from
Sep 20, 2018

Conversation

dtolnay
Copy link
Member

@dtolnay dtolnay commented Sep 20, 2018

This PR builds on #480 but eliminates the Cow in favor of distinct &RawValue and Box<RawValue> options.

Documentation of RawValue:


Reference to a range of bytes encompassing a single valid JSON value in the input data.

A RawValue can be used to defer parsing parts of a payload until later, or to avoid parsing it at all in the case that part of the payload just needs to be transferred verbatim into a different output object.

When serializing, a value of this type will retain its original formatting and will not be minified or pretty-printed.

Example

#[macro_use]
extern crate serde_derive;
extern crate serde_json;

use serde_json::{Result, value::RawValue};

#[derive(Deserialize)]
struct Input<'a> {
    code: u32,
    #[serde(borrow)]
    payload: &'a RawValue,
}

#[derive(Serialize)]
struct Output<'a> {
    info: (u32, &'a RawValue),
}

// Efficiently rearrange JSON input containing separate "code" and "payload"
// keys into a single "info" key holding an array of code and payload.
//
// This could be done equivalently using serde_json::Value as the type for
// payload, but &RawValue will perform netter because it does not require
// memory allocation. The correct range of bytes is borrowed from the input
// data and pasted verbatim into the output.
fn rearrange(input: &str) -> Result<String> {
    let input: Input = serde_json::from_str(input)?;

    let output = Output {
        info: (input.code, input.payload),
    };

    serde_json::to_string(&output)
}

fn main() -> Result<()> {
    let out = rearrange(r#" {"code": 200, "payload": {}} "#)?;

    assert_eq!(out, r#"{"info":[200,{}]}"#);

    Ok(())
}

Ownership

The typical usage of RawValue will be in the borrowed form:

#[derive(Deserialize)]
struct SomeStruct<'a> {
    #[serde(borrow)]
    raw_value: &'a RawValue,
}

The borrowed form is suitable when deserializing through serde_json::from_str and serde_json::from_slice which support borrowing from the input data without memory allocation.

When deserializing through serde_json::from_reader you will need to use the boxed form of RawValue instead. This is almost as efficient but involves buffering the raw value from the I/O stream into memory.

#[derive(Deserialize)]
struct SomeStruct {
    raw_value: Box<RawValue>,
}

srijs and others added 15 commits September 14, 2018 20:11
This avoids a redundant UTF-8 check in the case of StrRead.
This is what serde_json::Number does in arbitrary precision mode. It is
more robust because it fails fast in formats that are not JSON. In the
previous implementation, it was too easy when mixing different data
formats to end up with a RawValue holding something that is not valid
JSON data:

    extern crate serde;
    extern crate serde_json;

    use serde::de::{Deserialize, IntoDeserializer, value::Error};
    use serde_json::RawValue;

    fn main() -> Result<(), Error> {
        let bad = RawValue::deserialize("~~~".to_owned().into_deserializer())?;
        println!("{}", bad);
        Ok(())
    }

The new implementation detects in this case that we are not
deserializing from a deserializer that speaks the language of
serde_json::RawValue and will fail fast rather than producing an illegal
RawValue.
Unclear in what situations comparing RawValues for equality would be
meaningful.
Before:

    RawValue("{\"k\":\"v\"}")

After:

    RawValue({"k":"v"})
@dtolnay dtolnay mentioned this pull request Sep 20, 2018
@srijs
Copy link
Contributor

srijs commented Sep 20, 2018

I like where this is going! Anything I can pick up or help with?

@dtolnay dtolnay merged commit 871e752 into serde-rs:master Sep 20, 2018
@dtolnay dtolnay deleted the raw branch September 20, 2018 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants