Skip to content

Add serialize to all the exported structs of wit-parser #1177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 34 commits into from

Conversation

Mossaka
Copy link
Member

@Mossaka Mossaka commented Aug 23, 2023

This PR adds Serialize implementation to all the exported structs of the wit-parser crate so that it could output JSON given a WIT package.

This follows the discussion from bytecodealliance/wit-bindgen#640

Since wit-parser uses id-arena for allocating graph objects, which doesn't support serde, I had to fork this project and added Serialize to Id and Arena. See here for my changes: fitzgen/id-arena#14

Usage:

wasm-tools component wit --json

Example Output

Give the following WIT

package foo:foo

interface disambiguate {
  type t = u32
}

world foo {
  type a = u32

  type b = a

  export c: func(a: a) -> b
}
world bar {
  import disambiguate
  use disambiguate.{t}

  export foo: func() -> t
}
world the-test {
  import foo: func(a: a) -> b

  record a {
    x: u32,
  }

  variant b {
    c(a),
  }

  export bar: func(a: a) -> b
}

The corresponding JSON format is

{
  "worlds": [
    {
      "name": "foo",
      "docs": {
        "contents": null
      },
      "imports": {
        "a": {
          "Type": "type-1"
        },
        "b": {
          "Type": "type-2"
        }
      },
      "exports": {
        "c": {
          "Function": {
            "docs": {
              "contents": null
            },
            "name": "c",
            "kind": "Freestanding",
            "params": [
              [
                "a",
                "type-1"
              ]
            ],
            "results": {
              "Anon": "type-2"
            }
          }
        }
      },
      "package": {
        "idx": 0
      }
    },
    {
      "name": "bar",
      "docs": {
        "contents": null
      },
      "imports": {
        "interface-0": {
          "Interface": "interface-0"
        },
        "t": {
          "Type": "type-3"
        }
      },
      "exports": {
        "foo": {
          "Function": {
            "docs": {
              "contents": null
            },
            "name": "foo",
            "kind": "Freestanding",
            "params": [],
            "results": {
              "Anon": "type-3"
            }
          }
        }
      },
      "package": {
        "idx": 0
      }
    },
    {
      "name": "the-test",
      "docs": {
        "contents": null
      },
      "imports": {
        "a": {
          "Type": "type-4"
        },
        "b": {
          "Type": "type-5"
        },
        "foo": {
          "Function": {
            "docs": {
              "contents": null
            },
            "name": "foo",
            "kind": "Freestanding",
            "params": [
              [
                "a",
                "type-4"
              ]
            ],
            "results": {
              "Anon": "type-5"
            }
          }
        }
      },
      "exports": {
        "bar": {
          "Function": {
            "docs": {
              "contents": null
            },
            "name": "bar",
            "kind": "Freestanding",
            "params": [
              [
                "a",
                "type-4"
              ]
            ],
            "results": {
              "Anon": "type-5"
            }
          }
        }
      },
      "package": {
        "idx": 0
      }
    }
  ],
  "interfaces": [
    {
      "name": "disambiguate",
      "docs": {
        "contents": null
      },
      "types": {
        "t": {
          "idx": 0
        }
      },
      "functions": {},
      "package": {
        "idx": 0
      }
    }
  ],
  "types": [
    {
      "docs": {
        "contents": null
      },
      "kind": {
        "Type": "U32"
      },
      "name": "t",
      "owner": {
        "Interface": {
          "idx": 0
        }
      }
    },
    {
      "docs": {
        "contents": null
      },
      "kind": {
        "Type": "U32"
      },
      "name": "a",
      "owner": {
        "World": {
          "idx": 0
        }
      }
    },
    {
      "docs": {
        "contents": null
      },
      "kind": {
        "Type": "type-1"
      },
      "name": "b",
      "owner": {
        "World": {
          "idx": 0
        }
      }
    },
    {
      "docs": {
        "contents": null
      },
      "kind": {
        "Type": "type-0"
      },
      "name": "t",
      "owner": {
        "World": {
          "idx": 1
        }
      }
    },
    {
      "docs": {
        "contents": null
      },
      "kind": {
        "Record": {
          "fields": [
            {
              "docs": {
                "contents": null
              },
              "name": "x",
              "ty": "U32"
            }
          ]
        }
      },
      "name": "a",
      "owner": {
        "World": {
          "idx": 2
        }
      }
    },
    {
      "docs": {
        "contents": null
      },
      "kind": {
        "Variant": {
          "cases": [
            {
              "docs": {
                "contents": null
              },
              "name": "c",
              "ty": "type-4"
            }
          ]
        }
      },
      "name": "b",
      "owner": {
        "World": {
          "idx": 2
        }
      }
    }
  ],
  "packages": [
    {
      "name": "foo:foo",
      "docs": {
        "contents": null
      },
      "interfaces": {
        "disambiguate": {
          "idx": 0
        }
      },
      "worlds": {
        "foo": {
          "idx": 0
        },
        "bar": {
          "idx": 1
        },
        "the-test": {
          "idx": 2
        }
      }
    }
  ],
  "package_names": {
    "foo:foo": {
      "idx": 0
    }
  }
}

@alexcrichton
Copy link
Member

Would it be possible to avoid the need for id-arena to be Serialize/Deserialize? That includes the arena_id in a number of places which is irrelevant for this representation. On one hand I realize that this PR amounts to "let's just dump the internal representation" but at the same time if we can I think it'd be best to do our best to hide the Rust-specific bits that aren't relevant elsewhere.

@Mossaka
Copy link
Member Author

Mossaka commented Aug 23, 2023

Would it be possible to avoid the need for id-arena to be Serialize/Deserialize?

I believe arena_id could be omitted from the output, but idx is a key for looking for types, interfaces, worlds and packages.

@Mossaka
Copy link
Member Author

Mossaka commented Aug 23, 2023

to hide the Rust-specific bits that aren't relevant elsewhere.

What do you consider are Rust-specific bits?

@alexcrichton
Copy link
Member

Yes idx is required but the arena bits are the Rust-specific bits I'm worried about. Other Rust-specific bits could perhaps be serialized enum representations but that depends on the output.

@Mossaka Mossaka force-pushed the serde branch 2 times, most recently from 99b30a8 to fbc2e95 Compare September 1, 2023 23:29
@Mossaka Mossaka changed the title DO NOT MERGE: Add serialize to all the exported structs of wit-parser Add serialize to all the exported structs of wit-parser Sep 3, 2023
@Mossaka Mossaka marked this pull request as ready for review September 3, 2023 03:04
@Mossaka
Copy link
Member Author

Mossaka commented Sep 3, 2023

Just updated the repo thanks to @ydnar . Now the JSON output looks like

{
  "worlds": [
    {
      "name": "foo",
      "docs": {
        "contents": null
      },
      "imports": {
        "a": {
          "type": 1
        },
        "b": {
          "type": 2
        }
      },
      "exports": {
        "c": {
          "function": {
            "docs": {
              "contents": null
            },
            "name": "c",
            "kind": "freestanding",
            "params": [
              {
                "name": "a",
                "type": 1
              }
            ],
            "results": [
              {
                "type": 2
              }
            ]
          }
        }
      },
      "package": 0
    },
    {
      "name": "bar",
      "docs": {
        "contents": null
      },
      "imports": {
        "interface-0": {
          "interface": 0
        },
        "t": {
          "type": 3
        }
      },
      "exports": {
        "foo": {
          "function": {
            "docs": {
              "contents": null
            },
            "name": "foo",
            "kind": "freestanding",
            "params": [],
            "results": [
              {
                "type": 3
              }
            ]
          }
        }
      },
      "package": 0
    },
    {
      "name": "the-test",
      "docs": {
        "contents": null
      },
      "imports": {
        "a": {
          "type": 4
        },
        "b": {
          "type": 5
        },
        "foo": {
          "function": {
            "docs": {
              "contents": null
            },
            "name": "foo",
            "kind": "freestanding",
            "params": [
              {
                "name": "a",
                "type": 4
              }
            ],
            "results": [
              {
                "type": 5
              }
            ]
          }
        }
      },
      "exports": {
        "bar": {
          "function": {
            "docs": {
              "contents": null
            },
            "name": "bar",
            "kind": "freestanding",
            "params": [
              {
                "name": "a",
                "type": 4
              }
            ],
            "results": [
              {
                "type": 5
              }
            ]
          }
        }
      },
      "package": 0
    }
  ],
  "interfaces": [
    {
      "name": "disambiguate",
      "docs": {
        "contents": null
      },
      "types": {
        "t": 0
      },
      "functions": {},
      "package": 0
    }
  ],
  "types": [
    {
      "docs": {
        "contents": null
      },
      "kind": {
        "type": "u32"
      },
      "name": "t",
      "owner": {
        "interface": 0
      }
    },
    {
      "docs": {
        "contents": null
      },
      "kind": {
        "type": "u32"
      },
      "name": "a",
      "owner": {
        "world": 0
      }
    },
    {
      "docs": {
        "contents": null
      },
      "kind": {
        "type": 1
      },
      "name": "b",
      "owner": {
        "world": 0
      }
    },
    {
      "docs": {
        "contents": null
      },
      "kind": {
        "type": 0
      },
      "name": "t",
      "owner": {
        "world": 1
      }
    },
    {
      "docs": {
        "contents": null
      },
      "kind": {
        "record": {
          "fields": [
            {
              "docs": {
                "contents": null
              },
              "name": "x",
              "type": "u32"
            }
          ]
        }
      },
      "name": "a",
      "owner": {
        "world": 2
      }
    },
    {
      "docs": {
        "contents": null
      },
      "kind": {
        "variant": {
          "cases": [
            {
              "docs": {
                "contents": null
              },
              "name": "c",
              "type": 4
            }
          ]
        }
      },
      "name": "b",
      "owner": {
        "world": 2
      }
    }
  ],
  "packages": [
    {
      "name": "foo:foo",
      "docs": {
        "contents": null
      },
      "interfaces": {
        "disambiguate": 0
      },
      "worlds": {
        "foo": 0,
        "bar": 1,
        "the-test": 2
      }
    }
  ]
}

};
let s = serde_json::to_string(&resolve).unwrap();
let json_resolve = JSON_RESOLVE.replace(" ", "").replace("\n", "");
assert_eq!(s, json_resolve);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ydnar I've added a dummy Resolve to match the JSON output

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tests here! I'm a bit worried about the maintainability of this over time though as updating this test is going to be a bit of a pain. Alternatively though could the tests/ui/*.wit tests be all automatically serialized to JSON? Similar to how *.result files are generated for the ui/parse-fail/*.wit tests could the *.json tests be automatically generated for all the WIT successful tests? That way it'd be easy to see changes-at-a-glance and additionally auto-update test expectations when anything changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to generate and commit golden JSON files for all the WIT files in tests/ui, and compare against these.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I will work on that (expecting delayed commits from me as I am attending WasmCon this week)

Copy link
Member Author

@Mossaka Mossaka Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just updated this PR to commit all JSON files corresponding to valid WIT files. The test runner is able to compare and emitted JSON from resolve to commited JSON files similar to how .result files work. Coud you PTAL, @alexcrichton ?

@@ -266,14 +285,24 @@ pub struct IncludeName {

/// The key to the import/export maps of a world. Either a kebab-name or a
/// unique interface.
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
#[derive(Debug, Clone, PartialEq, Eq, Hash, Serialize)]
#[serde(into = "String")]
pub enum WorldKey {
/// A kebab-name.
Name(String),
/// An interface which is assigned no kebab-name.
Interface(InterfaceId),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shoot. Forgot to make this one a JSON number like the other enums.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is what we want, because WorldKey is used in World exports and imports IndexMap<WorldKey, WorldItem>.

@cardoso
Copy link
Contributor

cardoso commented Sep 4, 2023

This is great! Any chance this could evolve to export span information at some point? If so, once this is merged, I'd be happy to push a draft PR and discuss further.

I had to pause bytecodealliance/vscode-wit#16 , but with this + span info, the implementation of LSP for WIT would be much more straightforward especially while the language isn't stable.

I really like the approach used here: https://github.com/dtolnay/clang-ast

The purpose here is different, but I guess a lot still applies.

"worlds": [
{
"name": "world1",
"docs": { "contents": null },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could docs be skipped if the contents are null?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Is there any plans to add additional fields to the Docs struct? Could it be an optional string?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there are any current plans, no, but it seemed like something which could be simplified when I was reading over this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, great. Maybe a different PR? PackageName notwithstanding our goal was to not change the public API of this crate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I don't think this would require a separate PR? I'm hoping to change this output in this PR by dropping the "docs": ... if it's empty. I think that can be done with #[serde(skip_if = "...")] or a similar field attribute name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -134,23 +134,7 @@ harness = false

[features]
# By default, all subcommands are built
default = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the formatting changes in this file be undone?

pub struct PackageName {
/// A namespace such as `wasi` in `wasi:foo/bar`
pub namespace: String,
/// The kebab-name of this package, which is always specified.
pub name: String,
/// Optional major/minor version information.
pub version: Option<Version>,
pub version: Option<SerializableVersion>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using SerializableVersion could this use serialize_with?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the tests, this works with semver::Version without SerializableVersion at all. So I removed it in be17ced.

pub include_names: Vec<Vec<IncludeName>>,
}

#[derive(Debug, Clone)]
#[derive(Debug, Clone, Serialize)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need Serialize if it's serde(skip)'d above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

fn from(key: WorldKey) -> String {
match key {
WorldKey::Name(name) => name,
WorldKey::Interface(id) => format!("interface-{}", id.index()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may also be what @ydnar is referring to just above this, but I think it'd be good to separate these two ariants a it better than with an interface- prefix since that looks a lot like a kebab-name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Since this is just an internal reference to an Interface I think this can be serialized to a JSON number.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon further investigation, I think we need to have a unique string-serializable key for the IndexMap<WorldKey, WorldItem> maps in World, given that interfaces may be unnamed:

    /// Optionally listed name of this interface.
    ///
    /// This is `None` for inline interfaces in worlds.
    pub name: Option<String>,

@alexcrichton are you OK with the kebab-case interface-$N form or is there another form you’d prefer?

pub package: Option<PackageId>,
}

#[derive(Debug, Clone, PartialEq)]
#[derive(Debug, Clone, PartialEq, Serialize)]
pub struct TypeDef {
pub docs: Docs,
pub kind: TypeDefKind,
pub name: Option<String>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be serde(skip)'d if the name is None?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done in ydnar@68b52f6

Copy link
Contributor

@ydnar ydnar Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton would reordering some of the fields in the Resolve or child structs be a problem?

I’d like to reorganize a little, moving name above docs, for instance, so the serialization is a bit more human-readable. I’d do this in a follow-up PR.

pub struct TypeDef {
pub docs: Docs,
pub kind: TypeDefKind,
pub name: Option<String>,
pub owner: TypeOwner,
}

#[derive(Debug, Clone, PartialEq)]
#[derive(Debug, Clone, PartialEq, Serialize)]
#[serde(rename_all = "lowercase")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of "lowercase" could "kebab-case" be used for this and all other serialization keys?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does BA have a style guide or doc somewhere on how to represent names and identifiers? e.g. when to use kebab-case vs camel vs snake / screaming snake?

(given most target languages don’t support kebab-case for identifiers)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh I don't think anything is anywhere near official to have these sort of guidelines at a BA level by any means.

In any case though I figure kebab-case might be good to have since it matches the component model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out all of the fields are single words, so kebab and lowercase are identical. Revisit if/when this changes?

};
let s = serde_json::to_string(&resolve).unwrap();
let json_resolve = JSON_RESOLVE.replace(" ", "").replace("\n", "");
assert_eq!(s, json_resolve);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tests here! I'm a bit worried about the maintainability of this over time though as updating this test is going to be a bit of a pain. Alternatively though could the tests/ui/*.wit tests be all automatically serialized to JSON? Similar to how *.result files are generated for the ui/parse-fail/*.wit tests could the *.json tests be automatically generated for all the WIT successful tests? That way it'd be easy to see changes-at-a-glance and additionally auto-update test expectations when anything changes.

ydnar and others added 18 commits September 13, 2023 20:48
Handle serialization using existing type alias to not break existing public API.
Signed-off-by: Jiaxiao Zhou (Mossaka) <[email protected]>
Signed-off-by: Jiaxiao Zhou (Mossaka) <[email protected]>
This uses the "BLESS=1" env var to tell `cargo test` to force
to write to files.

Signed-off-by: Jiaxiao Zhou (Mossaka) <[email protected]>
Signed-off-by: Jiaxiao Zhou (Mossaka) <[email protected]>
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I realize that there's still a follow-up here or there, but if you'd prefer I'd be ok landing this first and can iterate on it in-tree.

@ydnar
Copy link
Contributor

ydnar commented Sep 14, 2023

@alexcrichton thanks! I have a couple commits to land to address feedback. Mind holding off merging until I land those?

@alexcrichton
Copy link
Member

No worries, sounds good! Let me know when you're ready

@ydnar
Copy link
Contributor

ydnar commented Sep 14, 2023

@alexcrichton @Mossaka is on holiday, so I can’t push commits to his repo (and therefore this PR).

I could open a new PR with this + my subsequent commits, if you want.

https://github.com/bytecodealliance/wasm-tools/compare/main...ydnar:wasm-tools:serde?expand=1

@alexcrichton
Copy link
Member

Sounds good to me!

@ydnar
Copy link
Contributor

ydnar commented Sep 14, 2023

@alexcrichton OK, #1203 is up!

@alexcrichton
Copy link
Member

Merged as part of #1203, so closing. (thanks again!)

@Mossaka Mossaka deleted the serde branch September 14, 2023 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants