npm packaging requirements #34

ashleygwilliams · 2018-01-30T15:14:26Z

moving this out of #5 because that's got a lot of discussion that's not quite on this topic.

to package up wasm for npm, we'll need these things for a package.json:

{
   "name": String, // name of the package
   "version": String, // version of the package
   "main": String, // the primary file
   "files": [
        "path/to/js", // list of files to include in the package
        "path/to/otherstuff?",
    ],
   "dependencies" : {
       "pkgname": "version", // list of npm pkgs that the pkg depends on
   }
}

there's other metadata such a repo, author, contributors, that we may also want to consider, but the above is what is needed for bare minimum packaging. for more info: https://docs.npmjs.com/files/package.json

chicoxyzzy · 2018-01-30T21:22:23Z

Is "files" field really necessary here?

Pauan · 2018-01-30T21:51:57Z

@chicoxyzzy No, the files field isn't mandatory, and it is rarely used.

Instead of using files, most projects use .gitignore and/or .npmignore to exclude files from the npm package.

lukewagner · 2018-01-30T22:12:53Z

Below is a fairly idiomatic (relative to webassembly/design/BinaryEncoding.md) binary encoding of the above JSON schema (with my arbitrary choice of package.json as the custom section name open for discussion).

That being said, I wonder whether it's better to simply stick the (UTF-8-encoded bytes of the) JSON blob literally as the payload of the custom section. Then extracting this from a given WebAssembly.Module m is as simple as:

calling WebAssembly.Module.customSection(m, "package.json") to get the ArraryBuffer of UTF-8-encoded bytes b
calling JSON.parse(TextDecoder.decode(b)) to get the JSON.

Thoughts? I'm actually inclined toward the latter... @alexcrichton ?

npm package custom section

As a custom section,

the id is 0,
the name field is the UTF-8 byte sequence package.json,
the payload_data field contains the following record:

Field	Type	Description
flags	`varuint32`	Bitmask initially required to be `0` that later allows adding fields
name	`string`	name of the package
version	`string`	version of the package
main	`string`	the primary file
num_files	`varuint32`	number of file `string`s that follow
files	`string`*	list of files to include in the package
num_dependencies	`varuint32`	number of file `string`s that follow
dependencies	`string`*	list of npm pkgs that the pkg depends on

string

Field	Type	Description
name_len	`varuint32`	length of `name_str` in bytes
name_str	bytes	UTF-8 encoding of string

alexcrichton · 2018-01-30T22:21:18Z

Heh yeah I'd be totally ok going with just raw JSON here in the custom sections.

One point I'd want to clarify though, where do we think this'll happen? Depending on what happens we may not necessarily want the whole schema in a section of the wasm executable, but I'd wanna gut check my thinking.

Our eventual end state would be something along the lines of:

You're publishing a library to npm, and this library can consist of a bunch of rust crates compiled to wasm.
Any crate in the crate graph could depend on an npm package.
Any crate in the crate graph may also have "custom js" it needs available to it
The final wasm blob is probably created via LLD (or some wasm linker thing)
This tool @ashleygwilliams is thinking of is run at the final point before actually publishing to npm.

Does that sound right? If so I think the only parts we'd need from the wasm blob which may affect package.json are the custom JS files to include and npm dependencies. The name/version/main fields may be generated at the final step (perhaps main through the bindgen business).

In that sense I was thinking that the wasm cusotm sections would be engineered to be concatenateable where each crate would have an optional custom section listing its "custom js" or npm dependencies, and the linker would naturally binary-concatenate all these sections into one when producing the final output.

I may be confused though!

Pauan · 2018-01-31T02:05:30Z

Am I missing something? Why are we talking about inserting package.json into the WebAssembly?

package.json is a separate file that describes the package metadata, it's the npm equivalent of Cargo.toml

And linking all the npm dependencies together should probably be the responsibility of a bundler like Webpack or Parcel, not Rust.

aturon · 2018-01-31T04:47:53Z

@alexcrichton

Just to check: in the case where we have multiple crates with individual package.json files, it's not enough to simply concatenate their contents; we need to apply a "semver constraint intersection" to "flatten" into a coherent final set of dependencies. Is your idea that this flattening would take place within the publication tool, which would read out a bunch of separate package.json custom sections and flatten them?

@Pauan I believe the main rationale for using custom sections to store this data is that we will then be able to avoid a lot of special-case tooling, e.g. when linking crates. Even the npm publication tool is expected to be language-agnostic; each compiler will produce a custom section for dependencies in the form the tool expects.

It's not the bundler's job, because this all needs to happen prior to publishing to npm. That is, we need to determine a package.json for publication, well before a bundler is involved.

est31 · 2018-01-31T05:19:33Z

I don't think the compiler (or cargo) should get involved in choosing javascript package managers. There isn't just npm, there are also other package managers for javascript. This should definitely be a separate thing, and it should be also possible to turn it off completely for the case you don't want to publish on npm but use the wasm file directly, especially if you want to use it without using any bundler or similar.

aturon · 2018-01-31T05:22:59Z

@est31 That's the plan. The idea is just to have a mechanism for recording data into custom sections for consumption by various tools.

alexcrichton · 2018-01-31T06:07:20Z

@aturon

Just to check: in the case where we have multiple crates with individual package.json files, it's not enough to simply concatenate their contents

Oh sure, of course! I was mostly referring to the literal binary representation where the linker (I'd assume at least) would just bytewise concatenate similarly named sections from each module into one at the end. In that sense raw JSON may not work as they're not byte-wise concatenatable but a binary form with lengths and such should work.

In other words the tool which is taking a wasm file and generating a package.json needs to basically iterate over the requests each of the inputs to the final wasm module had, but after this iteration it'd for sure do the resolution like you're mentioning.

aturon · 2018-01-31T06:33:40Z

@alexcrichton Great, that's exactly what I hoped! Sounds very clean.

Pauan · 2018-02-01T00:24:19Z

@aturon I believe the main rationale for using custom sections to store this data is that we will then be able to avoid a lot of special-case tooling, e.g. when linking crates. Even the npm publication tool is expected to be language-agnostic; each compiler will produce a custom section for dependencies in the form the tool expects.

It's not the bundler's job, because this all needs to happen prior to publishing to npm. That is, we need to determine a package.json for publication, well before a bundler is involved.

I'm sorry, I'm still not understanding, could you elaborate some more?

My understanding is that this is how the process should work:

Let's say somebody wants to write some Rust code and then create a foo package and publish the foo package to npm. They would follow these steps:

Write some Rust code:

// This means that we're exporting a function to WebAssembly
#[wasm_export]
pub extern fn foo() -> i32 {
    0.0
}

Compile that Rust code to a foo.wasm file.

Create a package.json file which contains the usual npm metadata:

{
  "name": "foo",
  "version": "0.1.0",
  "main": "./foo.wasm"
}

Run the npm publish command, which is the standard way of publishing npm packages.

Okay, great, they're done!

Now, somebody else wants to consume that foo package. There might be all sorts of different consumers: WebAssembly, JavaScript, TypeScript, etc.

For the sake of this example, let's assume that they want to consume that foo package in Rust. They would follow these steps:

Write some Rust code:

// This means that we're importing the `foo` npm package
#[wasm_module = "foo"]
extern {
    fn foo() -> i32;
}

// Use foo in some way

Compile that Rust code to a bar.wasm file.

Create a package.json file which uses the foo package as a dependency:

{
  "name": "bar",
  "version": "0.1.0",
  "main": "./bar.wasm",
  "devDependencies": {
    "foo": "^0.1.0"
  }
}

Run the npm install command.
Bundle the bar.wasm file using Parcel, Webpack, etc.

And that's it, everything should Just Work(tm). All of the packaging, bundling, and linking is done by external tools (npm and Webpack/Parcel/etc.). I think this is the only way that Rust can seamlessly work with npm.

The only thing that rustc needs to do is that when you use #[wasm_export] it creates a wasm export, and when you use #[wasm_module = "foo"] it creates a wasm import. Not some special Rust-specific import/export, just a regular wasm import/export. No custom sections or metadata needed.

What if you don't want to use wasm_export and wasm_module and write all the extern stuff? Well, in that case my recommendation would be to publish a Rust package on Cargo, and then statically link it with other Rust packages (which are also obtained from Cargo).

This linking is handled entirely by rustc / llvm, it's just the normal workflow that Rust programmers are currently using. You can then publish a single .wasm file (which contains all of the statically linked Rust packages) to npm (by following the above steps).

Or perhaps rustc could support a sort of "dynamic linking" where it creates multiple .wasm files (one for each crate), and those .wasm files use wasm imports to import each other. Then you can publish those multiple .wasm files as a single npm package.

But in any case, if you want to use (or publish) a WebAssembly module, you'll have to use the wasm_export and wasm_module stuff, and rely upon an external WebAssembly linker (like Webpack or Parcel).

Is my above understanding correct, or do you have something different in mind?

Pauan · 2018-02-01T16:04:47Z

By the way, what I said above is assuming Rust has no built-in npm integration. If we wanted to integrate npm (and I think we should), then I imagine this is how it would work:

If somebody wants to write some Rust code and publish it as the npm package foo, they would do this:

Write some Rust code:

// This means that we're exporting a function to WebAssembly
#[wasm_export]
pub extern fn foo() -> i32 {
    0.0
}

Add the following to their Cargo.toml file:
```
[npm]
name = "foo"
version = "0.1.0"
```
Run cargo npm publish --release --target wasm32-unknown-unknown (or whatever other target they want)

And they're done!

When they run cargo npm publish, Cargo will:

Create a folder which is used for the publishing (this would probably be something like target/npm)
Build the project and move the compiled WebAssembly file(s) into target/npm
Create a package.json file in target/npm (which includes the fields specified in [npm])
Run the npm publish command.

All of this is an implementation detail of Cargo, so the user doesn't (and shouldn't!) need to worry about it. The user simply needs to use cargo npm publish and everything Just Works(tm).

When consuming an npm package in Rust, they would follow these steps:

Write some Rust code:

// This means that we're importing the `foo` npm package
#[wasm_module = "foo"]
extern {
    fn foo() -> i32;
}

// Use foo in some way

Add the following to their Cargo.toml file:
```
[npm.devDependencies]
foo = "^0.1.0"
```
Run cargo build as usual.

And they're done!

When they specify [npm.devDependencies] (or [npm.dependencies], [npm.peerDependencies], [npm.bundledDependencies], [npm.optionalDependencies], etc.) Cargo will:

Create a folder which is used for npm (this would probably be something like target/npm)
Create a package.json file in target/npm which contains the [npm.devDependencies] fields.
Run the npm install command.
Add the package-lock.json file into Cargo.lock (this ensures that npm dependencies are deterministic, just like Cargo dependencies).

If package-lock.json already exists in Cargo.lock then it should copy it into the target/npm folder instead of generating a new package-lock.json file (this copying needs to be done before running npm install).
Build the Rust project and move the WebAssembly file(s) into target/npm
Run Parcel (or Webpack or whatever) to create the final fully-linked output. Parcel/Webpack will automatically resolve the npm packages to the target/npm/node_modules folder (e.g. target/npm/node_modules/foo), so that doesn't need to be done by Rust or Cargo.

Once again, this is all internal implementation details that the user shouldn't need to know about.

Fundamentally, it's doing the same steps that I described in my previous post, except that those steps have been integrated into Cargo so that it's easier for the programmer to use.

Things get trickier when a Rust package qux specifies [npm.devDependencies], and it uses a Rust package corge, and corge also specifies [npm.devDependencies]. There are basically two options:

Concatenate the [npm.devDependencies] together and resolve them as if it were a single npm package. This is the more Rust-y way of doing things, but it means you will get build failures if there is a version conflict (and version conflicts will be very common if you use this approach).
Treat each Rust package as if it were a separate npm package, so the versions get resolved separately. This is how npm does things. It avoids version conflicts, but it can create extra code bloat. Although I don't like this way of doing things, npm generally relies upon this behavior, so this is probably the correct option.

If you don't want to integrate this functionality directly into Cargo, we could make a separate third-party cargo-npm command which does the above steps (similar to how we have cargo-web right now).

Why am I suggesting to use npm install and Parcel/Webpack, rather than handling it entirely in rustc/Cargo? Because a Rust package that wants to integrate with the npm ecosystem must do that.

There are a wide variety of packages on npm: JavaScript code, TypeScript code, WebAssembly created with C++, WebAssembly created with Rust, hand-written WebAssembly, JavaScript code importing WebAssembly code, WebAssembly code importing JavaScript code, code intended for the browser only, code intended for Node only, code that works with the browser or Node, code that relies upon quirky behavior of npm, code that relies upon quirky behavior of Webpack, code that injects JSON/CSS/HTML into the final bundle, code which is dynamically imported at runtime (code-splitting), code which uses Webpack plugins, code which relies upon multiple different versions of npm packages existing simultaneously, etc.

Bundlers (such as Webpack or Parcel) have been designed to deal with this complexity, they are the de-facto standard in the npm/JavaScript communities. They handle all of the above situations (and more!). Trying to replicate that behavior in rustc/Cargo is simply not practically viable. Even just trying to replicate npm's package version resolution mechanism is very tricky (people have tried).

aturon · 2018-02-01T18:24:36Z

@Pauan Thanks for elaborating your thoughts! FWIW, I actually think we are all largely in agreement here, and there's just a bit of context that's missing re: what's being spelled out in this issue.

First, and most important: did you read and understand the pipeline diagram? That's the clearest documentation we currently have for the intended pipeline here, and I think it addresses many of your concerns.

But let me also try to explain things in text form.

First off, we very much want to let npm and the bundlers do their work. The goal of this issue and the related one on expressing imports is just to figure out how to tell npm, and ultimately the bundlers, the information they need to know to do this work.

We envision that process happening in two steps:

As you suggest, we need some way for crates to specify imports from npm. You're suggesting putting that in Cargo.toml. We had been thinking having a separate package.json. But that's a question for the other thread. The point is, in the end, for each crate we have npm dependency information.
At the root crate, when we want to publish up to npm, we need to produce a final package.json and .wasm file (and, with bindgen, some js) and publish that to npm for consumption. That's where the tool described in this issue comes in. We imagined this process itself comprising two steps:
- The npm dependency information for each crate in the graph would be injected as a custom section, which means linking will produce a single .wasm file with all of the dependency information from the crate graph, without Rust tooling needing any special knowledge.
- The tool described in this issue will then remove the custom section, and use it to compute a final package.json. That involves dealing with overlapping and/or conflicting version constraints from the crate graph. The tool will then publish the whole shebang on npm.

The use of custom sections is motivated by decoupling, in two ways:

Keeping the Rust toolchain as oblivious as possible, so instead this information is all handled by wasm-specific tooling.
Allowing the tool described in this issue to be language-agnostic: it works on an arbitrary .wasm file that has the right custom sections, which could've been generated by other languages.

Finally, a bit of broader context: @ashleygwilliams is coming from the npm perspective -- she works at npm -- and we've been discussing the above with the bundlers (we met with Parcel recently).

Hopefully that helps clear some of this up!

Pauan · 2018-02-02T00:31:31Z

@aturon First, and most important: did you read and understand the pipeline diagram?

I did see that image, but it's quite large and requires both horizontal and vertical scrolling, so it was hard to read. I have read it more thoroughly now, it does help some.

The npm dependency information for each crate in the graph would be injected as a custom section, which means linking will produce a single .wasm file with all of the dependency information from the crate graph, without Rust tooling needing any special knowledge.

So let me just verify that I'm understanding you correctly. Let's say you create a Rust project which uses multiple Rust crates.

Each Rust crate might have a package.json. When you build your Rust project, rustc will read the package.json file for each crate, concatenating them together.

In addition, when rustc statically links all the Rust crates together into a single .wasm file, it also injects the concatenated package.json files into a custom section in the .wasm

Then you run a separate tool which will read the .wasm, generate a package.json from the custom section, and then delete the custom section.

Is my understanding correct? If so, then I completely agree with the proposal, it makes perfect sense now.

Finally, a bit of broader context: @ashleygwilliams is coming from the npm perspective -- she works at npm -- and we've been discussing the above with the bundlers (we met with Parcel recently).

Indeed, I am also coming at it from the npm perspective, because I am a long-time JavaScript and npm user. That's why the suggestion of using WebAssembly custom sections seemed odd to me, since package.json is the normal way of doing things.

I thought that the custom sections would be included in the npm package. But now I understand that the custom sections are simply an implementation detail, not something that is exposed to the programmer: the programmer uses package.json as usual.

Hopefully that helps clear some of this up!

Yes it does clear it up, thanks a lot!

ashleygwilliams · 2019-07-25T16:34:24Z

we are tracking this in the wasm-pack repo and it has already shipped ;)

ashleygwilliams mentioned this issue Jan 30, 2018

Work out core npm integration story #5

Closed

ashleygwilliams added packaging wasm tooling labels Jan 30, 2018

ashleygwilliams mentioned this issue Jan 30, 2018

wasm-npm-packager: what to write it in #35

Closed

alexcrichton mentioned this issue Apr 20, 2018

Coordinating with wasm-bingden about versions and list of NPM dependencies rustwasm/wasm-pack#101

Closed

ashleygwilliams closed this as completed Jul 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

npm packaging requirements #34

npm packaging requirements #34

ashleygwilliams commented Jan 30, 2018 •

edited

Loading

chicoxyzzy commented Jan 30, 2018

Pauan commented Jan 30, 2018

lukewagner commented Jan 30, 2018

alexcrichton commented Jan 30, 2018

Pauan commented Jan 31, 2018

aturon commented Jan 31, 2018

est31 commented Jan 31, 2018

aturon commented Jan 31, 2018

alexcrichton commented Jan 31, 2018

aturon commented Jan 31, 2018

Pauan commented Feb 1, 2018 •

edited

Loading

Pauan commented Feb 1, 2018 •

edited

Loading

aturon commented Feb 1, 2018

Pauan commented Feb 2, 2018 •

edited

Loading

ashleygwilliams commented Jul 25, 2019

npm packaging requirements #34

npm packaging requirements #34

Comments

ashleygwilliams commented Jan 30, 2018 • edited Loading

chicoxyzzy commented Jan 30, 2018

Pauan commented Jan 30, 2018

lukewagner commented Jan 30, 2018

npm package custom section

string

alexcrichton commented Jan 30, 2018

Pauan commented Jan 31, 2018

aturon commented Jan 31, 2018

est31 commented Jan 31, 2018

aturon commented Jan 31, 2018

alexcrichton commented Jan 31, 2018

aturon commented Jan 31, 2018

Pauan commented Feb 1, 2018 • edited Loading

Pauan commented Feb 1, 2018 • edited Loading

aturon commented Feb 1, 2018

Pauan commented Feb 2, 2018 • edited Loading

ashleygwilliams commented Jul 25, 2019

ashleygwilliams commented Jan 30, 2018 •

edited

Loading

Pauan commented Feb 1, 2018 •

edited

Loading

Pauan commented Feb 1, 2018 •

edited

Loading

Pauan commented Feb 2, 2018 •

edited

Loading