Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test crate to compile DataFusion with wasm-pack #7633

Merged
merged 8 commits into from
Sep 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions .github/workflows/rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,29 @@ jobs:
cd datafusion-cli
cargo doc --document-private-items --no-deps

linux-wasm-pack:
name: build with wasm-pack
runs-on: ubuntu-latest
container:
image: amd64/rust
steps:
- uses: actions/checkout@v4
- name: Cache Cargo
uses: actions/cache@v3
with:
path: /github/home/.cargo
# this key equals the ones on `linux-build-lib` for re-use
key: cargo-cache-
- name: Setup Rust toolchain
uses: ./.github/actions/setup-builder
with:
rust-version: stable
- name: Install wasm-pack
run: curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh
- name: Build with wasm-pack
working-directory: ./datafusion/wasmtest
run: wasm-pack build --dev

# verify that the benchmark queries return the correct results
verify-benchmark-results:
name: verify benchmark results (amd64)
Expand Down
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ members = [
"datafusion/sql",
"datafusion/sqllogictest",
"datafusion/substrait",
"datafusion/wasmtest",
"datafusion-examples",
"test-utils",
"benchmarks",
Expand Down
50 changes: 50 additions & 0 deletions datafusion/wasmtest/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

[package]
name = "datafusion-wasmtest"
description = "Test library to compile datafusion crates to wasm"
version = { workspace = true }
edition = { workspace = true }
readme = { workspace = true }
homepage = { workspace = true }
repository = { workspace = true }
license = { workspace = true }
authors = { workspace = true }
rust-version = "1.70"

[lib]
crate-type = ["cdylib", "rlib",]

[dependencies]

# The `console_error_panic_hook` crate provides better debugging of panics by
# logging them with `console.error`. This is great for development, but requires
# all the `std::fmt` and `std::panicking` infrastructure, so isn't great for
# code size when deploying.
console_error_panic_hook = { version = "0.1.1", optional = true }

datafusion-common = { path = "../common", version = "31.0.0", default-features = false }
datafusion-expr = { path = "../expr" }
datafusion-optimizer = { path = "../optimizer" }
datafusion-physical-expr = { path = "../physical-expr" }
datafusion-sql = { path = "../sql" }

# getrandom must be compiled with js feature
getrandom = { version = "0.2.8", features = ["js"] }
parquet = { version = "47.0.0", default-features = false }
wasm-bindgen = "0.2.87"
61 changes: 61 additions & 0 deletions datafusion/wasmtest/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
<!---
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

## wasmtest

Library crate to verify that various DataFusion crates compile successfully to the `wasm32-unknown-unknown` target with wasm-pack.
jonmmease marked this conversation as resolved.
Show resolved Hide resolved

Some of DataFusion's downstream projects compile to WASM to run in the browser. Doing so requires special care that certain library dependencies are not included in DataFusion.

## Setup

First, [install wasm-pack](https://rustwasm.github.io/wasm-pack/installer/)

Then use wasm-pack to compile the crate from within this directory

```
wasm-pack build
```

## Try it out

The `datafusion-wasm-app` directory contains a simple app (created with [`create-wasm-app`](https://github.com/rustwasm/create-wasm-app) and then manually updated to WebPack 5) that invokes DataFusion and writes results to the browser console.

From within the `datafusion/wasmtest/datafusion-wasm-app` directory:

```
npm install
npm run start
```

Then open http://localhost:8080/ in a web browser and check the console to see the results of using various DataFusion crates.

**Note:** In GitHub Actions we test the compilation with `wasm-build`, but we don't currently invoke `datafusion-wasm-app`. In the future we may want to test the behavior of the WASM build using [`wasm-pack test`](https://rustwasm.github.io/wasm-pack/book/tutorials/npm-browser-packages/testing-your-project.html).

## Compatibility

The following DataFusion crates are verified to work in a wasm-pack environment using the default `wasm32-unknown-unknown` target:

- `datafusion-common` with default-features disabled to remove the `parquet` dependency (see below)
- `datafusion-expr`
- `datafusion-optimizer`
- `datafusion-physical-expr`
- `datafusion-sql`

The difficulty with getting the remaining DataFusion crates compiled to WASM is that they have non-optional dependencies on the [`parquet`](https://docs.rs/crate/parquet/) crate with its default features enabled. Several of the default parquet crate features require native dependencies that are not compatible with WASM, in particular the `lz4` and `zstd` features. If we can arrange our feature flags to make it possible to depend on parquet with these features disabled, then it should be possible to compile the core `datafusion` crate to WASM as well.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tustvold do you have any thoughts about finagling the parquet crate's dependencies so it can compile, by default, on wasm? Should we perhaps change datafusion to disable the parquet default features?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC it is the compression codecs that have issues with WASM, disabling these by default I think would be surprising for users. Further I'm not sure how useful parquet support would be given that only InMemory object_store is supported on WASM, although I may have some time to look into this over the next couple of days

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think we'd want DataFusion's default build to disable the default parquet features, but if we could arrange things so that depending on the datafusion core crate with default-features=false would either remove the parquet dependency all together, or disable the default parquet features, then I think we could get things at least compiling for wasm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility is to make parquet support itself entirely optional given that not all DataFusion users want / need such support. I filed #7653 to track that

I also filed #7652 to track the idea of compiling all the crates for wasm.

I gathered all WASM related tickets I can find here: #7651

2 changes: 2 additions & 0 deletions datafusion/wasmtest/datafusion-wasm-app/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
node_modules
dist
68 changes: 68 additions & 0 deletions datafusion/wasmtest/datafusion-wasm-app/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
<div align="center">

<h1><code>create-wasm-app</code></h1>

<strong>An <code>npm init</code> template for kick starting a project that uses NPM packages containing Rust-generated WebAssembly and bundles them with Webpack.</strong>

<p>
<a href="https://travis-ci.org/rustwasm/create-wasm-app"><img src="https://img.shields.io/travis/rustwasm/create-wasm-app.svg?style=flat-square" alt="Build Status" /></a>
</p>

<h3>
<a href="#usage">Usage</a>
<span> | </span>
<a href="https://discordapp.com/channels/442252698964721669/443151097398296587">Chat</a>
</h3>

<sub>Built with 🦀🕸 by <a href="https://rustwasm.github.io/">The Rust and WebAssembly Working Group</a></sub>

</div>

## About

This template is designed for depending on NPM packages that contain
Rust-generated WebAssembly and using them to create a Website.

- Want to create an NPM package with Rust and WebAssembly? [Check out
`wasm-pack-template`.](https://github.com/rustwasm/wasm-pack-template)
- Want to make a monorepo-style Website without publishing to NPM? Check out
[`rust-webpack-template`](https://github.com/rustwasm/rust-webpack-template)
and/or
[`rust-parcel-template`](https://github.com/rustwasm/rust-parcel-template).

## 🚴 Usage

```
npm init wasm-app
```

## 🔋 Batteries Included

- `.gitignore`: ignores `node_modules`
- `LICENSE-APACHE` and `LICENSE-MIT`: most Rust projects are licensed this way, so these are included for you
- `README.md`: the file you are reading now!
- `index.html`: a bare bones html document that includes the webpack bundle
- `index.js`: example js file with a comment showing how to import and use a wasm pkg
- `package.json` and `package-lock.json`:
- pulls in devDependencies for using webpack:
- [`webpack`](https://www.npmjs.com/package/webpack)
- [`webpack-cli`](https://www.npmjs.com/package/webpack-cli)
- [`webpack-dev-server`](https://www.npmjs.com/package/webpack-dev-server)
- defines a `start` script to run `webpack-dev-server`
- `webpack.config.js`: configuration file for bundling your js with webpack

## License

Licensed under either of

- Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)

at your option.

### Contribution

Unless you explicitly state otherwise, any contribution intentionally
submitted for inclusion in the work by you, as defined in the Apache-2.0
license, shall be dual licensed as above, without any additional terms or
conditions.
22 changes: 22 additions & 0 deletions datafusion/wasmtest/datafusion-wasm-app/bootstrap.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

// A dependency graph that contains any wasm must all be imported
// asynchronously. This `bootstrap.js` file does the single async import, so
// that no one else needs to worry about it again.
import("./index.js")
.catch(e => console.error("Error importing `index.js`:", e));
12 changes: 12 additions & 0 deletions datafusion/wasmtest/datafusion-wasm-app/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Hello wasm-pack!</title>
</head>
<body>
<h1>See console</h1>
<noscript>This page contains webassembly and javascript content, please enable javascript in your browser.</noscript>
<script src="bootstrap.js"></script>
</body>
</html>
20 changes: 20 additions & 0 deletions datafusion/wasmtest/datafusion-wasm-app/index.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

import * as wasm from "datafusion-wasmtest";

wasm.try_datafusion();
Loading