Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added D1 integration #246

Closed
wants to merge 8 commits into from
Closed

Conversation

KernelFreeze
Copy link

@KernelFreeze KernelFreeze commented Nov 21, 2022

Added a Cloudflare D1 integration using wasm_bindgen.

Pending work:

  • Add tests to Worker Sandbox.
  • Add meta field.
  • Cleanup.

Copy link
Collaborator

@zebp zebp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a really good start, but I feel that it won't be rust-y enough if we just return JsValues. I think the path forward is to create a D1Deserializer that implements Deserializer. This guide by the Serde developers should be super useful in implementing that, but it still might not be trivial. If you'd like me write that deserializer I'm more than happy to get this PR landed :)

worker/src/d1.rs Outdated
self.0.error()
}

pub fn results(&self) -> Vec<JsValue> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try to avoid returning regular JsValues where possible. Perhaps we can use a serde Deserializer with D1's implicit type conversion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea

worker/src/d1.rs Outdated
Ok(D1Result(result.into()))
}

pub async fn raw(&self) -> Result<Vec<JsValue>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing, let's avoid JsValue.

worker/src/d1.rs Outdated
Ok(())
}

pub async fn first(&self, col_name: Option<&str>) -> Result<JsValue> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid JsValue.

@KernelFreeze
Copy link
Author

I applied your suggestion using Cloudflare's serde_wasm_bindgen, is that ok?

@zebp
Copy link
Collaborator

zebp commented Nov 27, 2022

I applied your suggestion using Cloudflare's serde_wasm_bindgen, is that ok?

Yeah that seems fine! When you think this PR is ready, remove the draft status and give me a ping.

@KernelFreeze
Copy link
Author

I'm now working on improving the error handling. Some users reported to me that they have problems with some queries and they can't know what is the exact problem because right now it only returns "D1Error".

pub struct D1PreparedStatement(D1PreparedStatementSys);

impl D1PreparedStatement {
pub fn bind<T>(&self, value: &T) -> Result<()>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be some variation of pub fn bind<T>(&self, value: &T) -> Result<Result<D1PreparedStatement, JsValue>>? Please correct me if I'm wrong but it looks like we're throwing away the returned statement / error?

Copy link
Contributor

@FlareLine FlareLine Dec 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally replacing JSValue with a more appropriate 'Rust-y' type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I need to rework the error handling

@FlareLine
Copy link
Contributor

FlareLine commented Dec 19, 2022

Just playing around with this one (as I'm extremely interested in getting in across the line):

Looks like I'm having trouble using parameterized queries. A simple code example:

// ...
.get_async("/endpoint/:id", |_, ctx| async move {
let default = String::from("A");
let id = ctx.param("id").unwrap_or(&default).clone();
let d1 = ctx.env.d1("mydb")?;
let stmt = match d1
  .prepare("SELECT * FROM things WHERE thing_id = ?1")
  .bind(&id)
  {
    Ok(r) => r,
    Err(e) => return Response::error(format!("{:?}", e), 500),
  };

#[derive(Serialize, Deserialize)]
struct Thing {
  thing_id: String,
  desc: String,
  num: u16,
}

Using the above code (and some modifications to the workers create to allow for deeper debugging provides me with this error, irrespective of calls to run()/all()/raw():

D1_ERROR
Error: D1_ERROR
at D1Database._send (d1-beta-facade.entry.js:682:13)
at async D1PreparedStatement.all (d1-beta-facade.entry.js:720:7))")

Using the same structure but with a non-parameterized query doesn't seem to error in the same way

.prepare("SELECT * FROM things WHERE thing_id = 'A'")
.bind(&id)

seems to return the expected result of

> 'Hello world!'

I've taken a look at serde_wasm_bindgen and it doesn't look like there's anything out of the ordinary here. I've also tried several other different commands / structures and the errors I'm getting are super inconsistent.

The errors provided by the binding with JS seem really obscure 🤔

@Alex1607
Copy link

Having the same problem as @FlareLine when trying to bind a parameter.
Funnily enough, the binding works fine when I added quotes around it. Like this: [...] WHERE something = "?";
However, it doesn't work, obviously. I just found that a bit strange.

@FlareLine
Copy link
Contributor

It does appear that most of the unexpected errors occur when you attempt to use parameterized queries. Ruling out all known issues I still get errors when binding a simple string to a parameter in a query.

To get around this issue, I could use simple interpolation instead of parameter binding in my query string, as below:

.get_async("/db/:id", |_, ctx| async move {
  let default = String::from("A");
  let id = ctx.param("id").unwrap_or(&default);
  let d1 = ctx.env.d1("my-db")?;
  let query = format!("SELECT * FROM things WHERE thing_id = '{}'", id);
  let stmt = d1.prepare(&query);
  let result = stmt.all().await?;
  let things = result.results::<Thing>()?;
  Response::from_json::<Vec<Thing>>(&things)
})
.run(request, env)
.await

But this opens up the application to SQLi vulnerabilities and is a 'hacky' workaround.

Upon attempting to diagnose further, it looks like there's a lack of information coming back from the wasm_bindgen call - I'm way out of my depth to know how to diagnose past the Rust code, but does anyone have any doco on the process?

@KianNH KianNH mentioned this pull request Jan 5, 2023
1 task
@FlareLine
Copy link
Contributor

Documenting my findings for reference

I've forked workers-rs and wrangler locally to allow me to debug and make changes to the underlying JS.

I've changed a significant amount of the error handling within the local wrangler fork, in the D1 JS facade (d1-beta-facade.js) to facilitate better error visibility during debugging. This isn't something that we'd do in the JS layer to allow for this Rust D1 functionality, but we would have to work out the binding process to allow this information to be bound and discoverable at the Rust layer.

I've worked out that one of the underlying errors is actually a SQL 9008 Invalid type error as per this statement and response combination:

  • JS query payload: "{\"sql\":\"SELECT * FROM things WHERE thing_id = ?1\",\"params\":[[\"A\"]]}"
  • JS query response: {\"error\":\"ERROR 9008: SQL query error: Invalid type\",\"success\":false,\"serve...

From this, I've noticed that the params field actually contains a nested array? I'm not sure if this is intentional, but performing that same query without using binding results in a successful response.

I'll continue debugging here to see if I can get a consistent pattern working.

#[derive(Debug, Clone, PartialEq, Eq)]
pub type D1PreparedStatement;

#[wasm_bindgen(structural, method, js_class=D1PreparedStatement, js_name=bind, catch)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes one of the issues documented in #246 (comment). 😄

Suggested change
#[wasm_bindgen(structural, method, js_class=D1PreparedStatement, js_name=bind, catch)]
#[wasm_bindgen(structural, method, catch, variadic, js_class=D1PreparedStatement, js_name=bind)]

T: serde::ser::Serialize + ?Sized,
{
let value = serde_wasm_bindgen::to_value(value)?;
let array = Array::of1(&value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will most likely cause issues documented in #246 (comment). We should allow bind<T> to take in an Iterator of T and convert this to an Array instance for use in our self.0.bind(array) call. Despite the variadic fix in the aforementioned comment, I can still only bind a single parameter to my query, and any attempts to call bind again in some chained fashion will fail.

@FlareLine
Copy link
Contributor

@KernelFreeze are you still happy to work on this PR? I'd be keen to get this across the line myself - happy to attribute your contributions thus far? 😄

@KernelFreeze
Copy link
Author

KernelFreeze commented Jan 19, 2023

@KernelFreeze are you still happy to work on this PR? I'd be keen to get this across the line myself - happy to attribute your contributions thus far? 😄

Hey! I don't have too much time right now to finish this, so sure, go ahead ^^ If you want to, I can give you access to my repo or if you prefer you can open your own PR.

@FlareLine
Copy link
Contributor

@zebp @Alex1607 I've created #270 to supersede this PR as per guidance from @KernelFreeze.

@Alex1607
Copy link

@zebp @Alex1607 I've created #270 to supersede this PR as per guidance from @KernelFreeze.

Thank you! I will try it out as soon as I got some time :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants