Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A way to embed the results of a query in a page #2

Open
simonw opened this issue Sep 22, 2021 · 12 comments
Open

A way to embed the results of a query in a page #2

simonw opened this issue Sep 22, 2021 · 12 comments
Labels
enhancement New feature or request

Comments

@simonw
Copy link
Owner

simonw commented Sep 22, 2021

This is the most interesting feature of this tool - without which it's just a very simple wiki. I want to be able to embed the results of a SQL query (or filtered table page) directly into a page of notes.

@simonw simonw added the enhancement New feature or request label Sep 22, 2021
@simonw simonw changed the title Embed results of a query A way to embed the results of a query in a page Sep 22, 2021
@simonw
Copy link
Owner Author

simonw commented Sep 22, 2021

There are two ways this could work:

  • Server-side: the Python code finds those references in the Markdown, executes the queries and includes the results in the HTML
  • Client-side: JavaScript executes after the page has loaded, scans for references and uses fetch() calls to fetch the data

My concern with doing this server-side is that a page with a dozen expensive queries could take a very long time to render.

Doing them client-side avoids this problem: if some of the queries are cheap they'll display quickly, and long-running ones won't block the initial page load.

Interesting twist: with client-side, it could even be possible to embed URLs to queries that live in an entirely separate Datasette instance - the JavaScript could then fetch data (assuming CORS is enabled). That's pretty fun!

(Technically a server-side implementation could do this too, but that's even more likely to cause poor initial page load performance.)

@simonw
Copy link
Owner Author

simonw commented Sep 22, 2021

There is a third route: I could attempt to run the queries server-side with a strict time limit, and then cancel them and return the page (with JavaScript to try running them client-side) if the queries don't execute fast enough.

This is an interesting option, but I think it's probably not worth the added complexity over running everything client-side.

@simonw
Copy link
Owner Author

simonw commented Sep 22, 2021

So I've decided that the first implementation of this will be client-side. I can reuse some of the code from https://github.com/simonw/datasette-search-all

Next question: what should the Markdown format for these be?

I'm inclined to keep this as simple as possible and go with pasted URLs.

@simonw
Copy link
Owner Author

simonw commented Nov 27, 2021

I built a prototype of a <datasette-table url="..."> web component using Lit 2:

<script type="module">
  const { html, LitElement, css } = await import(
    "https://cdn.skypack.dev/[email protected]?min"
  );
  class DatasetteTable extends LitElement {
    static get styles() {
      return css`
        :host {
          font-family: Helvetica, sans-serif;
        }
        div {
          overflow: auto;
          width: 100%;
        }
        th {
          padding-right: 1em;
          text-align: left;
        }
        td {
          border-top: 1px solid #aaa;
          border-right: 1px solid #eee;
          padding: 5px;
          vertical-align: top;
          white-space: pre-wrap;
          line-height: 1.2;
        }
      `;
    }

    static get properties() {
      return {
        url: { type: String },
        data: { attribute: false },
      };
    }

    constructor() {
      super();
    }

    connectedCallback() {
      super.connectedCallback();
      this.fetchData();
    }

    fetchData() {
      fetch(this.url)
        .then((response) => {
          if (!response.ok) {
            throw new Error("Network error");
          }
          return response.json();
        })
        .then((data) => {
          this.data = data;
        })
        .catch((error) => {
          console.error("Error:", error);
        });
    }

    render() {
      if (!this.data) {
        return html` <h4>Loading...</h4> `;
      }
      return html` <div>
        <h3>
          <a href="${this.url.replace(".json", "")}"> ${this.data.table} </a>
        </h3>
        <table>
          <thead>
            <tr>
              ${this.data.columns.map((column) => html`<th>${column}</th>`)}
            </tr>
          </thead>
          <tbody>
            ${this.data.rows.map(
              (row) =>
                html`<tr>
                  ${row.map((cell) => html`<td>${cell}</td>`)}
                </tr>`
            )}
          </tbody>
        </table>
      </div>`;
    }
  }
  window.customElements.define("datasette-table", DatasetteTable);
</script>

<datasette-table
  url="https://covid-19.datasettes.com/covid/ny_times_us_counties.json?_size=1000&county=Greene&state=Mississippi"
></datasette-table>

<datasette-table
  url="https://global-power-plants.datasettes.com/global-power-plants/global-power-plants.json"
></datasette-table>

@simonw
Copy link
Owner Author

simonw commented Nov 27, 2021

Here's an interesting challenge: given a wiki page with a bunch of embedded URLs, how can I detect which ones are URLs to Datasette query or table pages and hence should be enhanced with that web component?

Two cases to consider: URLs on the current site, and URLs to Datasette instances on other sites (as shown in the example above).

I'm going to ignore the URLs-on-other-sites case for the moment.

So, given a bunch of URLs to this site, how can JavaScript spot the following?

  • /database/table
  • /database/table?options=
  • /database?sql=...
  • /database/name-of-canned-query

One trick would be to do this part server-side: parse the page looking for URLs, then for each of those URLs do a check against Datasette's own URL routing to see if it matched a known table or canned query or database.

This feels potentially slow and error-prone though.

A much more interesting option: what if either the server-side code or the JavaScript code could make HEAD requests against those URLs, and Datasette could follow a fast codepath to return an HTTP header indicating if that page corresponds to a query or not (and maybe even returning the JSON URL that should be used to fetch the corresponding data)?

@simonw
Copy link
Owner Author

simonw commented Nov 27, 2021

How about a Link: ... rel="alternate" HTTP header?

Link: <url-to-json>; rel="alternate"; type="application/json+datasette"

Is this an OK thing to do? MIght need to check in with some standards-minded folk.

[UPDATE: apparently application/datasette+json is better here, see also the IANA media types list]

@simonw
Copy link
Owner Author

simonw commented Nov 27, 2021

It's a fun idea though so I'm going to work up a prototype.

simonw added a commit to simonw/datasette that referenced this issue Nov 27, 2021
@simonw
Copy link
Owner Author

simonw commented Nov 27, 2021

https://github.com/simonw/datasette/tree/e0a84691c2959f2d1d76948574c9c4a910c7556c is a prototype of Datasette returning the following HTTP header for a table page:

Link: http://127.0.0.1:8058/fixtures/compound_three_primary_keys.json; rel="alternate"; type="application/json+datasette"

It works with HEAD requests too (currently it still generates the entire page though, so this could be optimized):

~ % curl -I 'http://127.0.0.1:8058/fixtures/compound_three_primary_keys'
HTTP/1.1 200 OK
date: Sat, 27 Nov 2021 20:09:36 GMT
server: uvicorn
Link: http://127.0.0.1:8058/fixtures/compound_three_primary_keys.json; rel="alternate"; type="application/json+datasette"
Referrer-Policy: no-referrer
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: Authorization
Access-Control-Expose-Headers: Link
content-type: text/html; charset=utf-8
Transfer-Encoding: chunked

In JavaScript:

response = await fetch(
  "http://127.0.0.1:8058/fixtures/compound_three_primary_keys?_size=10",
  {
    method: "HEAD"
  }
)
response.headers.get("Link")

@simonw
Copy link
Owner Author

simonw commented Nov 27, 2021

Asked for feedback on the standards bit on Twitter: https://twitter.com/simonw/status/1464688600646184963

@simonw
Copy link
Owner Author

simonw commented Nov 28, 2021

I published the above code as my first ever npm package: https://github.com/simonw/datasette-table

@simonw
Copy link
Owner Author

simonw commented Dec 3, 2021

How about every time a user adds a new hyperlink to a page we do a one-time HEAD request against that URL to see if it has a Link rel Datasette header... and we store the result of that check in a database table! That way we only ever make one request per URL, and we can then serve up a class data-datasette= HTML attribute which the JavaScript can then use.

Could take this in interesting directions - any URL embedded in any page could be fetched by a background task to try and get the page title and social card image and suchlike.

simonw added a commit to simonw/datasette that referenced this issue Feb 2, 2022
Also added Access-Control-Expose-Headers: Link to --cors mode.

Closes #1533

Refs simonw/datasette-notebook#2

LL#	metadata.json.1
@simonw
Copy link
Owner Author

simonw commented Mar 24, 2022

Datasette 0.61 is now out which includes those new headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant