Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add episode 51 #547

Merged
merged 1 commit into from
Oct 10, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
178 changes: 178 additions & 0 deletions _episodes/51.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
---
layout: episode
title: "51: Trino cools off with PopSQL"
date: 2023-07-06
tags: trino popsql query editor collaboration
youtube_id: "w2HrwynI5fk"
wistia_id: "3r0jsnvbmm"
sections:
- title: "Intro"
desc: "Episode 51: Trino cools off with PopSQL"
time: 0
- title: "Trino releases"
desc: "423-427"
time: 108
- title: "Concept of the episode"
desc: "Introducing PopSQL"
time: 480
- title: "Concept of the episode"
desc: "Building a new Node adapter for Trino"
time: 682
- title: "Demo of the episode"
desc: "Demoing PopSQL"
time: 1464
- title: "Demo of the episode"
desc: "Query versioning and scheduling"
time: 1903
- title: "Pull request of the episode"
desc: "Trino Gateway release"
time: 3010
- title: "Rounding out"
desc: "Trino Summit and the SQL training series!"
time: 3197
---

## Hosts

* [Cole Bowden](https://www.linkedin.com/in/cole-m-bowden), Developer Advocate
at [Starburst](https://starburst.io)
* Manfred Moser, Director of Technical Content at
[Starburst](https://starburst.io)
([@simpligility](https://twitter.com/simpligility))

## Guests

* [Jake Peterson](https://www.linkedin.com/in/jakeptrsn/), Head of Customer
Success at [PopSQL](https://popsql.com/)
* Matthew Peveler, Software Engineer at [PopSQL](https://popsql.com/),
[MasterOdin](https://github.com/MasterOdin) on GitHub

## Releases 423-427

Official highlights from Martin Traverso:

[Trino 423](https://trino.io/docs/current/release/release-423.html)

* Schema evolution for nested fields
* Support for comments on materialized view columns
* Support for `CASCADE` option in `DROP SCHEMA` for Clickhouse, MariaDB, MySQL,
Oracle and SingleStore
* Various performance improvements

[Trino 424](https://trino.io/docs/current/release/release-424.html)

* Improved performance for JSON, CSV, text and related formats in Hive
* Support for `CASCADE` in `DROP SCHEMA` for PostgreSQL and Iceberg
* Improved coordinator CPU utilization for large clusters

[Trino 425](https://trino.io/docs/current/release/release-425.html)

* Improved performance of `GROUP BY`.
* Support for check constraints in `MERGE` for Delta Lake connector.
* Support for the Decimal128 in MongoDB connector.

[Trino 426](https://trino.io/docs/current/release/release-426.html)

* Support for `SET/RESET SESSION AUTHORIZATION`.
* Improved performance of aggregations over decimal values.
* Support for `TRUNCATE TABLE` in Delta Lake connector.
* Support for Databricks 13.3 LTS.

[Trino 427](https://trino.io/docs/current/release/release-427.html)

* Improved performance for `GROUP BY` and `DISTINCT`.
* Support for pushing down `UPDATE` statements into connectors.
* Support for reading Delta Lake tables with Deletion Vectors.
* Faster writing to Parquet files in Delta Lake and Iceberg.
* Support for querying tags in Iceberg.

## Concept of the episode: PopSQL

It may be familiar to some of our viewers to describe an environment where
key queries and dashboards are buried in someone's personal workspace, and you
have to go ask them directly every time you want to check on your metrics.
When you're running a world-class, highly-performant query engine like Trino and
investing time and resources into maintaining it, shouldn't you treat your
queries like a first-class, collaborative, versioned system, too?

[PopSQL](https://popsql.com/), a playful spin on the word popsicle, solves the
sadness that is disorganized and siloed insights by centralizing queries into a
platform that has versioning, security, and a suite of collaborative tools
comparable to Google Drive. Want to work with your teammate on a query? You can
open up the same editor and see the same thing. Want to see what that query
someone ran last week was to see how the new feature is doing? It's there. Have
a suggestion to improve something? Leave a comment. Realize your suggestion was
wrong and need to undo the change? You can view past versions of the query.

PopSQL and Trino make sense together. PopSQL provides a best-in-class interface
for organizing, collaborating, and working together on all of your SQL queries
across the business, and Trino handles running those queries at unparalleled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahaha .. unparalleled speed .. by parallel processing

speeds. They go hand-in-hand for treating your data and SQL analytics as first
class citizens. In today's episode, we'll be exploring what PopSQL is, how it
integrates with Trino, and how the engineers at PopSQL have done some cool
things with Trino to make the integration better than ever before. We'll start
with that last one, actually.

## Concept of the episode: A new Node.js adapter for Trino
mosabua marked this conversation as resolved.
Show resolved Hide resolved

Trino in the frontend is... a tricky thing. We can go ahead and admit that the
[Trino web UI](/docs/current/admin/web-interface.html) isn't going to win any
awards for design or functionality. And while a couple Node-based libraries
exist out there, including [presto-client-node](https://www.npmjs.com/package/presto-client)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We initially used this library, but in addition to the lack of streaming support, there's also some missing functionality (e.g. 50x errors throw an uncaught error) and some idiosyncrasies in the codebase (e.g. callbacks that have an err variable that's never defined), it's ended up not a great fit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add some of this @colebow ?

and [lento](https://github.com/vweevers/lento). But presto-client-node lacked
support for streaming and had some issues handling 500 errors, and lento doesn't
quite support Trino out of the box and only supports single streams, which
wasn't ideal for PopSQL's distributed architecture. So when PopSQl's engineers
went to build their frontend and integrate with Trino, what did they do? Build
their own adapter.

We'll talk about how it was implemented, what key features it unlocks, and why
it makes using PopSQL with Trino an even better experience.

## Demo of the episode: Using PopSQL with Trino

It's hard to write show notes for a demo, because you can't really experience
the demo by reading about what's happening. But as a surface-level overview,
we'll be going over:

* Setting up a connection
* The schema explorer
* The SQL editor
* Query scheduling
mosabua marked this conversation as resolved.
Show resolved Hide resolved

## PR of the episode: #57 on trino-gateway: Release version 3

Last week, the community officially released the
[trino-gateway](https://github.com/trinodb/trino-gateway), a proxy and load
balancer that enables large operations to run multiple Trino clusters in
harmony with each other to serve big queries and small queries alike. If you or
your organization have a need for more than one Trino cluster and want the
seamless experience of being able to connect to any of them through a single
interface, then check it out! It's the product of many months of effort and
should be a fantastic solution for running Trino at the absolute largest scales.

To learn more about it, you should check out
[the blog post announcing its first release.]({{site.baseurl}}/blog/2023/09/28/trino-gateway)

## Rounding out

Trino Summit, the biggest Trino event of the year, is coming up on the 13th and
14th of December, and like Trino Fest, it'll be fully virtual. If you'd like to
give a talk about anything related to Trino, we're looking for speakers now.
[Submit your talk here!](https://sessionize.com/trino-summit-2023/) If you'd
rather attend, you can also
[go register to attend now](https://www.starburst.io/info/trinosummit2023/?utm_source=trino&utm_medium=website&utm_campaign=NORAM-FY24-Q4-EV-Trino-Summit-2023&utm_content=tcb).

mosabua marked this conversation as resolved.
Show resolved Hide resolved
Prior to Trino Summit, if you'd like to learn about SQL from the absolute
experts, we've also announced the [Trino Training Series](/blog/2023/09/27/training-series)
colebow marked this conversation as resolved.
Show resolved Hide resolved
that we'll be running as a buildup to the summit. Register now and look forward
to four great sessions starting from the ground up and ending with some key
tricks and Trino specifics that even a seasoned SQL veteran may not know about.

If you want to learn more about Trino, get the definitive guide from
O'Reilly. You can download
[the free PDF](https://www.starburst.io/info/oreilly-trino-guide/) or
buy the book online.

Music for the show is from the [Megaman 6 Game Play album by Krzysztof
Slowikowski](https://krzysztofslowikowski.bandcamp.com/album/mega-man-6-gp).