Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compute] map LIR to dataflow #29848

Merged
merged 17 commits into from
Nov 6, 2024

Conversation

mgree
Copy link
Contributor

@mgree mgree commented Oct 3, 2024

This PR introduces two new introspection sources and two new introspection views. These novel forms of introspection allow us to map LIR operators down to dataflow operators; we can now attribute existing introspection data about dataflows to LIR operators.

Introspection

The two new sources are ComputeLog sources that run per worker.

mz_introspection.mz_compute_dataflow_global_ids_per_worker

Maps dataflow identifiers to the global IDs used internally for things that get built as dataflows.

   name    | nullable | type  | comment
-----------+----------+-------+---------
 id        | f        | uint8 | dataflow ID
 worker_id | f        | uint8 |
 global_id | f        | text  |

mz_introspection.mz_compute_lir_mapping_per_worker

Tracks attribution information for LIR terms (in terms of FlatPlan).

       name        | nullable | type  | comment
-------------------+----------+-------+---------
 global_id         | f        | text  |
 lir_id            | f        | uint8 | AST node number
 worker_id         | f        | uint8 |
 operator          | f        | text  | rendered string
 parent_lir_id     | t        | uint8 | parent AST node number
 nesting           | f        | uint2 | nesting (used for indentation)
 operator_id_start | t        | uint8 | first dataflow operator (inclusive)
 operator_id_end   | t        | uint8 | last dataflow oeprator (exclusive)

Views

We use two introspection views to work with these per-worker sources. It ought to be the case that all workers agree about this metadata (though they may not agree on, say, the amount of memory a dataflow operator is using!).

So: these are just views that set worker_id = 0.

mz_introspection.mz_dataflow_global_ids

   name    | nullable | type  | comment
-----------+----------+-------+---------
 id        | f        | uint8 |
 global_id | f        | text  |

mz_introspection.mz_lir_mapping

       name        | nullable | type  | comment
-------------------+----------+-------+---------
 global_id         | f        | text  |
 lir_id            | f        | uint8 |
 operator          | f        | text  |
 parent_lir_id     | t        | uint8 |
 nesting           | f        | uint2 |
 operator_id_start | t        | uint8 |
 operator_id_end   | t        | uint8 |

Attributing to LIR

We can see a sample interaction as follows:

CREATE TABLE t(x INT NOT NULL, y INT, z TEXT);
CREATE VIEW v AS
  SELECT t1.x AS x, t1.z AS z1, t2.z AS z2
  FROM t AS t1, t AS t2
  WHERE t1.x = t2.y;
CREATE INDEX v_idx_x ON v(x);

\! sleep 1

SELECT global_id, lir_id, REPEAT(' ', MAX(nesting) * 2) || operator AS operator, SUM(duration_ns) AS duration, SUM(count) AS count
    FROM           mz_introspection.mz_lir_mapping mlm
         LEFT JOIN mz_introspection.mz_compute_operator_durations_histogram mcodh
         ON (mlm.operator_id_start <= mcodh.id AND mcodh.id < mlm.operator_id_end)
GROUP BY global_id, lir_id, operator
ORDER BY global_id, lir_id DESC;

which yields an output like:

 global_id | lir_id |          operator          | duration | count
-----------+--------+----------------------------+----------+-------
 u2        | 4      | Join::Differential 1 » 3   |  1261568 |    17
 u2        | 3      |   Arrange 2                |   466944 |    16
 u2        | 2      |     Get::Collection u1     |    69632 |     7
 u2        | 1      |   Arrange 0                |   417792 |    16
 u2        | 0      |     Get::Collection u1     |    73728 |     7
 u3        | 6      | Arrange 5                  |   454656 |    17
 u3        | 5      |   Get::PassArrangements u2 |          |

A place to store our bicycles

  • Should these be mz_internal or mz_introspection? It needs to be mz_introspection to pass tests.

  • Should I just do the indentation up front?

  • Should I track other metadata (e.g., parents/children of LIR nodes?) I added parent_lir_id to allow for more complex reconstructions.

Motivation

  • This PR adds a known-desirable feature.

The first step in attribution/plan profiling. https://github.com/MaterializeInc/database-issues/issues/6551

Tips for reviewer

I've tried to break things down so that each commit is sensible, but there's a little bit of back-and-forth in exactly what I store in the views.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@mgree mgree force-pushed the lir-to-dataflow-address-mapping branch 7 times, most recently from 1c15839 to 146a170 Compare October 16, 2024 15:15
@mgree mgree force-pushed the lir-to-dataflow-address-mapping branch 4 times, most recently from 002cd64 to a1447ef Compare October 23, 2024 18:07
@antiguru antiguru self-requested a review October 23, 2024 18:34
@sthm
Copy link
Contributor

sthm commented Oct 28, 2024

I'd need to play around with it to get a better intuition how this works in practice. But from the description above, it seems to provide a an easy way to map operator ids back to object ids in a way that preserves the hierarchy and dependencies of operators.

The example shows how this mapping can be used to add operator level statistics mz_compute_operator_durations_histogram back to something that looks like a query plan. From a quick skim of the documentation, it seems like it would also work for mz_arrangement_sizes and even more complicated queries like operator skew from dataflow troubleshooting, which is great.

@mgree mgree force-pushed the lir-to-dataflow-address-mapping branch 2 times, most recently from b42194c to d5ffe7b Compare October 28, 2024 19:38
@antiguru antiguru requested a review from teskje October 29, 2024 08:50
@mgree mgree force-pushed the lir-to-dataflow-address-mapping branch 13 times, most recently from 2eb9ff2 to a79f9e6 Compare October 31, 2024 01:08
@mgree mgree marked this pull request as ready for review October 31, 2024 01:44
@mgree mgree force-pushed the lir-to-dataflow-address-mapping branch 5 times, most recently from 7d11230 to 4ac784a Compare November 4, 2024 19:32
@mgree mgree force-pushed the lir-to-dataflow-address-mapping branch from 4ac784a to eeccbbf Compare November 4, 2024 19:33
@mgree mgree force-pushed the lir-to-dataflow-address-mapping branch from 67b5f26 to e94055a Compare November 4, 2024 20:28
@mgree mgree force-pushed the lir-to-dataflow-address-mapping branch from 21589bc to 1b111af Compare November 4, 2024 21:43
@mgree mgree requested a review from antiguru November 4, 2024 22:37
Copy link
Member

@antiguru antiguru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you! Left some inline comments.

src/compute/src/logging/compute.rs Outdated Show resolved Hide resolved
src/compute/src/logging/compute.rs Outdated Show resolved Hide resolved
src/compute/src/logging/compute.rs Outdated Show resolved Hide resolved
src/compute/src/render.rs Outdated Show resolved Hide resolved
src/compute/src/render.rs Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to include your 'Attribution to Lir' example here, without duration/count?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

@mgree mgree merged commit c094caf into MaterializeInc:main Nov 6, 2024
84 checks passed
@mgree mgree deleted the lir-to-dataflow-address-mapping branch November 6, 2024 18:14
def- added a commit to def-/materialize that referenced this pull request Nov 7, 2024
@mgree
Copy link
Contributor Author

mgree commented Dec 3, 2024

Turned back on in #30692.

@mgree mgree mentioned this pull request Dec 23, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants