Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(airflow): Circuit breaker and python api for Assertion and Operation #5196

Merged
merged 18 commits into from
Jul 13, 2022

Conversation

treff7es
Copy link
Contributor

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions
Copy link

github-actions bot commented Jun 17, 2022

Unit Test Results (build & test)

401 tests  +12   401 ✔️ +12   9m 22s ⏱️ - 1m 59s
  96 suites +  4       0 💤 ±  0 
  96 files   +  4       0 ±  0 

Results for commit 063895a. ± Comparison against base commit 9f2b3b9.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Jun 17, 2022

Unit Test Results (metadata ingestion)

       8 files  ±  0         8 suites  ±0   1h 11m 12s ⏱️ - 7m 34s
   598 tests +22     595 ✔️ +24    3 💤 ±0  0  - 2 
1 134 runs  +44  1 090 ✔️ +45  44 💤 +1  0  - 2 

Results for commit 063895a. ± Comparison against base commit 9f2b3b9.

♻️ This comment has been updated with latest results.

task = context["ti"].task
for outlet in task._outlets:
print(f"Reporting insert operation for {outlet.urn}")
reporter.report_operation(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just an example right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the row count is just hardcoded

)

# New Datahub Operation Circuit Breaker Sensor
pet_profiles_operation_sensor = DatahubOperationCircuitBreakerSensor(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment for everything:

Going forward we should never use Datahub as the variable name (always DataHub)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we skipping this one?

on_success_callback=report_operation,
)

# NEW RUNNING GE ASSERTION
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why all caps?

time_delta=datetime.timedelta(days=1),
)

# NEW ASSERTION OPERATOR
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same



class AssertionCircuitBreakerConfig(CircuitBreakerConfig):
check_last_assertion_time: bool = Field(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this name is a bit confusing...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(no specific suggestions yet)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check

verify_after_last_update?


class AssertionCircuitBreaker(AbstractCircuitBreaker):
r"""
Datahub Assertion Circuit Breaker
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataHub

result: bool = True
assertion_last_states: Dict[str, AssertionResult] = {}
for assertion in assertions:
for run_event in assertion["runEvents"]["runEvents"]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to ensure there is a runEvent for the assertion.

It's possible to have 0 run events

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(make sure to test this against URNs with no assertions defined at all)


for assertion_urn, last_assertion in assertion_last_states.items():
if last_assertion.state == "FAILURE":
print(f"Runevent: {last_assertion.run_event}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need prints?

for assertion_urn, last_assertion in assertion_last_states.items():
if last_assertion.state == "FAILURE":
print(f"Runevent: {last_assertion.run_event}")
print(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use a logger typically or raw print?

datahub_token: Optional[str] = None
timeout: Optional[int] = None


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: let's add a class comment!


class OperationCircuitBreaker(AbstractCircuitBreaker):
r"""
Datahub Operation Circuit Breaker
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataHub

operation_type: Optional[str] = None,
) -> bool:
r"""
Checks if the circuit breaker is active
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great docs!

payload: Optional[Dict] = None
stats: Optional[Dict] = None
start_time: Optional[datetime] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this in here? Please remove


class DatahubAssertionOperator(BaseOperator):
r"""
Datahub Assertion Circuit Breaker Operator.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataHub

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skipping?

from datahub_provider.hooks.datahub import DatahubRestHook


class DatahubAssertionOperator(BaseOperator):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataHub

from datahub_provider.hooks.datahub import DatahubRestHook


class DatahubAssertionSensor(BaseSensorOperator):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

check_last_assertion_time=check_last_assertion_time,
time_delta=time_delta,
)
self.circuit_breaker = AssertionCircuitBreaker(config=config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome reuse!!!


class DatahubOperationCircuitBreakerOperator(BaseSensorOperator):
r"""
Datahub Operation Circuit Breaker Operator.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a sentece about in what cases you would want to use this!

class DatahubOperationCircuitBreakerSensor(BaseSensorOperator):
r"""
Datahub Operation Circuit Breaker Sensor.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add one sentence about when you'd want to use this!

@@ -0,0 +1,1313 @@
[
{
"auditHeader": null,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove?

@@ -117,6 +117,75 @@ def test_bq_usage_source(pytestconfig, tmp_path):
)


@freeze_time(FROZEN_TIME)
def test_bq_usage_source_with_read_events(pytestconfig, tmp_path):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove?

data = json.load(f)
mock_gql_client.side_effect = [lastUpdatedResponseAfterLastAssertion, data]

config = AssertionCircuitBreakerConfig(datahub_host="dummy")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test!

import json
from unittest.mock import patch

from datahub.api.circuit_breaker import (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed we are not testing Operation circuit breaker?

Let's try to achieve testing parity between the two.

Copy link
Collaborator

@jjoyce0510 jjoyce0510 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seem to be some extra changes in this PR. Overall, the content looks good. A few meta points

  1. For new code, use DataHub instead of Datahub (even though some existing APIs use Datahub)
  2. We should have parity in testing between Assertions + Operations circuit breaker
  3. We need an example doc about using these APIs, which can point to or embed the examples you've created for longtailcompanions

@@ -0,0 +1,8 @@
from datahub.api.circuit_breaker.assertion_circuit_breaker import (
AssertionCircuitBreaker,
AssertionCircuitBreakerConfig,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice extraction of the api!

)
return True
elif last_assertion.state == "SUCCESS":
print(f"Found successful assertion: {assertion_urn}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need this print?

status="COMPLETE",
)

if self._check_if_assertion_failed(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

self.log.info(f"Checking if dataset {self.urn} is ready to be consumed")
ret = self.circuit_breaker.is_circuit_breaker_active(urn=urn)
if ret:
print(f"Dataset {self.urn} is not in consumable state")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes we are using prints and other times logs -- is there an intentional pattern going on here?

source_type=self.source_type,
)
if ret:
print(f"Dataset {self.urn} is not in consumable state")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why both print AND raise? is one not enough?

Copy link
Collaborator

@jjoyce0510 jjoyce0510 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few remaining comments but all-in-all looks good to me

@jjoyce0510 jjoyce0510 merged commit 4334248 into datahub-project:master Jul 13, 2022
Santhin pushed a commit to Santhin/datahub that referenced this pull request Jul 26, 2022
maggiehays pushed a commit to maggiehays/datahub that referenced this pull request Aug 1, 2022
neojunjie added a commit to aqden/datahub that referenced this pull request Aug 29, 2022
* feat(ingest): working with multiple bigquery projects (datahub-project#5240)

* fix(build): missing libs (datahub-project#5254)

* fix(build): use correct creds (datahub-project#5261)

* feat(ingest): Option to define path spec for Redshift lineage generation (datahub-project#5256)

* fix(ui): Enable previews properly when browsing for DataJob (datahub-project#5250)

* fix(docs): Fix acronym on mxe docs (datahub-project#5249)

* fix(ui): Support deleting references to glossary terms / nodes, users, assertions, and groups (datahub-project#5248)

* Adding referential integrity to deletes API

* Updating comments

* Fix build

* fix checkstyle

* Fixing Delete Entity utils Test

* feat(docs) add links in quickstart for adding users (datahub-project#5267)

* fix(siblings) Display sibling assertions in Validations tab (datahub-project#5268)

* fix(siblings) Display sibling assertions in Validations tab

* query changes

Co-authored-by: Chris Collins <[email protected]>

* feat(domain) Add ability to edit a Domain name from the UI (datahub-project#5266)

* feat(ingest): delta-lake: adding support for delta lake (datahub-project#5259)

Co-authored-by: Shirshanka Das <[email protected]>

* fix(siblings) Update the names of siblings utils args for readability (datahub-project#5269)

Co-authored-by: Chris Collins <[email protected]>

* docs(adopters): add showroomprive and n26 as DataHub adopters (datahub-project#5271)

* feat(glossary) Add Source section to sidebar for Glossary Terms (datahub-project#5262)

* fix(ingest): delta-lake - fix dependency issue for snowflake due to s3_util (datahub-project#5274)

* fix(ingest): s3 - Remove unneeded methods from s3_util (datahub-project#5276)

* feat(ui): Selector recommendations in Owner, Tag and Domain Modal (datahub-project#5197)

* fix(security) Sanitize rich text before sending to backend or rendering on frontend (datahub-project#5278)

* feat(GraphQL): Support for Deleting Domains, Tags via GraphQL API (datahub-project#5272)

* feat(build): reduce build time for ingestion image (datahub-project#5225)

* fix(ingestion): profiling - Fixing partitioned table profiling in BQ (datahub-project#5283)

* fix(ingest) redshift: Adding missing dependencies and relaxing sqlalchemy dependency (datahub-project#5284)

Relaxing sqlalchemy deps to make our plugins work with Airflow 2.3

* fix(ingestion): Reverting sqlalchemy upgrade because it caused issues with mssql and redshift-usage (datahub-project#5289)

* fix(Siblings): Have sibling hook use entity client (datahub-project#5279)

* fixing dbt platform issues

* have sibling hook use entity client over entity service

* switching search service as well

* lint

* more lint

* more specific exceptions

* refactor(ui): Show message when related glossary terms are empty. (datahub-project#5285)

* docs(adopter): add Digital Turbine as DataHub adopter (datahub-project#5290)

* docs(docker): Update schema-registry  docker.env (datahub-project#5231)

* feat(siblings): index sibling aspects for historical dbt metadata (datahub-project#5291)

* fixing dbt platform issues

* starting sibling restore index job work

* finish restore indices

* migrating to list urns

* rename constant

* disaster recovery

* feat(ui) Adding support for deleting Tags and Domains via the UI (datahub-project#5280)

* Adding support for deleting tags and domains via the UI

* Fixing tests

* fix(test): add cleanup in tests, make urls configurable (datahub-project#5287)

* fix(docs,quickstart): release related changes for 0.8.40 (datahub-project#5299)

* fix(doc): config typo on confluent cloud doc (datahub-project#5293)

* fix(cli): suppress secrets in stacktraces (datahub-project#5302)

* Minor UI bug fuix (datahub-project#5292)

* fix(cli): timeline - category should be owner not ownership (datahub-project#5304)

* perf(ui): reduce data fetched by siblings in lineage (datahub-project#5308)

* fix(ingest): bigquery - Fix for bigquery error when there was no bigquery catalog specified (datahub-project#5303)

* fix(ui) Fix entity profile sidebar width issues (datahub-project#5305)

Co-authored-by: Chris Collins <[email protected]>

* perf(search): Improve search default performance  (datahub-project#5311)

* perf(ui): Performance improvements and misc refactorings in the UI (datahub-project#5310)

* feat(ui): Modified the drop down of Menu Items (datahub-project#5301)

* fix(validation) Fail validation error silently instead of crashing (datahub-project#5314)

* feat(docs) Add documentation on authorization & authentication (datahub-project#5265)

* fix(ui) Make profile icon clickable to expand header menu (datahub-project#5317)

* refactor(ui): Extract searchable page into its own component (perf + ux)  (datahub-project#5318)

* fix(gms) Remove auto-creating status aspects if not present when ingesting metadata (datahub-project#5315)

* fix(ui): Add missing SearchRoutes component (datahub-project#5321)

* feat(ingest): looker - ingest dashboard create/update/delete timestamps (datahub-project#5312)

* fix(ui): Fix pipeline tasks list loading (datahub-project#5332)

* feat(ingest): lookml - adding support for only emitting reachable views from explores (datahub-project#5333)

* fix(ingest): tableau - omit schema fields when name is absent (datahub-project#5275)

* fix(siblings) Combine siblings data but remove duplicate data (datahub-project#5337)

* fix(docs): Fix typo in metadata-ingestion.md (datahub-project#5338)

* fix(me) Cache the me query for performance reasons (datahub-project#5316)

* fix(tokens) Adds non-admin tests for access tokens (datahub-project#5174)

* feat(bigquery): support size, rowcount, lastmodified based table selection for profiling (datahub-project#5329)

Co-authored-by: Shirshanka Das <[email protected]>

* chore: Refactor Python Codebase (datahub-project#5113)

* docs(bigquery): profiling report enhancement (datahub-project#5342)

* feat(ingest): update CSV source to support description and ownership type (datahub-project#5346)

* fix(ui): fixed the ui issue (datahub-project#5341)

* feat(ingest): salesforce - add connector (datahub-project#5104)

Co-authored-by: Shirshanka Das <[email protected]>
Co-authored-by: Vincent Koc <[email protected]>

* feat(bootstrap): create abstract class UpgradeStep to abstract away upgrade logic (datahub-project#5349)

* fix(ingest): bigquery-usage - dataset name for sharded tables (datahub-project#5347)

Co-authored-by: Shirshanka Das <[email protected]>

* docs(features): update grammar (datahub-project#5350)

* fix(ci): fix mysql test and attempt kafka-connect ingestion (datahub-project#5352)

* feat(ui): add copy function for stats table sample value (datahub-project#5331)

* fix(ui) Correct show/hide tabs in Settings based on privileges (datahub-project#5355)

Co-authored-by: Chris Collins <[email protected]>

* fix(siblings): add useMutationUrn to domain section (datahub-project#5270)

* fixing dbt platform issues

* useMutationUrn for domains modal

* feat(schema) Show last observed timestamp in the schema tab (datahub-project#5348)

* fix(glossary) Fixes a bug for yaml ingested terms without source_url (datahub-project#5356)

* feat(lineage) Add Lineage tab to Chart and Dashboard entity profiles (datahub-project#5357)

* fix(cassandra): fix Cassandra queries used by IngestDataPlatformInstancesStep (datahub-project#5199)

* refactor(ui): Use createTag mutation for creating new tags from the UI (datahub-project#5359)

* feat(ui): Added recommendation on group modal (datahub-project#5362)

* refactor(ui): Remove unnecessary fields in GraphQL (datahub-project#5358)

* feat(ingest) - add audit actor urn to auditStamp (datahub-project#5264)

* feat(ingest): improve domain ingestion usability (datahub-project#5366)

* fix(config): fixes config key in DataHubAuthorizerFactory (datahub-project#5371)

* fix(ingest): domains - check whether urn based domain exists during resolution (datahub-project#5373)

* feat(quickstart): Adding env variables and cli options for customizing mapped ports in  quickstart (datahub-project#5353)

Co-authored-by: Shirshanka Das <[email protected]>

* fix(build): tweak ingestion build (datahub-project#5374)

* feat(sdk): python - add get_aspects_for_entity (datahub-project#5255)

Co-authored-by: Shirshanka Das <[email protected]>

* fix(airflow): fix for failing serialisation when Param was specified + support for external task sensor (datahub-project#5368)

fixes datahub-project#4546

* fix(users): fix to not get invite token unless the invite token modal is visible (datahub-project#5380)

* fix(gms) Propagate cache exception upstream (datahub-project#5381)

* fix(bootstrap): skip ingesting data platforms that already exist (datahub-project#5382)

* fix(cli): respect server telemetry settings correctly (datahub-project#5384)

Co-authored-by: Shirshanka Das <[email protected]>

* fix(ingest): bigquery - Graceful bq partition id date parsing failure (datahub-project#5386)

* feat(airflow): Circuit breaker and python api for Assertion and Operation (datahub-project#5196)

* feat(kafka-setup): add options for sasl_plaintext (datahub-project#5385)

allow sasl_plaintext options using environment variables

* fix(bigquery): multi-project GCP setup run query through correct project (datahub-project#5393)

* fix(bigquery): add storage project name (datahub-project#5395)

* Add Changes to support smoke test on Datahub deployed on kubernetes Cluster (datahub-project#5334)

Co-authored-by: Aseem Bansal <[email protected]>

* fix(PlayCookie) PLAY_TOKEN cookie rejected because userprofile exceeds 4096 chars (datahub-project#5114)

* feat(dashboards): add datasets field to DashboardInfo aspect (datahub-project#5188)

Co-authored-by: John Joyce <[email protected]>

* feat(siblings): allow viewing siblings separately (datahub-project#5390)

* allow pulling back curtain for siblings

* sibling pullback working for lineage + property merge

* propagating provinence to ui

* fixups from merge & some renames

* fix styling & add tooltip

* adding cypress tests

* fix lint

* updating mocks

* updating smoke test

* fixing domains smoke test

* responding to comments

* refactor(ui): Added Cursor pointer to tags (datahub-project#5389)

* feat(GMS): Adding Dashboard Usage Models (datahub-project#5399)

* fix(quickstart): use platform agnostic way to get folder (datahub-project#5400)

* Adds support for Domains in CSV source (datahub-project#5372)

* feat(ingestion) Build out UI form for Snowflake Managed Ingestion (datahub-project#5391)

* fix(kafka): add missing configs (datahub-project#5394)

* feat(model): dashboard usage model, is_null condition added (datahub-project#5397)

* fix(datahub-client): Fix kafka config issue (datahub-project#5403)

* build: improve comprehensiveness of gradle clean (datahub-project#5003)

* fix(gms): Change MessageDigest to be thread safe (datahub-project#5405)

* fix(metadata-ingestion) Fix broken csv enricher test (datahub-project#5406)

* fix(tests): Removes duplicate policies tests & makes DataHub user configurable (datahub-project#5365)

* feat(quickstart,docs): updates for v0.8.41 (datahub-project#5409)

* fix(ingest): ensure upgrade checks run async (datahub-project#5383)

* fix(ingest): looker - pass transport options to all api calls (datahub-project#5417)

* feat(quickstart): moving to official confluent images for m1 (datahub-project#5416)

* fix(documentation) Fix erratic cursor in documentation editor bug (datahub-project#5411)

Co-authored-by: Chris Collins <[email protected]>

* feat(ui): Supporting enriched search preview + misc improvements  (datahub-project#5419)

* chore: remove unnecessary modules from codebase (datahub-project#5420)

* fix(ingest): looker - extract usage for dashboards allowed by pattern (datahub-project#5424)

* fix(docker): fix kafka-setup command to support same capabilities as previous (datahub-project#5428)

* fix(protobuf) Set undeprecated ownership type & fix case sentitive urn corpGroup (datahub-project#5425)

* fix(ui): add dataset qualifiedName parameter to lineage query (datahub-project#5427)

* fix(glossary) Fix dropdown where disabled buttons are still clickable (datahub-project#5430)

Co-authored-by: Chris Collins <[email protected]>

* docs(bigquery): add changelog and unittest for profiling limits (datahub-project#5407)

* fix(siblings): fixing lineage fetching for siblings & sources (datahub-project#5415)

* fix(ui): Fixing unreleased search preview bugs  (datahub-project#5432)

* feat(ui): Adding Statistics Summary to Dataset + Dashboard Profiles  (datahub-project#5440)

* feat(ingest): add test source connection feature, structured report file (datahub-project#5442)

* fix(ingest/glue): handle error when generating s3 tags for virtual view tables (datahub-project#5398)

Co-authored-by: Tim Costa <[email protected]>
Co-authored-by: Shirshanka Das <[email protected]>

* feat(ingest): model - adding a small extension to support communicating structured responses (datahub-project#5429)

* fix(ingest): bigquery-usage - fix dataset name for sharded table (datahub-project#5412)

* feat(ingestion) Add new endpoint to test an ingestion connection (datahub-project#5438)

* feat(cli,build): remove deprecated variables GMS_HOST/_PORT (datahub-project#5451)

* fix(search): make filters by default an empty list if null (datahub-project#5454)

* fix(ingest): hive - add column comment as a column description (datahub-project#5449)

* feat(groups): add native groups concept to DataHub (datahub-project#5443)

* fix(ingest): fix serialization of report to handle nesting (datahub-project#5455)

* fix(ingest): tableau - fix tableau db error, add more logs (datahub-project#5423)

* build(deps): bump terser from 5.9.0 to 5.14.2 in /docs-website (datahub-project#5448)

Bumps [terser](https://github.com/terser/terser) from 5.9.0 to 5.14.2.
- [Release notes](https://github.com/terser/terser/releases)
- [Changelog](https://github.com/terser/terser/blob/master/CHANGELOG.md)
- [Commits](https://github.com/terser/terser/commits)

---
updated-dependencies:
- dependency-name: terser
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* docs: spark-lineage - configuration details for Amazon EMR (datahub-project#5459)

* feat(app): schema-history - remove blame language for the schema history feature (datahub-project#5457)

* Worked on the alignment of menu icon in search header (datahub-project#5458)

* build(deps): bump terser from 4.8.0 to 4.8.1 in /datahub-web-react (datahub-project#5446)

Bumps [terser](https://github.com/terser/terser) from 4.8.0 to 4.8.1.
- [Release notes](https://github.com/terser/terser/releases)
- [Changelog](https://github.com/terser/terser/blob/master/CHANGELOG.md)
- [Commits](https://github.com/terser/terser/commits)

---
updated-dependencies:
- dependency-name: terser
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* feat(ingest): snowflake - basic test connection capability (datahub-project#5464)

* fix(ingest/trino): Avoid exception if $properties table empty or not readable (datahub-project#5447)

Under some configuration of access rules in Trino, the user may not have
read access to the content of the table, which will result in an exception
(`fetchone()` returns `None`)

This commit ensures no exception are raised and the ingestion can proceed.

* feat(ingest): preflight - Add way to check/upgrade brew package version in preflight if needed (datahub-project#5435)

* fix(build): add base image with gradle wrapper cached (datahub-project#5467)

* doc(bigquery): groups grants by requirements (datahub-project#5468)

* fix(docs,build): remove base image not needed, cleanup docs (datahub-project#5469)

* feat(ui): Partial support for Chart usage (datahub-project#5473)

* fix(ingest): bigquery: multiproject profiling fix (datahub-project#5474)

* fix(ingest): kafka - revert deps back to < 1.9.0 (datahub-project#5476)

* feat(ci): datahub-upgrade - support multiplatform image (datahub-project#5477)

* feat(cli): quickstart - experimental support for backup restore (datahub-project#5418)

* feat(ingest): dbt - updating source lineage logic (datahub-project#5414)

Co-authored-by: Shirshanka Das <[email protected]>

* Ingestion: Added form in Big Query type to edit the queries. (datahub-project#5431)

* docs(reindex): fix docsearch config (datahub-project#5479)

* refactor(ui): Adding checkbox option to select multiple results at once. (datahub-project#5422)

* feat(cli): delete - hard delete deletes soft deleted entities (datahub-project#5478)

* fix(docs): add missing closing marker for note section (datahub-project#5480)

* fix(ci): intermittent failure in github actions (datahub-project#5452)

* feat(model, ingest): add user email in dashboard user usage counts (datahub-project#5471)

* feat(ingest): snowflake - test_connection add support for capability report (datahub-project#5472)

Co-authored-by: Shirshanka Das <[email protected]>

* feat(build): automatically mark issues as stale to close inactive issues (datahub-project#5482)

* fix(ingest): loosen confluent-kafka dep requirement (datahub-project#5489)

* refactor(ingest): cleanup importlib.import_module calls (datahub-project#5490)

* build(ingest): make gradle build less chatty (datahub-project#5491)

* fix(ingest): dbt - add support for trino datatypes (datahub-project#5379)

* refactor(ci): use custom action for checking codegen status (datahub-project#5493)

* feat(spark-lineage, java-emitter): Support ssl cert disable verification functionality (datahub-project#5488)

Co-authored-by: Shirshanka Das <[email protected]>

* docs(auth): fix link to point to new doc (datahub-project#5501)

* docs(updating-datahub): add note for breaking change in looker usage ingestion (datahub-project#5499)

* fix(ingest): cleanup unused flake8 noqa statements (datahub-project#5492)

* fix(ingest): cleanup unused flake8 noqa statements

In the future, we can discover these using `flake8-noqa`.

* add back c901

* refactor(ci): refactor Docker build-and-push workflows (datahub-project#5494)

* docs(slack): update to Slack guidelines (datahub-project#5504)

* feat(cli): delete - add --only-soft-deleted option, perf improvements (datahub-project#5485)

* fix(ingest): use temp dir for file generated during test (datahub-project#5505)

* feat(ui) Show Glossary and Domains header links to everyone (datahub-project#5506)

Co-authored-by: Chris Collins <[email protected]>

* fix(ui): Fix Flickering Issue on search input field (datahub-project#5503)

* fix(ingest): respect rest emitter timeout setting (datahub-project#5508)

* fix(ui): Flickering Issue on search input field (datahub-project#5515)

* feat(ui): Added form to Looker and Tableau (datahub-project#5487)

* feat(identity): update azure and okta connectors to emit Origin aspects (datahub-project#5495)

* feat(ui): Adding Search Select feature(frontend only)  (datahub-project#5507)

* test(ingest): limit GMS retries in test (datahub-project#5509)

* fix(ingest): airflow: update subdag check for compatibility with older Airflow versions (datahub-project#5523)

* use getattr to default None if no subdag

* add None check

* add other None check

* Apply suggestions from code review- double quotes

Co-authored-by: Harshal Sheth <[email protected]>

* minor tweak to fix lint

Co-authored-by: Harshal Sheth <[email protected]>

* fix(ingest): fix unbound variable bug in cli ingest list-runs (datahub-project#5527)

* fix(ui) Display Term Group name properly in Recently Viewed (datahub-project#5528)

* feat(ingestion) Add frontend connection test for Snowflake (datahub-project#5520)

* fix(glossary) Fix Glossary success messages and sort Glossary (datahub-project#5533)

* show error and success messages in glossary properly

* sort glossary nodes and terms alphabetically

Co-authored-by: Chris Collins <[email protected]>

* feat(apache-ranger): Apache Ranger Authorizer support in datahub-gms (datahub-project#4999)

* feat(ingest): add deprecation warning for Python 3.6 (datahub-project#5519)

* docs(townhall) add past townhall agendas (datahub-project#5536)

* feat(ingestion): add groups to ldap users (datahub-project#5470)

* chore(issues): reduce time for issues to be marked stale and then closed (datahub-project#5537)

* fix(ingestion) Set pipeline_name on UI recipes with forms (datahub-project#5535)

* Fixing OIDC logout issues (datahub-project#5538)

* fix(analytics-tab) - fix analytics tab config variable for gms (datahub-project#5529)

* feat(ui): Support batch adding / remove tags from search lists. (Batch Actions part 2/7)  (datahub-project#5534)

* fix(ingestionSource): improve error experience when ingestion source is in an inconsistent state (datahub-project#5522)

* fix(docs): Fixed typo in schema history markdown! (datahub-project#5545)

* fix(docker): Fixing dev docker and quickstart  (datahub-project#5550)

* feat(ui): Support Batch adding and removing Glossary Terms (Batch Actions 3/7) (datahub-project#5544)

* feat(ci): test quickstart works (datahub-project#5518)

* feat(ci): test quickstart works

* do not fail fast

* remove macos

* add some debug information

* tweak triggers

* fix workflow file

* remove running on every PR

* Update .github/workflows/check-quickstart.yml

Co-authored-by: Harshal Sheth <[email protected]>

* Update .github/workflows/check-quickstart.yml

Co-authored-by: Harshal Sheth <[email protected]>

Co-authored-by: Harshal Sheth <[email protected]>

* test(ingest): mark trino/hana tests as xfail due to flakes (datahub-project#5549)

* feat(ingestion): superset - add display_uri to config (datahub-project#5408)

* fix(quickstart): failure on a path not being present (datahub-project#5554)

* fix(dbt): fix issue of assertion error when stateful ingestion is used with dbt tests (datahub-project#5540)

* fix(dbt): fix issue of dbt stateful ingestion with tests

Co-authored-by: MugdhaHardikar-GSLab <[email protected]>
Co-authored-by: MohdSiddique Bagwan <[email protected]>
Co-authored-by: Ravindra Lanka <[email protected]>

* feat(ui): Batch add & remove Owners to assets via the UI (datahub-project#5552)

* feat(ingestion) Update managed ingestion scheduler to be easier to use (datahub-project#5559)

* fix(ingestion): correct trino datatype handling (datahub-project#5541)

Co-authored-by: Ravindra Lanka <[email protected]>

* feat(ingest) Allow ingestion of Elasticsearch index template (datahub-project#5444)


Co-authored-by: Ravindra Lanka <[email protected]>

* fix(ingest): fix some typos and logging issues (datahub-project#5564)

* feat(transformers): Add domain transformer for dataset (datahub-project#5456)

Co-authored-by: MohdSiddique Bagwan <[email protected]>

* chore(0.8.42): update breaking changes doc (datahub-project#5563)

* fix(ingest): activate mypy support for ParamSpec typing annotation (datahub-project#5551)

* (chore): upgrading ingestion to 0.8.42 (datahub-project#5562)

* fix(gms): ensure directory is present (datahub-project#5568)

* fix(ci): flaky smoke test fix (datahub-project#5569)

* fix(gms): missing directory for gms (datahub-project#5570)

* chore(build): tweak stale issue timing (datahub-project#5571)

* feat(ui): Batch set & unset Domain for assets via the UI (datahub-project#5560)

* extending assertion std model (datahub-project#5575)

* feat(ui): Support batch deprecation from the UI (Batch actions part 6/7) (datahub-project#5572)

* feat(graphql): add MutableTypeBatchResolver (datahub-project#4976)

* feat(ingestion) Implement secrets in new managed ingestion form (datahub-project#5574)

* fix(ui): Fixing batch set domains bug (datahub-project#5580)

* chore(gradle): update node version for docs site (datahub-project#5581)

* feat(test): add read-only smoke tests (datahub-project#5558)

* feat(ingestion) Add Save & Run button to managed ingestion builder (datahub-project#5579)

* fix(ingest): handle when current server version is unavailable (datahub-project#5547)

* feat(ingest): dbt - control over emitting test_results, test_definitions, etc. (datahub-project#5328)

Co-authored-by: Piotr Sierkin <[email protected]>
Co-authored-by: Shirshanka Das <[email protected]>

* feat(datahub-client): add java file emitter (datahub-project#5578)

Co-authored-by: Shirshanka Das <[email protected]>

* feat(ingest): infer aspectName from aspect type in MCP (datahub-project#5566)

* fix(ingest): sql-common - db2, snowflake bug fixes to extract table descriptions (datahub-project#5526)

Co-authored-by: Harshal Sheth <[email protected]>
Co-authored-by: Shirshanka Das <[email protected]>

* fix(ingest): moving delta-lake connector to be 3.7+ only (datahub-project#5584)

* feat(ingest): delta-lake - extract table history into operation aspect (datahub-project#5277)

Co-authored-by: Shirshanka Das <[email protected]>

* fix apache ranger plugin readme file rendering (datahub-project#5585)

* feat(ui): make container description searchable and have description show up in results (datahub-project#5586)

* fix(groups): fix user, search, and preview group membership to be fetched for both external and native group memberships (datahub-project#5587)

* feat(ingest): power-bi - make ownership ingestion optional (datahub-project#5335)


Co-authored-by: MohdSiddique Bagwan <[email protected]>
Co-authored-by: Harshal Sheth <[email protected]>

* Expose catalog_name in athena.py (datahub-project#5548)

* expose catalog_name to the sql alchemy uri that is passed into pyathena

Co-authored-by: Ravindra Lanka <[email protected]>
Co-authored-by: Shirshanka Das <[email protected]>

* Fix profiling when using {table}. (datahub-project#5531)

* profiling fix for when using {table}

Co-authored-by: Shirshanka Das <[email protected]>
Co-authored-by: Ravindra Lanka <[email protected]>

* feat(ui): Support batch deleting from ui (datahub-project#5582)

* feat(ingest): clickhouse - add metadata modification time and data size (datahub-project#5330)

Co-authored-by: Ravindra Lanka <[email protected]>

* feat(ui): Add rich UI ingestion run summary (datahub-project#5577)

* fix(ci): smoke test less flaky, add src, dev dep in smoke image (datahub-project#5594)

* updated mock custom to pass the test suite

* added env for mysql-setup for smoketest to pass

* added env for mysql-setup for smoketest to pass

* added env for mysql-setup for smoketest to pass

* push to heruko repo instead of linkedin

Co-authored-by: Aseem Bansal <[email protected]>
Co-authored-by: Tamas Nemeth <[email protected]>
Co-authored-by: Michael A. Schlosser <[email protected]>
Co-authored-by: John Joyce <[email protected]>
Co-authored-by: Pedro Silva <[email protected]>
Co-authored-by: Chris Collins <[email protected]>
Co-authored-by: Chris Collins <[email protected]>
Co-authored-by: Mugdha Hardikar <[email protected]>
Co-authored-by: Shirshanka Das <[email protected]>
Co-authored-by: Chris Collins <[email protected]>
Co-authored-by: Maggie Hays <[email protected]>
Co-authored-by: Ankit keshari <[email protected]>
Co-authored-by: Gabe Lyons <[email protected]>
Co-authored-by: liyuhui666 <[email protected]>
Co-authored-by: Tengis Batsaikhan <[email protected]>
Co-authored-by: Chris Collins <[email protected]>
Co-authored-by: Pedro Silva <[email protected]>
Co-authored-by: Mayuri Nehate <[email protected]>
Co-authored-by: dougpm <[email protected]>
Co-authored-by: Vincent Koc <[email protected]>
Co-authored-by: Aditya Radhakrishnan <[email protected]>
Co-authored-by: Amanda Ng <[email protected]>
Co-authored-by: Chris Collins <[email protected]>
Co-authored-by: Justin Marozas <[email protected]>
Co-authored-by: Sergio Gómez Villamor <[email protected]>
Co-authored-by: Navin Sharma <[email protected]>
Co-authored-by: Aezo <[email protected]>
Co-authored-by: abiwill <[email protected]>
Co-authored-by: Felix Lüdin <[email protected]>
Co-authored-by: Harshal Sheth <[email protected]>
Co-authored-by: Chris Collins <[email protected]>
Co-authored-by: leifker <[email protected]>
Co-authored-by: Alexey Kravtsov <[email protected]>
Co-authored-by: Tim Costa <[email protected]>
Co-authored-by: Tim Costa <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Guillaume Gardey <[email protected]>
Co-authored-by: Vishal Shah <[email protected]>
Co-authored-by: mohdsiddique <[email protected]>
Co-authored-by: Salih Can <[email protected]>
Co-authored-by: RyanHolstien <[email protected]>
Co-authored-by: Skyler Sinclair <[email protected]>
Co-authored-by: Dan Andreescu <[email protected]>
Co-authored-by: MohdSiddique Bagwan <[email protected]>
Co-authored-by: Ravindra Lanka <[email protected]>
Co-authored-by: Marcin Szymański <[email protected]>
Co-authored-by: xiphl <[email protected]>
Co-authored-by: NoahFournier <[email protected]>
Co-authored-by: Piotr Sierkin <[email protected]>
Co-authored-by: Piotr Sierkin <[email protected]>
Co-authored-by: Jordan Wolinsky <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants