Releases: datahub-project/datahub
[Known Issues] DataHub v0.9.4
Known Issues
In this release, the version of our OIDC SSO library was majorly upgraded. There is an issue with how the newer version of the library interacts with OIDC providers. We have addressed this issue in v0.9.5. We recommend avoiding upgrading to this version if your organization is actively using OIDC to manage user authentication.
Important Release Notes
With this release, if you are using Neo4J as your graph implementation, you need to set:
GRAPH_SERVICE_DIFF_MODE_ENABLED=false
For GMS (or MAE Consumer for standalone mode).
What's Changed
- chore(): Updating default CLI version, update updating-datahub.md by @jjoyce0510 in #6590
- fix(ingest): profiling - Profiling failed if column cardinality threw an error by @treff7es in #6582
- fix(actions): add missing datahub-gms-protocol env var by @shirshanka in #6593
- fix(ingest): restrict snowflake-connector-python dependency by @mayurinehate in #6594
- feat(ingest/bigquery): avoid creating/deleting tables for profiling by @hsheth2 in #6578
- fix(ingest): unify emit interface by @hsheth2 in #6592
- fix(security): security version updates by @david-leifker in #6602
- docs: remove Kafka Streams from documentation by @maver1ck in #6596
- refactor(ui): Improving Kafka UI Ingestion Form, Create Domain, Create Secret Modals by @jjoyce0510 in #6588
- fix(ingest): clarify tableau auth error messages by @hsheth2 in #6600
- docs(graphql): fix deleteTest "Create"->"Delete" by @nickwu241 in #6574
- fix(gms/startup): remove set -x from start.sh by @timcosta in #6589
- feat(sql): Add SQL index on createdon field by @pedro93 in #6522
- feat(ml model): updating view of ml model feature list by @gabe-lyons in #6576
- fix(ingest/bigquery): ignore complex types from profiling by @treff7es in #6613
- feat(ingest): add external url for snowflake objects by @mayurinehate in #6580
- chore(ingest): bump and pin mypy by @hsheth2 in #6584
- fix(ingest): only require github_info for lookml and not looker by @hsheth2 in #6608
- docs(ingest): add airflow docs that use the
PythonVirtualenvOperator
by @hsheth2 in #6604 - fix(ui) Fix double scroll in embedded list search sections by @chriscollins3456 in #6618
- feat(ingest): print detailed GMS error messages by @djordje-mijatovic in #6519
- Townhall agenda wikimedia by @maggiehays in #6622
- fix(analytics): skip ListDomains if user cannot manage domains and have only one loading message by @aditya-radhakrishnan in #6624
- feat(quickstart): add support for passing thru env vars needed by Sla… by @shirshanka in #6591
- docs(actions): slack, teams by @shirshanka in #6632
- fix(logging): Remove lombok as source of slf4j-api by @david-leifker in #6616
- docs: add links from main README to slack, teams actions by @shirshanka in #6633
- feat(ingest): Support config variable for specifying a direct privat… by @mayurinehate in #6609
- Add AWS Postgres Iam Auth jar to GMS by @syedzoherer in #6371
- feat(ingest/snowflake): support filtering by fully qualified schema_pattern by @mayurinehate in #6611
- feat(ingest/kafka-connect): support MongoSourceConnector by @frsann in #6416
- feat(graph) Add createdOn, createdActor, updatedOn, updatedActor to graph edges by @chriscollins3456 in #6615
- refactor(ui): Making improvements to UI ingestion forms, adding MySQL, Trino, Presto, MSSQL, MariaDB forms by @jjoyce0510 in #6607
- perf(ui-ingestion): cache on creation or deletion of ingestion sources to reduce latency by @aditya-radhakrishnan in #6647
- feat(ingest): add dummy data source for automated testing by @anshbansal in #6550
- docs(managed datahub): adding release notes for v0.1.70 by @anshbansal in #6655
- feat(gms): Pluggable Authentication & Authorization Framework by @mohdsiddique in #6634
- docs: move rfcs to separate repo by @laulpogan in #6621
- fix(ingest): fix lingering demo-data source issues by @hsheth2 in #6659
- feat(ingest): bigquery - Running lineage extraction after metadata extraction by @treff7es in #6653
- fix(ingest): issue deprecation warning correctly by @hsheth2 in #6623
- chore(ingest): remove feast-legacy by @hsheth2 in #6661
- fix(ingest/snowflake): support domains for snowflake schema containers by @hsheth2 in #6662
- build(deps): bump decode-uri-component from 0.2.0 to 0.2.2 in /datahub-web-react by @dependabot in #6617
- feat(ingest/dbt): add support for latest DBT version 1.3 by @MatthieuBlais in #6651
- docs: add languages to code highlighting by @hsheth2 in #5576
- docs(typo) Correct typo in domains.md by @maggiehays in #6667
- feat(gms): Enable auth-api publishing to maven by @mohdsiddique in #6671
- fix(ingest/powerbi-report-server): deprecate unused graphql config by @daha in #6630
- fix(docker): Fix datahub-frontend dockerfile by @jjoyce0510 in #6670
- fix(ingest): profiling - Changing profiling defaults by @treff7es in #6640
- feat(ci): add smoke test for domain mutation by @anshbansal in #6641
- fix(datahub-protobuf): fix missing httpclient dependency by @shirshanka in #6672
- feat(ingest): update snowflake docs, add simple validations by @mayurinehate in #6636
- fix(gms): DataHub Auth API java doc fix by @mohdsiddique in #6674
- feat(ingest): run profiler in more cardinality cases by @hsheth2 in #6397
- docs(search) update broken youtube link by @maggiehays in #6678
- docs(protobuf): update examples for protobuf by @david-leifker in #6681
- feat(ingest): support knowledge links in business glossary by @mohdsiddique in #6375
- fix(ingestion/vertica): support columns with timestamp precision by @inancdokurel in #6295
- feat(ingest): add timestamps for snowflake objects by @mayurinehate in #6570
- feat(onboarding): adds framework and some steps for onboarding steps UI by @aditya-radhakrishnan in #6462
- feat(ingest): use entry point for registering transformers by @Masterchen09 in #6628
- chore(ci): update base ingestion image requirements file by @anshbansal in #6687
- fix(ci): reduce warnings due to deprecated action by @anshbansal in #6686
- refactor(ui): Adding caching for users, groups, and roles by @jjoyce0510 in #6673
- fix(ci): revert confluent kafka in base image by @anshbansal in #6690
- fix(security): version bump to latest minor python image by @david-leifker in #6694
- docs(ingest/salesforce): list required permissions by @orlandine in #6610
- feat(ingest): bigquery - option to set on behalf project by @treff7es in #6660
- ci: stop commenting test results on PR by @hsheth2 in #6700
- fix(auth-api): Attempting to fix publish for auth-api by @jjoyce0510 in https:...
DataHub v0.9.3
Release Highlights
Important Release Notes
With this release, if you are using Neo4J as your graph implementation, you need to set:
GRAPH_SERVICE_DIFF_MODE_ENABLED=false
For GMS (or MAE Consumer for standalone mode).
User Experience
- Column Level Lineage Impact Analysis is live! Read more about it here
- You can now sort Dataset field names alphabetically - this is super handy for finding columns within wide datasets that may not have an easy-to-follow order by default
- New - an “Explore All” button on the home page, making it easier to jump into the search experience
- Plus! We now have a “Share” button on entity pages, making it easier for you to share DataHub links with others
-
[Community Contribution] You can now assign the same user as different owner types - thanks for the contrib, @rtekal!
-
[Community Contribution] You can now see recommendations for Recently Edited entities on the homepage! - thanks for the contrib, @CorentinDuhamel
Metadata Ingestion
- Snowflake Automated PII Classification is here! We’re eager for feedback on the utility of this feature - check out this guide, take it for a spin, and let us know what you think!
- NEW! dbt Cloud ingestion is ready for ya - check out the module details here
- We’ve simplified the configs required to add stateful ingestion to an ingestion source - check out the updated docs here
- Speaking of stateful ingestion, it’s now available with:
- Looker & LookML ingestion sources
- [Community Contribution] Container-level ingestion – thanks for the contrib, @wangsaisai!
Developer Experience
- [Community Contribution] For those of you deploying DataHub with Neo4j, we now support Lineage Impact analysis via Neoj4 mulithop functionality. Thanks for the contrib, @djordje-mijatovic!
- We’ve loosened our SQLAlchemy dependencies to support Airflow 2.3+
What's Changed
- fix(spark-lineage): Smoke test fix + smoke test m1 support by @treff7es in #6372
- feat(ingest): supports MCEs in domain transformer by @hsheth2 in #6364
- feat(ingest): enable container stateful ingestion by @wangsaisai in #6343
- build(ingest): pin mypy version by @hsheth2 in #6391
- build: use acryl's gradle-avro-plugin by @hsheth2 in #6390
- fix(ingest): unity - add missing date type by @ms32035 in #6385
- fix(ingest): unity-catalog - Removing unneeded sqlalchemy dependency to fix install by @treff7es in #6379
- feat(ingest/tableau): re-authenticate if the token expires by @hsheth2 in #6380
- fix(ingest): use profiler config settings correctly by @hsheth2 in #6354
- fix(ingest): handle error when query returns no columns in snowflake lineage by @mayurinehate in #6404
- fix(ingest): fix missing snowflake lineage when table_pattern is set by @mayurinehate in #6410
- feat(ingest): loosen sqlalchemy dep & support airflow 2.3+ by @hsheth2 in #6204
- fix(ingest/s3): add status aspect for detected s3 datasets by @mayurinehate in #6402
- fix(ingest/snowflake): loosen snowflake connector version requirement by @hsheth2 in #6418
- fix(mysql): fix native data type for mysql set type by @mayurinehate in #6407
- perf(ui): virtualized schema table rows by @stanbaker in #6287
- fix(ui) Improve HoverEntityTooltip and truncate parent glossary nodes by @chriscollins3456 in #6417
- feat(ingest): support incremental lineage to dbt node from external platform by @mayurinehate in #6392
- fix(ingest): init dataset props if missing in transformer by @hsheth2 in #6429
- fix(change-event): remove unnecessary dependencies on EntityChangeEventGeneratorRegistryFactory by @aditya-radhakrishnan in #6431
- build(deps): bump moment-timezone from 0.5.34 to 0.5.35 in /datahub-web-react by @dependabot in #5783
- feat(frontend): Adding support to show externalUrl and institutionalMemoryFields for MLModels by @lurecas in #6053
- feat(model): adds properties, ownership, deprecated, institutional memory and tags as aspects for data platform instance entity by @sgomezvillamor in #5728
- docs(ingest/airflow): clarify docs around 1.x compat by @hsheth2 in #6436
- feat(recommendations): add last edited entities by @CorentinDuhamel in #6329
- fix(ingest): correctly compute entity change percentage by @hsheth2 in #6438
- docs(townhall) Updating Townhall History by @maggiehays in #6336
- Neo4j multihop support by @djordje-mijatovic in #6104
- fix(mae-consumer): Set proper variable expansion for JMX_OPTS and JAVA_OPTS in MAE docker by @skrydal in #6378
- docs(ingest): move prerequisite section before the ingestion recipe example by @mayurinehate in #6341
- fix(dataset): improve glossary term load performance for datasets by @Reilman79 in #6396
- feat(lineage) Implement CLL impact analysis for inputFields by @chriscollins3456 in #6426
- feat(ui) Add upgrade step to enable CLL impact analysis for existing data by @chriscollins3456 in #6427
- Added functionality to copy fieldpath and urn of each column by @Ankit-Keshari-Vituity in #6398
- fix(ingestion): add output converters for ODBC unsuported datatype in… by @LavinaVRovine in #6134
- fix(ui) Fix parentNodes overfetching everywhere it's used by @chriscollins3456 in #6446
- fix(ingest): snowflake - Fixing top query trimming in snowflake by @treff7es in #6447
- feat(elasticsearch): Updates to elasticsearch configuration, dao, tests by @david-leifker in #6269
- chore(ingest): fix mssql lint by @hsheth2 in #6453
- fix(ingest): add cli info to ingestion reporter by @hsheth2 in #6451
- fix(ui) Fix glossary side browser width fluctuating by @chriscollins3456 in #6457
- fix(python): Fix python dependencies for doc generation by @david-leifker in #6460
- docs(website): add homepage links by @jeffmerrick in #6458
- build(ingest): loosen jinja2 dependency for superset by @KulykDmytro in #6433
- fix(ingest): lowercase db name in mssql ingestion by @hsheth2 in #6448
- fix(ingest): handle missing schema in transformer by @hsheth2 in #6445
- feat(ingest): allow specific profiler config fields to override profile_table_level_only by @hsheth2 in #6366
- docs(enrichment) updating enrichment landing page by @maggiehays in #6286
- fix(home-page): remove redundant getAuthenticatedUser query by @aditya-radhakrishnan in #6464
- feat(ingest): detect old or missing docker compose by @hsheth2 in #6466
- feat(ingestion): powerbi # Power BI report support by @mohdsiddique in #6339
- fix(ingest/dbt): disable incremental lineage by default by @hsheth2 in #6467
- fix(loggin): print logging timestamp in ISO8601 format instead of jus… by @szalai1 in #6474
- docs(ingest/trino): add example of http connect...
DataHub v0.9.2
Release Highlights
This is a Bug Fix (non-scheduled) release to address the Known Issues in v0.9.1.
User Experience
- Improvements to Nav Bar UX
- Improvements to filtering for Related Glossary Entities and Tags (migrated to using keyword filters and fixed longstanding urn-related bug)
- Enable providing an Ownership Type for Glossary Terms, Nodes, and Domains
- Allow adding links without a full domain to entity (community requested)
Actions
- Adding new Entity Change Events for DataProcessInstanceRunEvent and AssertionRunEvent
Access Management
- Added new Metadata Privilege called "Manage Children" which permits creating and deleting Glossary Terms and Glossary Nodes inside a particular Node.
Fixes
- Properly escape schema field urns with URN-encoded characters
- Fix around the visibility of deleting a Term Group with children
- Properly show personal access token duration beyond 1 month during creation
What's Changed
- feat(change-event): add change events for DataProcessInstanceRunEvent by @aditya-radhakrishnan in #6320
- Worked on the Usage column & Lineage Drawer by @Ankit-Keshari-Vituity in #6290
- refactor(bootstrap data): Adding assertions data to bootstrap by @jjoyce0510 in #6324
- fix(ui) Disable deleting Term Groups with children by @chriscollins3456 in #6332
- feat(ingestion): business-glossary - Add values and relatedTerms support by @mohdsiddique in #6148
- fix(ui): two small ux fixes by @gabe-lyons in #6335
- feat(ingest): add new ingestion source PowerBI Report Server by @alcoccoque in #5369
- feat(ingest): drop plugin support for airflow 1.x by @hsheth2 in #6331
- fix(ingest): fix invalid schema field urns with empty field path by @mayurinehate in #6338
- fix(perf): trim down unnecessary fields from container and domain GraphQl queries by @aditya-radhakrishnan in #6337
- fix(adv search): fixing typo in es utils by @gabe-lyons in #6348
- docs(ingest): clarify that redshift-usage doesn't support column-level usage by @hsheth2 in #6347
- Fix the date issue on Create Access token Modal by @Ankit-Keshari-Vituity in #6342
- Enable Owner Type for Glossary Node + Domain by @jjoyce0510 in #6334
- fix(ui) Fix filters in embedded list search component by @chriscollins3456 in #6350
- docs: add Razer as DataHub adopter by @lvxhnat in #6353
- Allow links without top level domain by @djordje-mijatovic in #6317
- feat(ingest): support reserved keywords in model codegen by @hsheth2 in #6351
- docs(act) update Act on Metadata landing page by @maggiehays in #6288
- feat(build): refactor cypress tests, add some tests for invite users, domain creation by @anshbansal in #6259
- fix(ingest): fix log line interpolation by @hsheth2 in #6349
- fix(schema-history): remove unnecessary margin on version selector by @aditya-radhakrishnan in #6359
- feat(ingest): upgrade feast by @cburroughs in #6186
- fix(apache-ranger): compile ranger plugin for java 8 by @mohdsiddique in #6355
- feat(privileges) Create privileges to allow for managing children of entities by @chriscollins3456 in #6346
- fix(ui) Handle encoded schemaField urns on the frontend by @chriscollins3456 in #6321
- refactor(ui): Refactor the Glossary Related Entities, Tag Profiles to use search filters instead of query API. by @jjoyce0510 in #6352
- fix(ingest): only log vars if requested by @hsheth2 in #6362
- Install openjdk in base ingestion image by @frsann in #6365
- docs(logos) add Hurb and Razer logos to docs site by @maggiehays in #6363
- fix(ingest): bigquery - Sending lineage aspects as patch by @treff7es in #6313
New Contributors
- @alcoccoque made their first contribution in #5369
- @lvxhnat made their first contribution in #6353
- @cburroughs made their first contribution in #6186
Full Changelog: v0.9.1...v0.9.2
DataHub v0.9.1
Release Highlights
Known Issues
- In embedded search experiences (Glossary Terms, Domains, Lineage), filters can become "locked" in place once selected. This is addressed in v0.9.2
User Experience
- Column-level Impact Analysis is here! You can now see the full end-to-end list of column dependencies; watch the demo here
- When creating a Glossary Term from the UI, you can now add the description in the same step
- We now support adding Domains to Glossary Terms
- You can now preview Entity Names and Types in browser tabs
- Login with SSO button on the login page.
Bug Fixes
- Assertions Tab functionality is restored
- SSO: Continuous login loop bug reported when the session cookie size exceed 4096 characters has been address.
- Ingestion scheduler for > 30 ingestion sources is now fixed. Previously there was a bug causing certain ingestion to become unscheduled.
Metadata Ingestion
- New Ingestion Source: Databricks Unity Catalog - check out the docs here
- Tableau: Column-level lineage and Stateful Ingestion are now supported
- LookML: Improved column-level lineage
- BigQuery: we have promoted
bigqery-beta
tobigquery
- Snowflake: Stateful Ingestion now supports deleting Containers
DataHub Docs Site
We continue to push improved feature guides to the DataHub docs site, including:
- Sync Status
- DataHub Roles
- Dataset Usage and Query History
- DataHub Access Policies
- Managed DataHub Metadata Tests
What's Changed
- feat(ui): looker, lookml - add banner to cross-link ingestion by @Ankit-Keshari-Vituity in #6111
- feat(ingest): infer aspect name from type in get_aspect by @hsheth2 in #6033
- feat(ingestion): Tableau stateful ingestion by @amanda-her in #6094
- feat(ingest): include raw s3 paths if s3 source by @hsheth2 in #6168
- feat(secrets) Allow creating secrets with multiline values in the UI by @chriscollins3456 in #6169
- feat(ingest/tableau): support dashboard tags by @hsheth2 in #6185
- feat(ingest): bigquery-beta - Parsing view ddl definition for lineage by @treff7es in #6187
- fix(ingest) - bigquery-beta - Using table ref instead of table id by @treff7es in #6193
- docs(roles): update roles docs to new doc format by @aditya-radhakrishnan in #6175
- docs(posts): add posts feature guide by @aditya-radhakrishnan in #6184
- feat(ingest): include instance in container dataPlatform when provided by @hsheth2 in #6083
- feat(telemetry): add telemetry events to the settings page by @aditya-radhakrishnan in #6198
- Worked to update the ingestion type while editing by @Ankit-Keshari-Vituity in #6156
- fix(ingest): add lower bound for ujson dep version by @hsheth2 in #6189
- feat(ingest/tableau): emit status aspects + streamline stateful ingestion by @hsheth2 in #6188
- feat(ingest): support self-signed certs in Tableau by @hsheth2 in #6172
- fix(ingest): report warning/error counts correctly by @hsheth2 in #6128
- fix(ingest): Closeable as a context manager by @hsheth2 in #6067
- feat(ingestion-ui) Add new form for the bigquery-beta connector by @chriscollins3456 in #6200
- feat(ingest): add platform instance to tableau by @alaponin in #5978
- feat(release): bump CLI version to 0.9.0 by @szalai1 in #6195
- fix(frontend): fix UI message in create group modal by @liyuhui666 in #6205
- docs: dataset usage and query history feature guide by @treff7es in #5900
- fix(glossary) Improve business glossary loading performance by @chriscollins3456 in #6208
- feat(ingest): replace base85's pickle with json by @hsheth2 in #6178
- docs: add sync status feature guide by @hsheth2 in #5897
- feat(frontend): add custom ssl truststore settings by @alexey-kravtsov in #6090
- docs(spark): add configuration instructions for databricks by @mayurinehate in #6206
- fix(ingest): use corpGroup instead of corpgroup by @hsheth2 in #6202
- build: upgrade gradle wrapper by @hsheth2 in #6203
- fix(ingest): catch errors when profiling for sample values by @mayurinehate in #6194
- fix(ingest): only restrict GE version for hive by @hsheth2 in #6170
- feat(ingest/GE): enable debug logs to stdout when DATAHUB_DEBUG env var is set by @mayurinehate in #6192
- feat(ingest): allow selfsigned certificate in s3 source by @mayurinehate in #6179
- build(ingest): remove markupsafe dep and bump pytest-docker by @hsheth2 in #6201
- docs(access policies): Creating Proper Access Policies Guide by @jjoyce0510 in #6001
- feat(ingest): support deletion of containers in snowflake stateful in… by @mayurinehate in #6180
- fix(glossary) Improve performance when getting root glossary terms by @chriscollins3456 in #6214
- fix(ui) Fix bigquery and redshift forms for lineage fields by @chriscollins3456 in #6215
- fix(ui) Properly display column-level lineage with v2 field paths by @chriscollins3456 in #6217
- fix(ingest): bigquery-beta - Add stacktrace to bigquery schema ingest logs by @treff7es in #6226
- tests(embedded search): adding domain & container tests by @gabe-lyons in #6221
- fix(docs): fix pdl link for mxe docs by @aditya-radhakrishnan in #6230
- feat(telemetry): add telemetry events to the glossary, domains, and managed ingestion pages by @aditya-radhakrishnan in #6216
- fix(ingest): bigquery-beta - Adding python 3.8 fix for memory footprint util by @treff7es in #6228
- docs(quickstart): enable slack community link by @jx2lee in #6209
- fix(build): allow image tag via env, fix requirements by @anshbansal in #6237
- fix(ingest): remove back-ticks from table name when creating urn by @mayurinehate in #6236
- feat(ingest): bigquery-beta - Add option to lowercase urns by @treff7es in #6240
- fix(ingest): presto-on-hive - Adding db name to the presto on hive urn by @treff7es in #6024
- Worked on the CSS issue of Add Owners Modal by @Ankit-Keshari-Vituity in #6223
- fix(ingest): stateful-ingestion - keep dataset urn case in checkpoints by @treff7es in #6244
- Create Tag Modal Issue: Clear the input value on press. by @Ankit-Keshari-Vituity in #6212
- feat(build): add cypress tests for glossary and deprecation by @anshbansal in #6249
- feat(ingest): hive-on-presto - Add option to properly filter hive schemas by @treff7es in #6247
- fix(ingest):lookml - better column-level linea...
DataHub v0.9.0
Release Highlights
Known Issues
Assertions Tab UX bug
This release introduced a bug in the assertions tab causing assertion results to be hidden. This will be addressed in the subsequent release.
Release Notes
We’re excited to announce the release of DataHub v0.9.0!
This minor release includes an upgrade to Java 11 and surfacing Column-Level Lineage support within the DataHub UI.
Here are some additional highlights:
User Experience
- Column-Level Lineage is now surfaced within the DataHub UI!
- Advanced Search now supports searching by Column-level details (i.e. name, description, tag, etc.), as well as complex AND/OR statements. For example:
- Show results that match any filters
- Show results that match all filters
- Owner is either of Shannon or Mark
- Oner is not Shannon nor Mark
- Try it in demo here
- You can now add invite users and assign them to a default DataHub Role
- Improvements to site performance during the Browse experience
Developer Experience
- DataHub has been upgraded to Java 11!
- Improved tracking of GraphQL errors for bug resolution
- CorpUser and CorpGroup are now available via the Python SDK
Metadata Ingestion
- Automatically extract Column-Level Lineage from Snowflake & Looker sources
- dbt Meta Mapping is now supported at the Column Level - this means you can automatically extract Tags and Glossary Terms from your dbt model and surface them in DataHub
What's Changed
- fix(ingest): bigquery-beta - Getting datasets with biquery client by @treff7es in #6039
- feat(roles): add ability to invite users into a role by @aditya-radhakrishnan in #6015
- refactor(java11) - convert most modules to java 11 by @leifker in #5836
- docs(readme): Fixing broken article link by @davrax in #6042
- refactor(ingest): streamline pydantic configs by @hsheth2 in #6011
- docs(ingest): add example of dbt column_meta_mapping by @hsheth2 in #6038
- refactor(ingest): use aspect map in transformers by @hsheth2 in #6040
- feat(ui): Adding placeholder entity for DataPlatform by @jjoyce0510 in #6045
- feat(ingest): implement compression for CheckpointState by @alexey-kravtsov in #6007
- feat(advanced-search): adding select value modal by @gabe-lyons in #6026
- fix(ingest): bigquery-beta - Additional fixes for Bigquery beta by @treff7es in #6051
- feat(advanced search): adding advanced search filter component & prereqs for it by @gabe-lyons in #6055
- docs(ingest): add path spec examples for s3 by @mayurinehate in #6050
- fix(deps): metadata-io - remove parquet dependency by @shirshanka in #6046
- fix(ingestion): Tableau test case execution fix by @mohdsiddique in #6005
- feat(ingest): list referenced env variables in recipe by @hsheth2 in #6043
- fix(ingest): compat with mypy 0.981 by @hsheth2 in #6056
- fix(elasticsearch_index): create datahub_usage_event index where
datahub_analytics_enabled
set tofalse
by @GyuhoonK in #5974 - docs(approval workflows): adding approval workflow docs by @gabe-lyons in #5896
- feat(retention): disable applying retention on bootstrap by @anshbansal in #6066
- fix(ingest): correct tableau browse paths by @hsheth2 in #6064
- fix(ingest): bigquery-beta - handling complex types properly by @treff7es in #6062
- docs: create SECURITY.md by @laulpogan in #6069
- fix(containers): show soft deleted status of containers by @gabe-lyons in #6072
- docs(ingest): clarify bigquery-beta multiproject setup by @hsheth2 in #6071
- chore(setup): change defaults for partitions by @anshbansal in #6074
- refactor(browse): Improving Browse Feature Performance by @jjoyce0510 in #6073
- feat(ingest): add column-level lineage support for snowflake by @mayurinehate in #6034
- feat(ingest): looker - support for simple column level lineage by @shirshanka in #6084
- fix(elastic-setup) Fixing env var logic by @pedro93 in #6079
- Revert "chore(setup): change defaults for partitions (#6074)" by @pedro93 in #6086
- fix(mae-consumer): fix regression on base64 encoding by @codesorcery in #6061
- fix(elasticsearch) Analytics indices creation on AWS ES by @tomas-kubin in #5502
- docs(ingest): note that Athena doesn't support lineage by @hsheth2 in #6081
- fix(ingest): alias for mssql-odbc source by @hsheth2 in #6080
- fix(ingest): presto-on-hive - Setting display name properly by @treff7es in #6065
- fix(schema filter): fix schema infinite rerender by @gabe-lyons in #6082
- feat(monitoring): track graphql errors in metrics by @szalai1 in #6087
- feat(advanced search): Add component to show all advanced search filters & add new filter by @gabe-lyons in #6058
- fix(ingest): bump
lkml
version by @hsheth2 in #6091 - fix(ingest): lookml - extract column correctly by @shirshanka in #6093
- feat(retention): change default policy, add API to apply retention by @anshbansal in #6088
- fix(lineage): fix missed casing in lineage registry by @gabe-lyons in #6078
- fix(ingest): bigquery-beta - Lowering a bit memory footprint of bigquery usage by @treff7es in #6095
- feat(ingest): remove hardcoded env variable default for cli version by @shirshanka in #6075
- docs: add information about mapping ports for datahub-gms by @shirshanka in #6092
- chore(deps): upgrade graphql-java deps to 19.0 by @shirshanka in #6099
- chore(deps): upgrade neo4j to 4.4.x by @shirshanka in #6101
- feat(docs): Improve documentation about Search by @szalai1 in #5889
- feat(ingest): add async option to ingest proposal endpoint by @RyanHolstien in #6097
- chore(deps): upgrade opentelemetry dependencies by @shirshanka in #6100
- refactor(recommendations): Bump default max recommendations count for Platforms by @jjoyce0510 in #6113
- feat(ingest): add Sandbox support by @rgudic in #6105
- fix(mae): use JAVA_TOOL_OPTIONS instead of JDK_JAVA_OPTIONS by @szalai1 in #6114
- feat(advanced-search): Complete Advanced Search: backend changes & tying UI together by @gabe-lyons in #6068
- feat(search): improved search snippet FE logic by @gabe-lyons in #6109
- feat(ingest): add CorpUser and CorpGroup to the Python SDK by @ttaubermarshall-stripe in #5930
- fix(ingest): hide deprecated path_spec option from config by @hsheth2 in #5944
- feat(posts): add posts feature to DataHub by @aditya-radhakrishnan in #6110
- fix(ingest): remove unused mysql golden file by @hsheth2 in #6106
- fix(ingestion): fix percent change computation in stale_entity_removal by @rslanka in #6121
- refactor(ingest): use pydantic utilities for NamingPattern by @hsheth2 in #6013
- fix(ingest): presto-on-hive - not failing on Hive type parsing error by @treff7es in #6118
- fix(ingest): ignore usage and operation for snowflake datasets withou… by @mayurinehate in https://github.com...
DataHub v0.8.45
Release Highlights
User Experience
- Allow Term Groups to be the target of permissions
- Customize browser favicon via
REACT_APP_FAVICON_URL
param - Some UX improvements for charts & dashboards entity pages to reduce confusion
- Performance improvements on the lineage visualization
- Search bar for dataset schema tab
Developer Experience
- Add rest endpoint for restoring indices of a single entity (/aspects?action=restoreIndices)
- Create new platform instances via CLI
- Improved impact analysis performance due to an added caching layer
- Support for Patch as seen in August 2022 town hall.
Metadata Ingestion
- Introduces bigquery-beta source
- Looker source memory usage dramatically reduced
- Report memory usage during ingestion
- Improve Tableau lineage
- Usage statistics for Tableau
- LookML can automatically clone your Git repository. LookML is now supported in UI-based ingestion.
- dbt supports column-level meta mappings
- Support for deletion & rollback of time series data
- Upgrade to browse path forms
[see next page for list of commits]
What's Changed
- fix(privileges) Add Term Groups as targetable entities for privileges by @chriscollins3456 in #5806
- fix(javadocs): remove ampersand from pdl causing issue in doc generation for openapi by @RyanHolstien in #5808
- chore(ingest): remove archived docs by @hsheth2 in #5793
- feat(ingest): add rewrite option for metadata file check by @hsheth2 in #5763
- feat(cli): add support for sampled reporting to keep logs manageable by @shirshanka in #5800
- docs(refactor): Refactor Tags Feature Guide by @maggiehays in #5781
- docs(feature-guide) Impact Analysis by @maggiehays in #5765
- feat(theming): set custom favicon via env var by @gabe-lyons in #5810
- test(smoke-test): check debug arg in executor requests by @hsheth2 in #5811
- fix(ingest): bigquery-beta - Fixing dependencies by @treff7es in #5814
- feat(ingest): looker - reduce memory requirements by @shirshanka in #5815
- feat(restore-indices): add endpoint for restore indices, add basic check for graph by @anshbansal in #5805
- fix(frontend): download node only when USE_SYSTEM_NODE is set to false by @szalai1 in #5817
- doc: Make Airflow link clickable by @daha in #5803
- feat(ingest):looker - reduce mem usage, misc reporting improvements by @shirshanka in #5823
- feat(model, ingest): populate sizeInBytes in snowflake, fall back to table level profiling for large tables by @mayurinehate in #5774
- chore(docker): make curl/wget commands quiet in docker by @hsheth2 in #5819
- chore: cleanup references to the old ember app by @hsheth2 in #5797
- fix(ingest): spark-lineage: Adding additional debug logs to spark lineage by @treff7es in #5772
- fix(docker): add missing port mappings for non-neo4j quickstart by @hsheth2 in #5799
- fix(ingest): looker - report dashboard scanning correctly by @shirshanka in #5829
- feat(cli): report memory usage during ingest by @shirshanka in #5828
- fix(ingest): presto-on-hive - Fixing mysql filter by @treff7es in #5825
- docs(big query): add needed delete permission to list by @maaaikoool in #5826
- chore(ingest): set isort combine_as_imports by @hsheth2 in #5820
- fix(ingest): use
AwsConnectionConfig
instead ofAwsSourceConfig
by @hsheth2 in #5813 - feat(ingest): looker test connection by @hsheth2 in #5768
- feat(ingest): improve tableau lineage, workbooks query, fix pagination by @mayurinehate in #5756
- fix(ingest): profiling - memory usage reduction by @shirshanka in #5830
- feat(monitoring): enable JMX and OTEL for frontend pods by @szalai1 in #5834
- fix(standalone-consumers): Exclude Solr from spring boot application config & make them run on M1 by @pedro93 in #5827
- feat(hooks): Add toggle for enabling/disabling platform event hook by @pedro93 in #5840
- feat(transformers): Add semantics & transform_aspect support in transformers by @mohdsiddique in #5514
- feat(ci): auto label PRs by @anshbansal in #5839
- feat(inputs): improving clarity on inputs for dashboards by @gabe-lyons in #5841
- feat(ingest): add utility for converting MCEs to MCPs by @hsheth2 in #5812
- chore(smoke): add additional log in smoke test by @hsheth2 in #5842
- fix(ingest): fix doc generation import ordering issue with postgres by @hsheth2 in #5846
- feat(docker) Adds Sasl support to base ingestion image by @pedro93 in #5855
- fix(graphql) Fix null pointer exception when fetching entity aspect via graphql by @chriscollins3456 in #5857
- fix(ingest): reporting should work with timestamps by @shirshanka in #5860
- fix(patch-entity-registry): Remove exception for entities with key aspects. by @pghazanfari in #5831
- fix(browse): Fixing browse path to remove requirement for simple name suffix by @jjoyce0510 in #5634
- fix(ingest): bigquery - Fixing sharded regexp pattern config by @treff7es in #5861
- perf(elastic search graph service): improving perf of lineage query by @gabe-lyons in #5858
- chore(ingest): remove outdated GE compatibility hack by @hsheth2 in #5862
- ci(ingest): test with python 3.10 by @hsheth2 in #5863
- docs: improve doc generation, add better docs for snowflake, looker by @shirshanka in #5867
- feat(ci): tweak auto-label globs by @anshbansal in #5849
- fix(m1): preflight works with brew postgres@14 by @shirshanka in #5868
- feat(smoke-tests) Make smoke tests use standalone consumers by @pedro93 in #5856
- fix(domains): adding 10,000+ text when domain list caps out elastic count capacity by @gabe-lyons in #5838
- docs(notifications): slack notification docs by @anshbansal in #5871
- feat(docker): Update Dockerfiles to use java 11 runtime by @pedro93 in #5853
- Scroll issue on Glossary related entity page by @Ankit-Keshari-Vituity in #5804
- fix(ingest): include urns in rest sink failure logs by @hsheth2 in #5848
- fix(docker): Bumps JRE 11 to latest by @pedro93 in #5875
- feat(ingest): support reading config file from stdin by @hsheth2 in #5847
- fix(ingest): remove dbt
delete_tests_as_datasets
option by @hsheth2 in #5865 - fix(ingest): avrogen handling for missing fields with default values by @hsheth2 in #5844
- refactor(ingest): add ALL_ENV_TYPES constant by @hsheth2 in #5866
- feat(cli) Make docker compose quiet by @pedro93 in #5869
- feat(datahub-protobuf): add support for shadow jar, publish by @shirshanka in #5882
- feat(jars): better jar versioning for datahub-client, spark-lineage and protobuf by @shirshanka in #5883
- fix(dev-docker): set right context for frontend dev build by @szalai1 in #5885
- fix(ci): fix jar release action dependencies by @shirshanka in #5884
- feat(schema) Add search filter to Schema tab by @chriscollins3456 in #5845
- feat(ui) Add ...
DataHub v0.8.44
Release Highlights
Known Issues
Standalone Kafka Consumers
We have identified that using standalone Kafka consumers (MCP/MCL messages) has been a broken feature since v0.8.44. Root cause is some spring bean dependencies that were not correctly excluded.
This has gone undetected in our testing infrastructure because our tests do not run with standalone consumers since then until recently.
The underlying issue has been fixed by #5827 and we are now running all our smoke tests with standalone consumers, since #5856 to prevent this from happening in the future. The fix will be released in v0.8.46.
[Helm] DataHub Actions Container
We recently rolled out support for running ingestion in debug mode. This requires a bump in the datahub-actions
container to either HEAD (latest) or v0.0.7
. The correct version is set correctly as the default in v0.2.103.
User Experience
- Improvements to UI-based ingestion: view live logs during execution, view ingestion summary (ie. number of entities ingested), and rollback functionality. Also surfaces CLI-run ingestion jobs.
- New look on Homepage: Domains have been promoted to the top of the fold, so they are listed above Entity cards and Platform cards
- Improvements to searching for Looker resources - when searching for a measure or dimension, we will now surface Looks & Dashboards that reference those fields
- The DataHub Docs Site has a new look! We are reorganizing content to make it easier and more intuitive for DataHub Developers and End-Users alike to navigate our resources.
- Improved Error Handling on the UI - a much nicer messaging when exceptions are caught by the frontend application.
- Misc minor bug fixes and improvements
Developer Experience
- Eternal personal access tokens are now supported
- Deprecated support for Python 3.6 (we expect this to have little-to-no impact on the Community based on pip download data)
Metadata Ingestion
- Improved documentation for Domains transformer
- Stateful Ingestion now supported for Glue
data-lake
Source has been deprecated in favor ofs3
source- Chart Entity now supports chartUsageStatistics
- dbt ingestion supports auto-extracting owner from the
meta
block - Improved Snowflake Connector is now available; we expect this to provide a reduction in ingestion run-time and lower levels of complexity
What's Changed
- chore(ingest): remove orderedset dependency by @hsheth2 in #5591
- refactor(ingest): simplify upgrade version stats by @hsheth2 in #5588
- feat(metadata-service-auth): add support for eternal personal access tokens by @ksrinath in #5433
- fix(ci): paths for github workflows by @anshbansal in #5595
- fix(ingest): Fix ingest Clickhouse without password by @liyuhui666 in #5511
- fix(ci): cleanup sleeps to instead use retries by @anshbansal in #5597
- Kafka form Addition and resolved confilict by @Ankit-Keshari-Vituity in #5598
- fix(ingest): Fix minor logging bug in the glue source. by @rslanka in #5605
- fix(ci): use different image for smoke base image by @anshbansal in #5607
- fix(ci): cancel docker-unified workflow only on PRs on new commits by @anshbansal in #5608
- fix(ci): add env variable for creds smoke test by @anshbansal in #5609
- fix(ui) Followups to recent changes to UI ingestion forms by @chriscollins3456 in #5602
- docs(transformers): Add domain transformer documentation in transformers readme by @mohdsiddique in #5606
- feat(model): adding status aspect to assertions by @shirshanka in #5612
- fix(ingest): use default telemetry ID when config is unwritable by @hsheth2 in #5614
- chore(ingest): drop python 3.6 support by @hsheth2 in #5521
- fix(ui): Split based on Data Platform delimiter in Lineage viz by @jjoyce0510 in #5613
- feat(search): Sticky search filters + misc bug fixes & improvements by @jjoyce0510 in #5601
- fix(graphql): handle null source values in ml features & primary keys by @gabe-lyons in #5626
- fix(graph service): only query for entities that should have lineage [Breaking Change] by @gabe-lyons in #5539
- feat(model): Add optional message field to auditstamp by @gabe-lyons in #5611
- fix(ingest): fix indenting issue in azure ad connector by @aditya-radhakrishnan in #5627
- feat(tokens) Create and display non-expiring tokens on the frontend by @chriscollins3456 in #5630
- Schema tab: Fixed the header issue by @Ankit-Keshari-Vituity in #5622
- build(docs-website): only show release notes for recent releases by @hsheth2 in #5621
- docs(README): update links and reorg content by @maggiehays in #5618
- perf(operations): performance improvement to operations tab via reduced fetching by @gabe-lyons in #5632
- feat(ui) Retrieve last ingested timestamp and display on frontend by @chriscollins3456 in #5600
- Update README.md and maintaining consistency by @hemanthkotaprolu in #5623
- fix(ingest): fix delta-lake dict iteration bug by @hsheth2 in #5625
- fix(ingest): okta - make async loop init more robust by @shirshanka in #5640
- fix(ingest): cli - handle exception in upgrade check by @shirshanka in #5641
- build(docs-website): make codegen script idempotent by @hsheth2 in #5620
- docs(airflow): fix formatting by @hsheth2 in #5617
- fix(ui): Fixing minor search redirect filtering issue introduced by sticky filters by @jjoyce0510 in #5643
- fix(ingestion): Update developer docs by @szalai1 in #5644
- feat(ui): Adding slack handle to corp group info by @jjoyce0510 in #5645
- fix(delta-table): allow env, credential file based s3 auth by @MugdhaHardikar-GSLab in #5636
- feat(GraphQL API): Add "browsePaths" field to browsable entity types by @jjoyce0510 in #5646
- feat(ingest): generate a list of aspects in codegen by @hsheth2 in #5633
- feat(ingestion): Glue stateful ingestion by @amanda-her in #5553
- feat(ingest): add snowflake-beta source by @mayurinehate in #5517
- fix(ingest): remove alphabet field from allow/deny config by @hsheth2 in #5629
- feat(mssql): add multi database ingest support by @MugdhaHardikar-GSLab in #5516
- chore(ingest): drop data-lake source in favor of s3 source by @hsheth2 in #5628
- fix(ingest): use mongodb ping command to test connection by @hsheth2 in #5650
- fix(ingest): remove
profile_sql_table
event by @hsheth2 in #5616 - fix(ci): use graphql instead of restli by @anshbansal in #5610
- feat(ingest): rest_emitter - Adding option to disable ssl by @szalai1 in #5642
- feat(ingest): GE Profile/Action Trino support by @aezomz in #5361
- Stats Tab: Table and column stats hide when there is no data by @Ankit-Keshari-Vituity in #5651
- fix(ingest): redash - fix redash dashboard url bug by @de-kwanyoung-son in #5500
- Glossary: Worked on the refetching data issue by @Ankit-Keshari-Vituity in #5638
- feat(ingestion) Fetch live logs on an ingestion run from UI by @chriscollins3456 in #5653
- fix(spark-lineage): Create application setup on sqlevent start by @MugdhaHardikar-GSLab in #5657
- fix(ui) Remove constraint for searching with less than 3 characters by @chriscollins3456 in #5654
- docs: adds ABLY as DataHub adopter by @de-...
DataHub v0.8.43
v0.8.43
Highlights
User Experience
- Bulk edit support - you can now add or remove Owners, Glossary Terms, Tags, Domains, Deprecation Status to multiple entities with a few clicks!
- Improved user experience to create secrets and ingestion schedules
Developer/Community Experience
- A new Java-based file emitter, generating a JSON file that can be used in the “File” metadata ingestion source
- Delta Lake fixes to make it more stable and to extract table history to populate the operation aspect
Metadata Ingestion
- When ingesting metadata from the DataHub UI, you will now see an “Ingestion Run Summary” which shows the run outcome, number of entities successfully ingested, and the ability to download logs collected during the run
- New Dataset Domain Transformer - assign a Domain to Datasets during ingestion
Full Commit Log
What's Changed
- #5577 @jjoyce0510 feat(ui): Add rich UI ingestion run summary
- #5330 @liyuhui666 feat(ingest): clickhouse - add metadata modification time and data size
- #5582 @jjoyce0510 feat(ui): Support batch deleting from ui
- #5531 @Jiafi Fix profiling when using {table}.
- #5548 @Jiafi Expose catalog_name in athena.py
- #5335 @mohdsiddique feat(ingest): power-bi - make ownership ingestion optional
- #5587 @aditya-radhakrishnan fix(groups): fix user, search, and preview group membership to be fetched for both external and native group memberships
- #5586 @xiphl feat(ui): make container description searchable and have description show up in results
- #5585 @mohdsiddique fix apache ranger plugin readme file rendering
- #5277 @MugdhaHardikar-GSLab feat(ingest): delta-lake - extract table history into operation aspect
- #5584 @shirshanka fix(ingest): moving delta-lake connector to be 3.7+ only
- #5526 @MugdhaHardikar-GSLab fix(ingest): sql-common - db2, snowflake bug fixes to extract table descriptions
- #5566 @hsheth2 feat(ingest): infer aspectName from aspect type in MCP
- #5578 @MugdhaHardikar-GSLab feat(datahub-client): add java file emitter
- #5328 @Santhin feat(ingest): dbt - control over emitting test_results, test_definitions, etc.
- #5547 @hsheth2 fix(ingest): handle when current server version is unavailable
- #5579 @chriscollins3456 feat(ingestion) Add Save & Run button to managed ingestion builder
- #5558 @anshbansal feat(test): add read-only smoke tests
- #5581 @maggiehays chore(gradle): update node version for docs site
- #5580 @jjoyce0510 fix(ui): Fixing batch set domains bug
- #5574 @chriscollins3456 feat(ingestion) Implement secrets in new managed ingestion form
- #4976 @noahfournier feat(graphql): add MutableTypeBatchResolver
- #5572 @jjoyce0510 feat(ui): Support batch deprecation from the UI (Batch actions part 6/7)
- #5575 @gabe-lyons extending assertion std model
- #5560 @jjoyce0510 feat(ui): Batch set & unset Domain for assets via the UI
- #5571 @anshbansal chore(build): tweak stale issue timing
- #5570 @anshbansal fix(gms): missing directory for gms
- #5569 @anshbansal fix(ci): flaky smoke test fix
- #5568 @anshbansal fix(gms): ensure directory is present
- #5562 @gabe-lyons (chore): upgrading ingestion to 0.8.42
- #5551 @hsheth2 fix(ingest): activate mypy support for ParamSpec typing annotation
- #5563 @gabe-lyons chore(0.8.42): update breaking changes doc
- #5456 @mohdsiddique feat(transformers): Add domain transformer for dataset
- #5564 @hsheth2 fix(ingest): fix some typos and logging issues
- #5444 @xiphl feat(ingest) Allow ingestion of Elasticsearch index template
- #5541 @ms32035 fix(ingestion): correct trino datatype handling
- #5559 @chriscollins3456 feat(ingestion) Update managed ingestion scheduler to be easier to use
- #5552 @jjoyce0510 feat(ui): Batch add & remove Owners to assets via the UI
v0.8.42
v0.8.42
Highlights
User Experience
- Improved Search Experience - preview cards now display usage and freshness information
- Update to Schema History - incorporated Community feedback to remove “Blame” terminology
- Improved UI-Based Ingestion - easily configure metadata ingestion from Snowflake, BigQuery, Looker, and Tableau with an easy-to-follow form; YAML is still supported!
Developer/Community Experience
- Python 3.6 is no longer supported for ingestion – we expect this to impact fewer than 1% of DataHub users (based on PyPi download stats). Please upgrade to Python 3.7 or newer
- Update to GitHub Issue management - issues will be marked as “Inactive” after 30 days of no activity and will be automatically closed following an additional 30 days of inactivity
- We’ve updated our Slack Guidelines! Read them here
Metadata Ingestion
- You can now test your Snowflake connection via the CLI and UI-based Ingestion to ensure you have proper access levels required for general ingestion, profiling, and usage. We will be expanding this functionality to other cloud-based ingestion sources in upcoming cycles.
- Hard delete will now discover and remove soft deleted entities
- Resolved issue of assertion error with dbt stateful ingestion
Full Commit Log
What's Changed
- feat(quickstart,docs): updates for v0.8.41 by @anshbansal in #5409
- fix(ingest): ensure upgrade checks run async by @shirshanka in #5383
- fix(ingest): pass transport options to usage history looker api calls by @mayurinehate in #5417
- feat(quickstart): moving to official confluent images for m1 by @shirshanka in #5416
- fix(documentation) Fix erratic cursor in documentation editor bug by @chriscollins3456 in #5411
- feat(ui): Supporting enriched search preview + misc improvements by @jjoyce0510 in #5419
- chore: remove unnecessary modules from codebase by @shirshanka in #5420
- fix(ingest): extract usage for dashboards allowed by pattern by @mayurinehate in #5424
- fix(docker): fix kafka-setup command to support same capabilities as … by @shirshanka in #5428
- fix(protobuf): ownership fixes by @leifker in #5425
- fix(ui): add dataset qualifiedName parameter to lineage query by @alexey-kravtsov in #5427
- fix(glossary) Fix dropdown where disabled buttons are still clickable by @chriscollins3456 in #5430
- docs(bigquery): add changelog and unittest for profiling limits by @MugdhaHardikar-GSLab in #5407
- fix(siblings): fixing lineage fetching for siblings & sources by @gabe-lyons in #5415
- fix(ui): Fixing unreleased search preview bugs by @jjoyce0510 in #5432
- feat(ui): Adding Statistics Summary to Dataset + Dashboard Profiles by @jjoyce0510 in #5440
- feat(ingest): add test source connection feature, structured report file by @shirshanka in #5442
- fix(ingest/glue): handle error when generating s3 tags for virtual view tables by @timcosta in #5398
- feat(ingest): model - adding a small extension to support communicati… by @shirshanka in #5429
- fix(bigquery-usage): fix dataset name for sharded table by @MugdhaHardikar-GSLab in #5412
- feat(ingestion) Add new endpoint to test an ingestion connection by @chriscollins3456 in #5438
- feat(cli,build): remove deprecated variables GMS_HOST/_PORT by @anshbansal in #5451
- fix(search): make filters by default an empty list if null by @aditya-radhakrishnan in #5454
- fix(hive): add column comment as a column description by @MugdhaHardikar-GSLab in #5449
- feat(groups): add native groups concept to DataHub by @aditya-radhakrishnan in #5443
- fix(ingest): fix serialization of report to handle nesting by @shirshanka in #5455
- fix(tableau): fix tableau db error, add more logs by @mayurinehate in #5423
- build(deps): bump terser from 5.9.0 to 5.14.2 in /docs-website by @dependabot in #5448
- feat(doc): spark-lineage - Adding spark lineage configuration doc for Amazon EMR by @treff7es in #5459
- feat(schema-history): remove blame language for the schema history feature by @aditya-radhakrishnan in #5457
- Search header: Menu icon alignment by @Ankit-Keshari-Vituity in #5458
- build(deps): bump terser from 4.8.0 to 4.8.1 in /datahub-web-react by @dependabot in #5446
- feat(ingest): snowflake - basic test connection capability by @shirshanka in #5464
- fix(ingest/trino): Avoid exception if $properties table empty or not readable by @glinmac in #5447
- feat(ingest): preflight - Add way to check/upgrade brew package version in preflight if needed by @treff7es in #5435
- fix(build): add base image with gradle wrapper cached by @anshbansal in #5467
- doc(bigquery): groups grants by requirements by @sgomezvillamor in #5468
- fix(docs,build): remove base image not needed, cleanup docs by @anshbansal in #5469
- feat(ui): Partial support for Chart usage by @jjoyce0510 in #5473
- fix(ingest): bigquery: multiproject profiling fix by @treff7es in #5474
- fix(ingest): kafka - revert deps back to < 1.9.0 by @shirshanka in #5476
- feat(docker): support multiplatform image for datahub-upgrade by @shirshanka in #5477
- feat(quickstart): experimental support for backup restore for quickstart by @shirshanka in #5418
- feat(dbt): updating source lineage logic by @gabe-lyons in #5414
- Ingestion: Added form in Big Query type to edit the queries. by @Ankit-Keshari-Vituity in #5431
- docs: fix docsearch config by @hsheth2 in #5479
- Search Results: Added checkbox option to select multiple results at once. by @Ankit-Keshari-Vituity in #5422
- feat(delete): hard delete deletes soft deleted entities by @anshbansal in #5478
- fix(docs): add missing closing marker for note section by @shirshanka in #5480
- fix(build): intermittent failure in github actions by @anshbansal in #5452
- feat(model, ingest): add user email in dashboard user usage counts by @mayurinehate in #5471
- feat(ingest): add support for capability report in snowflake test connection by @mayurinehate in #5472
- feat(build): automatically mark issues as stale to close inactive issues by @anshbansal in #5482
- fix(ingest): loosen confluent-kafka dep requirement by @hsheth2 in #5489
- refactor(ingest): cleanup importlib.import_module calls by @hsheth2 in #5490
- build(ingest): make gradle build less chatty by @hsheth2 in #5491
- fix(ingest): Fixing dbt trino datatypes by @aezomz in #5379
- refactor(ci): use custom action for checking codegen status by @hsheth2 in #5493
- feat(spark-lineage): Support ssl cert disable functionality by @MugdhaHardikar-GSLab in #5488
- docs(auth): fix link to point to new doc by @anshbansal in #5501
- docs(updating-datahub): add note for breaking change in looker usage … by @mayurinehate in #5499
- fix(ingest): cleanup unused flake8 noqa statements by @hsheth2 in #5492
- refactor(ci): refactor Docker build-and-push workflows by @hsheth2 in #5494
- docs(slack) Update to Slack guidelines by @maggiehays in #5504
- feat(cli): dele...
v0.8.41
Highlights
User Experience
- Performance improvements in the UI
- Improvements in CSV connector for easier ingestion - description, ownership, domain support added
- UI form for Snowflake Managed Ingestion so you don't have to make changes in YAML
- Viewing Siblings
Developer Experience
- Ability to stop quickstart instead of nuking
- Customizing mapped ports in quickstart
- New models for dashboard usage
- Circuit breaker and python api for Assertion and Operation
Metadata Ingestion
- Improvements in bigquery connector to only profile some tables
- Intermittent 401 errors during ingestion fixed
- New salesforce connector
What's Changed
- fix(test): add cleanup in tests, make urls configurable by @anshbansal in #5287
- fix(docs,quickstart): release related changes for 0.8.40 by @anshbansal in #5299
- [Deployment]: fix config typo on confluent cloud by @tengis in #5293
- fix(cli): suppress secrets in stacktraces by @anshbansal in #5302
- refactor(ui): Fix settings page divider by @jjoyce0510 in #5292
- fix(cli): timeline - category should be owner not ownership by @shirshanka in #5304
- perf(siblings): reduce data fetched by siblings in lineage by @gabe-lyons in #5308
- fix(ingest): bigquery - Fix for bigquery error when there was no bigquery catalog specified by @treff7es in #5303
- fix(ui) Fix entity profile sidebar width issues by @chriscollins3456 in #5305
- perf(search): Improve search default performance by @jjoyce0510 in #5311
- perf(ui): Performance improvements and misc refactorings in the UI by @jjoyce0510 in #5310
- Modified the drop down of Menu Items by @Ankit-Keshari-Vituity in #5301
- fix(validation) Fail validation error silently instead of crashing by @chriscollins3456 in #5314
- feat(docs) Add documentation on authorization & authentication by @pedro93 in #5265
- fix(ui) Make profile icon clickable to expand header menu by @chriscollins3456 in #5317
- refactor(ui): Extract searchable page into its own component (perf + ux) by @jjoyce0510 in #5318
- fix(gms) Remove auto-creating status aspect if not present when ingesting by @pedro93 in #5315
- fix(ui): Add missing SearchRoutes component by @jjoyce0510 in #5321
- feat(ingest): Ingest Looker dashboard create/update/delete timestamps by @mayurinehate in #5312
- fix(ui): Fix pipeline tasks list loading by @jjoyce0510 in #5332
- feat(ingest): lookml - adding support for only emitting reachable vie… by @shirshanka in #5333
- fix(ingest): omit schema fields when name is absent by @mayurinehate in #5275
- fix(siblings) Combine siblings data but remove duplicate data by @chriscollins3456 in #5337
- Fix typo in metadata-ingestion.md by @dougpm in #5338
- fix(me) Cache the me query for performance reasons by @chriscollins3456 in #5316
- fix(tokens) Adds non-admin tests for access tokens by @pedro93 in #5174
- feat(bigquery): support size, rowcount, lastmodified based table selection for profiling by @MugdhaHardikar-GSLab in #5329
- chore: Refactor Python Codebase by @koconder in #5113
- docs(bigquery): profiling report enhancement by @MugdhaHardikar-GSLab in #5342
- feat(ingest): update CSV source to support description and ownership type by @aditya-radhakrishnan in #5346
- Fixed UI issue: Tags list going outside the container by @Ankit-Keshari-Vituity in #5341
- feat(ingest): add salesforce connector by @mayurinehate in #5104
- feat(bootstrap): create abstract class UpgradeStep to abstract away upgrade logic by @aditya-radhakrishnan in #5349
- fix(bigquery-usage): dataset name fix for sharded tables by @MugdhaHardikar-GSLab in #5347
- docs(features): update grammar on Features overview by @maggiehays in #5350
- fix(ci): fix mysql and kafka-connect ingestion test by @shirshanka in #5352
- feat(ui): add copy function for stats table sample value by @ngamanda in #5331
- fix(ui) Correct show/hide tabs in Settings based on privileges by @chriscollins3456 in #5355
- fix(siblings): add useMutationUrn to domain section by @gabe-lyons in #5270
- feat(schema) Show last observed timestamp in the schema tab by @chriscollins3456 in #5348
- fix(glossary) Fixes a bug for yaml ingested terms without source_url by @chriscollins3456 in #5356
- feat(lineage) Add Lineage tab to Chart and Dashboard entity profiles by @chriscollins3456 in #5357
- fix(cassandra): fix Cassandra queries used by IngestDataPlatformInstancesStep by @justinas-marozas in #5199
- refactor(ui): Use createTag mutation for creating new tags from the UI by @jjoyce0510 in #5359
- Added recommendation on group modal by @Ankit-Keshari-Vituity in #5362
- refactor(ui): Remove unnecessary fields in GraphQL queries by @jjoyce0510 in #5358
- feat(ingest) - add audit actor urn to auditStamp by @neojunjie in #5264
- feat(ingest): Domain ingestion usability by @shirshanka in #5366
- fix(config): fixes config key in DataHubAuthorizerFactory by @sgomezvillamor in #5371
- fix(ingest): domains - check whether urn based domain exists during r… by @shirshanka in #5373
- feat(quickstart): Adding env variables and cli options for customizing mapped ports in quickstart by @NavinSharma13 in #5353
- fix(build): tweak ingestion build by @anshbansal in #5374
- feat(query) Add get_entity_v2 to python package by @aezomz in #5255
- fix(airflow): Fix for failing serialisation when Param was specified + support for external task sensor by @treff7es in #5368
- fix(users): fix to not get invite token unless the invite token modal is visible by @aditya-radhakrishnan in #5380
- fix(gms): Propagate token cache error by @pedro93 in #5381
- fix(bootstrap): skip ingesting data platforms that already exist by @aditya-radhakrishnan in #5382
- fix(cli): respect server telemetry settings correctly by @treff7es in #5384
- fix(ingest): bigquery - Graceful bq partition id date parsing failure by @treff7es in #5386
- feat(airflow): Circuit breaker and python api for Assertion and Operation by @treff7es in #5196
- feat(kafka-setup): add options for sasl_plaintext by @abiwill in #5385
- fix(bigquery): multi-project GCP setup run query through correct project by @anshbansal in #5393
- fix(bigquery): add storage project name by @anshbansal in #5395
- Add Changes to support smoke test on Datahub deployed on kubernetes Cluster by @NavinSharma13 in #5334
- fix(PlayCookie) PLAY_TOKEN cookie rejected because userprofile exceeds 4096 chars by @neojunjie in #5114
- feat(dashboards): add datasets field to DashboardInfo aspect by @Masterchen09 in #5188
- feat(siblings): allow viewing siblings separately by @gabe-lyons in #5390
- Added Cursor pointer to tags by @Ankit-Keshari-Vituity in #5389
- feat(GMS): Adding Dashboard Usage Models by @jjoyce0510 in #5399
- fix(q...