Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Integration][Spark] Support adding arbitrary parameters to OpenLinea…
…ge URL (apache#425) * Support Arbitrary Parameters in Lineage URL Supports extracting parameters from config: spark.openlineage.url Supports extracting parameters from config: spark.openlineage.url.param.xxx Users can now pass along additional query / url parameters to the openlineage url that is used when emitting lineage. This is useful for passing additional parameters necessary for a non-marquez destination of OpenLineage metadata. If using spark.openlineage.url.param.xxx, xxx represents the name of the url parameter you want to include in the lineage url. Any configuration variable passed in beginning with spark.openlineage.url.param. will be used as a url parameter in the lineage url. These config settings will ignore api_key if it is specified as a url.param to avoid conflicting with the spark.openlineage.apiKey config. Signed-off-by: Will Johnson <[email protected]> * Adding more idiomatic Java syntax Signed-off-by: Will Johnson <[email protected]> * Even more idiomatic java 8 syntax Signed-off-by: Will Johnson <[email protected]> * [spark] support read/write to kafka (apache#387) * adding initial test for spark kafka integration (apache#279) * adding initial test for spark kafka integration Signed-off-by: tomassatka <[email protected]> * [spark] Resolves: apache#280 kafka support Signed-off-by: olek <[email protected]> * [spark] Resolves: apache#280 kafka support Signed-off-by: olek <[email protected]> * [spark] Resolves: apache#280 remove debug line Signed-off-by: olek <[email protected]> * [spark] Resolves: apache#280 javadoc Signed-off-by: olek <[email protected]> * [spark] Resolves: apache#280 review Signed-off-by: olek <[email protected]> * [spark] Resolves: apache#280 review Signed-off-by: olek <[email protected]> * Moved KafkaWriter handling into KafkaRelationVisitor and added support for assign in kafka consumer parsing Signed-off-by: Michael Collado <[email protected]> * Add check for hive classes to avoid NoClassDefFoundErrors Signed-off-by: Michael Collado <[email protected]> * Fix build for kafka integration (apache#411) * Fix integration tests step to run on raw machine to enable docker tests Signed-off-by: Michael Collado <[email protected]> * Updated kafka tests to use common version and common reference to kafka package Signed-off-by: Michael Collado <[email protected]> * Added reset() call to mock server to avoid results across tests Signed-off-by: Michael Collado <[email protected]> * Added String only constructor to HttpError Signed-off-by: Michael Collado <[email protected]> * Changed build to copy dependencies and use downloaded jars in Spark containers Signed-off-by: Michael Collado <[email protected]> * Fix spark integration test build to use explicit 3.1.2 version Signed-off-by: Michael Collado <[email protected]> Co-authored-by: Tomas Satka <[email protected]> Co-authored-by: Michael Collado <[email protected]> Co-authored-by: Michael Collado <[email protected]> Signed-off-by: Will Johnson <[email protected]> * dbt: support dbt build command (apache#398) Signed-off-by: Maciej Obuchowski <[email protected]> Signed-off-by: Will Johnson <[email protected]> * great expectations: pin version to the one supported by airflow operator (apache#420) Signed-off-by: Maciej Obuchowski <[email protected]> Signed-off-by: Will Johnson <[email protected]> * [SPARK] adding output metrics (apache#361) * [spark] Resolves: apache#304 adding task metrics collector Signed-off-by: olek <[email protected]> * [spark] Resolves: apache#304 remove OutputDatasetWithMetadataVisitor.java and deprecated "OutputStatisticsFacet.java" Signed-off-by: olek <[email protected]> * [spark] Resolves: apache#304 add entry in CHANGELOG.md Signed-off-by: olek <[email protected]> * [spark] Resolves: apache#304 add test and fix test Signed-off-by: olek <[email protected]> * [spark] Resolves: apache#304 fix test Signed-off-by: olek <[email protected]> * [spark] Resolves: apache#304 review comments Signed-off-by: olek <[email protected]> * [spark] Resolves: apache#304 review comments Signed-off-by: olek <[email protected]> * Fix references to JobMetricsHolder to use singleton, default metric values to null if not present Signed-off-by: Michael Collado <[email protected]> Co-authored-by: Michael Collado <[email protected]> Signed-off-by: Will Johnson <[email protected]> * dbt: filter non-test nodes while processing assertions (apache#422) Signed-off-by: Maciej Obuchowski <[email protected]> Signed-off-by: Will Johnson <[email protected]> * Updating OpenLineage spark integration README with new param Signed-off-by: Will Johnson <[email protected]> * Fixing spotless checks Signed-off-by: Will Johnson <[email protected]> Co-authored-by: OleksandrDvornik <[email protected]> Co-authored-by: Tomas Satka <[email protected]> Co-authored-by: Michael Collado <[email protected]> Co-authored-by: Michael Collado <[email protected]> Co-authored-by: Maciej Obuchowski <[email protected]>
- Loading branch information