-
Notifications
You must be signed in to change notification settings - Fork 11
Version 1.8 Release Notes
David Freels Sr edited this page Jul 6, 2022
·
20 revisions
(#174) Split Step
The split and merge step types allow pipeline designers the ability to run different step sequences in parallel. The merge step is used to indicate where the executions should stop and normal processing resume.
- (#208) Allow Embedded Fork Steps
(#232) Delta Lake Steps
The Metalus Delta Lake project includes step objects for updating, deleting, and merging deltalake datasources.
- (#203) Create Application and Execution Template Metadata Extractor
- (#206) Metadata Extract Fails for certain versions
- (#204) Metalus GCP fails during Step Metadata Extraction
- (#205) Maven Dependency Resolver Should verify MD5 if available
- (#209) Implement Custom Form Support for Metadata Extractors
- Added scopes to the metalus-aws dependencies json to allow easier classpath configuration
- (#215) Streaming Drivers Not Stopping on Failure
- (#211) Implement Event Based PipelineListener (Kinesis, Kafka and Pub/Sub Implementations)
- (#211) Created CombinedPipelineListenener to allow more than one within an application
- (#202) Expose results from streaming executions to be shared between runs
- Fixed an issue with the KinesisPipelineDriver where the consumer-streams could not be parsed properly
- Audit report is now printed out by the DefaultPipelineListener
- Added AWS and GCP Secrets Manager Credential Providers
- Added CredentialSteps to make working with credentials easier
- (#212) Added Credential Mapping (%) character
- Added step to authenticate (S3) DataFrame outside of normal steps.
- Added new step object CatalogSteps that exposes many of the methods in spark Catalog class.
- (#233) Added support for accessing Arrays and Lists elements via index in pipeline mappings
- New step functions added to DataSteps: count, rename column and drop duplicate records
- added spark configuration and spark settings to ExecutorAudits
- (#175) Remove list/object parsing from mapByValue
- (#207) Remove Support for Unsupported Spark (Spark 2.3 and Spark 2.4 with Scala 2.12)
- (#231) Support auto-casting primitives in PipelineStepMapper.castToType
- Spark 2.4 Scala 2.12 has been restored
- Spark 3.1 Scala 2.12 has been added
- New execution templates and pipelines have been added to ingest data into Bronze
- Added support for role based authentication to the AWSCredential trait.
- Added support for role based authentication to the KinesisPipelineDriver.
- (#238) Inline string concatenation fails when mapped parameters are wrapped in multiple options
- (#243) Comma in datatype of scalascript type parameter causes issues
- (#248) Added the ability to escape mapping characters within string templates
- Added PipelineContext to exceptions
- Performance improvements for the S3OutputStream
- (#247) Fixed an issue related to consumerStreams with KinesisPipelineDriver.
- Created a experimental BigQuerySteps
- (#255) New Connectors Architecture
- (#252) Generic read and write step that uses the new connectors architecture
- (#258) Create a new copy pipeline and load to bronze that uses the new connector architecture
- (#254) Generic Retry for step on error and a step that will retry a specified number of times.
- (#259) Spark Configuration Steps
- (#256) Streaming Pipeline Drivers (Kinesis and GCP PubSub) now take a credential name from the command line
- (#260) Moved Fork Step value validation to allow the forkMethod to be mapped
- (#253) Secrets Manager Credential Providers not adding default credential parser
- (#257) Fork Step now allows a limit on the number of concurrent threads to be used in parallel
- #263 JDBCDataConnector
- #266 Validated streaming connectors
- #270 Fixes to streaming connectors
- #272 Updated DefaultPipelineDriver to simulate a streaming driver when using structured streaming
- #275 Added ability to start an application at specific executions
- #274 Implemented recursive file listings
- #267 Fixed an issue with the Big Query dependency
- #268 Fixed an issue with the Big Query dependency
- #269 Fixed a bug with application globals mapping introduced in 1.8.3
- #292 Implemented Advanced Execution flow control by supporting execution forking and conditional executions.
- Enhanced FileManagerSteps with additional steps.
- #278 Fixed issues with MongoDataConnector
- #279 SplitFlow now pushes global updates
- #281 Fixes made while testing metalus-gcp against GCP Dataproc and Databricks GCP
- #284 Moved JavascriptSteps and ScalaSteps to the core library
- Added new ExceptionSteps to make throwing exceptions in a pipeline easier.
- #286 Added support for nullable fields in the Schema object
- Bugfix related to S3OutputStream not properly draining the buffer.
- #308 Allow steps to register global links
- Added ability to pass hints to the JSON4S deserializer code within applications
- Added new getStatus function to FileManager and implementations
- Added a new JSON API Connector for basic data interactions
- Created a new step to determine if a value is empty
- Updated Schema to support long, float and boolean types
- Pipeline now includes a description field
- #303 Additional bug fixes for BigQuery support
- #306 Enhancements to streaming support
- Added streaming monitor pipeline to allow executions to easily implement streaming connectors
- Fixed an issue where execution forks do not run the join pipelines
- Added a new property named
executionForkValueIndex
to each fork execution to indicate the index within the value list of the current execution