-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update FW branch with latest main, to prep for testing #133
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…led without packages.
* Update CHANGELOG and VERSION. * Hotfix/optional parquet sources (#86) * Update optional file check in FileSource to build an empty dataframe if an empty folder is passed. * Remove explicit file check in compile. * Re-add filesize check in FileSource.execute(). * Move FtpSource connect from compile to execute. * Fix attribute naming bug. * Fix bug. * Allow filepaths to be passed in optional FileSources, and check the existance of the path before loading the dataframe. * Update CHANGELOG. * fix add_columns typo in readme * update changelog * Feature/union all columns (#94) * Add 'fill_missing' optional field to UnionOperation that uses default Pandas concat logic without erroring out. Still raise a debug message when applicable. * Rename new field to 'fill_missing_columns' for clarity. * Update dataframe.py Rename fill_missing_columns to fill_missing. * Update dataframe.py * Update CHANGELOG.md * Update CHANGELOG.md * Rename UnionOperation's fill_missing field to fill_missing_columns; update README. * Git clone timeout when running `earthmover deps` (#93) * try using subprocess with timeout * Update error message * tweak timeouts * switch to makedirs * don't error if dir already exists * remove package path on failure * adjust deletes * typo * switch to rmtree * remove gitpython dependency * remove unused import * remove unused var * add optional git timeout config * reverse accidentally removed kwargs * add notes on git_auth_timeout config to readme * code cleanup * Update README. --------- Co-authored-by: jayckaiser <[email protected]> * Update changelog. * Fix escape chars in output when `linearize: False` (#98) * fixes a bug where escape characters were present in the output file when linearize is False * remove unneeded Dask import * update return value and comment based on notes from Jay --------- Co-authored-by: Tom Reitz <[email protected]> * fixing a bug introduced in the last version where nested JSON would be loaded as a stringified Python dictionaty, which is difficult to use in downstream Jinja (#97) Co-authored-by: Tom Reitz <[email protected]> * Only write `earthmover_compiled.yaml` on compile, not run (#91) * only write to disk on compile, not run * update readme with change to earthmover_compiled.yaml * Add `earthmover clean` command and some CLI error handling (#87) * add 'clean' command and clean up CLI messaging * comment justifying dictionary * update changlog * remove skip_mkdir, make compiled_yaml_file a class attribute * replace dict with list of constntas --------- Co-authored-by: Jay Kaiser <[email protected]> * Update CHANGELOG with new features. * Fix `__row_data__` in `add_columns` and `modify_columns` operations (#99) * fix __row_data__ in Jinja expressions of add_columns and modify_columns operations * update how __row_data__ is added to prefent an error about modifying row --------- Co-authored-by: Tom Reitz <[email protected]> * Feature: Refactor Destination Execute (#95) * Update config parsing to use ErrorHandler.assert_get_key() for all fields; move and unify Jinja template processing to execute. * Update destination.py * Update CHANGELOG. * makes destination template optional (#88) * makes destination template optional; when not specified, each row is turned into a JSON object where column names become object properties * implement changes based on feedback from Jay * bugfix * Minor cleanup. --------- Co-authored-by: Tom Reitz <[email protected]> Co-authored-by: jayckaiser <[email protected]> * Update CHANGELOG. * adding the `debug` operation (#100) * adding debug operation * Update dataframe.py Refactor code to improve readability and reference to existing Node attributes. --------- Co-authored-by: Tom Reitz <[email protected]> Co-authored-by: Jay Kaiser <[email protected]> * Use Node.full_name in Node.check_expectations(), instead of redefining the string manually. * Update CHANGELOG. * Feature/flatten operation whitespace cleanup (#101) * adding a flatten_operation * README tweak * implement changes based on feedback from Jay * Clean up comments and whitespace in new FlattenOperation. * Add print statements to debug tuple problem. * Minor cleanup. * Minor cleanup. * Add single quotes to strip and trim variables in FlattenOperation. * Fix single quote representation in trim_whitespace. --------- Co-authored-by: Tom Reitz <[email protected]> * Update CHANGELOG. --------- Co-authored-by: johncmerfeld <[email protected]> Co-authored-by: Samantha LeBlanc <[email protected]> Co-authored-by: Tom Reitz <[email protected]> Co-authored-by: Tom Reitz <[email protected]>
Hotfix: Resolve incompatible dependencies
updating fix with latest main
fix nested json not working when rendering destination templates
* Simplify code in FileDestination.render_row() to improve readability. * Change FileDestination write logic to compute and write each partition, instead of mapping writes over rows. * Update CHANGELOG and VERSION in preparation for patch.
…type()` and genericize logic using class attributes.
…eration (#106) Co-authored-by: Tom Reitz <[email protected]>
* init * wip * wip * populate starter project * add error message for invalid names * reset requirements * remove memory limit * need to test on windows * be more explict about mkdir * be more explict about mkdir * be more explict about mkdir * fix base_dir issues * cleanup * remove formatting changes from main * remove formatting changes from main * update init readme * fix typo * add comment
…_header_footer add support for Jinja in a destination node header and footer
update version and changelog for 0.3.7 release
…ooter Jinja renders on destination write.
…ation_headers Hotfix: Refactor Jinja Destination Headers and Footers
update version and changelog for bugfix release
Feature: Add support for Python 3.12, latest versions of Dask
adds a `--set` flag to the cli to enable overriding values in _compiled_ `earthmover.yml`
bump version to 0.4.0
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.