- 423 Add seaborn as a domain level extension for visualization
- 422 Add pandas_df.plot as the first namespace extension
- 421 Add the namespace concept to Fugue extensions
- 420 Add is_distributed to engines
- 419 Log transpiled SQL query upon error
- 384 Expanding Fugue API
- 410 Unify Fugue SQL dialect (syntax only)
- 409 Support arbitrary column names in Fugue
- 404 Ray/Dask engines guess optimal default partitions
- 403 Deprecate register_raw_df_type
- 392 Aggregations on Spark dataframes fail intermittently
- 398 Rework API Docs and Favicon
- 393 ExecutionEngine as_context
- 385 Remove DataFrame metadata
- 381 Change SparkExecutionEngine to use pandas udf by default
- 380 Refactor ExecutionEngine (Separate out MapEngine)
- 378 Refactor DataFrame show
- 377 Create bag
- 372 Infer execution engine from input
- 340 Migrate to plugin mode
- 369 Remove execution from FugueWorkflow context manager, remove engine from FugueWorkflow
- 373 Fixed Spark engine rename slowness when there are a lot of columns
- 362 Remove Python 3.6 Support
- 363 Create IbisDataFrame and IbisExecutionEngine
- 364 Enable Map type support
- 365 Support column names starting with numbers
- 361 Better error message for cross join
- 345: Enabled file as input/output for transform and out_transform
- 326: Added tests for Python 3.6 - 3.10 for Linux and 3.7 - 3.9 for Windows. Updated devenv and CICD to Python 3.8.
- 321: Moved out Fugue SQL to https://github.com/fugue-project/fugue-sql-antlr, removed version cap of
antlr4-python3-runtime
- 323: Removed version cap of DuckDB
- 334: Replaced RLock with SerializableRLock
- 337: Fixed index warning in fugue_dask
- 339: Migrated execution engine parsing to triad conditional_dispatcher
- 341: Added Dask Client to DaskExecutionEngine, and fixed bugs of Dask and Duckdb
- Create a hybrid engine of DuckDB and Dask
- Save Spark-like partitioned parquet files for all engines
- Enable DaskExecutionEngine to transform dataframes with nested columns
- A smarter way to determine default npartitions in Dask
- Support even partitioning on Dask
- Add handling of nested ArrayType on Spark
- Change to plugin approach to avoid explicit import
- Fixed Click version issue
- Added version caps for antlr4-python3-runtime and duckdb as they both released new versions with breaking changes.
- Make Fugue exceptions short and useful
- Ibis integration (experimental)
- Get rid of simple assignment (not used at all)
- Improve DuckDB engine to use a real DuckDB ExecutionEngine
- YIELD LOCAL DATAFRAME
- Add an option to transform to turn off native dataframe output
- Add callback parameter to
transform
andout_transform
- Support DuckDB
- Create fsql_ignore_case for convenience, make this an option in notebook setup
- Make Fugue SQL error more informative about case issue
- Enable pandas default SQL engine (QPD) to take lower case SQL
- Change pickle to cloudpickle for Flask RPC Server
- Add license to package
- Parsed arbitrary object into execution engine
- Made Fugue SQL accept
+
,~
,-
in schema expression - Fixed transform bug for Fugue DataFrames
- Fixed a very rare bug of annotation parsing
- Added Select, Aggregate, Filter, Assign interfaces
- Made compatible with Windows OS, added github actions to test on windows
- Register built-in extensions
- Accept platform dependent annotations for dataframes and execution engines
- Let SparkExecutionEngine accept empty pandas dataframes
- Move to codecov
- Let Fugue SQL take input dataframes with name such as a.b
- Dask repartitioning improvement
- Separate Dask IO to use its own APIs
- Improved Dask print function by adding back head
- Made
assert_or_throw
lazy - Improved notebook setup handling for jupyter lab
- HOTFIX avro support
- Added built in avro support
- Fixed dask print bug
- Added Codacy and Slack channel badges, fixed pylint
- Created transform and out_transform functions
- Added partition syntax sugar
- Fixed FugueSQL
CONNECT
bug
- Fugueless 1 2 3 4 5
- Notebook experience and extension 1 2
- NativeExecutionEngine: switched to use QPD for SQL
- Spark pandas udf: migrate to applyInPandas and mapInPandas
- SparkExecutionEngine take bug
- Fugue SQL: PRINT ROWS n -> PRINT n ROWS|ROW
- Refactor yield
- Fixed Jinja templating issue
- Change _parse_presort_exp from a private function to public
- Failure to delete execution temp directory is annoying was changed to info
- Limit and Limit by Partition
- README code is working now
- Limit was renamed to take and added to SQL interface
- RPC for Callbacks to collect information from workers in real time
- Changes in handling input dataframe determinism. This fixes a bug related to thread locks with Spark DataFrames because of a deepcopy.
- sample function
- Make csv infer schema consistent cross engine
- Make loading file more consistent cross engine
- Support **kwargs in interfaceless extensions, see this
- Support
Iterable[pd.DataFrame]
as output type, see this - Alter column types
- RENAME in Fugue SQL
- CONNECT different SQL service in Fugue SQL
- Fixed Spark EVEN REPARTITION issue
- Add hook to print/show, see this.
- Fixed import issue with OutputTransformer
- Added fillna as a built-in transform, including SQL implementation
- Extension validation interface and interfaceless syntax
- Passing dataframes cross workflow (yield)
- OUT TRANSFORM to transform and finish a branch of execution
- Fixed a PandasDataFrame datetime issue that only happened in transformer interface approach
- Unified checkpoints and persist
- Drop columns and na implementations in both programming and sql interfaces
- Presort takes array as input
- Fixed jinja template rendering issue
- Fixed path format detection bug
- Require pandas 1.0 because of parquet schema
- Improved Fugue SQL extension parsing logic
- Doc for contributors to setup their environment
- Added set operations to programming interface:
union
,subtract
,intersect
- Added
distinct
to programming interface - Ensured partitioning follows SQL convention: groups with null keys are NOT removed
- Switched
join
,union
,subtract
,intersect
,distinct
to QPD implementations, so they follow SQL convention - Set operations in Fugue SQL can directly operate on Fugue statemens (e.g.
TRANSFORM USING t1 UNION TRANSFORM USING t2
) - Fixed bugs
- Added onboarding document for contributors
- Main features of Fugue core and Fugue SQL
- Support backends: Pandas, Spark and Dask