- Writer.write_intermediate_footer method for ORC library 1.9.0 and newer.
- Python 3.12 wheels.
- Dropped support for Python 3.7.
- ORC C++ Core updated to 1.9.1.
- Python 3.11 wheels. (PR #58, contribution of @dbaxa)
- ORC C++ Core updated to 1.7.7.
- Improved type annotations, set module's __all__ variable.
- Universal2 wheels for MacOS. (PR #55, contribution of @dbaxa)
- ORC-517, ORC-203, and ORC-14 versions to WriterVersion enum.
- Dropped support for Python 3.6.
- ORC C++ Core updated to 1.7.5.
- New parameter to Writer: dict_key_size_threshold for setting threshold for dictionary encoding. (PR #46, contribution of @dirtysalt)
- New parameter to Writer: padding_tolerance for block padding.
- New parameter to Reader and Writer: null_value for changing representation of ORC null value. The value must be a singleton object.
- Type stubs for classes implemented in C++.
- Experimental musllinux and PyPy wheels.
- Writer.writerows method reimplemented in C++.
- Improved type annotations.
- ORC C++ Core updated to 1.7.3.
- Removed build_orc setup.py command, moved the same functionality to build_ext command.
- Unnecessary string casting of values when writing user metadata. (Issue #45)
- Module level variables for the ORC library version: orc_version string and orc_version_info namedtuple.
- New parameter for Writer: row_index_stride.
- New read-only properties for Reader: row_index_stride and software_version.
- Trino and Scritchley writer ids.
- Type annotations support for ORC types.
- Support for timestamp with local time zone type.
- New parameter for Reader and Writer: timezone.
- The backported zoneinfo module dependency pior to Python 3.9.
- Predicate (SearchArgument) support for filtering row groups during ORC file reads. New classes: Predicate and PredicateColumn.
- New parameter for Reader: predicate.
- Build for aarch64 wheels. (PR #43, contribution of @odidev)
- ORC C++ Core updated to 1.7.0, and because many of the new features are not backported to the 1.6 branch, currently this is the minimum required lib version.
- TimestampConverter's to_orc and from_orc methods got an extra timezone parameter, that will be bound to the same ZoneInfo object passed to the Reader or Writer via their timezone parameters during type convert.
- Renamed Reader.metadata property and Writer.set_metadata method to user_metadata and set_user_metadata respectively to avoid confusion.
- Experimental Windows support.
- tzdata package dependency on Windows. Automatically setting TZDIR to the path of the tzdata package's data dir after importing PyORC.
- Create ORC Type from TypeDescription directly (instead of string parsing) for Writer. (PR #26, contribution of @blkerby)
- Dotted column names are allowed to use in TypeDescription.find_column_id method with escaping them backticks.
- ORC C++ Core updated to 1.6.6.
- Handling large negative seconds on Windows for TimestampConverter.from_orc.
- Metadata property for Reader and set_metadata for Writer to handle ORC file's metadata.
- Meta info attributes like writer_id, writer_version, bytes_length, compression and compression_block_size for Reader.
- New TypeDescription subclasses to represent ORC types.
- Reimplemented TypeDescription in Python.
- ORC C++ Core updated to 1.6.3.
- Converting date from ORC on systems where the system's timezone has a negative UTC offset (Issues #5)
- Converters for date, decimal and timestamp ORC types in Python and option to change them via Reader's and Writer's converters parameter.
- Column object for accessing statistics about ORC columns.
- An attribute to Reader for selected schema.
- Use timezone-aware datetime objects (in UTC) for ORC timestamps by default.
- Wrapped C++ stripe object to Python Stripe.
- Decrementing reference for bytes object after reading from file stream.
- A Reader object to read ORC files.
- A stripe object to read only a stripe in an ORC file.
- A Writer object to write ORC files.
- A typedescription object to represent the ORC schema.
- Support to represent a struct type either a Python tuple or a dictionary.