File format v2

nightscape released this 23 Aug 19:47

· 384 commits to main since this release

v0.14.0

4c22545

#210 File format v2 (#389)

* register data source for .format("excel")

* ignore .vscode

* V2 with new Spark Data Source API, uses FileDataSourceV2

* set header default to true, got 1st test passed

* ExcelHelper become options awareness

* handle string type for error-formula

* PlainNumberReadSuite is good now. Also fixed the issue in #285. This introduces a breaking change (good, I think)

* test-case for issue_285

* Handling Error Cells and Undefined Rows

* Test cases for #52 #74 #97 issues

* format & test cases for column pruning (projection)

* Added more test-cases for numerical types

* Stricter numerical types (Integer, Long and Double) in schema inferring. Issue #162

* preparing for final push on writing

* Apply format & Writing is working

* Added excel-row-number column for issues #40 #59 #115 and refactoring

* refactoring unit-tests

* preparing for MR

* Update all test-cases with ScalaTest 3.x

* Writing aware about dataAddress

* writing with dataAddress; No change on dependencies nor build script

* Schema Infering Improvement: {Iterator instead of Seq; Use both samplingRatio and excerptSize}

* added more recent spark version to CI/CD

* support from spark 2.4.1 up

* Fix scalastyle check & enable non-ascii character due to native of unit-tests

* Update src/main/2.4/scala/com/crealytics/spark/v2/excel/ExcelDataSource.scala

Co-authored-by: Martin Mauch <[email protected]>

* Update src/main/2.4/scala/com/crealytics/spark/v2/excel/ExcelDataSource.scala

Co-authored-by: Martin Mauch <[email protected]>

* spark-excel examples in Jupyter Notebook

Co-authored-by: Martin Mauch <[email protected]>

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File format v2