Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python)!: Change default engine for read_excel to "calamine" #17263

Merged
merged 4 commits into from
Jun 30, 2024

Conversation

stinodego
Copy link
Member

@stinodego stinodego commented Jun 28, 2024

Closes #17177

Changes

Example

Before:

>>> pl.read_excel("data.xlsx", engine_options={"skip_empty_lines": True})

After:

>>> pl.read_excel("data.xlsx", engine_options={"skip_empty_lines": True})
Traceback (most recent call last):
...
TypeError: read_excel() got an unexpected keyword argument 'skip_empty_lines'

Instead, explicitly specify the xlsx2csv engine or omit the engine_options:

>>> pl.read_excel("data.xlsx", engine="xlsx2csv", engine_options={"skip_empty_lines": True})

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars labels Jun 28, 2024
Copy link

codecov bot commented Jun 28, 2024

Codecov Report

Attention: Patch coverage is 78.12500% with 7 lines in your changes missing coverage. Please review.

Project coverage is 80.71%. Comparing base (d444b79) to head (5c6151e).
Report is 1 commits behind head on main.

Files Patch % Lines
py-polars/polars/io/spreadsheet/functions.py 86.20% 3 Missing and 1 partial ⚠️
py-polars/polars/selectors.py 0.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #17263      +/-   ##
==========================================
- Coverage   80.72%   80.71%   -0.01%     
==========================================
  Files        1475     1475              
  Lines      193162   193181      +19     
  Branches     2751     2756       +5     
==========================================
- Hits       155922   155919       -3     
- Misses      36732    36752      +20     
- Partials      508      510       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines -481 to -485
if is_file and str(source).lower().endswith(".ods"):
# note: if called from "read_ods" the engine cannot be 'None', hence
# this check is only triggered when called from "read_excel"
msg = "OpenDocumentSpreadsheet files require use of `read_ods`, not `read_excel`"
raise ValueError(msg)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the logic wasn't being hit anymore since engine cannot be None. Not sure if we still need this.

@stinodego stinodego changed the title feat(python): Change default engine for read_excel to "calamine" feat(python)!: Change default engine for read_excel to "calamine" Jun 28, 2024
@github-actions github-actions bot added the breaking Change that breaks backwards compatibility label Jun 28, 2024
@ritchie46 ritchie46 merged commit 24e5c2c into main Jun 30, 2024
17 checks passed
@ritchie46 ritchie46 deleted the read-excel-default branch June 30, 2024 06:49
@stinodego
Copy link
Member Author

@alexander-beedie for the columns parameter, what happens when multiple sheets are read? Shouldn't it accept a mapping of sheet to column names?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Change that breaks backwards compatibility enhancement New feature or an improvement of an existing feature python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Switch over default engine for read_excel from "xlsx2csv" to "calamine"
3 participants