-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(docs): add benchmarks and plots in readme #367
Conversation
Got some pointers from a maintainer of |
calamine vs openpyxl (read_only mode), python3.11 on my PC:
Code: from openpyxl import load_workbook
wb = load_workbook(filename='NYC_311_SR_2010-2020-sample-1M.xlsx', read_only=True)
ws = wb['NYC_311_SR_2010-2020-sample-1M']
for row in ws.rows:
_ = row
# Close the workbook after reading
wb.close() |
I didn't find this too. With this code, application allocate over 10 GB memory and I killed it. let path = std::path::Path::new("NYC_311_SR_2010-2020-sample-1M.xlsx");
let book = umya_spreadsheet::reader::xlsx::read(path).unwrap();
let sheet = book.get_sheet_by_name("NYC_311_SR_2010-2020-sample-1M").unwrap();
let _ = sheet.get_collection_to_hashmap();
// OR
let path = std::path::Path::new("NYC_311_SR_2010-2020-sample-1M.xlsx");
let book = umya_spreadsheet::reader::xlsx::lazy_read(path).unwrap();
let _ = book.get_lazy_read_sheet_cells(&0).unwrap(); |
Previous `excelize` data was gotten using an improper iterator. New code comes from [here](qax-os/excelize#1695 (comment)).
What version of python did you use?
|
@dimastbk |
Thanks. I just surprised so big different between python3.10 and 3.11. |
I'm also interested in how much slower mine is compared to yours. 100 seconds. I'm not even sure what could account for that much difference. |
Thanks! |
Went through and benchmarked some other libraries to see where
calamine
stood compared to other ecosystems. Decided to add it to the docs. As well as, after seeing the results, file an issue forexcelize
.I wanted to add
umya-spreadsheet
, but it didn't seem to have any way to directly iterate over the rows? At least I couldn't tell from the wording in the docs nor the function signitures. If you manage to figure out a way to do that, and want another rust comparison, I don't mind adding it.Git history is a bit messy with fixes, squashing might be best.