Excessive Memory Usage During the Creation of Large Excel Documents #480

gamerover98 · 2024-09-21T15:57:11Z

Problem

I am experiencing a memory usage issue when creating very large Excel documents using fastexcel. While inserting cells into a worksheet with maximum dimensions (16,384 columns by 1,048,576 rows), I noticed that the memory consumption was extremely high.

Here are some specific observations:

Memory Usage: I measured memory consumption with VisualVM and observed usage of over 8 GB of RAM.
Memory Profiling: The profiling showed more than 180 million instances of org.dhatim.fastexcel.Cell and over 1 million instances of org.dhatim.fastexcel.Cell[].

Code Example to Reproduce the Issue

package org.example;

import lombok.extern.slf4j.Slf4j;
import org.dhatim.fastexcel.Workbook;
import org.dhatim.fastexcel.Worksheet;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;

@Slf4j
public class Main {

    private static final String EXCEL_RELATIVE_PATH = "./target/data/example.xlsx";
    private static final int COLUMNS = Worksheet.MAX_COLS; // 16_384 columns
    private static final int ROWS = Worksheet.MAX_ROWS;    // 1_048_576 rows

    public static void main(String... args) throws Exception {
        try (var outputStream = new FileOutputStream(getExcelFile());
             var workbook = new Workbook(outputStream, "MyApplication", "1.0");
             var worksheet = workbook.newWorksheet("Sheet 1")) {

            log.debug("Generating data...");

            for (int columnIndex = 0; columnIndex < COLUMNS; columnIndex++) {
                for (int rowIndex = 0; rowIndex < ROWS; rowIndex++) {
                    worksheet.value(rowIndex, columnIndex, "test");
                }
            }

            log.debug("Waiting...");
            // Stop current thread to see memory usage.
            Thread.sleep(Long.MAX_VALUE);
        }
    }

    public static File getExcelFile() throws IOException {
        var file = new File(EXCEL_RELATIVE_PATH);

        if (file.mkdirs()) {
            log.debug("Excel file created");
        }

        if (file.exists()) {
            Files.delete(Path.of(file.toURI()));
            log.debug("Excel file deleted");
        }

        return file;
    }
}

Proposed Solution

I propose implementing a new version of the Worksheet class that writes cells directly to the document without keeping them in memory for potential future modifications. This sequential writing method is particularly useful for those, like me, who need to create large documents "cascadingly" without needing to know the content of previously written cells.

This solution would significantly reduce memory usage for large documents, improving the library’s efficiency in similar scenarios.

Thanks a lot 💯

The text was updated successfully, but these errors were encountered:

gamerover98 mentioned this issue Sep 21, 2024

Initial implementation of memory-efficient Excel writing for large documents #481

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Excessive Memory Usage During the Creation of Large Excel Documents #480

Excessive Memory Usage During the Creation of Large Excel Documents #480

gamerover98 commented Sep 21, 2024

Excessive Memory Usage During the Creation of Large Excel Documents #480

Excessive Memory Usage During the Creation of Large Excel Documents #480

Comments

gamerover98 commented Sep 21, 2024

Problem

Code Example to Reproduce the Issue

Proposed Solution