-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allowed memory size of X bytes exhausted in PhpSpreadsheet\Worksheheet\Worksheet.php on line 3108 #3814
Comments
Hi petruchek, I have tested the example.xlsx(only data in 1 row) on Php 8.2.
Let me know if this works for you or You can assign me this issue Thanks |
Hi! Actually, it's not an infinite loop. Some kind of optimization and / or fix should be made though. If you use PHP 8.2 or later, and set the huge I'm here from the GitHub search to test the new version of my memory profiler (Reli 0.11.0). "Allowed memory size of" is my personal trending word these days. I've done some investigation about this issue, so I share it. Test scriptcomposer.json{
"require": {
"phpoffice/phpspreadsheet": "^1.29"
}
} analyzer.php<?php
register_shutdown_function(
function (): void {
$error = error_get_last();
if (is_null($error)) {
return;
}
if (strpos($error['message'], 'Allowed memory size of') !== 0) {
return;
}
$pid = getmypid();
$file_opt = '--memory-limit-error-file=' . escapeshellarg($error['file']);
$line_opt = '--memory-limit-error-line=' . escapeshellarg($error['line']);
system("sudo /home/sji/work/reli-prof/reli i:m -p {$pid} --no-stop-process {$file_opt} {$line_opt} >memory_analyzed.json");
}
); test.php<?php
require_once './vendor/autoload.php';
$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load('./example.xlsx');
$names = $spreadsheet->getSheetNames();
$result = [];
for($i=0;$i<$spreadsheet->getSheetCount();$i++)
{
foreach ($spreadsheet->getSheet($i)->toArray() as $row) {
print_r($row);
}
} Run with 8M limit$ php -dmemory_limit=8M -dauto_prepend_file=analyzer.php test.php
PHP Fatal error: Allowed memory size of 8388608 bytes exhausted (tried to allocate 20480 bytes) in /home/sji/work/oss/tmp/phpspreadsheet_test/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Worksheet/Worksheet.php on line 3107
$ ls -la memory_analyzed.json
-rw-rw-r-- 1 sji sji 7318949 Dec 19 04:38 memory_analyzed.json Summary$ cat memory_analyzed.json | jq .summary
[
{
"zend_mm_heap_total": 8388608,
"zend_mm_heap_usage": 7266520,
"zend_mm_chunk_total": 8388608,
"zend_mm_chunk_usage": 7266520,
"zend_mm_huge_total": 0,
"zend_mm_huge_usage": 0,
"vm_stack_total": 262144,
"vm_stack_usage": 1088,
"compiler_arena_total": 2011608,
"compiler_arena_usage": 52872,
"possible_allocation_overhead_total": 953990,
"possible_array_overhead_total": 254560,
"memory_get_usage": 7298856,
"memory_get_real_usage": 8388608,
"cached_chunks_size": 0,
"heap_memory_analyzed_percentage": 99.55697166788877,
"php_version": "v82",
"analyzer": "reli 0.11.0"
}
] Reli successfully analyzes the 99.55% of the heap usage. Extracting top 20 most memory-consuming types of memory locations$ cat memory_analyzed.json | jq .location_types_summary | jq -r '(["location_type", "count", "memory_usage"] | (., map(length*"="))),(to_entries|.[:20]|.[]|[.key,.value.count,.value.memory_usage])|@tsv' | column -t -o ' | '
location_type | count | memory_usage
============= | ===== | ============
ZendArrayTableMemoryLocation | 776 | 3737712
ZendArrayTableOverheadMemoryLocation | 775 | 113200
ZendObjectMemoryLocation | 697 | 109640
RuntimeCacheMemoryLocation | 391 | 52880
ZendArrayMemoryLocation | 778 | 43568
ZendStringMemoryLocation | 418 | 25298
ObjectsStoreMemoryLocation | 1 | 8192
StaticMembersTableMemoryLocation | 16 | 928
CallFrameVariableTableMemoryLocation | 5 | 688
CallFrameHeaderMemoryLocation | 5 | 400
ZendReferenceMemoryLocation | 5 | 160
ZendResourceMemoryLocation | 3 | 72 Array data consumes the most memory. Extracting the references of top 20 largest arrays$ cat memory_analyzed.json | jq '. as $root | path(..|objects|select(."#type"=="ArrayHeaderContext"))| . as $path | $root|getpath($path) as $header | $header.array_elements as $elements | {path: $path|join("."), size: $elements."#locations"[0].size, count: $elements."#count", node_id:$header."#node_id"}' | jq -rs '(["size", "count", "node_id" ,"path"] | (., map(length*"="))),(sort_by(.size) | .[-20:] | reverse | .[] | [.size, .count, .node_id, .path])|@tsv' | column -t -o ' | '
size | count | node_id | path
==== | ===== | ======= | ====
24960 | 524 | 475555 | context.class_table.phpoffice\\phpspreadsheet\\calculation\\calculation.static_properties.phpSpreadsheetFunctions
16392 | 1024 | 452757 | context.call_frames.2.local_variables.returnValue.array_elements.218.value
16392 | 1024 | 450705 | context.call_frames.2.local_variables.returnValue.array_elements.217.value
16392 | 1024 | 448653 | context.call_frames.2.local_variables.returnValue.array_elements.216.value
16392 | 1024 | 446601 | context.call_frames.2.local_variables.returnValue.array_elements.215.value
16392 | 1024 | 444549 | context.call_frames.2.local_variables.returnValue.array_elements.214.value
16392 | 1024 | 442497 | context.call_frames.2.local_variables.returnValue.array_elements.213.value
16392 | 1024 | 440445 | context.call_frames.2.local_variables.returnValue.array_elements.212.value
16392 | 1024 | 438393 | context.call_frames.2.local_variables.returnValue.array_elements.211.value
16392 | 1024 | 436341 | context.call_frames.2.local_variables.returnValue.array_elements.210.value
16392 | 1024 | 434289 | context.call_frames.2.local_variables.returnValue.array_elements.209.value
16392 | 1024 | 432237 | context.call_frames.2.local_variables.returnValue.array_elements.208.value
16392 | 1024 | 430185 | context.call_frames.2.local_variables.returnValue.array_elements.207.value
16392 | 1024 | 428133 | context.call_frames.2.local_variables.returnValue.array_elements.206.value
16392 | 1024 | 426081 | context.call_frames.2.local_variables.returnValue.array_elements.205.value
16392 | 1024 | 424029 | context.call_frames.2.local_variables.returnValue.array_elements.204.value
16392 | 1024 | 421977 | context.call_frames.2.local_variables.returnValue.array_elements.203.value
16392 | 1024 | 419925 | context.call_frames.2.local_variables.returnValue.array_elements.202.value
16392 | 1024 | 417873 | context.call_frames.2.local_variables.returnValue.array_elements.201.value
16392 | 1024 | 415821 | context.call_frames.2.local_variables.returnValue.array_elements.200.value There seems to be a large number of 1024-element arrays as elements of the array Extracting the call trace$ cat memory_analyzed.json | jq -r '(["frame_no", "function", "line"] | (., map(length*"="))),(path(.context.call_frames[]|objects) as $path | [$path[2], getpath($path).function_name, getpath($path).lineno])|@tsv' | column -t
frame_no function line
======== ======== ====
0 system -1
1 {closure}(/home/sji/work/oss/tmp/phpspreadsheet_test/analyzer.php:3-15) 14
2 PhpOffice\\PhpSpreadsheet\\Worksheet\\Worksheet::rangeToArray 3107
3 PhpOffice\\PhpSpreadsheet\\Worksheet\\Worksheet::toArray 3214
4 <main> 12 The executing function of the 3rd call frame is Extracting some local variables$ cat memory_analyzed.json | jq '.context.call_frames."2".local_variables|{row, minRow, maxRow, col, minCol, maxCol, range, rangeStart, rangeEnd}'
{
"row": {
"#node_id": 455852,
"#type": "ScalarValueContext",
"value": 220
},
"minRow": {
"#reference_node_id": 455842
},
"maxRow": {
"#reference_node_id": 455849
},
"col": {
"#node_id": 455855,
"#type": "StringContext",
"#locations": [
{
"address": 140454052395712,
"size": 26,
"refcount": 1,
"type_info": 22,
"value": "SS"
}
]
},
"minCol": {
"#reference_node_id": 5212
},
"maxCol": {
"#node_id": 455850,
"#type": "StringContext",
"#locations": [
{
"address": 140454065514368,
"size": 27,
"refcount": 1,
"type_info": 22,
"value": "AMK"
}
]
},
"range": {
"#node_id": 5417,
"#type": "StringContext",
"#locations": [
{
"address": 140454065211784,
"size": 37,
"refcount": 1,
"type_info": 22,
"value": "A1:AMJ1048576"
}
]
},
"rangeStart": {
"#node_id": 455836,
"#type": "ArrayHeaderContext",
"#locations": [
{
"address": 140454052645000,
"size": 56,
"refcount": 1,
"type_info": 3221336071
}
],
"possible_unused_area": {
"#node_id": 455837,
"#type": "ArrayPossibleOverheadContext",
"#locations": [
{
"address": 140454052589032,
"size": 96,
"used_location": {
"address": 140454052588992,
"size": 40,
"is_packed": true
}
}
]
},
"array_elements": {
"#node_id": 455838,
"#type": "ArrayElementsContext",
"#locations": [
{
"address": 140454052588992,
"size": 40,
"is_packed": true
}
],
"#count": 2,
"0": {
"#node_id": 455839,
"#type": "ArrayElementContext",
"value": {
"#node_id": 455840,
"#type": "ScalarValueContext",
"value": 1
}
},
"1": {
"#node_id": 455841,
"#type": "ArrayElementContext",
"value": {
"#node_id": 455842,
"#type": "StringContext",
"#locations": [
{
"address": 94044376008848,
"size": 25,
"refcount": 2,
"type_info": 342,
"value": "1"
}
]
}
}
}
},
"rangeEnd": {
"#node_id": 455843,
"#type": "ArrayHeaderContext",
"#locations": [
{
"address": 140454065743336,
"size": 56,
"refcount": 1,
"type_info": 3221335047
}
],
"possible_unused_area": {
"#node_id": 455844,
"#type": "ArrayPossibleOverheadContext",
"#locations": [
{
"address": 140454052531208,
"size": 96,
"used_location": {
"address": 140454052531168,
"size": 40,
"is_packed": true
}
}
]
},
"array_elements": {
"#node_id": 455845,
"#type": "ArrayElementsContext",
"#locations": [
{
"address": 140454052531168,
"size": 40,
"is_packed": true
}
],
"#count": 2,
"0": {
"#node_id": 455846,
"#type": "ArrayElementContext",
"value": {
"#node_id": 455847,
"#type": "ScalarValueContext",
"value": 1024
}
},
"1": {
"#node_id": 455848,
"#type": "ArrayElementContext",
"value": {
"#node_id": 455849,
"#type": "StringContext",
"#locations": [
{
"address": 140454065658240,
"size": 31,
"refcount": 2,
"type_info": 22,
"value": "1048576"
}
]
}
}
}
}
} This can be read as like the following:
So, it tries to create 1,048,576 arrays each having 1,024 elements. The size of zval in PHP is 16 bytes each. Each element in packed arrays has an overhead of extra 16 bytes in PHP 8.1 and earlier, while in PHP 8.2+ they only require the same size as the zval. So at least 16 GB is required for PHP 8.2 and later, and 32 GB for PHP 8.1 and earlier, just to hold the contents of this array A possible optimizationCurrently, PHP arrays are CoW, so it's easy to dramatically reduce memory consumption for cases like this. First prepare an array filled with Then the test.php in this comment can run wih 32M of the I'll send a PR later. Is this really a problem to be solved in that way?May or may not be. I don't know the requirement of this library well. This xlsx file has attribute settings in the AME to AMJ columns and also in the row number 1048576. This is what causes The actual range of data in the cells is much smaller, so it is questionable whether Nevertheless, there may be cases where some data is actually located "far away", so it would be not so bad to include such an optimization anyway. |
Oh, did another person see this issue? Anyway, I'll send a PR to reduce memory usage in cases where there are many empty rows. |
That's quite an analysis. Impressive. |
Add an usage example at PHPOffice/PhpSpreadsheet#3814
This is:
What is the expected behavior?
The spreadsheet array containing the textual data from the spreadsheet is initialized.
What is the current behavior?
Fatal error: Allowed memory size of 536870912 bytes exhausted (tried to allocate 36864 bytes) in PhpSpreadsheet\Worksheheet\Worksheet.php on line 3108
What are the steps to reproduce?
If this is an issue with reading a specific spreadsheet file, then it may be appropriate to provide a sample file that demonstrates the problem; but please keep it as small as possible, and sanitize any confidential information before uploading. File attached below.
What features do you think are causing the issue
Does an issue affect all spreadsheet file formats? If not, which formats are affected?
This happens for a specific file, I am attaching it here: example.xlsx
Which versions of PhpSpreadsheet and PHP are affected?
I have tested with 1.29 on PHP 7.4 (BTW, I thought 7.4 is no longer supported?) and PHP 8.2 - happens on both.
I've looked into
rangeToArray
function a little bit, this file has range ofA1:AMJ1048576
, and when we are loopingfor ($col = $minCol; $col !== $maxCol; ++$col) {
we never reach$maxCol
value ofAMK
I understand the file is crazy, yet I believe Reader should not go into infinite loop even for a crazy file like this.
The text was updated successfully, but these errors were encountered: