Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All Charts Are Removed From Sheet - Removed Part: Drawing shape. #3767

Closed
6 tasks
roland-jungwirth opened this issue Oct 16, 2023 · 10 comments · Fixed by #3771
Closed
6 tasks

All Charts Are Removed From Sheet - Removed Part: Drawing shape. #3767

roland-jungwirth opened this issue Oct 16, 2023 · 10 comments · Fixed by #3771
Labels

Comments

@roland-jungwirth
Copy link

This is:

- [ x] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

When I use an existing Excel file, load it, and then save it, I expect all the graphs to be in the document.

What is the current behavior?

When I use an existing Excel file, load it using IOFactory::load( $filename ), the write it to php://output using the Xlsx writer, then open the file, Excel returns the following error (from Xcode):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<recoveryLog xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"><logFileName>Repair Result to test.xml</logFileName><summary>Errors were detected in file ’~/Downloads/test.xlsx’</summary><removedParts summary="Following is a list of removed parts:"><removedPart>Removed Part: Drawing shape.</removedPart></removedParts></recoveryLog>

and the repaired file does not have any graphs.

What are the steps to reproduce?

Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:

<?php

require __DIR__ . '/vendor/autoload.php';

// Create new Spreadsheet object
$spreadsheet = IOFactory::load( 'test.xlsx' );

// I've also tried:
// 1. application/ms-excel
// 2. application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
header( 'Content-Type: application/vnd.ms-excel' );
header( 'Content-Disposition: attachment; filename="test-output.xlsx"' );
header( 'Cache-Control: max-age=0' );

$writer = new Xlsx($spreadsheet);

// added this, as this claimed to fix this.
ob_end_clean();
// I've found this open as well.
$writer->setIncludeCharts(TRUE);
$writer->save( 'php://output' );
exit;

The attached Excel sheet is just a sample with a - simple - graph. When I pass that to the above workflow, the graph is stripped. I tried opening this up in both Excel, LibreOffice and Google Sheets. The graph is missing in all of these.

What features do you think are causing the issue

  • [x ] Reader
  • [x ] Writer
  • Styles
  • Data Validations
  • Formula Calculations
  • Charts
  • AutoFilter
  • Form Elements

Does an issue affect all spreadsheet file formats? If not, which formats are affected?

test.xlsx

Which versions of PhpSpreadsheet and PHP are affected?

PHP version: 8.1
PhpSpreadsheet: 1.29

@oleibman
Copy link
Collaborator

Thank you for the sample file. This turns out to hit at least 3 different problems, so may take a while to fix.

First, although you are specifying includeCharts for write, you are not doing so for read. That's an easy problem for you to fix. So, the expected outcome is that your output file will not contain charts (because they were ignored on read).

Second, it is, of course, not an expected outcome that Excel will have to repair the output file given your code. That's problem 2, and it's our problem, and I will need to research what needs to be done to avoid that.

Finally, if you do specify includeCharts for read, PhpSpreadsheet will throw a TypeError during the save. Something appears to be wrong with the way it reads the Chart Title, which should a RichText item containing a string but which instead unexpectedly contains an array. That's problem 3, and it's again our problem, and I will need to research ...

@roland-jungwirth
Copy link
Author

roland-jungwirth commented Oct 17, 2023

@oleibman , thank you for your answer. Adding the includeCharts to the reader definitely helped. I have replaced this line

$spreadsheet = IOFactory::load( 'test.xlsx' );

with this

$reader = IOFactory::createReader('Xlsx');
$reader->setIncludeCharts(TRUE);
$spreadsheet = $reader->load( 'test.xlsx' );

and the graphs are there.

I do get the following error when opening the excel file

Excel completed file level validation and repair. Some parts of this workbook may have been repaired or discarded.

but I cannot see which part this pertains to, the file opens and the graphs seem to be correct.

I am not sure whether you want to close this, so I left it open for now.

@oleibman
Copy link
Collaborator

Please leave it open, I am working on a solution to the other problems.

Interesting that you avoided the TypeError for save; see next paragraph. At any rate, I am guessing that the part of the graph which Excel is having problems with is the title, which is set in the original sheet to what must be a default value 'Graph Title'. You'll notice that it's missing after Excel repairs it, but you probably wouldn't notice unless you were looking for it. I don't see anything in the Xml which sets that value in the original.

My results don't entirely match yours. I have been running my tests against master, which is where the TypeError shows up. However, it does not show up when I run against 1.29. Unlike you, I also don't get a corrupt file when I run against 1.29. At any rate, it seems the TypeError was introduced after 1.29 and is in the pending changes for whenever we issue a new release. So fixing it becomes a higher priority. I'm guessing it happened with the elimination of Php7.4 and the subsequent use of typing using Php declarations rather than doc-blocks. I don't know if I'll be able to put my finger on the smoking gun.

@roland-jungwirth
Copy link
Author

Ok, no problem. Just ping me should you want me to test something.

oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Oct 18, 2023
Fix PHPOffice#3767. The basic problem in that issue was user error, but it exposed a couple of problems in code which are being addressed.

When a spreadsheet with charts is loaded without the includeCharts option and then saved, a corrupt spreadsheet is created. I believe this has been the case for quite some time. Nothing in the test suite covers this scenario. It is, in fact, a difficult thing to test, since the problem is exposed only when the file is opened through Excel. The specific problem is that a rels file generated by the output writer continues to refer to the drawing file which described the chart, but that file is not included (and not needed) in the output spreadsheet. The resolution is kludgey. The information that the file will not be needed is not available when the rels file is written. But, when it comes time to write the drawing file, it is known whether the rels file included it. So, if nothing else has caused the file to be generated, it is written out as a basically empty xml file after all. This solves the problem, but I will continue to search for a less kludgey solution. This solution is, at least, testable; if a different solution is applied later on, the test being introduced here is likely to break so a new one will be needed.

When the provided spreadsheet is loaded with the includeCharts option and then saved, an error is exposed in processing the Chart Title caption when it doesn't exist. The change to Writer/Xlsx/Chart is simple and clearly justifiable. What is peculiar is that the error does not arise with release 1.29, but does arise with master. It is not at all clear to me what has changed since the release to expose the error - the code in question certainly hasn't changed. It is difficult to isolate changes because of the extensive number of changes following on the elimination of Php7.4 as a supported platform.

The provided spreadsheet is unusual in at least two senses. When opened in Excel, it will show a clearly default value for the chart title, namely 'Chart Title'. I cannot find anything in the xml corresponding to that text. Since I have no idea why Excel is using that title, I will not try to duplicate its behavior, so that loading and saving the provided spreadsheet will omit the chart title. I will continue to investigate.

The other sense in which it is unusual is that it includes some style files in the same directory as the chart. I doubt that PhpSpreadsheet looks at these. The styling after load and save seems to mostly match the original, although there is at least one color in the graph which does not match. I imagine it would be pretty complicated to formally support these files.
@oleibman
Copy link
Collaborator

@roland-jungwirth Since your test results didn't completely match mine, please test against 3771 if you can, including both cases (specify or omit includeCharts on read). If everything looks okay on your end, l will merge that change.

@roland-jungwirth
Copy link
Author

@oleibman, thank you - can I do this directly using composer, or do I need to pull it using git?

@oleibman
Copy link
Collaborator

@roland-jungwirth
There may be an easier way, but the following should work:

git clone -b issue3767 --single-branch https://github.com/oleibman/PhpSpreadsheet.git directory_name

Then go to directory_name and use composer to install any dependencies. Point your "require autoload" statement to directory_name, and you can run your tests.

@flyke
Copy link

flyke commented Nov 6, 2023

Hi,
I have the same issue.
I tested manually applying these changes into vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Writer/Xlsx.php and vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Writer/Xlsx/Chart.php.
Still no charts are generated.

See attached .xlsx test file that has 3 charts that are not being generated.
demo-excel.xlsx

use PhpOffice\PhpSpreadsheet\IOFactory as SpreadsheetIOFactory;
use PhpOffice\PhpSpreadsheet\Writer\Pdf\Mpdf as ExcelMPDF;

      // Load the document
      $reader = SpreadsheetIOFactory::createReader('Xlsx');
      $reader->setIncludeCharts(true);
      $document = $reader->load('demo-excel.xlsx);
      // Save as html for testing
      $writer = new \PhpOffice\PhpSpreadsheet\Writer\Html($document);
      $writer->setIncludeCharts(true);
      $writer->writeAllSheets();
      $writer->save( 'test.html');
      // Save as pdf for testing
      $writer = new ExcelMPDF($document);
      $writer->setIncludeCharts(true);
      $writer->save('demo-excel.pdf');

@oleibman
Copy link
Collaborator

oleibman commented Nov 6, 2023

@flyke Your problem is different. You need to specify a renderer when generating charts for Html or Pdf. If you have used Composer to install the suggested package mitoteam/jpgraph:

\PhpOffice\PhpSpreadsheet\Settings::setChartRenderer(
    \PhpOffice\PhpSpreadsheet\Chart\Renderer\MtJpGraphRenderer::class
);

You will now get charts in your Html/Pdf. Even so, I don't think that the rendering for this sheet is good enough. Please open a new issue and I will start looking into it once I see it.

@roland-jungwirth
Copy link
Author

roland-jungwirth commented Nov 9, 2023

@oleibman , sorry for the delay. This branch fixes my issue with the resulting Excel having an error.

For reference, this is the code that I tested it with:

$template_file = 'test.xlsx';;
$filename = 'test_' . date( 'Y-m-d_H-i-s' ) . '.xlsx';

// Load the template file
$reader = IOFactory::createReader('Xlsx');
$reader->setIncludeCharts(TRUE);

$spreadsheet = $reader->load($template_file);

// Set the headers to force a download of the Excel file
header( 'Content-Type: application/vnd.ms-excel' );
header( 'Content-Disposition: attachment; filename="' . $filename . '"' );
header( 'Cache-Control: max-age=0' );

// Create a writer and send the Excel file to the browser
$writer = new Xlsx($spreadsheet);

ob_end_clean();
$writer->setIncludeCharts(TRUE);
$writer->save( 'php://output' );
exit;

and this is the test.xlsx file:
test.xlsx

oleibman added a commit that referenced this issue Nov 9, 2023
* Address Some Chart Problems

Fix #3767. The basic problem in that issue was user error, but it exposed a couple of problems in code which are being addressed.

When a spreadsheet with charts is loaded without the includeCharts option and then saved, a corrupt spreadsheet is created. I believe this has been the case for quite some time. Nothing in the test suite covers this scenario. It is, in fact, a difficult thing to test, since the problem is exposed only when the file is opened through Excel. The specific problem is that a rels file generated by the output writer continues to refer to the drawing file which described the chart, but that file is not included (and not needed) in the output spreadsheet. The resolution is kludgey. The information that the file will not be needed is not available when the rels file is written. But, when it comes time to write the drawing file, it is known whether the rels file included it. So, if nothing else has caused the file to be generated, it is written out as a basically empty xml file after all. This solves the problem, but I will continue to search for a less kludgey solution. This solution is, at least, testable; if a different solution is applied later on, the test being introduced here is likely to break so a new one will be needed.

When the provided spreadsheet is loaded with the includeCharts option and then saved, an error is exposed in processing the Chart Title caption when it doesn't exist. The change to Writer/Xlsx/Chart is simple and clearly justifiable. What is peculiar is that the error does not arise with release 1.29, but does arise with master. It is not at all clear to me what has changed since the release to expose the error - the code in question certainly hasn't changed. It is difficult to isolate changes because of the extensive number of changes following on the elimination of Php7.4 as a supported platform.

The provided spreadsheet is unusual in at least two senses. When opened in Excel, it will show a clearly default value for the chart title, namely 'Chart Title'. I cannot find anything in the xml corresponding to that text. Since I have no idea why Excel is using that title, I will not try to duplicate its behavior, so that loading and saving the provided spreadsheet will omit the chart title. I will continue to investigate.

The other sense in which it is unusual is that it includes some style files in the same directory as the chart. I doubt that PhpSpreadsheet looks at these. The styling after load and save seems to mostly match the original, although there is at least one color in the graph which does not match. I imagine it would be pretty complicated to formally support these files.

* Unused Assignment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

3 participants