Skip to content

Commit dfa1d46

Browse files
authored
Merge pull request #4543 from oleibman/blockimages
All Readers - Allow or Forbid Fetching of External Images
2 parents adb7459 + 59e20db commit dfa1d46

File tree

11 files changed

+169
-23
lines changed

11 files changed

+169
-23
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org).
99

1010
### Added
1111

12+
- Add to all readers the option to allow or forbid fetching external images. This is unconditionally allowed now. The default will be set to "allow", so no code changes are necessary. However, we are giving consideration to changing the default. [PR #4543](https://github.com/PHPOffice/PhpSpreadsheet/pull/4543)
1213
- Address Excel Inappropriate Number Format Substitution. [PR #4532](https://github.com/PHPOffice/PhpSpreadsheet/pull/4532)
1314

1415
### Removed

docs/references/features-cross-reference.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,17 @@
3434
<td style="text-align: center;">N/A</td>
3535
<td style="text-align: center; color: green;">✔</td>
3636
</tr>
37+
<tr>
38+
<td style="padding-left: 1em;">Read Charts</td>
39+
<td style="text-align: center; color: red;">✖</td>
40+
<td style="text-align: center; color: green;">✔</td>
41+
<td style="text-align: center; color: red;">✖</td>
42+
<td style="text-align: center; color: red;">✖</td>
43+
<td style="text-align: center; color: red;">✖</td>
44+
<td style="text-align: center;">N/A</td>
45+
<td style="text-align: center;">N/A</td>
46+
<td style="text-align: center; color: red;">✖</td>
47+
</tr>
3748
<tr>
3849
<td style="padding-left: 1em;">Read Data Only (no formatting)</td>
3950
<td style="text-align: center; color: green;">✔</td>
@@ -67,6 +78,17 @@
6778
<td style="text-align: center; color: red;">✖</td>
6879
<td style="text-align: center; color: green;">✔</td>
6980
</tr>
81+
<tr>
82+
<td style="padding-left: 1em;">Read External Images</td>
83+
<td style="text-align: center; color: red;">✖</td>
84+
<td style="text-align: center; color: green;">✔ <a href="#footnote9"><sup>9</sup></a></td>
85+
<td style="text-align: center; color: red;">✖</td>
86+
<td style="text-align: center; color: red;">✖</td>
87+
<td style="text-align: center; color: red;">✖</td>
88+
<td style="text-align: center;">N/A</td>
89+
<td style="text-align: center;">N/A</td>
90+
<td style="text-align: center; color: green;">✔ <a href="#footnote9"><sup>9</sup></a></td>
91+
</tr>
7092
<tr>
7193
<td style="padding-left: 0.5em;"><strong>Document Properties</strong></td>
7294
<td style="text-align: center; color: orange;">●</td>
@@ -1006,6 +1028,7 @@
10061028
6. <span id="footnote6">There is very limited support for reading styles from an Ods spreadsheet. Writing styles has better support, although Number Format is incomplete.</span>
10071029
7. <span id="footnote7">In most cases, Html reader processes only inline styles; styles provided by Css classes may be ignored.</span>
10081030
8. <span id="footnote8">Code must [opt in](../topics/recipes.md#array-formulas) to array output.</span>
1031+
9. <span id="footnote9">Starting with release 4.5, code can allow or not external images. In release 4.5 (and in earlier releases which do not offer an option), default is to allow it.</span>
10091032

10101033
## Writers
10111034

docs/topics/reading-and-writing-to-file.md

Lines changed: 17 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1114,7 +1114,10 @@ $reader->load("spreadsheetWithCharts.xlsx", $reader::LOAD_WITH_CHARTS);
11141114
If you wish to use the IOFactory `load()` method rather than instantiating a specific Reader, then you can still pass these flags.
11151115

11161116
```php
1117-
$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load("spreadsheetWithCharts.xlsx", \PhpOffice\PhpSpreadsheet\Reader\IReader::LOAD_WITH_CHARTS);
1117+
$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load(
1118+
"spreadsheetWithCharts.xlsx",
1119+
\PhpOffice\PhpSpreadsheet\Reader\IReader::LOAD_WITH_CHARTS
1120+
);
11181121
```
11191122

11201123
Flags that are available that can be passed to the Reader in this way include:
@@ -1123,17 +1126,19 @@ Flags that are available that can be passed to the Reader in this way include:
11231126
- $reader::READ_DATA_ONLY
11241127
- $reader::IGNORE_EMPTY_CELLS
11251128
- $reader::IGNORE_ROWS_WITH_NO_CELLS
1126-
1127-
| Readers | LOAD_WITH_CHARTS | READ_DATA_ONLY | IGNORE_EMPTY_CELLS | IGNORE_ROWS_WITH_NO_CELLS |
1128-
|----------|------------------|----------------|--------------------|---------------------------|
1129-
| Xlsx | YES | YES | YES | YES |
1130-
| Xls | NO | YES | YES | NO |
1131-
| Xml | NO | NO | NO | NO |
1132-
| Ods | NO | YES | NO | NO |
1133-
| Gnumeric | NO | YES | NO | NO |
1134-
| Html | N/A | N/A | N/A | N/A |
1135-
| Slk | N/A | NO | NO | NO |
1136-
| Csv | N/A | NO | NO | NO |
1129+
- $reader::ALLOW_EXTERNAL_IMAGES (starting with release 4.5)
1130+
- $reader::DONT_ALLOW_EXTERNAL_IMAGES (starting with release 4.5)
1131+
1132+
| Readers | LOAD<br>CHARTS | DATA<br>ONLY | IGNORE<br>EMPTY | IGNORE<br>ROWS | EXTERNAL<br>IMAGES |
1133+
|----------|--------|------|--------|--------|--------|
1134+
| Xlsx | YES | YES | YES | YES | YES |
1135+
| Xls | NO | YES | YES | NO | NO |
1136+
| Xml | NO | NO | NO | NO | NO |
1137+
| Ods | NO | YES | NO | NO | NO |
1138+
| Gnumeric | NO | YES | NO | NO | NO |
1139+
| Html | N/A | N/A | N/A | N/A | YES |
1140+
| Slk | N/A | NO | NO | NO | N/A |
1141+
| Csv | N/A | NO | NO | NO | N/A |
11371142

11381143
Likewise, when saving a file using a Writer, loaded charts will not be saved unless you explicitly tell the Writer to include them:
11391144

src/PhpSpreadsheet/Reader/BaseReader.php

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,13 @@ abstract class BaseReader implements IReader
4747
*/
4848
protected bool $ignoreRowsWithNoCells = false;
4949

50+
/**
51+
* Allow external images. Use with caution.
52+
* Improper specification of these within a spreadsheet
53+
* can subject the caller to security exploits.
54+
*/
55+
protected bool $allowExternalImages = true;
56+
5057
/**
5158
* IReadFilter instance.
5259
*/
@@ -149,6 +156,23 @@ public function setReadFilter(IReadFilter $readFilter): self
149156
return $this;
150157
}
151158

159+
/**
160+
* Allow external images. Use with caution.
161+
* Improper specification of these within a spreadsheet
162+
* can subject the caller to security exploits.
163+
*/
164+
public function setAllowExternalImages(bool $allowExternalImages): self
165+
{
166+
$this->allowExternalImages = $allowExternalImages;
167+
168+
return $this;
169+
}
170+
171+
public function getAllowExternalImages(): bool
172+
{
173+
return $this->allowExternalImages;
174+
}
175+
152176
public function getSecurityScanner(): ?XmlScanner
153177
{
154178
return $this->securityScanner;
@@ -177,6 +201,12 @@ protected function processFlags(int $flags): void
177201
if (((bool) ($flags & self::IGNORE_ROWS_WITH_NO_CELLS)) === true) {
178202
$this->setIgnoreRowsWithNoCells(true);
179203
}
204+
if (((bool) ($flags & self::ALLOW_EXTERNAL_IMAGES)) === true) {
205+
$this->setAllowExternalImages(true);
206+
}
207+
if (((bool) ($flags & self::DONT_ALLOW_EXTERNAL_IMAGES)) === true) {
208+
$this->setAllowExternalImages(false);
209+
}
180210
}
181211

182212
protected function loadSpreadsheetFromFile(string $filename): Spreadsheet

src/PhpSpreadsheet/Reader/Html.php

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1129,7 +1129,7 @@ private function insertImage(Worksheet $sheet, string $column, int $row, array $
11291129
$name = $attributes['alt'] ?? null;
11301130

11311131
$drawing = new Drawing();
1132-
$drawing->setPath($src, false);
1132+
$drawing->setPath($src, false, allowExternal: $this->allowExternalImages);
11331133
if ($drawing->getPath() === '') {
11341134
return;
11351135
}

src/PhpSpreadsheet/Reader/IReader.php

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,14 @@ interface IReader
3333
*/
3434
public const IGNORE_ROWS_WITH_NO_CELLS = 8;
3535

36+
/**
37+
* Allow external images. Use with caution.
38+
* Improper specification of these within a spreadsheet
39+
* can subject the caller to security exploits.
40+
*/
41+
public const ALLOW_EXTERNAL_IMAGES = 16;
42+
public const DONT_ALLOW_EXTERNAL_IMAGES = 32;
43+
3644
public function __construct();
3745

3846
/**
@@ -132,6 +140,15 @@ public function getReadFilter(): IReadFilter;
132140
*/
133141
public function setReadFilter(IReadFilter $readFilter): self;
134142

143+
/**
144+
* Allow external images. Use with caution.
145+
* Improper specification of these within a spreadsheet
146+
* can subject the caller to security exploits.
147+
*/
148+
public function setAllowExternalImages(bool $allowExternalImages): self;
149+
150+
public function getAllowExternalImages(): bool;
151+
135152
/**
136153
* Loads PhpSpreadsheet from file.
137154
*
@@ -141,6 +158,9 @@ public function setReadFilter(IReadFilter $readFilter): self;
141158
* self::READ_DATA_ONLY Read only data, not style or structure information, from the file
142159
* self::IGNORE_EMPTY_CELLS Don't read empty cells (cells that contain a null value,
143160
* empty string, or a string containing only whitespace characters)
161+
* self::IGNORE_ROWS_WITH_NO_CELLS Don't load any rows that contain no cells.
162+
* self::ALLOW_EXTERNAL_IMAGES Attempt to fetch images stored outside the spreadsheet.
163+
* self::DONT_ALLOW_EXTERNAL_IMAGES Don't attempt to fetch images stored outside the spreadsheet.
144164
*/
145165
public function load(string $filename, int $flags = 0): Spreadsheet;
146166
}

src/PhpSpreadsheet/Reader/Xlsx.php

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1518,7 +1518,7 @@ protected function loadSpreadsheetFromFile(string $filename): Spreadsheet
15181518
);
15191519
if (isset($images[$linkImageKey])) {
15201520
$url = str_replace('xl/drawings/', '', $images[$linkImageKey]);
1521-
$objDrawing->setPath($url, false);
1521+
$objDrawing->setPath($url, false, allowExternal: $this->allowExternalImages);
15221522
}
15231523
if ($objDrawing->getPath() === '') {
15241524
continue;
@@ -1622,7 +1622,7 @@ protected function loadSpreadsheetFromFile(string $filename): Spreadsheet
16221622
);
16231623
if (isset($images[$linkImageKey])) {
16241624
$url = str_replace('xl/drawings/', '', $images[$linkImageKey]);
1625-
$objDrawing->setPath($url, false);
1625+
$objDrawing->setPath($url, false, allowExternal: $this->allowExternalImages);
16261626
}
16271627
if ($objDrawing->getPath() === '') {
16281628
continue;

src/PhpSpreadsheet/Worksheet/Drawing.php

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ public function getPath(): string
9292
*
9393
* @return $this
9494
*/
95-
public function setPath(string $path, bool $verifyFile = true, ?ZipArchive $zip = null): static
95+
public function setPath(string $path, bool $verifyFile = true, ?ZipArchive $zip = null, bool $allowExternal = true): static
9696
{
9797
$this->isUrl = false;
9898
if (preg_match('~^data:image/[a-z]+;base64,~', $path) === 1) {
@@ -107,6 +107,9 @@ public function setPath(string $path, bool $verifyFile = true, ?ZipArchive $zip
107107
if (!preg_match('/^(http|https|file|ftp|s3):/', $path)) {
108108
throw new PhpSpreadsheetException('Invalid protocol for linked drawing');
109109
}
110+
if (!$allowExternal) {
111+
return $this;
112+
}
110113
// Implicit that it is a URL, rather store info than running check above on value in other places.
111114
$this->isUrl = true;
112115
$ctx = null;

tests/PhpSpreadsheetTests/Reader/Html/HtmlHelper.php

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,12 @@ public static function createHtml(string $html): string
1818
return $filename;
1919
}
2020

21-
public static function loadHtmlIntoSpreadsheet(string $filename, bool $unlink = false): Spreadsheet
21+
public static function loadHtmlIntoSpreadsheet(string $filename, bool $unlink = false, ?bool $allowExternalImages = null): Spreadsheet
2222
{
2323
$html = new Html();
24+
if ($allowExternalImages !== null) {
25+
$html->setAllowExternalImages($allowExternalImages);
26+
}
2427
$spreadsheet = $html->load($filename);
2528
if ($unlink) {
2629
unlink($filename);

tests/PhpSpreadsheetTests/Reader/Html/HtmlImage2Test.php

Lines changed: 37 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111

1212
class HtmlImage2Test extends TestCase
1313
{
14-
public function testCanInsertImageGoodProtocol(): void
14+
public function testCanInsertImageGoodProtocolAllowed(): void
1515
{
1616
if (getenv('SKIP_URL_IMAGE_TEST') === '1') {
1717
self::markTestSkipped('Skipped due to setting of environment variable');
@@ -23,16 +23,32 @@ public function testCanInsertImageGoodProtocol(): void
2323
</tr>
2424
</table>';
2525
$filename = HtmlHelper::createHtml($html);
26-
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true);
26+
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true, true);
2727
$firstSheet = $spreadsheet->getSheet(0);
2828

2929
/** @var Drawing $drawing */
3030
$drawing = $firstSheet->getDrawingCollection()[0];
3131
self::assertEquals($imagePath, $drawing->getPath());
3232
self::assertEquals('A1', $drawing->getCoordinates());
33+
$spreadsheet->disconnectWorksheets();
3334
}
3435

35-
public function testCantInsertImageNotFound(): void
36+
public function testCanInsertImageGoodProtocolNotAllowed(): void
37+
{
38+
$imagePath = 'https://phpspreadsheet.readthedocs.io/en/stable/topics/images/01-03-filter-icon-1.png';
39+
$html = '<table>
40+
<tr>
41+
<td><img src="' . $imagePath . '" alt="test image voilà"></td>
42+
</tr>
43+
</table>';
44+
$filename = HtmlHelper::createHtml($html);
45+
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true, false);
46+
$firstSheet = $spreadsheet->getSheet(0);
47+
self::assertCount(0, $firstSheet->getDrawingCollection());
48+
$spreadsheet->disconnectWorksheets();
49+
}
50+
51+
public function testCantInsertImageNotFoundAllowed(): void
3652
{
3753
if (getenv('SKIP_URL_IMAGE_TEST') === '1') {
3854
self::markTestSkipped('Skipped due to setting of environment variable');
@@ -44,10 +60,27 @@ public function testCantInsertImageNotFound(): void
4460
</tr>
4561
</table>';
4662
$filename = HtmlHelper::createHtml($html);
47-
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true);
63+
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true, true);
64+
$firstSheet = $spreadsheet->getSheet(0);
65+
$drawingCollection = $firstSheet->getDrawingCollection();
66+
self::assertCount(0, $drawingCollection);
67+
$spreadsheet->disconnectWorksheets();
68+
}
69+
70+
public function testCantInsertImageNotFoundNotAllowed(): void
71+
{
72+
$imagePath = 'https://phpspreadsheet.readthedocs.io/en/latest/topics/images/xxx01-03-filter-icon-1.png';
73+
$html = '<table>
74+
<tr>
75+
<td><img src="' . $imagePath . '" alt="test image voilà"></td>
76+
</tr>
77+
</table>';
78+
$filename = HtmlHelper::createHtml($html);
79+
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true, false);
4880
$firstSheet = $spreadsheet->getSheet(0);
4981
$drawingCollection = $firstSheet->getDrawingCollection();
5082
self::assertCount(0, $drawingCollection);
83+
$spreadsheet->disconnectWorksheets();
5184
}
5285

5386
#[DataProvider('providerBadProtocol')]

0 commit comments

Comments
 (0)