Skip to content

Commit 2d67646

Browse files
authored
Merge pull request #4532 from oleibman/builtinnumfmt
Excel Inappropriate Number Format Substitution
2 parents a106886 + 8fb10a6 commit 2d67646

File tree

5 files changed

+134
-2
lines changed

5 files changed

+134
-2
lines changed

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org).
99

1010
### Added
1111

12-
- Nothing yet.
12+
- Address Excel Inappropriate Number Format Substitution. [PR #4532](https://github.com/PHPOffice/PhpSpreadsheet/pull/4532)
1313

1414
### Removed
1515

docs/topics/Excel Anomalies.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# Excel Anomalies
2+
3+
This is documentation for some behavior in Excel itself which we
4+
just do not understand, or which may come as a surprise to the user.
5+
6+
## Date Number Format
7+
8+
My system short date format is set to `yyyy-mm-dd`. Excel, for a very long time, did not include that amongst its formatting choices for dates, so it needed to be added as a custom format - no big deal. It has recently been added to the list of date formats, but ...
9+
10+
I used Excel to create a spreadsheet, and included some dates, specifying `yyyy-mm-dd` formatting. When I looked at the resulting spreadsheet, I was surprised to see that Excel had stored the style not as `yyyy-mm-dd`, but rather as builtin style 14 (system short date format). Apparently the fact that the Excel styling matched my system choice was sufficient for it to override my choice! This is an astonishingly user-hostile implementation. Even though there are formats which, by design, "respond to changes in regional date and time settings", and even though the format I selected was not among those, Excel decided it was appropriate to vary the display even when I said I wanted an unvarying format. I assume, but have not confirmed, that this applies to formats other than `yyyy-mm-dd`.
11+
12+
Note that this is not a problem when using PhpSpreadsheet to set the style, only when you let Excel do it. And, in that case, after a little experimentation, I figured out a format that Excel doesn't sabotage `[Black]yyyy-mm-dd`.
13+
14+
If you have a spreadsheet that has been altered in this way, it can be fixed with the following PhpSpreadsheet code:
15+
```php
16+
foreach ($spreadsheet->getCellXfCollection() as $style) {
17+
$numberFormat = $style->getNumberFormat();
18+
// okay to use NumberFormat::SHORT_DATE_INDEX below
19+
if ($numberFormat->getBuiltInFormatCode() === 14) {
20+
$numberFormat->setFormatCode('yyyy-mm-dd');
21+
}
22+
}
23+
```
24+
Starting with PhpSpreadsheet 4.5.0, this can be simplified to:
25+
```php
26+
$spreadsheet->replaceBuiltinNumberFormat(
27+
\PhpOffice\PhpSpreadsheet\Style\NumberFormat::SHORT_DATE_INDEX,
28+
'yyyy-mm-dd'
29+
);
30+
```
31+
32+
## Negative Time Intervals
33+
34+
You have a time in one cell, and a time in another, and you want to subtract and display the result in `h:mm` format. No problem if the result is positive. But, if it's negative, Excel just fills the cell with `#`. There is a solution of sorts. If you use a 1904 base date (default on Mac), the negative interval will work just fine. Alas, no dice if you use a 1900 base data (default on Windows). No idea why they can't fix that - the existing implementation can't really be something that anybody actually wants. Note that it is *not* safe to change the base date for an existing spreadsheet, so, if this is something you want to do, make sure you change the base date before populating any data.
35+
36+
## Long-ago Dates
37+
38+
Excel does not support dates before either 1900-01-01 (Windows default) or 1904-01-01 (Mac default). For the 1900 base year, there is the additional problem that non-existent date 1900-02-29 is squeezed between 1900-02-28 and 1900-03-01.
39+
40+
## Weird Fractions
41+
42+
Similar fraction formats have inconsistent results in Excel. For example, if a cell contains the value 1 and the cell's format is `0 0/0`, it will display as `1 0/1`. But, if the cell's format is `? ??/???`, it will display as `1`. See [this issue](https://github.com/PHPOffice/PhpSpreadsheet/issues/3625), which remains open because, in the absence of usable documentation, we aren't sure how to handle things.
43+
44+
## COUNTIF and Text Cells
45+
46+
In Excel, COUNTIF appears to ignore text cells, behavior which doesn't seem to be documented anywhere. See [this issue](https://github.com/PHPOffice/PhpSpreadsheet/issues/3802), which remains open because, in the absence of usable documentation, we aren't sure how to handle things.

docs/topics/The Dating Game.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -184,7 +184,7 @@ MS Excel allows any separator character between hours/minutes/seconds; PhpSpread
184184

185185
### Duration (Elapsed Time)
186186

187-
Excel also supports formatting a value as a duration; a total number of hours, minutes or seconds rather than a time of day.
187+
Excel also supports formatting a value as a duration; a total number of hours, minutes or seconds rather than a time of day. However, please note that negative durations are supported only if using base year 1904 (Mac default).
188188

189189
| Code | Description | Displays as |
190190
|---------|----------------------------------------------------------------|-------------|

src/PhpSpreadsheet/Spreadsheet.php

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1784,4 +1784,21 @@ public function mergeDrawingCellsForPdf(): void
17841784
}
17851785
}
17861786
}
1787+
1788+
/**
1789+
* Excel will sometimes replace user's formatting choice
1790+
* with a built-in choice that it thinks is equivalent.
1791+
* Its choice is often not equivalent after all.
1792+
* Such treatment is astonishingly user-hostile.
1793+
* This function will undo such changes.
1794+
*/
1795+
public function replaceBuiltinNumberFormat(int $builtinFormatIndex, string $formatCode): void
1796+
{
1797+
foreach ($this->cellXfCollection as $style) {
1798+
$numberFormat = $style->getNumberFormat();
1799+
if ($numberFormat->getBuiltInFormatCode() === $builtinFormatIndex) {
1800+
$numberFormat->setFormatCode($formatCode);
1801+
}
1802+
}
1803+
}
17871804
}
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
<?php
2+
3+
declare(strict_types=1);
4+
5+
namespace PhpOffice\PhpSpreadsheetTests\Reader\Xlsx;
6+
7+
use PhpOffice\PhpSpreadsheet\Spreadsheet;
8+
use PhpOffice\PhpSpreadsheet\Style\NumberFormat;
9+
use PhpOffice\PhpSpreadsheetTests\Functional\AbstractFunctional;
10+
11+
class ReplaceBuiltinNumberFormatTest extends AbstractFunctional
12+
{
13+
private ?Spreadsheet $spreadsheet = null;
14+
15+
private ?Spreadsheet $reloadedSpreadsheet = null;
16+
17+
protected function tearDown(): void
18+
{
19+
if ($this->spreadsheet !== null) {
20+
$this->spreadsheet->disconnectWorksheets();
21+
$this->spreadsheet = null;
22+
}
23+
if ($this->reloadedSpreadsheet !== null) {
24+
$this->reloadedSpreadsheet->disconnectWorksheets();
25+
$this->reloadedSpreadsheet = null;
26+
}
27+
}
28+
29+
public function testReplaceBuiltinNumberFormat(): void
30+
{
31+
$spreadsheet = $this->spreadsheet = new Spreadsheet();
32+
$sheet = $this->spreadsheet->getActiveSheet();
33+
$sheet->fromArray([45486, 1023, 45487, 45488, 45489]);
34+
$sheet->getStyle('A1')->getNumberFormat()
35+
->setBuiltInFormatCode(NumberFormat::SHORT_DATE_INDEX);
36+
$sheet->getStyle('B1')->getNumberFormat()
37+
->setFormatCode('#,##0.00');
38+
$sheet->getStyle('C1')->getNumberFormat()
39+
->setBuiltInFormatCode(NumberFormat::SHORT_DATE_INDEX);
40+
$sheet->getStyle('D1')->getNumberFormat()
41+
->setFormatCode('dd-MMM-yyyy');
42+
$sheet->getStyle('E1')->getNumberFormat()
43+
->setBuiltInFormatCode(16);
44+
$values = $sheet->toArray();
45+
$expected = [[
46+
'7/13/2024', // builtin style 14
47+
'1,023.00', // #,##0.00
48+
'7/14/2024', // builtin style 14
49+
'15-Jul-2024', // dd-MMM-yyyy
50+
'16-Jul', // builtin style 16
51+
]];
52+
self::assertSame($expected, $values);
53+
$this->reloadedSpreadsheet = $this->writeAndReload($spreadsheet, 'Xlsx');
54+
$this->reloadedSpreadsheet->replaceBuiltinNumberFormat(
55+
NumberFormat::SHORT_DATE_INDEX,
56+
'yyyy-mm-dd'
57+
);
58+
$rsheet = $this->reloadedSpreadsheet->getActiveSheet();
59+
$newValues = $rsheet->toArray();
60+
$newExpected = [[
61+
'2024-07-13', // yyyy-mm-dd changed from builtin style 14
62+
'1,023.00', // unchanged #,##0.00
63+
'2024-07-14', // yyyy-mm-dd changed from builtin style 14
64+
'15-Jul-2024', // unchanged dd-MMM-yyyy
65+
'16-Jul', // unchanged builtin style 16
66+
]];
67+
self::assertSame($newExpected, $newValues);
68+
}
69+
}

0 commit comments

Comments
 (0)