Skip to content

Commit ace1755

Browse files
committed
Excel Inappropriate Number Format Substitution
My system short date format is set to `yyyy-mm-dd`. I used Excel to create a spreadsheet, and included some dates, specifying `yyyy-mm-dd` formatting. When I looked at the resulting spreadsheet, I was surprised to see that Excel had stored the style not as `yyyy-mm-dd`, but rather as builtin style 14 (system short date format). Apparently the fact that the Excel styling matched my system choice was sufficient for it to override my choice! This is an astonishingly user-hostile implementation. Even though there are formats which, by design, "respond to changes in regional date and time settings", and even though the format I selected was not among those, Excel decided it was appropriate to vary the display even when I said I wanted an unvarying format. This PR adds a new method `replaceBuiltinNumberFormat` to undo the damage that Excel does in such a situation. It also adds an `Excel Anomalies` document to the formal documentation, just to make situations like this readily available to the community. BTW, Excel's sabotage can be avoided by using a number format style like `[Black]yyyy-mm-dd`.
1 parent 747ccd1 commit ace1755

File tree

4 files changed

+131
-1
lines changed

4 files changed

+131
-1
lines changed

docs/topics/Excel Anomalies.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Excel Anomalies
2+
3+
This is documentation for some behavior in Excel itself which we
4+
just do not understand, or which may come as a surprise to the user.
5+
6+
## Date Number Format
7+
8+
My system short date format is set to `yyyy-mm-dd`. Excel, for a very long time, did not include that amongst its formatting choices for dates, so it needed to be added as a custom format - no big deal. It has recently been added to the list of date formats, but ...
9+
10+
I used Excel to create a spreadsheet, and included some dates, specifying `yyyy-mm-dd` formatting. When I looked at the resulting spreadsheet, I was surprised to see that Excel had stored the style not as `yyyy-mm-dd`, but rather as builtin style 14 (system short date format). Apparently the fact that the Excel styling matched my system choice was sufficient for it to override my choice! This is an astonishingly user-hostile implementation. Even though there are formats which, by design, "respond to changes in regional date and time settings", and even though the format I selected was not among those, Excel decided it was appropriate to vary the display even when I said I wanted an unvarying format. I assume, but have not confirmed, that this applies to formats other than `yyyy-mm-dd`.
11+
12+
Note that this is not a problem when using PhpSpreadsheet to set the style, only when you let Excel do it. And, in that case, after a little experimentation, I figured out a format that Excel doesn't sabotage `[Black]yyyy-mm-dd`.
13+
14+
If you have a spreadsheet that has been altered in this way, it can be fixed with the following PhpSpreadsheet code:
15+
```php
16+
foreach ($spreadsheet->getCellXfCollection() as $style) {
17+
$numberFormat = $style->getNumberFormat();
18+
if ($numberFormat->getBuiltInFormatCode() === 14) {
19+
$numberFormat->setFormatCode('yyyy-mm-dd');
20+
}
21+
}
22+
```
23+
Starting with PhpSpreadsheet 4.5.0, this can be simplified to:
24+
```php
25+
$spreadsheet->replaceBuiltinNumberFormat(
26+
14,
27+
'yyyy-mm-dd'
28+
);
29+
```
30+
31+
## Negative Time Intervals
32+
33+
You have a time in one cell, and a time in another, and you want to subtract and display the result in `h:mm` format. No problem if the result is positive. But, if it's negative, Excel just fills the cell with `#`. There is a solution of sorts. If you use a 1904 base date (default on Mac), the negative interval will work just fine. Alas, no dice if you use a 1900 base data (default on Windows). No idea why they can't fix that - the existing implementation can't really be something that anybody actually wants. Note that it is *not* safe to change the base date for an existing spreadsheet, so, if this is something you want to do, make sure you change the base date before populating any data.
34+
35+
## Long-ago Dates
36+
37+
Excel does not support dates before either 1900-01-01 (Windows default) or 1904-01-01 (Mac default). For the 1900 base year, there is the additional problem that non-existent date 1900-02-29 is squeezed between 1900-02-28 and 1900-03-01.
38+
39+
## Weird Fractions
40+
41+
Similar fraction formats have inconsistent results in Excel. For example, if a cell contains the value 1 and the cell's format is `0 0/0`, it will display as `1 0/1`. But, if the cell's format is `? ??/???`, it will display as `1`. See [this issue](https://github.com/PHPOffice/PhpSpreadsheet/issues/3625), which remains open because, in the absence of usable documentation, we aren't sure how to handle things.
42+
43+
## COUNTIF and Text Cells
44+
45+
In Excel, COUNTIF appears to ignore text cells, behavior which doesn't seem to be documented anywhere. See [this issue](https://github.com/PHPOffice/PhpSpreadsheet/issues/3802), which remains open because, in the absence of usable documentation, we aren't sure how to handle things.

docs/topics/The Dating Game.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -184,7 +184,7 @@ MS Excel allows any separator character between hours/minutes/seconds; PhpSpread
184184

185185
### Duration (Elapsed Time)
186186

187-
Excel also supports formatting a value as a duration; a total number of hours, minutes or seconds rather than a time of day.
187+
Excel also supports formatting a value as a duration; a total number of hours, minutes or seconds rather than a time of day. However, please note that negative durations are supported only if using base year 1904 (Mac default).
188188

189189
| Code | Description | Displays as |
190190
|---------|----------------------------------------------------------------|-------------|

src/PhpSpreadsheet/Spreadsheet.php

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1784,4 +1784,21 @@ public function mergeDrawingCellsForPdf(): void
17841784
}
17851785
}
17861786
}
1787+
1788+
/**
1789+
* Excel will sometimes replace user's formatting choice
1790+
* with a built-in choice that it thinks is equivalent.
1791+
* Its choice is often not equivalent after all.
1792+
* Such treatment is astonishingly user-hostile.
1793+
* This function will undo such changes.
1794+
*/
1795+
public function replaceBuiltinNumberFormat(int $builtinFormatIndex, string $formatCode): void
1796+
{
1797+
foreach ($this->cellXfCollection as $style) {
1798+
$numberFormat = $style->getNumberFormat();
1799+
if ($numberFormat->getBuiltInFormatCode() === $builtinFormatIndex) {
1800+
$numberFormat->setFormatCode($formatCode);
1801+
}
1802+
}
1803+
}
17871804
}
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
<?php
2+
3+
declare(strict_types=1);
4+
5+
namespace PhpOffice\PhpSpreadsheetTests\Reader\Xlsx;
6+
7+
use PhpOffice\PhpSpreadsheet\Spreadsheet;
8+
use PhpOffice\PhpSpreadsheetTests\Functional\AbstractFunctional;
9+
10+
class ReplaceBuiltinNumberFormatTest extends AbstractFunctional
11+
{
12+
private ?Spreadsheet $spreadsheet = null;
13+
14+
private ?Spreadsheet $reloadedSpreadsheet = null;
15+
16+
protected function tearDown(): void
17+
{
18+
if ($this->spreadsheet !== null) {
19+
$this->spreadsheet->disconnectWorksheets();
20+
$this->spreadsheet = null;
21+
}
22+
if ($this->reloadedSpreadsheet !== null) {
23+
$this->reloadedSpreadsheet->disconnectWorksheets();
24+
$this->reloadedSpreadsheet = null;
25+
}
26+
}
27+
28+
public function testJustifyLastLine(): void
29+
{
30+
$spreadsheet = $this->spreadsheet = new Spreadsheet();
31+
$sheet = $this->spreadsheet->getActiveSheet();
32+
$sheet->fromArray([45486, 1023, 45487, 45488, 45489]);
33+
$sheet->getStyle('A1')->getNumberFormat()
34+
->setBuiltInFormatCode(14);
35+
$sheet->getStyle('B1')->getNumberFormat()
36+
->setFormatCode('#,##0.00');
37+
$sheet->getStyle('C1')->getNumberFormat()
38+
->setBuiltInFormatCode(14);
39+
$sheet->getStyle('D1')->getNumberFormat()
40+
->setFormatCode('dd-MMM-yyyy');
41+
$sheet->getStyle('E1')->getNumberFormat()
42+
->setBuiltInFormatCode(16);
43+
$values = $sheet->toArray();
44+
$expected = [[
45+
'7/13/2024', // builtin style 14
46+
'1,023.00', // #,##0.00
47+
'7/14/2024', // builtin style 14
48+
'15-Jul-2024', // dd-MMM-yyyy
49+
'16-Jul', // builtin style 16
50+
]];
51+
self::assertSame($expected, $values);
52+
$this->reloadedSpreadsheet = $this->writeAndReload($spreadsheet, 'Xlsx');
53+
$this->reloadedSpreadsheet->replaceBuiltinNumberFormat(
54+
14,
55+
'yyyy-mm-dd'
56+
);
57+
$rsheet = $this->reloadedSpreadsheet->getActiveSheet();
58+
$newValues = $rsheet->toArray();
59+
$newExpected = [[
60+
'2024-07-13', // yyyy-mm-dd changed from builtin style 14
61+
'1,023.00', // unchanged #,##0.00
62+
'2024-07-14', // yyyy-mm-dd changed from builtin style 14
63+
'15-Jul-2024', // unchanged dd-MMM-yyyy
64+
'16-Jul', // unchanged builtin style 16
65+
]];
66+
self::assertSame($newExpected, $newValues);
67+
}
68+
}

0 commit comments

Comments
 (0)