Skip to content

Commit 368bc96

Browse files
authored
Merge 2023-02 LWG Motion 11
P2572R1 std::format fill character allowances
2 parents 96fb041 + b4fbb5d commit 368bc96

File tree

1 file changed

+115
-74
lines changed

1 file changed

+115
-74
lines changed

source/utilities.tex

Lines changed: 115 additions & 74 deletions
Original file line numberDiff line numberDiff line change
@@ -14774,7 +14774,7 @@
1477414774
\fmtgrammarterm{std-format-spec}.
1477514775
\begin{note}
1477614776
The format specification can be used to specify such details as
14777-
field width, alignment, padding, and decimal precision.
14777+
minimum field width, alignment, padding, and decimal precision.
1477814778
Some of the formatting options
1477914779
are only supported for arithmetic types.
1478014780
\end{note}
@@ -14823,63 +14823,108 @@
1482314823
\end{ncbnf}
1482414824

1482514825
\pnum
14826+
Field widths are specified in defnadj{field width}{units};
14827+
the number of column positions required to display a sequence of
14828+
characters in a terminal.
14829+
The \defnadj{minimum}{field width}
14830+
is the number of field width units a replacement field minimally requires of
14831+
the formatted sequence of characters produced for a format argument.
14832+
The \defnadj{estimated}{field width} is the number of field width units
14833+
that are required for the formatted sequence of characters
14834+
produced for a format argument independent of
14835+
the effects of the \fmtgrammarterm{width} option.
14836+
The \defnadj{padding}{width} is the greater of \tcode{0} and
14837+
the difference of the minimum field width and the estimated field width.
14838+
14839+
\begin{note}
14840+
The POSIX \tcode{wcswidth} function is an example of a function that,
14841+
given a string, returns the number of column positions required by
14842+
a terminal to display the string.
14843+
\end{note}
14844+
14845+
\pnum
14846+
The \defnadj{fill}{character} is the character denoted by
14847+
the \fmtgrammarterm{fill} option or,
14848+
if the \fmtgrammarterm{fill} option is absent, the space character.
14849+
For a format specification in UTF-8, UTF-16, or UTF-32,
14850+
the fill character corresponds to a single Unicode scalar value.
1482614851
\begin{note}
14827-
The \fmtgrammarterm{fill} character can be any character
14828-
other than \tcode{\{} or \tcode{\}}.
14829-
The presence of a fill character is signaled by the
14830-
character following it, which must be one of the alignment options.
14852+
The presence of a \fmtgrammarterm{fill} option
14853+
is signaled by the character following it,
14854+
which must be one of the alignment options.
1483114855
If the second character of \fmtgrammarterm{std-format-spec}
1483214856
is not a valid alignment option,
14833-
then it is assumed that both the fill character and the alignment option are
14834-
absent.
14857+
then it is assumed that
14858+
the \fmtgrammarterm{fill} and \fmtgrammarterm{align} options
14859+
are both absent.
1483514860
\end{note}
1483614861

1483714862
\pnum
14838-
The \fmtgrammarterm{align} specifier applies to all argument types.
14863+
The \fmtgrammarterm{align} option applies to all argument types.
1483914864
The meaning of the various alignment options is as specified in \tref{format.align}.
1484014865
\begin{example}
14866+
%FIXME: example is incomplete, sB and sC result in:
14867+
%Error: Invalid UTF-8 byte sequence.
1484114868
\begin{codeblock}
1484214869
char c = 120;
14843-
string s0 = format("{:6}", 42); // value of \tcode{s0} is \tcode{"\ \ \ \ 42"}
14844-
string s1 = format("{:6}", 'x'); // value of \tcode{s1} is \tcode{"x\ \ \ \ \ "}
14845-
string s2 = format("{:*<6}", 'x'); // value of \tcode{s2} is \tcode{"x*****"}
14846-
string s3 = format("{:*>6}", 'x'); // value of \tcode{s3} is \tcode{"*****x"}
14847-
string s4 = format("{:*^6}", 'x'); // value of \tcode{s4} is \tcode{"**x***"}
14848-
string s5 = format("{:6d}", c); // value of \tcode{s5} is \tcode{"\ \ \ 120"}
14849-
string s6 = format("{:6}", true); // value of \tcode{s6} is \tcode{"true\ \ "}
14870+
string s0 = format("{:6}", 42); // value of \tcode{s0} is \tcode{"\ \ \ \ 42"}
14871+
string s1 = format("{:6}", 'x'); // value of \tcode{s1} is \tcode{"x\ \ \ \ \ "}
14872+
string s2 = format("{:*<6}", 'x'); // value of \tcode{s2} is \tcode{"x*****"}
14873+
string s3 = format("{:*>6}", 'x'); // value of \tcode{s3} is \tcode{"*****x"}
14874+
string s4 = format("{:*^6}", 'x'); // value of \tcode{s4} is \tcode{"**x***"}
14875+
string s5 = format("{:6d}", c); // value of \tcode{s5} is \tcode{"\ \ \ 120"}
14876+
string s6 = format("{:6}", true); // value of \tcode{s6} is \tcode{"true\ \ "}
14877+
string s7 = format("{:*<6.3}", "123456"); // value of \tcode{s7} is \tcode{"123***"}
14878+
string s8 = format("{:02}", 1234); // value of \tcode{s8} is \tcode{"1234"}
14879+
string s9 = format("{:*<}", "12"); // value of \tcode{s9} is \tcode{"12"}
14880+
string sA = format("{:*<6}", "12345678"); // value of \tcode{sA} is \tcode{"12345678"}
1485014881
\end{codeblock}
1485114882
\end{example}
1485214883
\begin{note}
14853-
Unless a minimum field width is defined, the field width is determined by
14854-
the size of the content and the alignment option has no effect.
14884+
The \fmtgrammarterm{fill}, \fmtgrammarterm{align}, and \tcode{0} options
14885+
have no effect when the minimum field width
14886+
is not greater than the estimated field width
14887+
because padding width is \tcode{0} in that case.
14888+
Since fill characters are assumed to have a field width of \tcode{1},
14889+
use of a character with a different field width can produce misaligned output.
14890+
%FIXME: cannot show clown face character below.
14891+
The \unicode{1f921}{clown face} character has a field width of \tcode{2}.
14892+
The examples above that include that character
14893+
illustrate the effect of the field width
14894+
when that character is used as a fill character
14895+
as opposed to when it is used as a formatting argument.
1485514896
\end{note}
1485614897

1485714898
\begin{floattable}{Meaning of \fmtgrammarterm{align} options}{format.align}{lp{.8\hsize}}
1485814899
\topline
1485914900
\lhdr{Option} & \rhdr{Meaning} \\ \rowsep
1486014901
\tcode{<} &
14861-
Forces the field to be aligned to the start of the available space.
14902+
Forces the formatted argument to be aligned to the start of the field
14903+
by inserting $n$ fill characters after the formatted argument
14904+
where $n$ is the padding width.
1486214905
This is the default for
1486314906
non-arithmetic non-pointer types, \tcode{charT}, and \tcode{bool},
1486414907
unless an integer presentation type is specified.
1486514908
\\ \rowsep
1486614909
%
1486714910
\tcode{>} &
14868-
Forces the field to be aligned to the end of the available space.
14911+
Forces the formatted argument to be aligned to the end of the field
14912+
by inserting $n$ fill characters before the formatted argument
14913+
where $n$ is the padding width.
1486914914
This is the default for
1487014915
arithmetic types other than \tcode{charT} and \tcode{bool},
1487114916
pointer types,
1487214917
or when an integer presentation type is specified.
1487314918
\\ \rowsep
1487414919
%
1487514920
\tcode{\caret} &
14876-
Forces the field to be centered within the available space
14921+
Forces the formatted argument to be centered within the field
1487714922
by inserting
1487814923
$\bigl\lfloor \frac{n}{2} \bigr\rfloor$
14879-
characters before and
14924+
fill characters before and
1488014925
$\bigl\lceil \frac{n}{2} \bigr\rceil$
14881-
characters after the value, where
14882-
$n$ is the total number of fill characters to insert.
14926+
fill characters after the formatted argument, where
14927+
$n$ is the padding width.
1488314928
\\
1488414929
\end{floattable}
1488514930

@@ -14955,50 +15000,45 @@
1495515000
trailing zeros are not removed from the result.
1495615001

1495715002
\pnum
14958-
A zero (\tcode{0}) character
14959-
preceding the \fmtgrammarterm{width} field
14960-
pads the field with leading zeros (following any indication of sign or base)
14961-
to the field width,
14962-
except when applied to an infinity or NaN.
14963-
This option is only valid for
14964-
arithmetic types other than \tcode{charT} and \tcode{bool}
14965-
or when an integer presentation type is specified.
14966-
If the \tcode{0} character and an \fmtgrammarterm{align} option both appear,
14967-
the \tcode{0} character is ignored.
15003+
The \tcode{0} option is valid for arithmetic types
15004+
other than \tcode{charT} and \tcode{bool} or
15005+
when an integer presentation type is specified.
15006+
For formatting arguments that have a value
15007+
other than an infinity or a NaN,
15008+
this option pads the formatted argument by
15009+
inserting the \tcode{0} character $n$ times
15010+
following the sign or base prefix indicators (if any)
15011+
where $n$ is \tcode{0} if the \fmtgrammarterm{align} option is present and
15012+
is the padding width otherwise.
1496815013
\begin{example}
1496915014
\begin{codeblock}
1497015015
char c = 120;
1497115016
string s1 = format("{:+06d}", c); // value of \tcode{s1} is \tcode{"+00120"}
1497215017
string s2 = format("{:#06x}", 0xa); // value of \tcode{s2} is \tcode{"0x000a"}
14973-
string s3 = format("{:<06}", -42); // value of \tcode{s3} is \tcode{"-42\ \ \ "} (\tcode{0} is ignored because of \tcode{<} alignment)
15018+
string s3 = format("{:<06}", -42); // value of \tcode{s3} is \tcode{"-42\ \ \ "} (\tcode{0} has no effect)
15019+
string s4 = format("{:06}", inf); // value of \tcode{s4} is \tcode{" inf"} (\tcode{0} has no effect)
1497415020
\end{codeblock}
1497515021
\end{example}
1497615022

15023+
\pnum
15024+
The \fmtgrammarterm{width} option specifies the minimum field width.
15025+
If the \fmtgrammarterm{width} option is absent,
15026+
the minimum field width is \tcode{0}.
15027+
1497715028
\pnum
1497815029
If \tcode{\{ \opt{\fmtgrammarterm{arg-id}} \}} is used in
14979-
a \fmtgrammarterm{width} or \fmtgrammarterm{precision},
14980-
the value of the corresponding formatting argument is used in its place.
15030+
a \fmtgrammarterm{width} or \fmtgrammarterm{precision} option,
15031+
the value of the corresponding formatting argument is used as the value of the option.
1498115032
If the corresponding formatting argument is
1498215033
not of standard signed or unsigned integer type, or
1498315034
its value is negative,
1498415035
an exception of type \tcode{format_error} is thrown.
1498515036

1498615037
\pnum
1498715038
% FIXME: What if it's an arg-id?
14988-
The \fmtgrammarterm{positive-integer} in
14989-
\fmtgrammarterm{width} is a decimal integer defining the minimum field width.
14990-
If \fmtgrammarterm{width} is not specified,
14991-
there is no minimum field width, and
14992-
the field width is determined based on the content of the field.
14993-
14994-
\pnum
14995-
\indextext{string!width}%
14996-
The \defn{width} of a string is defined as
14997-
the estimated number of column positions appropriate
14998-
for displaying it in a terminal.
14999-
\begin{note}
15000-
This is similar to the semantics of the POSIX \tcode{wcswidth} function.
15001-
\end{note}
15039+
If \fmtgrammarterm{positive-integer} is used in a
15040+
\fmtgrammarterm{width} option, the value of the decimal integer
15041+
is used as the value of the option.
1500215042

1500315043
\pnum
1500415044
For the purposes of width computation,
@@ -15019,44 +15059,45 @@
1501915059
\end{note}
1502015060

1502115061
\pnum
15022-
For a string in UTF-8, UTF-16, or UTF-32,
15023-
implementations should estimate the width of a string
15024-
as the sum of estimated widths of
15025-
the first code points in its extended grapheme clusters.
15026-
The extended grapheme clusters of a string are defined by \UAX{29}.
15027-
The estimated width of the following code points is 2:
15062+
For a sequence of characters in UTF-8, UTF-16, or UTF-32,
15063+
an implementation should use as its field width
15064+
the sum of the field widths of the first code point
15065+
of each extended grapheme cluster.
15066+
Extended grapheme clusters are defined by \UAX{29} of the Unicode Standard.
15067+
The following code points have a field width of 2:
1502815068
\begin{itemize}
1502915069
\item
15030-
Any code point with the \tcode{East_Asian_Width="W"} or
15070+
any code point with the \tcode{East_Asian_Width="W"} or
1503115071
\tcode{East_Asian_Width="F"} Derived Extracted Property as described by
15032-
\UAX{44}
15072+
\UAX{44} of the Unicode Standard
1503315073
\item
1503415074
\ucode{4dc0} -- \ucode{4dff} (Yijing Hexagram Symbols)
1503515075
\item
1503615076
\ucode{1f300} -- \ucode{1f5ff} (Miscellaneous Symbols and Pictographs)
1503715077
\item
1503815078
\ucode{1f900} -- \ucode{1f9ff} (Supplemental Symbols and Pictographs)
1503915079
\end{itemize}
15040-
The estimated width of other code points is 1.
15080+
The field width of all other code points is 1.
1504115081

1504215082
\pnum
15043-
For a string in neither UTF-8, UTF-16, nor UTF-32,
15044-
the width of a string is unspecified.
15083+
For a sequence of characters in neither UTF-8, UTF-16, nor UTF-32,
15084+
the field width is unspecified.
1504515085

1504615086
\pnum
15047-
% FIXME: What if it's an arg-id?
15048-
The \fmtgrammarterm{nonnegative-integer} in
15049-
\fmtgrammarterm{precision} is a decimal integer defining
15050-
the precision or maximum field size.
15051-
It can only be used with floating-point and string types.
15052-
For floating-point types this field specifies the formatting precision.
15053-
For string types, this field provides an upper bound
15054-
for the estimated width of the prefix of
15055-
the input string that is copied into the output.
15056-
For a string in UTF-8, UTF-16, or UTF-32,
15057-
the formatter copies to the output
15058-
the longest prefix of whole extended grapheme clusters
15059-
whose estimated width is no greater than the precision.
15087+
The \fmtgrammarterm{precision} option is valid
15088+
for floating-point and string types.
15089+
For floating-point types,
15090+
the value of this option specifies the precision
15091+
to be used for the floating-point presentation type.
15092+
For string types,
15093+
this option specifies the longest prefix of the formatted argument
15094+
to be included in the replacement field such that
15095+
the field width of the prefix is no greater than the value of this option.
15096+
15097+
\pnum
15098+
If \fmtgrammarterm{nonnegative-integer} is used in
15099+
a \fmtgrammarterm{precision} option,
15100+
the value of the decimal integer is used as the value of the option.
1506015101

1506115102
\pnum
1506215103
When the \tcode{L} option is used, the form used for the conversion is called

0 commit comments

Comments
 (0)