Skip to content

Commit 103144c

Browse files
committed
Create 0000-additional-float-types.md
1 parent ad4f78f commit 103144c

File tree

1 file changed

+139
-0
lines changed

1 file changed

+139
-0
lines changed

text/0000-additional-float-types.md

Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
- Feature Name: `additional-float-types`
2+
- Start Date: 2023-6-28
3+
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
4+
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
This RFC proposes new floating point types `f16` and `f128` into core language and standard library. Also this RFC introduces `f80`, `doubledouble`, `bf16` into `core::arch` for inter-op with existing native code.
10+
11+
# Motivation
12+
[motivation]: #motivation
13+
14+
IEEE-754 standard defines binary floating point formats, including binary16, binary32, binary64 and binary128. The binary32 and binary64 correspond to `f32` and `f64` types in Rust, while binary16 and binary128 are used in multiple scenarios (machine learning, scientific computing, etc.) and accepted by some modern architectures (by software or hardware).
15+
16+
In C/C++ world, there're already types representing these formats, along with more legacy non-standard types specific to some platform. Introduce them in a limited way would help improve FFI against such code.
17+
18+
# Guide-level explanation
19+
[guide-level-explanation]: #guide-level-explanation
20+
21+
`f16` and `f128` are primitive floating types, they can be used just like `f32` or `f64`. They always conform to binary16 and binary128 format defined in IEEE-754, which means size of `f16` is always 16-bit, and size of `f128` is always 128-bit.
22+
23+
```rust
24+
let val1 = 1.0; // Default type is still f64
25+
let val2: f128 = 1.0;
26+
let val3: f16 = 1.0;
27+
let val4 = 1.0f128; // Suffix of f128 literal
28+
let val5 = 1.0f16; // Suffix of f16 literal
29+
30+
println!("Size of f128 in bytes: {}", std::mem::size_of_val(&val2)); // 16
31+
println!("Size of f16 in bytes: {}", std::mem::size_of_val(&val3)); // 2
32+
```
33+
34+
Because not every target supports `f16` and `f128`, compiler provides conditional guards for them:
35+
36+
```rust
37+
#[cfg(target_has_f128)]
38+
fn get_f128() -> f128 { 1.0f128 }
39+
40+
#[cfg(target_has_f16)]
41+
fn get_f16() -> f16 { 1.0f16 }
42+
```
43+
44+
All operators, constants and math functions defined for `f32` and `f64` in core, are also defined for `f16` and `f128`, and guarded by respective conditional guards.
45+
46+
`f80` type is defined in `core::arch::{x86, x86_64}`. `doubledouble` type is defined in `core::arch::{powerpc, powerpc64}`. `bf16` type is defined in `core::arch::{arm, aarch64, x86, x86_64}`. They do not have literal representation.
47+
48+
# Reference-level explanation
49+
[reference-level-explanation]: #reference-level-explanation
50+
51+
## `f16` type
52+
53+
`f16` consists of 1 bit of sign, 5 bits of exponent, 10 bits of mantissa.
54+
55+
The following `From` and `TryFrom` traits are implemented for conversion between `f16` and other types:
56+
57+
```rust
58+
impl From<f16> for f32 { /* ... */ }
59+
impl From<f16> for f64 { /* ... */ }
60+
impl From<bool> for f16 { /* ... */ }
61+
impl From<u8> for f16 { /* ... */ }
62+
impl From<i8> for f16 { /* ... */ }
63+
```
64+
65+
`f16` will generate `half` type in LLVM IR.
66+
67+
## `f128` type
68+
69+
`f128` consists of 1 bit of sign, 15 bits of exponent, 112 bits of mantissa.
70+
71+
`f128` is available for on targets having (1) hardware instructions or software emulation for 128-bit float type; (2) backend support for `f128` type on the target; (3) essential target features enabled (if any).
72+
73+
The list of targets supporting `f128` type may change over time. Initially, it includes `powerpc64le-*`.
74+
75+
The following traits are also implemented for conversion between `f128` and other types:
76+
77+
```rust
78+
impl From<f16> for f128 { /* ... */ }
79+
impl From<f32> for f128 { /* ... */ }
80+
impl From<f64> for f128 { /* ... */ }
81+
impl From<bool> for f128 { /* ... */ }
82+
impl From<u8> for f128 { /* ... */ }
83+
impl From<i8> for f128 { /* ... */ }
84+
impl From<u16> for f128 { /* ... */ }
85+
impl From<i16> for f128 { /* ... */ }
86+
impl From<u32> for f128 { /* ... */ }
87+
impl From<i32> for f128 { /* ... */ }
88+
impl From<u64> for f128 { /* ... */ }
89+
impl From<i64> for f128 { /* ... */ }
90+
```
91+
92+
`f128` will generate `fp128` type in LLVM IR.
93+
94+
95+
`std::simd` defines new vector types with `f16` or `f128` element: `f16x2` `f16x4` `f16x8` `f16x16` `f16x32` `f128x2` `f128x4`.
96+
97+
For `doubledouble` type, conversion intrinsics are available under `core::arch::{powerpc, powerpc64}`. For `f80` type, conversion intrinsics are available under `core::arch::{x86, x86_64}`.
98+
99+
## Architectures specific types
100+
101+
As for non-standard types, `f80` generates `x86_fp80`, `doubledouble` generates `ppc_fp128`, `bf16` generates `bfloat`.
102+
103+
# Drawbacks
104+
[drawbacks]: #drawbacks
105+
106+
Unlike f32 and f64, although there are platform independent implementation of supplementary intrinsics on these types, not every target support the two types natively, with regards to the ABI. Adding them will be a challenge for handling different cases.
107+
108+
# Rationale and alternatives
109+
[rationale-and-alternatives]: #rationale-and-alternatives
110+
111+
There are some crates aiming for similar functionality:
112+
113+
- [f128](https://github.com/jkarns275/f128) provides binding to `__float128` type in GCC.
114+
- [half](https://github.com/starkat99/half-rs) provides implementation of binary16 and bfloat16 types.
115+
116+
However, besides the disadvantage of usage inconsistency between primitive type and type from crate, there are still issues around those bindings.
117+
118+
The availablity of additional float types depends on CPU/OS/ABI/features of different targets heavily. Evolution of LLVM may also unlock possibility of the types on new targets. Implementing them in compiler handles the stuff at the best location.
119+
120+
Most of such crates defines their type on top of C binding. But extended float type definition in C is complex and confusing. The meaning of `long double`, `_Float128` varies by targets or compiler options. Implementing in Rust compiler helps to maintain a stable codegen interface.
121+
122+
And since third party tools also relies on Rust internal code, implementing additional float types in compiler also help the tools to recognize them.
123+
124+
# Prior art
125+
[prior-art]: #prior-art
126+
127+
We have a previous proposal on `f16b` type to represent `bfloat16`: https://github.com/joshtriplett/rfcs/blob/f16b/text/0000-f16b.md
128+
129+
# Unresolved questions
130+
[unresolved-questions]: #unresolved-questions
131+
132+
This proposal does not introduce `c_longdouble` type for FFI, because it means one of `f128`, `doubledouble`, `f64` or `f80` on different cases. Also for `c_float128`.
133+
134+
# Future possibilities
135+
[future-possibilities]: #future-possibilities
136+
137+
More functions will be added to those platform dependent float types, like casting between `f128` and `doubledouble`.
138+
139+
For targets not supporting `f16` or `f128`, we may be able to introduce a 'limited mode', where the types are not fully functional, but user can load, store and call functions with such arguments.

0 commit comments

Comments
 (0)