-
Notifications
You must be signed in to change notification settings - Fork 114
WIP: add support to bigger ints (256 bit) #1552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
e9902c0
to
0e71d98
Compare
5dfab3f
to
1afe75b
Compare
*rounded_num_len = rounded / sizeof(intn_digit_t); | ||
} | ||
|
||
static term make_bigint(Context *ctx, uint32_t fail_label, uint32_t live, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it bigint
or big_int
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm using bigint now.
src/libAtomVM/bif.c
Outdated
} | ||
} | ||
|
||
static void pair_to_big_int(term arg1, term arg2, intn_digit_t *tmp_buf1, intn_digit_t *tmp_buf2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this exists as a function. Nothing seems shared between each pair member.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All binary operations (add, mul, div, etc...) have the same repeated pattern, so I was factoring it.
Honestly I think I should factor in a macro also this part before that is going to be repeated:
698 intn_digit_t tmp_buf1[INTN_INT64_LEN];
699 intn_digit_t tmp_buf2[INTN_INT64_LEN];
700
701 intn_digit_t *bn1;
702 size_t bn1_len;
703 intn_digit_t *bn2;
704 size_t bn2_len;
Anyway that function is required to handle all the combos such as: small-int * big-int
, big-int * small-int
and big-int * big-int
.
src/libAtomVM/intn.h
Outdated
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ | ||
ret = (((uint64_t) num[1] << 32) | (uint64_t) num[0]); | ||
#else | ||
#error "Unsupporoted endianess" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsupported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/libAtomVM/intn.c
Outdated
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ | ||
#error "Big endian not yet supported" | ||
#else | ||
#error "Unsupproted endiness" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsupported endianness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Expected_INT64_MIN = ?MODULE:shrink(), | ||
A = ?MODULE:mul(16#10101010CAFECAFE, 16#AABB), | ||
Square = ?MODULE:mul(A, A), | ||
<<"2559181265480533323615999200984578944503596644">> = erlang:integer_to_binary(Square), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2559181265480533323615999200984578944503596644 = Square,
We also need support for literal decoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, we are getting there step by step.
start() -> | ||
test_mul(). | ||
|
||
test_mul() -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't only about mul, the function could be split into several.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure about this comment, can you explain further?
@@ -2184,42 +2185,11 @@ static term nif_erlang_atom_to_list_1(Context *ctx, int argc, term argv[]) | |||
return ret; | |||
} | |||
|
|||
static size_t lltoa(avm_int64_t int_value, unsigned base, char *integer_string) | |||
static term integer_to_buf(Context *ctx, int argc, term argv[], char *tmp_buf, size_t tmp_buf_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused about the signature, returned value is [] or invalid_term, isn't it?
Couldn't we return just a bool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ugly signature indeed, that function raises different kinds of errors for both integer_to_binary
and integer_to_list
src/libAtomVM/nifs.c
Outdated
bool needs_cleanup; | ||
term maybe_fail_ret | ||
= integer_to_buf(ctx, argc, argv, tmp_buf, tmp_buf_size, &int_buf, &int_len, &needs_cleanup); | ||
if (UNLIKELY(term_is_invalid_term(maybe_fail_ret))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since maybe_fail_ret is not used, UNLIKELY could test the returned value of integer_to_buf directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is actually used:
2254 if (UNLIKELY(term_is_invalid_term(maybe_fail_ret))) {
2255 return maybe_fail_ret;
2256 }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion. What I mean is that we only test result being invalid term or not. When it's invalid term we return invalid term. In this respect we don't really use this value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, right, got it.
#include <stdlib.h> | ||
#include <string.h> | ||
|
||
#define USE_64BIT_MUL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a compile-time test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to test different combos indeed.
Aren't mbedtls_mpi_* functions linked into AtomVM if SSL is enabled? By default I think they are 4096 bits. I would favor these (and I also have a branch where I vendor crypto functions as we discussed in Fosdem for wasm). |
Yes, they are, but there are is some additional complexity.
So we can use it, but it is not that trivial, and I'm not sure it is worth it. As an additional note, in our implementation the most complex part are div and mul algorithms, that have been copied from Hacker's delight, and based on Knuth work. They are pretty efficient and well tested. |
6de1d25
to
3bb28df
Compare
Noteworthy additional changes:
The binary_to_integer function was not compliant with OTP behavior, it was accepting trailing spaces, and base prefixes (such as 0x).
It replaces our implementation, with an optimized one. |
ba9e3a3
to
fe978bf
Compare
nlz function is used from `divmnu` function. Use compiler builtin when available instead of C implementation. Signed-off-by: Davide Bettio <davide@uninstall.it>
559f3fc
to
8aa1058
Compare
This function is a first round of integration with intn bigint implementation. - Allow printing big integers with `erlang:display/1` and in general with `term_display` functions. - Allow converting big integers to binaries and lists using `erlang:integer_to_binary/1` and `erlang:integer_to_list/1`. Signed-off-by: Davide Bettio <davide@uninstall.it>
Implement a first arithmetic operation that uses `intn_mulmns` in order to validate the whole approach. Signed-off-by: Davide Bettio <davide@uninstall.it>
Add first iteration on bigint tests, starting with tests for `erlang:*/2` and `integer_to_binary/2`. Signed-off-by: Davide Bettio <davide@uninstall.it>
Just use `intn_parse` function. Signed-off-by: Davide Bettio <davide@uninstall.it>
Negative boxed integers have 3rd bit set (b1s00). Also introduce new defines: - TERM_BOXED_NEGATIVE_INTEGER - TERM_BOXED_INTEGER_SIGN_BIT - TERM_BOXED_INTEGER_SIGN_BIT_POS Signed-off-by: Davide Bettio <davide@uninstall.it>
Add functions for checking if a term is a positive integer, and etc... Function names are inspired to Erlang typespecs (such as non_neg_integer). Signed-off-by: Davide Bettio <davide@uninstall.it>
Start moving existing code to predicates such as `term_is_any_non_neg_integer(t)`. Signed-off-by: Davide Bettio <davide@uninstall.it>
Some operations in 2-complement turns to be quite complex, they require more code, and also more stack space for storing abs value. Hence using a dedicated sign bit (as Erlang does) turns to be an easier and pragmatic approach. This approach makes possible having sign bit outside the numeric payload, so the supported range is -(2^256 - 1)..+(2^256 - 1). They might be called int257, but it would be quite confusing. Sign bit is stored in boxed header, outside of the numeric payload. Also add a valgrind supression file, in order to ignore a bogus warning about overlapping memory in memcpy when executing memmove (that allows overlapping memory). Signed-off-by: Davide Bettio <davide@uninstall.it>
On 32-bit systems, use `make_maybe_boxed_int64` in `neg_boxed_helper` since `-(INT32_MAX + 1)` is `INT32_MIN` that fits into a 32-bit boxed integer. Before of this change `make_boxed_int64` was used, making a 64-bit boxed integer for an int32 value. New `term_compare` implementation will check size and sign metadata before performing any actual comparison, so all value must be in their "minimal canonical form". Signed-off-by: Davide Bettio <davide@uninstall.it>
Refactor term_compare to use metadata such as size and sign before performing any integer comparison (that might be expensive for big integers). Perform digit by digit comparison for big integers only when size and sign are equal. Signed-off-by: Davide Bettio <davide@uninstall.it>
Add functions that do not rely on undefined behavior for converting unsigned to signed negative integers (and viceversa), for checking if conversion overflows and for conditionally negate. Start using newly introduced utilities in both intn and externalterm (an old macro is removed). Signed-off-by: Davide Bettio <davide@uninstall.it>
Refactor term_conv_to_float in order to use intn_to_double function. Also make sure in opcodesswitch that term_conv_to_float() returns a finite value. Signed-off-by: Davide Bettio <davide@uninstall.it>
Add helper function float_to_integer_helper, that checks the float result of functions such as floor, round, trunc, etc... instead of the arguments in advance. Furthermore a better upper and limit for safe double to int64 conversion has been found. Signed-off-by: Davide Bettio <davide@uninstall.it>
Allow functions such as trunc, round, etc... to return a big integer, when a number above 2^63 or below -2^63 is given. Signed-off-by: Davide Bettio <davide@uninstall.it>
Test for overflows and for not-yet-overflowed values, when converting from float to big int. Also test comparison between floats and big ints. Signed-off-by: Davide Bettio <davide@uninstall.it>
This function converts n-bytes in either big or little endian format, signed / unsigned, and converts them into a intn integer. Signed-off-by: Davide Bettio <davide@uninstall.it>
This function is required in any place a new bigint term needs to be created. Also rename it to `term_intn_to_term_size`. Signed-off-by: Davide Bettio <davide@uninstall.it>
Add support in SMALL_BIG_EXT parsing to big integers (that means integers that are >= 8 bytes and <= 32 bytes). Signed-off-by: Davide Bettio <davide@uninstall.it>
Allow literals bigger than 64 bit, such as: 16#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF As a side note, bigger literals than (2^256 - 1) are encoded as external terms. Signed-off-by: Davide Bettio <davide@uninstall.it>
Add functions useful for writing a big integer back to a buffer, as a little/big-endian integer. Signed-off-by: Davide Bettio <davide@uninstall.it>
Allow calling term_to_binary for serializing big integers. Signed-off-by: Davide Bettio <davide@uninstall.it>
This PR introduces support to signed 256 bit integers.
Briefly:
intn.c
), limited to 256 bit for the sake of simplicity, but it might be expanded to generic big integers with some additional workinteger_to_binary
/_to_list
in order to reduce both code duplication and make it simpler adding big integers supportbinary_to_integer
in to make it compliant with OTP (binaries such as<<"0xCAFE">>
or <<" 42">> must be rejected).lltoa
with more performant functions that do not rely on slow helpers, specially for base 10 and 16Notes:
term.h
, so functions interm.h
handle them as opaques storing bytesValue bor 0x80000000000000000000000000000000
will cause a sign flip, unlike OTPTodos:
0xCAFE1234CAFECAFE = calculate()
cat not work)These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).
SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later