Skip to content

Add support for atoms longer than 255 bytes (but less than 256 chars) #1586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

pguyot
Copy link
Collaborator

@pguyot pguyot commented Mar 16, 2025

Also optimize storage and computation of atoms here and there.
Also fix hash of atom table (last character wasn't hashed).

See #1536

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

@pguyot pguyot force-pushed the w11/optimize-atoms branch 5 times, most recently from 2f3c4e9 to ec78e91 Compare March 18, 2025 07:13
Copy link
Collaborator

@bettio bettio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of comments here.


atom_ref_t atom_ref
= atom_table_get_atom_ptr_and_len(ctx->global->atom_table, atom_index, &atom_len);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this change is future-proof: but I think I need to explain first why atom_table_get_atom_ptr_and_len is required.

We'll need to allow purging old modules, as soon as our code loading support gets better.
By design atom strings are stored in the module atom table, so we cannot just unmap/free module memory, but we need to deal with atoms, since atom unloading is not an option.

So an option could be: during module unload, each HNode that is still pointing to old module memory is either updated to use a newer module or the atom string is strduped, so after that module memory can be freed.

Here is the reason we have atom_table_get_atom_ptr_and_len: it allows to have some kind of persistent reference (HNodes is never deleted/moved), and to copy atom string bytes in a thread safe manner.

Having pointers to atom strings around makes the unloading operation I described harder.

Copy link
Collaborator Author

@pguyot pguyot Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood this comment, I thought you wanted to keep atom_ref_t and atom_table_get_atom_ptr_and_len, which I resurrected in latest version, even if it's not called.

I believe we could have some kind of locks that could be required when calling atom_table_get_atom_string or consider cases where the pointer is in ROM. But this sounds like premature optimization so far. We should only address this after we optimize ROM-based atoms.

Before this PR, atom_table_get_atom_string wasn't safe either and was called by globalcontext_atomstring_from_term. This PR replaces some cases where an atom_index_t is sufficient, but there still are other cases than this function where atom_table_get_atom_string is called and that would require a lock.

@pguyot pguyot force-pushed the w11/optimize-atoms branch 5 times, most recently from 833adb7 to 7eeab10 Compare March 23, 2025 07:41
@pguyot pguyot force-pushed the w11/optimize-atoms branch from 7eeab10 to f019778 Compare March 30, 2025 05:54
@pguyot
Copy link
Collaborator Author

pguyot commented Apr 15, 2025

CI is broken with 32 bits.

@pguyot pguyot force-pushed the w11/optimize-atoms branch 3 times, most recently from 314d282 to ef806cb Compare April 18, 2025 06:11
@pguyot
Copy link
Collaborator Author

pguyot commented Apr 18, 2025

CI is broken with 32 bits.

Last issue is fixed in #1634

@pguyot pguyot force-pushed the w11/optimize-atoms branch from ef806cb to 88abd1a Compare April 25, 2025 05:47
pguyot added 7 commits May 9, 2025 06:02
Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Signed-off-by: Paul Guyot <pguyot@kallisys.net>
`atom_index_t` type represents a valid atom index.

Signature of functions that returned an atom index that could be an error
were modified consequently, introducing a type for the error results of
`atom_table_ensure_atoms` and `atom_table_ensure_atom` and removing
now redundant `globalcontext_insert_atom` in favor of `globalcontext_make_atom`
that returns a term.

Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Update signature of `atom_write_mfa` so it takes atom length and data separately
to prepare for longer atoms.

Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Change signature of `globalcontext_get_module` to get an `atom_index_t` to
refer to a given module.

Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Update atom_table API so it uses pointers to characters with a length, and
reduce further usage of AtomString which represent shorter atoms

Signed-off-by: Paul Guyot <pguyot@kallisys.net>
@pguyot pguyot force-pushed the w11/optimize-atoms branch from 88abd1a to 2f44f18 Compare May 9, 2025 04:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants