Make API more idiomatic, simplify code, improve style #612

H2CO3 · 2025-06-03T05:50:48Z

What does this PR do?

There are a number of issues with the current code style, for example:

The API sometimes forces sub-optimal choices (e.g. the use of &Option<T> forces the caller to have an owned T, whereas Option<&T> doesn't, and can be obtained from an Option<T> via Option::<T>::as_ref())
The implementation was sometimes overly verbose (e.g. heavily nested if let "triangles of hell", which can be replaced by let … else and early returns)
There have been some notable omissions (e.g. the source() method on impl Error for SafeTensorsError)
Some dependencies (notably, Criterion) were out-of-date and the code used deprecated functions
Other, minor things such as unnecessary allocation of Strings in value formatting, extensively commented-out code (that could just be deleted since it's already in version control history), or manual, verbose, inline impls of Option combinators such as map_or.

This PR fixes the aforementioned stylistic and API issues. Note that some of these are technically breaking change, but since the library is still relatively young, and hasn't reached 1.0 yet, this should not be a problem.

… size check and slice indexing with `.get()`; refactor hard-coded header size into common constant; remove unconditional cloning of metadata in deserialization; remove double negation in `if-else`; etc.

…e tests

…ary allocation; improve benchmark code

H2CO3 · 2025-06-04T07:14:32Z

@Narsil I believe this PR is now complete.

H2CO3 · 2025-06-10T05:39:06Z

@Narsil Hey, can I possibly get some feedback on this? Thanks!

H2CO3 · 2025-06-12T12:01:16Z

This clearly isn't going anywhere; closing.

Narsil · 2025-06-12T13:16:48Z

Hey,

First of all, sorry to not have seen that PR before. I just don't read PRs on every repo every day.

Now, thanks a lot for the propositions. I'll break down my comments here:

There are a lot of stylistic choices. I tend to defer anything stylistic to linters or similar tool, that avoids the bikeshedding. If they don't complain, then the code is fine.
You claim there are some over allocation, those would be really interesting to fix, but since there are so many changes, I'm not sure which are actually allocation related, could you split PRs into small unitary PRs?
For the error source + general display, do you have any specific examples where the displays where annoying to work with in particular ? Especially since we have to deal mostly with a python surface, it's important to have good error messages in Python too.

In order to merge PRs, I think having very small ones would be an easier start. This crate and python package is relied upon by many, so I try to make changes as few as possible (so breaking is much less likely).

H2CO3 · 2025-06-12T13:48:00Z

Hi there!

There are a lot of stylistic choices. I tend to defer anything stylistic to linters or similar tool, that avoids the bikeshedding. If they don't complain, then the code is fine.

Most of the improvements in this PR are not, in fact, purely stylistic, perhaps except for some newlines that make trait impls with many consecutive fns easier on the eye, and the removal of superfluous parentheses in a few places (2, IIRC). The rest of the changes make the code objectively better, in the ways described in my very first comment.

You claim there are some over allocation, those would be really interesting to fix, but since there are so many changes, I'm not sure which are actually allocation related

Again, please see the thread starter comment. It's mostly the forcing of &Option<T> instead of Option<&T>, which forces allocation as explained there in more detail. Unfortunately, this is a breaking change on the Rust side, although not on the Python side. (And it's a lot less annoying in Rust, since Cargo's strict adherence to semver ensures that nothing that already depends on the package will break anyway.)

There's also this unnecessary .into_iter().collect(), which doesn't do anything except for allocating a new vector and invalidating the old one.

For the error source + general display, do you have any specific examples where the displays where annoying to work with in particular ? Especially since we have to deal mostly with a python surface, it's important to have good error messages in Python too.

The error messages (and in some cases, Display impls) were mostly just deferred to the automatically-derived Debug impls, which is a practice that the official Rust API guidelines explicitly recommend against, because:

Debug formatting was never meant to be shown to users.
Debug formatting is not stable (it can change between Rust versions, toolchains, etc.), whereas user-facing Display and Error impls are part of the API and should be stable, therefore they can't ever be implemented correctly by just delegating to Debug (unless you know it's stable because you implemented it manually, but then it's better to implement Display by hand and delegate Debug to Display instead).

In order to merge PRs, I think having very small ones would be an easier start. This crate and python package is relied upon by many, so I try to make changes as few as possible (so breaking is much less likely).

I don't think the API of the Python package is affected in any way. Please see the individual commits for different aspects of the changes. I'm sorry but you also have to understand that it would be really hard for me to basically un-do everything but some very specific changes. The changes in this PR are coherent, and you don't have to take them at face value – feel free to ask any other experienced Rust programmer and they'll confirm that they improve the code.

Narsil · 2025-06-12T14:19:28Z

Do you mind opening small PRs instead of this giant one.

I see a few things slip which are not exactly obvious and making sure none slips review is hard.

Keep

write!(f, "{}...")

everywhere. It's just easier to read when code is uniform.
index.fmt(..)-> Same comment.
Those are the first ones I saw.

Anything that touches comments is very easy to review since it doesn't touch any logic.
Upgrading the deps is easy (if it's only number bumping) If it modifies any of the underlying code, it's easier to showcase it. For instance I don't see why the extra black_box is required. (To be fair some benchmarks are not really used anyway, the python ones are the relevant ones since they are actually doing comparisons).

There's also this unnecessary .into_iter().collect(), which doesn't do anything except for allocating a new vector and invalidating the old one.

Definitely a leftover, thanks for flagging.

Ah and another one, remove the Into<Vec<usize>>. I tried to avoid accepting traits anywhere not strictly necessary, I think it is cleaner to avoid the generic bloats everywhere.

The commented out code for the TensorIndexer should be left alone (yes it is in version control), but it's also purposefully not enabled since most tensors never exceed certain dimensions, however, it should be trivial to add for users that want to play around. There is no downside to it.

H2CO3 · 2025-06-12T16:58:56Z

I see a few things slip which are not exactly obvious and making sure none slips review is hard.
Keep

write!(f, "{}...")

everywhere. It's just easier to read when code is uniform.
§index.fmt(..)§-> Same comment.
Those are the first ones I saw.

Oh, that's definitely intentional, because the formatting mechanism is slow, so it's unnecessarily wasteful to write!() when there's nothing to format. A great example is all-unit variant enums such as Framework, which always write a constant string, so there's exactly one single call to Formatter::write_str() needed.

Ah and another one, remove the Into<Vec>. I tried to avoid accepting traits anywhere not strictly necessary, I think it is cleaner to avoid the generic bloats everywhere.

Okay, I can do that.

H2CO3 · 2025-06-13T14:05:39Z

@Narsil I have split this into 4 separate PRs, split by the "theme" of the PR:

I hope this is now acceptable, and you can review and hopefully merge them.

H2CO3 added 6 commits June 3, 2025 06:54

Update dependencies; remove deprecated 3rd-party black_box() function

a63b4eb

Code simplifications, stylistic, API, and performance improvements

c51983c

Further simplifications: remove redundant lifetimes, replace explicit…

79a46f4

… size check and slice indexing with `.get()`; refactor hard-coded header size into common constant; remove unconditional cloning of metadata in deserialization; remove double negation in `if-else`; etc.

Improve error handling, Display and Error impls, and simplify som…

7930a43

…e tests

Make data type alignment comparison more explicit; remove an unnecess…

3c9b6ee

…ary allocation; improve benchmark code

Adapt Python bindings to API changes

0a88f32

H2CO3 closed this Jun 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make API more idiomatic, simplify code, improve style #612

Make API more idiomatic, simplify code, improve style #612

Uh oh!

H2CO3 commented Jun 3, 2025 •

edited

Loading

Uh oh!

H2CO3 commented Jun 4, 2025

Uh oh!

H2CO3 commented Jun 10, 2025

Uh oh!

H2CO3 commented Jun 12, 2025

Uh oh!

Narsil commented Jun 12, 2025

Uh oh!

H2CO3 commented Jun 12, 2025 •

edited

Loading

Uh oh!

Narsil commented Jun 12, 2025 •

edited

Loading

Uh oh!

H2CO3 commented Jun 12, 2025

Uh oh!

H2CO3 commented Jun 13, 2025

Uh oh!

Uh oh!

Make API more idiomatic, simplify code, improve style #612

Make API more idiomatic, simplify code, improve style #612

Uh oh!

Conversation

H2CO3 commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

H2CO3 commented Jun 4, 2025

Uh oh!

H2CO3 commented Jun 10, 2025

Uh oh!

H2CO3 commented Jun 12, 2025

Uh oh!

Narsil commented Jun 12, 2025

Uh oh!

H2CO3 commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Narsil commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

H2CO3 commented Jun 12, 2025

Uh oh!

H2CO3 commented Jun 13, 2025

Uh oh!

Uh oh!

H2CO3 commented Jun 3, 2025 •

edited

Loading

H2CO3 commented Jun 12, 2025 •

edited

Loading

Narsil commented Jun 12, 2025 •

edited

Loading