Skip to content

Conversation

phw
Copy link
Member

@phw phw commented Sep 18, 2025

Summary

  • This is a…
    • Bug fix
    • Feature addition
    • Refactoring
    • Minor / simple change (like a typo)
    • Other
  • Describe this change in 1-2 sentences:

Problem

This applies some changes to the custom column sorting:

  • Removes the "descending" variations. These are not needed and even confusing. Sort order can be changed by the user in the UI. The descending ones then behave exactly the opposite, with no advantage to the user.
  • Makes sorting locale aware by default (just like the standard columns). Only special sorting cases, like by length, do not need to be locale aware
  • Removes the unused ReverseAdapter. This does not work with locale aware sorting keys anyway.

Solution

Action

Additional actions required:

  • Update Picard documentation (please include a reference to this PR)
  • Other (please specify below)

@phw phw requested a review from zas September 18, 2025 13:28
@phw phw force-pushed the custom-columns-sorting-tweaks branch from 2aea604 to 646a85e Compare September 18, 2025 13:40
@phw phw force-pushed the custom-columns-sorting-tweaks branch from 646a85e to 4c86cdf Compare September 18, 2025 14:10
This is unused and doesn't work with localized sorting anyway
@phw phw force-pushed the custom-columns-sorting-tweaks branch from 4c86cdf to adb71b4 Compare September 18, 2025 14:14
Copy link
Collaborator

@rdswift rdswift left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done. This makes it a lot clearer, especially from the user's perspective. This is the way it should have been done right from the start.

@rdswift
Copy link
Collaborator

rdswift commented Sep 18, 2025

Also a "heads up" that the documentation changes in metabrainz/picard-docs#234 will need to be updated as well.

@phw
Copy link
Member Author

phw commented Sep 19, 2025

Also a "heads up" that the documentation changes in metabrainz/picard-docs#234 will need to be updated as well.

It's good we have this PR already, but I think we should only merge it once we have fully finished the implementation of the new columns (including unifying the code for the predefined columns). There will probably be more changes to this coming.

@phw
Copy link
Member Author

phw commented Sep 19, 2025

This is the way it should have been done right from the start.

Yes, but sometimes it is easier to get a complex PR like #2714 merged first and make such small adjustments afterwards, where they are much easier to review. We already had 220 comments on the initial PR.

@phw phw requested review from knguyen1 and rdswift September 19, 2025 08:32
@phw
Copy link
Member Author

phw commented Sep 19, 2025

I actual wonder if we need both NumericSortAdapter and NaturalSortAdapter. They are kind of interchangable. NumericSortAdapter will first try to convert the whole value to a number, then falls back to natural sorting. Natural sorting will sort full numbers inside as a string by their value. I'm unsure whether the NumericSortAdapter's initial step is an actual optimization or unnecessary extra work. But in the end you'll essentially get the same results with both adapaters.

@knguyen1 your thoughts?

Copy link
Collaborator

@zas zas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good move


# Build mapping dynamically from available class names in SORTING_ADAPTER_NAMES
# This ensures we only try to map classes that are actually defined in the UI
available_class_names = {name for name in SORTING_ADAPTER_NAMES.keys() if name}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a generator instead?

def available_class_names():
    for name in SORTING_ADAPTER_NAMES.keys():
        if name:
           yield name

....
for class_name in available_class_names():

@rdswift
Copy link
Collaborator

rdswift commented Sep 19, 2025

Also a "heads up" that the documentation changes in metabrainz/picard-docs#234 will need to be updated as well.

It's good we have this PR already, but I think we should only merge it once we have fully finished the implementation of the new columns (including unifying the code for the predefined columns). There will probably be more changes to this coming.

I agree completely. I'm in no rush to merge the documentation PRs for these large UI changes for exactly that reason.

@rdswift
Copy link
Collaborator

rdswift commented Sep 19, 2025

I actual wonder if we need both NumericSortAdapter and NaturalSortAdapter. They are kind of interchangable. NumericSortAdapter will first try to convert the whole value to a number, then falls back to natural sorting. Natural sorting will sort full numbers inside as a string by their value. I'm unsure whether the NumericSortAdapter's initial step is an actual optimization or unnecessary extra work. But in the end you'll essentially get the same results with both adapaters.

I know you didn't specifically ask for my thoughts on this, but here they are anyway. 😜

If we can remove one of the selection options and still get the same results from one of the other options, I think we should be doing exactly that. The simpler we can make this for the user the better.

Copy link
Collaborator

@rdswift rdswift left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Comment on lines +329 to +336
"NullsLastAdapter": SortingAdapterDisplayInfo(
display_name=N_("Empty values first"),
tooltip=N_("Empty/whitespace values sort first."),
),
"NullsFirstAdapter": SortingAdapterDisplayInfo(
display_name=N_("Empty values last"),
tooltip=N_("Empty/whitespace values sort last."),
),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, wouldn't it be better if these two (mutually exclusive) options could be applied to any of the other sorting options? Perhaps not in the selection list, but as a separate radio button as something like:

Show empty values: ○ first, ◉ last

@knguyen1
Copy link
Contributor

knguyen1 commented Sep 21, 2025

I actual wonder if we need both NumericSortAdapter and NaturalSortAdapter. They are kind of interchangable. NumericSortAdapter will first try to convert the whole value to a number, then falls back to natural sorting. Natural sorting will sort full numbers inside as a string by their value. I'm unsure whether the NumericSortAdapter's initial step is an actual optimization or unnecessary extra work. But in the end you'll essentially get the same results with both adapaters.

Numeric and Natural are not interchangeable; they encode different semantics.

Below is LLM claim. See #2735 (comment) for proof.

What differs

  • NumericSortAdapter:

    • Tries to parse the entire value as a number; if successful returns a key like (0, number) so numbers sort first and numerically.
    • If parsing fails, falls back to natural sort key with a (1, ...) flag so non-numeric entries come after numeric ones, still sorted naturally.
    • Supports a custom parser (e.g., durations “3:30” → seconds), negative numbers, and decimals consistently.
    • Returns normalized tuple keys to avoid mixed-type comparison issues.
  • NaturalSortAdapter:

    • Always uses locale-aware natural collation on the string, treating digit runs inside text numerically (e.g., “Track 2” < “Track 10”).
    • Does not force “numbers first” semantics and doesn’t accept a custom numeric parser.
    • Key shape is whatever the collator returns; semantics differ across platforms/locales.

Why keep both

  • Semantics: Numeric enforces “numbers-first, numeric order” for whole-value numerics; Natural provides “human” alphanumeric order within text. These yield different outcomes for negatives, decimals, equal numeric values with different formatting (“000”, “01”, “1”), and custom formats (mm:ss).
  • Features: Only Numeric supports a custom parser and explicit numeric-first grouping.
  • Stability: Numeric’s tuple keys avoid cross-type comparison issues; tests rely on this behavior.

Performance note

  • The numeric-first parse is not just an optimization; it provides distinct semantics. It may also avoid building a collator key for purely numeric values, which can be marginally faster, but correctness is the main reason.

Recommendation

  • Use NumericSortAdapter for primarily numeric columns or when you need numbers-first or custom parsing.

  • Use NaturalSortAdapter for general text with embedded numbers where “Track 2, Track 10” style ordering is desired.

  • Kept both adapters; clarified they serve different semantics:

    • NumericSortAdapter: numbers-first, supports custom parser, normalized tuple keys.
    • NaturalSortAdapter: locale-aware natural order within strings, no numbers-first guarantee.

@knguyen1
Copy link
Contributor

You can use this python script to test why NumericSortAdapter should still be kept.

  • Numbers-first semantics across mixed values: ensures all numeric entries sort before non-numeric text.
  • Whole-value numeric parsing: supports negatives, decimals, scientific notation; natural sort does not parse signs/exponents as numbers.
  • Domain-specific numeric formats via custom parser: durations (mm:ss), localized numbers (“1,200”), file sizes, bitrates.
  • Stable, normalized keys: returns tuple keys that avoid mixed-type comparisons.
#!/usr/bin/env python3
from __future__ import annotations

from typing import Callable

from picard.item import Item
from picard.ui.itemviews.custom_columns.sorting_adapters import (
    NaturalSortAdapter,
    NumericSortAdapter,
)


class _ValueItem(Item):
    def __init__(self, value: str) -> None:
        super().__init__()
        self.value = value


class _Provider:
    def __init__(self, value: str) -> None:
        self._value = value

    def evaluate(self, _obj: Item) -> str:
        return self._value


def _sorted_values_with_adapter(
    adapter_factory: Callable[[object], object], values: list[str]
) -> list[str]:
    adapters = [adapter_factory(_Provider(v)) for v in values]
    keys = [a.sort_key(_ValueItem(v)) for a, v in zip(adapters, values, strict=True)]
    # Pair the original value with its key and sort by the produced key
    pairs = sorted(zip(values, keys, strict=True), key=lambda p: p[1])
    return [v for v, _ in pairs]


def _print_case(title: str, values: list[str]) -> None:
    print(f"\n=== {title} ===")
    nat = _sorted_values_with_adapter(lambda base: NaturalSortAdapter(base), values)
    num = _sorted_values_with_adapter(lambda base: NumericSortAdapter(base), values)
    print("Input:     ", values)
    print("Natural:   ", nat)
    print("Numeric:   ", num)


def _print_case_with_parser(
    title: str,
    values: list[str],
    parser: Callable[[str], float],
    parser_label: str,
) -> None:
    print(f"\n=== {title} ({parser_label}) ===")
    nat = _sorted_values_with_adapter(lambda base: NaturalSortAdapter(base), values)
    num = _sorted_values_with_adapter(lambda base: NumericSortAdapter(base, parser=parser), values)
    print("Input:     ", values)
    print("Natural:   ", nat)
    print("Numeric*:  ", num)


def _mmss_parser(s: str) -> float:
    # e.g. "3:30" -> 210.0
    total = 0
    for part in s.split(":"):
        total = total * 60 + int(part)
    return float(total)


def _comma_number_parser(s: str) -> float:
    # Handle localized/thousands separators like "1,200"
    return float(s.replace(",", "").strip())


def main() -> None:
    # 1) Mixed numerics, negatives, decimals, scientific notation, and text
    #    NaturalSortAdapter treats digits inside text as numbers, but not signs/exponents.
    #    NumericSortAdapter parses the entire value (float), so "-1" and "1e3" are truly numeric.
    _print_case(
        "Mixed: integers, decimals, negatives, scientific notation, and text",
        ["10", "2", "3.5", "x", "-1", "7a", "1e3", "3e2"],
    )

    # 2) Duration-like values need domain parsing. NumericSortAdapter with a custom parser
    #    sorts by actual time; NaturalSortAdapter only sees text + digit runs.
    _print_case_with_parser(
        "Durations requiring domain parsing",
        ["3:30", "2:05", "1:00", "x", "10:00", "0:59"],
        parser=_mmss_parser,
        parser_label="custom mm:ss parser",
    )

    # 3) Localized numbers with thousands separators. NumericSortAdapter can parse them
    #    (with a custom parser); NaturalSortAdapter compares as text with digits.
    _print_case_with_parser(
        "Localized numbers with commas",
        ["1,200", "300", "20", "1,050", "x"],
        parser=_comma_number_parser,
        parser_label="custom comma-aware parser",
    )

    # 4) Whitespace and formatting differences. float() handles surrounding whitespace;
    #    natural sorting treats whitespace as text.
    _print_case(
        "Whitespace and formatting",
        ["  12  ", "2", " 10", "x", "000", "01", "1", "1.0"],
    )


if __name__ == "__main__":
    main()

The Output

$ uv run python test_sort.py

=== Mixed: integers, decimals, negatives, scientific notation, and text ===
Input:      ['10', '2', '3.5', 'x', '-1', '7a', '1e3', '3e2']
Natural:    ['-1', '1e3', '2', '3.5', '3e2', '7a', '10', 'x']
Numeric:    ['-1', '2', '3.5', '10', '3e2', '1e3', '7a', 'x']

=== Durations requiring domain parsing (custom mm:ss parser) ===
Input:      ['3:30', '2:05', '1:00', 'x', '10:00', '0:59']
Natural:    ['0:59', '1:00', '2:05', '3:30', '10:00', 'x']
Numeric*:   ['0:59', '1:00', '2:05', '3:30', '10:00', 'x']

=== Localized numbers with commas (custom comma-aware parser) ===
Input:      ['1,200', '300', '20', '1,050', 'x']
Natural:    ['1,050', '1,200', '20', '300', 'x']
Numeric*:   ['20', '300', '1,050', '1,200', 'x']

=== Whitespace and formatting ===
Input:      ['  12  ', '2', ' 10', 'x', '000', '01', '1', '1.0']
Natural:    ['  12  ', ' 10', '000', '01', '1', '1.0', '2', 'x']
Numeric:    ['000', '01', '1', '1.0', '2', ' 10', '  12  ', 'x']

@phw phw merged commit 5e8a3da into metabrainz:master Sep 22, 2025
45 checks passed
@phw phw deleted the custom-columns-sorting-tweaks branch September 22, 2025 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants