Skip to content

Conversation

harahu
Copy link
Contributor

@harahu harahu commented Oct 5, 2025

Relax Subalgebra operations to work over CommSemiring structures instead of demanding commutative rings.


Found by asking OpenAI Codex to search for low-hanging generalization opportunities. I don't have any usage in mind. Just wanted to explore what Codex would find.

Open in Gitpod

@github-actions github-actions bot added the t-algebra Algebra (groups, rings, fields, etc) label Oct 5, 2025
Copy link

github-actions bot commented Oct 5, 2025

PR summary bfc2799b7b

Import changes for modified files

No significant changes to the import graph

Import changes for all files
Files Import difference

Declarations diff

No declarations were harmed in the making of this PR! 🐙

You can run this locally as follows
## summary with just the declaration names:
./scripts/declarations_diff.sh <optional_commit>

## more verbose report:
./scripts/declarations_diff.sh long <optional_commit>

The doc-module for script/declarations_diff.sh contains some details about this script.


No changes to technical debt.

You can run this locally as

./scripts/technical-debt-metrics.sh pr_summary
  • The relative value is the weighted sum of the differences with weight given by the inverse of the current value of the statistic.
  • The absolute value is the relative value divided by the total sum of the inverses of the current values (i.e. the weighted average of the differences).

@grunweg grunweg changed the title feat(Algebra): generalize subalgebra operations to semirings chore(Algebra): generalize subalgebra operations to semirings Oct 9, 2025
@ocfnash
Copy link
Contributor

ocfnash commented Oct 13, 2025

I appreciate your intentions are good but IMHO PRs like this are unhelpful to our mission.

The reason is that the growth of Mathlib is currently heavily resource-contrained on review bandwidth. Even a small PR like this takes time to review which could otherwise be spent on reviewing PRs which genuinely advance the borders of Mathlib.

I'll repeat that I recognise you have good intentions but please be mindful of my words above. If you wish to make AI-driven contributions it would be much more valuable if you could filter for those which have a demonstrable mathematical value, and only open PRs in such cases.

Having said all that, now that I've reviewed these changes, we might as well apply them.

bors merge

@leanprover-community-mathlib4-bot leanprover-community-mathlib4-bot added the ready-to-merge This PR has been sent to bors. label Oct 13, 2025
mathlib-bors bot pushed a commit that referenced this pull request Oct 13, 2025
Relax Subalgebra operations to work over `CommSemiring` structures instead of demanding commutative rings.
@mathlib-bors
Copy link
Contributor

mathlib-bors bot commented Oct 13, 2025

Pull request successfully merged into master.

Build succeeded:

@mathlib-bors mathlib-bors bot changed the title chore(Algebra): generalize subalgebra operations to semirings [Merged by Bors] - chore(Algebra): generalize subalgebra operations to semirings Oct 13, 2025
@mathlib-bors mathlib-bors bot closed this Oct 13, 2025
@harahu harahu deleted the feat/generalize-1 branch October 13, 2025 11:27
@harahu
Copy link
Contributor Author

harahu commented Oct 13, 2025

Thank you for you for taking the time to explain your perspective on PRs like this one, I appreciate it. I am aware that mathlib is constrained on review, and my intention is not to add additional noise for you to have to deal with.

If you wish to make AI-driven contributions it would be much more valuable if you could filter for those which have a demonstrable mathematical value, and only open PRs in such cases.

I don't think I have the ability to perform such a filtration. That is, I don't have a well-developed sense of what "mathematical value" is. In my mind, there is no discernible difference between generalizing these results and any other result I could have generalized. Hence, I'd like to ask you for advice on how to go about this in practice.

@ocfnash
Copy link
Contributor

ocfnash commented Oct 13, 2025

Thank you for you for taking the time to explain your perspective on PRs like this one, I appreciate it. I am aware that mathlib is constrained on review, and my intention is not to add additional noise for you to have to deal with.

If you wish to make AI-driven contributions it would be much more valuable if you could filter for those which have a demonstrable mathematical value, and only open PRs in such cases.

I don't think I have the ability to perform such a filtration. That is, I don't have a well-developed sense of what "mathematical value" is. In my mind, there is no discernible difference between generalizing these results and any other result I could have generalized. Hence, I'd like to ask you for advice on how to go about this in practice.

I suggest that you adopt the null hypothesis that what the AI has created is not worth it. I think for now the best policy would be that unless you can see mathematical value, then you should assume it is not there. No doubt this will generate some false negatives but there is a real cost to reviewing and triaging these PRs versus a very theoretical benefit to these generalisations so I think this is the right policy.

@harahu
Copy link
Contributor Author

harahu commented Oct 13, 2025

Hmm. Effectively, I think this would result in me creating no generalization PRs at all, for the time being. I think it would be exceedingly rare for me to be in a position to challenge that null hypothesis, even when there exists objective reason to do so. I'm an engineer, not a mathematician.

I could reluctantly accept that conclusion, but I would find it sad. I view the creation of these PRs as an interesting exercise, both in agent usage and in learning Lean.

I do wonder whether all generalization PRs are equally hard to review. You closed #30252 because a proof was modified in nontrivial ways. Closing that PR made sense to me in hindsight, based on your reasoning. Whereas this PR achieved generalization without modifying any proofs. To me, even if there isn't a clear use case, that seems like a clear benefit regardless, since the hypotheses now more clearly express the intrinsic generality of the proofs.

Are there still nontrivial concerns to consider in reviewing a generalization PR like this one, that just relaxes variables without touching anything else?

I'm asking because I want to better understand which types of changes are perceived as more or less burdensome to review.

@kbuzzard
Copy link
Member

Hi Harald.

First, let me thank you a lot for your work making our library a lot more grammatically accurate, consistent etc.

My impression is that the value of your PRs started off high and now is tailing off. Your early PRs such as #28595 , #28607 etc and essentially all of the first few were very easy to review and clearly of benefit. However now your PRs are harder to review and less clearly of some benefit, especially given that you are not a mathematician and are sometimes dealing with things that you may not fully understand. As you might have noticed we have a big problem with PRs in this repo. I wonder if it is best to pause for a while, whilst we try and get the PR backlog under control. I would imagine that there are many other repositories on github that you could experiment with using your methods.

@riccardobrasca
Copy link
Member

Let me add that in the mathlib3 days we had a tool to find such generalizations, and the results were most of the time not very interesting. The point is that mathlib has a lot of classes that are mathematically irrelevant (but are useful to set up the beginning of the hierarchy) and one should probably avoid generalizing theorems to such classes only for the sake of generalization (the statement becomes for example more cryptic).

But let me also thank you for your work, that has been mostly very useful!

@harahu
Copy link
Contributor Author

harahu commented Oct 13, 2025

@kbuzzard I take it that you're asking me to be picky in what I PR to mathlib for the time being. More of the obvious improvements that require little mental overhead to review. Less of the long tail of more nuanced improvements (and perhaps non-improvements). I don't mind you asking for that.

At the same time I don't really feel bad about having generated a few PRs that can be considered hard to review. I see that as a necessary cost of learning. If I were too timid in proposing changes that I think are for the best, then you wouldn't have received the PRs you are thanking me for either. That would also be suboptimal for mathlib.

I don't mind simply being told when a PR is not seen as useful, like I was in this case, so that I can adjust my output based on it. Like I mentioned above, my conclusion is that I most likely won't be attempting generalization PRs for a while.

@harahu
Copy link
Contributor Author

harahu commented Oct 15, 2025

Let me add that in the mathlib3 days we had a tool to find such generalizations, and the results were most of the time not very interesting. The point is that mathlib has a lot of classes that are mathematically irrelevant (but are useful to set up the beginning of the hierarchy) and one should probably avoid generalizing theorems to such classes only for the sake of generalization (the statement becomes for example more cryptic).

@riccardobrasca Thank you for taking the time to tell me about this! I wasn't aware there was such a tool, that is interesting to hear. It's also surprising to me to hear it didn't turn up much interesting generalization opportunities. I have a few follow-up questions, if you have the time and motivation to indulge my curiosity.

Do you know if the tool only flagged generalizations that could be done immediately, independently of any other ones? Or did it report the results you would get if you also were to also generalize, as far as possible, all the results a proof depends on? This would be equivalent to just iteratively applying immediate generalizations across mathlib until a steady state is reached, where no more generalizations are possible. I would naively expect this latter approach to be able to produce some quite interesting results.

Also, do you have some concrete examples of these mathematically irrelevant classes one does not want to generalize to? Just so that I can sharpen my intuition for what to filter out if I ever attempt generalization PRs again.

@riccardobrasca
Copy link
Member

If I remember correctly it only found direct generalization, meaning that one had to run several times.

Mathlib's hierarchy is huge, take for example DivisionMonoid: this extremely close to group, and probably quite a few results for groups can be generalized (I am not sure about this precise example, but it doesn't matter).

@harahu
Copy link
Contributor Author

harahu commented Oct 15, 2025

Your example makes sense to me, I think. A generalization from Group to DivisionMonoid is a fairly small increase in generality, that also happens to move from a well-known type, Group, to a type, DivisionMonoid, that I've never heard of before now. For the untrained eye, it might not even be clear anymore that the result applies to Group because you would have to know that a Group is also a DivisionMonoid, which you might not be aware of. Hence, the relatively small increase in generality could cost more than it is worth.

Is this an accurate understanding of the situation?

@riccardobrasca
Copy link
Member

Your example makes sense to me, I think. A generalization from Group to DivisionMonoid is a fairly small increase in generality, that also happens to move from a well-known type, Group, to a type, DivisionMonoid, that I've never heard of before now. For the untrained eye, it might not even be clear anymore that the result applies to Group because you would have to know that a Group is also a DivisionMonoid, which you might not be aware of. Hence, the relatively small increase in generality could cost more than it is worth.

Is this an accurate understanding of the situation?

Yes, that's the point.

@grunweg
Copy link
Collaborator

grunweg commented Oct 15, 2025

You could start a thread on zulip asking if there are any generalisations that are always wanted. (I have some around ENorm and its related typeclasses: knowing which lemmas are "free generalisations" can be a useful starting point. There might be more.) And then only look for generalisations on this allow-list.

@harahu
Copy link
Contributor Author

harahu commented Oct 16, 2025

I don't have a strong intuition for whether that would be a good idea or not. What do the other people in this thread think? @ocfnash do you think a hypothetical whitelist, like @grunweg is entertaining, could serve as an acceptable proxy for mathematical value? That is, do you have hope that it could bring the false positive rate down to a level you'd be comfortable with?

@ocfnash
Copy link
Contributor

ocfnash commented Oct 16, 2025

@harahu I'm going to be even more direct: people should not attempt mathematical changes unless they understand the mathematics.

I am running out of ways to say this. Please let this go.

@harahu
Copy link
Contributor Author

harahu commented Oct 16, 2025

I clearly already had let this go. Then @grunweg put forth a suggestion which I was merely attempting to politely entertain, even going out of my way in order to give you room to oppose the idea. In light of that, I think you are being unduly crass with me.

@kbuzzard
Copy link
Member

I don't think we're interested in arbitrary changes to the system with no applications to mathematics and no tests to see if this has any kind of performance hit.

@kbuzzard
Copy link
Member

kbuzzard commented Oct 16, 2025

@harahu Mathlib is an extremely complex library and you are getting AI to suggest arbitrary nontrivial changes to fundamental files without making any checks to see if performance has fallen off a cliff or if it breaks any downstream repos. Such PRs are definitely unwelcome. They are also taking up maintainers' time.

@harahu
Copy link
Contributor Author

harahu commented Oct 16, 2025

Thanks for pointing out the fact that performance might also be impacted by generalizations, and that one should test for performance regressions when doing so. I was not aware that might also be an issue.

For downstream repos, in the case of mere generalizations, would one expect potential for any problems beyond performance issues?

@ocfnash
Copy link
Contributor

ocfnash commented Oct 16, 2025

@harahu I stand by my assertion that mathematical changes are unwelcome unless people understand the mathematics.

@leanprover-community leanprover-community locked as too heated and limited conversation to collaborators Oct 16, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

ready-to-merge This PR has been sent to bors. t-algebra Algebra (groups, rings, fields, etc)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants