Add support for signing during sync #4075

jdieter · 2025-08-14T16:03:48Z

This commit adds a new stage, RpmArtifactSigningStage, to synchronize.py that signs the rpms as they're being synchronized from an upstream repo. The new stage only does the signing if both a signing service and fingerprint have been set for the repo.

This is fully functional, but there are some questions I have on how best to move forward before moving this out of draft.

At the moment, signing will happen if the repo has a signing service and fingerprint, but this would be a behavior change with the potential of surprising users. Should this instead be activated as a new sync policy or two ("additive_sign", "mirror_content_only_sign")? Or should it be activated some other way?
How do we ensure that we're not resyncing packages each time a sync is run? The checksums will obviously be different after redoing the signing, so should we add a field to the database that stores the original checksum?

I'm not sure if solving 2 is a hard requirement for this PR, but we really should decide on something for 1.

jdieter · 2025-08-20T13:07:34Z

Hey, I've gone ahead and moved forward with new sync policies to solve (1), and it turns out that Pulp RPM synchronization already handles (2) by looking at the NEVRA, not the checksums.

This should be ready for review now.

pedro-psb

Sorry for not answering the question (1) and (2) before (you started working) and thanks for the contribution! We were in a rush for the last weeks.

I've commented on a few points already, feel free to ask for any clarification.
Also, I'd like to discuss your questions a bit more.

About (1):
I've heard (and agreed) a general preference of the team to define the "sign policy" in the repository level instead of a per-call basis. Would that work for you or is there a case where you would need per-call control over it?

If that works, there are some options. First, how do you think the "auto-sign" behavior (like we have in upload) would be a surprise? It's not something that can be configured by accident. Anyway, if we agree its a surprise, an alternative could be adding a package_signing_enable or similar to the repository, wdyt?

About (2):
I'm not convinced it won't download and sign it again... You could write a test for this, asserting that no new packages are added to the repository on a second sync.

It's worth mentioning that we have a recently opened issue for this (on the upload case) here. Ideally we should share these kind of package signing handlers between upload and sync, but I don't really think we have a solution in place for this. I would be surprised if we do!

edit: IHO (2) is not a hard requirement for the PR.

pulp_rpm/app/tasks/synchronizing.py

pulp_rpm/tests/functional/api/test_package_signing.py

jdieter · 2025-08-22T11:46:24Z

Sorry for not answering the question (1) and (2) before (you started working) and thanks for the contribution! We were in a rush for the last weeks.

That's no problem at all! Thanks so much for the detailed review!

I've commented on a few points already, feel free to ask for any clarification. Also, I'd like to discuss your questions a bit more.

About (1): I've heard (and agreed) a general preference of the team to define the "sign policy" in the repository level instead of a per-call basis. Would that work for you or is there a case where you would need per-call control over it?

This would be absolutely fine. We have no requirement for per-call control over this. I was trying to make minimal structural changes, and sync_policy seemed to be the simplest place to change it. Having said that, I do agree that it makes more sense to be a repository level setting.

If that works, there are some options. First, how do you think the "auto-sign" behavior (like we have in upload) would be a surprise? It's not something that can be configured by accident. Anyway, if we agree its a surprise, an alternative could be adding a package_signing_enable or similar to the repository, wdyt?

My main concern is that this would be a behavior change. Previously, if someone synced a repo, even if it had the signing service setup, nothing got signed. Now it would be signed. Having said that, there is a good argument that if someone has setup a signing service on a repo, it should be signed, so maybe I'm overthinking this. If you think it's not a big deal, I think I'm OK with dropping my second commit.

About (2): I'm not convinced it won't download and sign it again... You could write a test for this, asserting that no new packages are added to the repository on a second sync.

It sure didn't seem to download and sign again when running the test suite. Packages that had been synced unsigned in a previous test were not replaced with the signed versions in the signing test, and I spent way too long fighting with it before I realized it's because the old unsigned packages were still in the database. That's why I've added delete_orphans_pre to the tests I created.

As you suggest, it would probably be worth adding an explicit test for this.

It's worth mentioning that we have a recently opened issue for this (on the upload case) here. Ideally we should share these kind of package signing handlers between upload and sync, but I don't really think we have a solution in place for this. I would be surprised if we do!

edit: IHO (2) is not a hard requirement for the PR.

Yeah, let me think on this a bit more.

I'll also address the individual issues you brought up.

jdieter · 2025-09-08T11:10:05Z

About (2): I'm not convinced it won't download and sign it again... You could write a test for this, asserting that no new packages are added to the repository on a second sync.

It sure didn't seem to download and sign again when running the test suite. Packages that had been synced unsigned in a previous test were not replaced with the signed versions in the signing test, and I spent way too long fighting with it before I realized it's because the old unsigned packages were still in the database. That's why I've added delete_orphans_pre to the tests I created.

So, after using this a bit in our Pulp instance, I've figured out what was happening. It turns out that it won't download and sign again if and only if the repodata hasn't changed at all. If even one package has been updated in the repo, all the of the packages will be re-signed.

I think the correct solution is to store the fingerprint of any RPM signatures in the database, so we can then ensure that we don't re-sign if the content hash hasn't changed and the correct RPM signature is in the database. What do you think?

dralley · 2025-09-10T03:15:28Z

So, after using this a bit in our Pulp instance, I've figured out what was happening. It turns out that it won't download and sign again if and only if the repodata hasn't changed at all. If even one package has been updated in the repo, all the of the packages will be re-signed.

That makes sense, we do have an optimization to skip basically the whole sync process if the metadata checksum is unchanged from the previous sync.

I think the correct solution is to store the fingerprint of any RPM signatures in the database, so we can then ensure that we don't re-sign if the content hash hasn't changed and the correct RPM signature is in the database. What do you think?

I assume what you mean is something along the lines of "keep a mapping of the pre-signed package checksums to the new, signed packages"?

josephtate · 2025-09-11T17:54:44Z

I'm pitching in on @jdieter's behalf. We work together.

I think the correct solution is to store the fingerprint of any RPM signatures in the database, so we can then ensure that we don't re-sign if the content hash hasn't changed and the correct RPM signature is in the database. What do you think?

I assume what you mean is something along the lines of "keep a mapping of the pre-signed package checksums to the new, signed packages"?

Right. Store the checksum of the synced package (pre-signing) so that if we see that package again during sync it gets skipped. That's one option.

The other option is to look only at the NEVRA, and if it is already present, don't sign, don't import. It's kind of a feature that old packages don't get sly-updated without extra work; broken packages would have to be deleted from the repository before a new sync to update them. This is a workaround I can get behind, because we wouldn't want things to update unexpectedly anyway.

jdieter · 2025-09-12T12:11:54Z

The other option is to look only at the NEVRA, and if it is already present, don't sign, don't import. It's kind of a feature that old packages don't get sly-updated without extra work; broken packages would have to be deleted from the repository before a new sync to update them. This is a workaround I can get behind, because we wouldn't want things to update unexpectedly anyway.

As part of this PR, I've implemented this option. If you want me to do the full "keep a mapping of the pre-signed package checksums to the new, signed packages", I can do it, but it will be more invasive. UPDATED: I'm using skip package incorrectly. This is not ready for review yet.

This commit adds a new stage, RpmArtifactSigningStage, to synchronize.py that signs the rpms as they're being synchronized from an upstream repo. The new stage only does the signing if both a signing service and fingerprint have been set for the repo. Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

This commit adds two new sync policies, `additive_sign` and `mirror_content_only_sign`, which are identical to the non `_sign` versions, but will sign packages during the sync process. It also adds a test to ensure that signing *doesn't* happen when not using a `*_sign` policy. And, because sync is smart enough to skip packages that have identical nevras, we need to ensure that all packages are garbage collected before running the tests, so we're adding the delete_orphans_pre fixture to both tests. Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

As mentioned in the PR review, it's far simpler to just create a new artifact that handles all the hashing instead of manually creating the hashes we think we should need. Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

We need to ensure that the size and pkgId are updated correctly, so pass the content instead of the artifact to signing function and update the content size and pkgId. Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

As requested in the review, the logic on whether or not to do package signing during sync has been moved out of RpmArtifactSigningStage, greatly simplifying the stage. Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

As discussed in the review, adding new sync policies is the wrong place for ensuring that signing happens only when necessary. For the moment, we are going to assume that if a user has set a signing service and fingerprint for a repository, then they want it re-signed on sync. Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

The "mirror_complete" sync policy or the "on_demand" remote policy are both not supported when doing signing during syncing, so raise errors if either policy is set during the sync process. Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

Currently, we start synchronizing packages even if the package is unchanged from a previous sync. This means that we end up re-signing packages each time we sync when signing during sync. This commit fixes that problem by re-using the already synced version of the package for packages that already exist in the latest version of the repository with the same NEVRA. Note that this means that a package that has been changed without changing the NEVRA will *not* be resynced anymore. This may be seen as a security feature where packages can't be silently changed. To sync a package with the same NEVRA, the original package must first be removed from the repository and then the repository must be synced again. Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

Make it easier to detect when the repos aren't being populated as expected by checking whether the repo_version has changed between each step of the test. This ensures that the test fails fast when things start going off the rails rather than waiting until the end. Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

We need more details in the DeclarativeContent, so let's create it properly and only use it there's at least one artifact. Also, add a "already_in_repo" flag into `extra_data` to ensure that we don't try to re-sign packages that were already in the repository (and should already be signed). Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

Create a test that syncs a repository, then syncs a completely different repository, and then re-syncs the first. Artifacts will be created during the first sync that will be orphaned (since signing creates a new artifact), and then, on the third sync, packages will be deduped to these artifacts. This test verifies that the artifact downloading in _sign_rpm_content works with these deduped artifacts. Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

jdieter · 2025-09-26T16:12:32Z

The other option is to look only at the NEVRA, and if it is already present, don't sign, don't import. It's kind of a feature that old packages don't get sly-updated without extra work; broken packages would have to be deleted from the repository before a new sync to update them. This is a workaround I can get behind, because we wouldn't want things to update unexpectedly anyway.

Ok, this has been implemented (correctly, this time, I think). Tests are all passing, and I've added a couple more, though I also had to disable two that were expecting the package to change when the NEVRA is the same.

I'm fine if we want to go down the route of adding a field for the original pkgId (and we'll probably also need to add a field for the signing key, so when looking through artifacts with the same originalPkgId, we can determine which signing key was used on each of them), but I wanted to make sure that the logic in here is sane and that what I've done isn't totally off-the-wall.

daviddavis · 2025-10-14T19:30:45Z

I'm fine if we want to go down the route of adding a field for the original pkgId (and we'll probably also need to add a field for the signing key, so when looking through artifacts with the same originalPkgId, we can determine which signing key was used on each of them), but I wanted to make sure that the logic in here is sane and that what I've done isn't totally off-the-wall.

I think this would be great to have. We're also interested in not re-signing packages again. However, we've encountered the situation where two packages had the same NEVRA but different contents unfortunately. So checking by package hash digest would work for us whereas NEVRA wouldn't.

github-actions bot added no-changelog no-issue multi-commit Added when a PR consists of more than one commit labels Aug 14, 2025

jdieter force-pushed the allow-signing-on-sync branch from 7c862a6 to 9767c43 Compare August 18, 2025 13:22

github-actions bot added multi-commit Added when a PR consists of more than one commit and removed multi-commit Added when a PR consists of more than one commit labels Aug 18, 2025

jdieter force-pushed the allow-signing-on-sync branch from 6ee4eeb to a16ced7 Compare August 20, 2025 10:42

jdieter changed the title ~~[Draft] Initial support for signing during sync~~ Add support for signing during sync Aug 20, 2025

jdieter force-pushed the allow-signing-on-sync branch from a16ced7 to 7837aa8 Compare August 20, 2025 11:37

jdieter marked this pull request as ready for review August 20, 2025 13:07

pedro-psb requested changes Aug 21, 2025

View reviewed changes

jdieter force-pushed the allow-signing-on-sync branch from c591d53 to 5185221 Compare September 8, 2025 11:48

jdieter force-pushed the allow-signing-on-sync branch 4 times, most recently from 99eb6ae to ef23b7d Compare September 15, 2025 13:38

dralley marked this pull request as draft September 17, 2025 22:33

jdieter added 7 commits September 26, 2025 11:09

Create a new artifact rather than updating old one

c20ad03

As mentioned in the PR review, it's far simpler to just create a new artifact that handles all the hashing instead of manually creating the hashes we think we should need. Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

Sign the content, not the artifact

ffdfec0

We need to ensure that the size and pkgId are updated correctly, so pass the content instead of the artifact to signing function and update the content size and pkgId. Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

Move logic on whether or not to sign out of RpmArtifactSigningStage

9d5067b

As requested in the review, the logic on whether or not to do package signing during sync has been moved out of RpmArtifactSigningStage, greatly simplifying the stage. Signed-off-by: Jonathan Dieter <jdieter@ciq.com>

jdieter added 2 commits September 26, 2025 11:09

jdieter force-pushed the allow-signing-on-sync branch from ef23b7d to 9a704d2 Compare September 26, 2025 14:41

jdieter force-pushed the allow-signing-on-sync branch from 9a704d2 to d5528e6 Compare September 26, 2025 14:45

jdieter force-pushed the allow-signing-on-sync branch from 7af4e7b to 2ff3f4e Compare September 26, 2025 15:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for signing during sync #4075

Add support for signing during sync #4075

Uh oh!

jdieter commented Aug 14, 2025

Uh oh!

jdieter commented Aug 20, 2025

Uh oh!

pedro-psb left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jdieter commented Aug 22, 2025

Uh oh!

jdieter commented Sep 8, 2025

Uh oh!

dralley commented Sep 10, 2025

Uh oh!

josephtate commented Sep 11, 2025

Uh oh!

jdieter commented Sep 12, 2025 •

edited

Loading

Uh oh!

jdieter commented Sep 26, 2025 •

edited

Loading

Uh oh!

daviddavis commented Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Add support for signing during sync #4075

Are you sure you want to change the base?

Add support for signing during sync #4075

Uh oh!

Conversation

jdieter commented Aug 14, 2025

Uh oh!

jdieter commented Aug 20, 2025

Uh oh!

pedro-psb left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jdieter commented Aug 22, 2025

Uh oh!

jdieter commented Sep 8, 2025

Uh oh!

dralley commented Sep 10, 2025

Uh oh!

josephtate commented Sep 11, 2025

Uh oh!

jdieter commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jdieter commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

daviddavis commented Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pedro-psb left a comment •

edited

Loading

jdieter commented Sep 12, 2025 •

edited

Loading

jdieter commented Sep 26, 2025 •

edited

Loading