Copy-on-Write (COW) Dump Implementation for Process duplication #2813

asafpamzn · 2025-11-06T14:57:55Z

Summary

I'm implementing a COW-based live migration feature for CRIU that uses userfaultfd write-protection to track memory modifications while the process continues running. The goal is to combine it with the lazy support in order to be able to duplicate a process to remote instance while minimizing downtime compared to traditional dump modes.

Overview

Write-protecting all writable memory pages using userfaultfd
Resuming the process immediately after protection
Capturing page contents on write faults before they're modified
Transferring pages to destination while process continues running

High level flow

Instead of dumping the entire memory mark VMAs with write protection
In https://github.com/asafpamzn/criu/blob/criu-cow/criu/cr-dump.c#L1720

A new parasite to do the job
https://github.com/asafpamzn/criu/blob/criu-dev/criu/cow-dump.c#L197C1-L198C1
https://github.com/asafpamzn/criu/blob/a59a151c1e2fb6edfe899ab940698c5a412f75b1/criu/pie/parasite.c#L963

Question: I want to dump small VMAs and mark in write protect only large VMAs - How can I do it? I don't fully understand how I can combine VMAs as they are all pushed to the same page image file.

Next, a new thread is getting the page faults and transfer the process.
https://github.com/asafpamzn/criu/blob/criu-cow/criu/cr-dump.c#L1728
https://github.com/asafpamzn/criu/blob/a59a151c1e2fb6edfe899ab940698c5a412f75b1/criu/cow-dump.c#L423
https://github.com/asafpamzn/criu/blob/a59a151c1e2fb6edfe899ab940698c5a412f75b1/criu/cow-dump.c#L444
Awake the source process
https://github.com/asafpamzn/criu/blob/a59a151c1e2fb6edfe899ab940698c5a412f75b1/criu/cow-dump.c#L414

I'm in the early stages of learning the code. I will be happy to some guidance and advice.
Please let me know if it makes sense. I'm most concern about how I combine the memory areas as I want to write protect only large vmas

This reverts commit 302bbf3.

This reverts commit 3c211ca.

This reverts commit 24f95b2.

This reverts commit dca82a0.

rst0git · 2025-11-08T15:37:29Z

@asafpamzn There are too many patches in this pull request and it would be difficult for someone to comment on the changes.

The following document provides more information on how to contribute to CRIU:
https://github.com/checkpoint-restore/criu/blob/criu-dev/CONTRIBUTING.md

I'm implementing a COW-based live migration feature for CRIU that uses userfaultfd write-protection to track memory modifications while the process continues running. The goal is to combine it with the lazy support in order to be able to duplicate a process to remote instance while minimizing downtime compared to traditional dump modes.

I believe Mike Rapoport (@rppt) might be able to provide some advice about the idea.

asafpamzn · 2025-11-08T16:25:16Z

Thanks @rst0git ,

Since it is a big change I would like to get advice about the general direction before starting to implement. I can provide a design doc if it works better. What is the best path going forward?
Should I consult with @rppt ?

rst0git · 2025-11-08T17:18:22Z

Since it is a big change I would like to get advice about the general direction before starting to implement. I can provide a design doc if it works better. What is the best path going forward?

Creating a GitHub issue with more information about the use-case and why this functionality is important will help us to understand the proposed design.

Should I consult with @rppt ?

There are multiple people in the community that can provide feedback. Mike is a MM maintainer for the Linux kernel and contributed many of the patches that enable post-copy migration with userfaultfd.

avagin · 2025-11-09T16:14:03Z

Since it is a big change I would like to get advice about the general direction before starting to implement. I can provide a design doc if it works better. What is the best path going forward?

Let's start with a design doc.

asafpamzn added 30 commits November 4, 2025 15:30

cow first commit

2fc4741

cow first commit

2b7bf40

cow first commit

576a302

cow first commit

2fafafa

cow first commit

5a53f67

cow first commit

3afd39a

upadtes

d2ad223

upadtes

ec9b6ce

upadtes

5ee615b

upadtes

3a9974a

upadtes

17940ff

upadtes

2635a92

upadtes

8546bee

cow first commit

db6ab73

cow first commit

8ed73e9

upadtes

2404739

upadtes

3b49be0

upadtes

fcf9ffc

upadtes

4cca948

upadtes

df73d47

upadtes

5b175e1

upadtes

f61e188

upadtes

dd5da23

upadtes

d30ab66

upadtes

1821c34

upadtes

296ac16

upadtes

4ab6275

upadtes

402e3ef

upadtes

09a7368

upadtes

b8d3e6f

asafpamzn added 26 commits November 6, 2025 13:50

Revert "remove dump failed vms"

89bd2f6

This reverts commit 302bbf3.

Revert "remove dump failed vms"

bc7da7a

This reverts commit 3c211ca.

Revert "remove dump failed vms"

dca82a0

This reverts commit 24f95b2.

Reapply "remove dump failed vms"

f75682e

This reverts commit dca82a0.

remove dump failed vms

a89b91b

remove dump failed vms

be59e2d

remove dump failed vms

cd45bbe

remove dump failed vms

509618f

remove dump failed vms

f979921

remove dump failed vms

b6fdb6a

remove dump failed vms

48f4055

remove dump failed vms

67b94c2

remove dump failed vms

02b2ea3

remove dump failed vms

a541ec4

remove dump failed vms

feb1a51

remove dump failed vms

c5ec624

remove dump failed vms

31cf2c9

remove dump failed vms

9de8b20

remove dump failed vms

dfbe1bd

remove dump failed vms

c918153

remove dump failed vms

43908ea

remove dump failed vms

2f9e29e

remove dump failed vms

e814b74

remove dump failed vms

f7c512b

remove dump failed vms

a59a151

cleanup

e0881d7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Copy-on-Write (COW) Dump Implementation for Process duplication #2813

Copy-on-Write (COW) Dump Implementation for Process duplication #2813

Uh oh!

asafpamzn commented Nov 6, 2025

Uh oh!

rst0git commented Nov 8, 2025

Uh oh!

asafpamzn commented Nov 8, 2025

Uh oh!

rst0git commented Nov 8, 2025

Uh oh!

avagin commented Nov 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copy-on-Write (COW) Dump Implementation for Process duplication #2813

Are you sure you want to change the base?

Copy-on-Write (COW) Dump Implementation for Process duplication #2813

Uh oh!

Conversation

asafpamzn commented Nov 6, 2025

Summary

Overview

High level flow

Uh oh!

rst0git commented Nov 8, 2025

Uh oh!

asafpamzn commented Nov 8, 2025

Uh oh!

rst0git commented Nov 8, 2025

Uh oh!

avagin commented Nov 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants