Skip to content

Use eatmydata in CI #33108

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Use eatmydata in CI #33108

wants to merge 5 commits into from

Conversation

def-
Copy link
Contributor

@def- def- commented Jul 22, 2025

Based on comparing against Nightly on main this seems to speed up by ~1 minute on average in tests. The main thing it does is use https://github.com/stewartsmith/libeatmydata to make fsync a no-op because we don't persist anything of importance in tests anyway.

The feature benchmark run does a good job showing the difference of using eatmydata:
Google Sheet: https://docs.google.com/spreadsheets/d/1ZOOZJBM0OmsyjsoEIdn1lT_crjDQUfjmMdqs04RJlbc/edit?gid=2146535294#gid=2146535294

NAME                                | TYPE            |      THIS       |      OTHER      |  UNIT  | THRESHOLD  |  Regression?  | 'THIS' is
--------------------------------------------------------------------------------------------------------------------------------------------------------
AccumulateReductions                | wallclock       |          34.892 |          44.828 |   s    |    10%     |      no       | better: 22.2% faster
ExactlyOnce                         | wallclock       |           3.106 |          21.221 |   s    |    10%     |      no       | better:  6.8 times faster
GroupByMaintained                   | wallclock       |          12.760 |          18.703 |   s    |    10%     |      no       | better: 31.8% faster
KafkaUpsertUnique                   | wallclock       |           2.207 |           2.167 |   s    |    10%     |      no       | worse:   1.8% slower
MySqlStreaming                      | wallclock       |           6.216 |           6.021 |   s    |    10%     |      no       | worse:   3.2% slower
StartupLoaded                       | wallclock       |           2.948 |           2.224 |   s    |    10%     |    !!YES!!    | worse:  32.6% slower
OptbenchTPCHQ01                     | wallclock       |     9376764.000 |    11384984.000 |   ns   |    20%     |      no       | better: 17.6% faster
OptbenchTPCHQ09                     | wallclock       |    13143297.000 |    36802913.000 |   ns   |    20%     |      no       | better:  2.8 times faster
OptbenchTPCHQ17                     | wallclock       |    30118337.000 |    36298172.000 |   ns   |    20%     |      no       | better: 17.0% faster
SubscribeParallelKafka              | wallclock       |           1.854 |           1.945 |   s    |    10%     |      no       | better:  4.7% faster
ConnectionLatency                   | wallclock       |           0.614 |           0.626 |   s    |    10%     |      no       | better:  1.9% faster
FastPathFilterIndex                 | wallclock       |           3.918 |           4.642 |   s    |    10%     |      no       | better: 15.6% faster
HydrateIndex                        | wallclock       |           1.453 |           2.012 |   s    |    10%     |      no       | better: 27.8% faster
MFPPushdown                         | wallclock       |           1.386 |           1.534 |   s    |    10%     |      no       | better:  9.6% faster
OrderBy                             | wallclock       |           7.368 |           9.839 |   s    |    10%     |      no       | better: 25.1% faster
StartupTpch                         | wallclock       |           2.287 |           2.643 |   s    |    10%     |      no       | better: 13.5% faster
OptbenchTPCHQ02                     | wallclock       |    41051217.000 |    69637745.000 |   ns   |    20%     |      no       | better: 41.1% faster
OptbenchTPCHQ10                     | wallclock       |    10286092.000 |    28581466.000 |   ns   |    20%     |      no       | better:  2.8 times faster
OptbenchTPCHQ18                     | wallclock       |    19115322.000 |    23105689.000 |   ns   |    20%     |      no       | better: 17.3% faster
SubscribeParallelTable              | wallclock       |          58.958 |          58.127 |   s    |    10%     |      no       | worse:   1.4% slower
CountDistinct                       | wallclock       |           1.114 |           1.514 |   s    |    10%     |      no       | better: 26.4% faster
FastPathFilterNoIndex               | wallclock       |           0.975 |           1.353 |   s    |    10%     |      no       | better: 27.9% faster
Insert                              | wallclock       |           1.747 |           2.143 |   s    |    10%     |      no       | better: 18.5% faster
ManyKafkaSourcesOnSameCluster       | wallclock       |           7.476 |           9.076 |   s    |    10%     |      no       | better: 17.6% faster
PgCdcInitialLoad                    | wallclock       |           1.197 |           1.135 |   s    |    10%     |      no       | worse:   5.4% slower
SwapSchema                          | wallclock       |           0.103 |           0.139 |   s    |    10%     |      no       | better: 25.3% faster
OptbenchTPCHQ03                     | wallclock       |     7932300.000 |     9545046.000 |   ns   |    20%     |      no       | better: 16.9% faster
OptbenchTPCHQ11                     | wallclock       |    27357031.000 |    30745480.000 |   ns   |    20%     |      no       | better: 11.0% faster
OptbenchTPCHQ19                     | wallclock       |    41683181.000 |    36250752.000 |   ns   |    20%     |      no       | worse:  15.0% slower
SubscribeParallelTableWithIndex     | wallclock       |           1.960 |           2.048 |   s    |    10%     |      no       | better:  4.3% faster
CreateIndex                         | wallclock       |           0.685 |           0.598 |   s    |    10%     |    !!YES!!    | worse:  14.5% slower
FastPathLimit                       | wallclock       |           0.288 |           0.681 |   s    |    10%     |      no       | better:  2.4 times faster
InsertAndSelect                     | wallclock       |           2.837 |           2.798 |   s    |    10%     |      no       | worse:   1.4% slower
ManySmallInserts                    | wallclock       |          18.845 |          19.468 |   s    |    10%     |      no       | better:  3.2% faster
PgCdcStreaming                      | wallclock       |           1.985 |           1.906 |   s    |    10%     |      no       | worse:   4.2% slower
Update                              | wallclock       |           1.082 |           1.403 |   s    |    10%     |      no       | better: 22.9% faster
OptbenchTPCHQ04                     | wallclock       |    21141052.000 |    13318064.000 |   ns   |    20%     |    !!YES!!    | worse:  58.7% slower
OptbenchTPCHQ12                     | wallclock       |    15935714.000 |    18575912.000 |   ns   |    20%     |      no       | better: 14.2% faster
OptbenchTPCHQ20                     | wallclock       |    42990831.000 |    46441218.000 |   ns   |    20%     |      no       | better:  7.4% faster
CrossJoin                           | wallclock       |           1.802 |           2.061 |   s    |    10%     |      no       | better: 12.6% faster
FastPathOrderByLimit                | wallclock       |           0.786 |           1.072 |   s    |    10%     |      no       | better: 26.7% faster
InsertBatch                         | wallclock       |          12.923 |          14.649 |   s    |    10%     |      no       | better: 11.8% faster
ManySmallUpdates                    | wallclock       |          12.227 |          13.648 |   s    |    10%     |      no       | better: 10.4% faster
QueryLatency                        | wallclock       |           1.981 |           2.285 |   s    |    10%     |      no       | better: 13.3% faster
UpdateMultiNoIndex                  | wallclock       |           2.434 |           2.630 |   s    |    10%     |      no       | better:  7.5% faster
OptbenchTPCHQ05                     | wallclock       |    29098557.000 |    29307687.000 |   ns   |    20%     |      no       | better:  0.7% faster
OptbenchTPCHQ13                     | wallclock       |    21626966.000 |    12290246.000 |   ns   |    20%     |    !!YES!!    | worse:  76.0% slower
OptbenchTPCHQ21                     | wallclock       |    47502145.000 |    45925976.000 |   ns   |    20%     |      no       | worse:   3.4% slower
DeltaJoin                           | wallclock       |           1.436 |           2.060 |   s    |    10%     |      no       | better: 30.3% faster
FinishOrderByLimit                  | wallclock       |           0.689 |           0.935 |   s    |    10%     |      no       | better: 26.4% faster
InsertMultiRow                      | wallclock       |           0.126 |           0.118 |   s    |    10%     |      no       | worse:   6.8% slower
MinMax                              | wallclock       |           0.971 |           1.288 |   s    |    10%     |      no       | better: 24.6% faster
ReplicaExpiration                   | wallclock       |           1.019 |           1.071 |   s    |    10%     |      no       | better:  4.8% faster
ParallelDataflows                   | wallclock       |          31.385 |          43.262 |   s    |    10%     |      no       | better: 27.5% faster
OptbenchTPCHQ06                     | wallclock       |     5703310.000 |     7674017.000 |   ns   |    20%     |      no       | better: 25.7% faster
OptbenchTPCHQ14                     | wallclock       |     7619306.000 |    22555308.000 |   ns   |    20%     |      no       | better:  3.0 times faster
OptbenchTPCHQ22                     | wallclock       |    43704169.000 |    58185431.000 |   ns   |    20%     |      no       | better: 24.9% faster
DeltaJoin                           | wallclock       |           1.436 |           2.060 |   s    |    10%     |      no       | better: 30.3% faster
FinishOrderByLimit                  | wallclock       |           0.689 |           0.935 |   s    |    10%     |      no       | better: 26.4% faster
InsertMultiRow                      | wallclock       |           0.126 |           0.118 |   s    |    10%     |      no       | worse:   6.8% slower
MinMax                              | wallclock       |           0.971 |           1.288 |   s    |    10%     |      no       | better: 24.6% faster
ReplicaExpiration                   | wallclock       |           1.019 |           1.071 |   s    |    10%     |      no       | better:  4.8% faster
ParallelDataflows                   | wallclock       |          31.385 |          43.262 |   s    |    10%     |      no       | better: 27.5% faster
OptbenchTPCHQ06                     | wallclock       |     5703310.000 |     7674017.000 |   ns   |    20%     |      no       | better: 25.7% faster
OptbenchTPCHQ14                     | wallclock       |     7619306.000 |    22555308.000 |   ns   |    20%     |      no       | better:  3.0 times faster
OptbenchTPCHQ22                     | wallclock       |    43704169.000 |    58185431.000 |   ns   |    20%     |      no       | better: 24.9% faster
DifferentialJoin                    | wallclock       |           0.684 |           0.933 |   s    |    10%     |      no       | better: 26.6% faster
GroupBy                             | wallclock       |           3.245 |           4.534 |   s    |    10%     |      no       | better: 28.4% faster
KafkaUpsert                         | wallclock       |           1.955 |           2.219 |   s    |    10%     |      no       | better: 11.9% faster
MySqlInitialLoad                    | wallclock       |           1.087 |           2.067 |   s    |    10%     |      no       | better: 47.4% faster
StartupEmpty                        | wallclock       |           0.236 |           0.235 |   s    |    10%     |      no       | worse:   0.6% slower
CustomerWorkload1                   | wallclock       |           4.826 |           6.800 |   s    |    10%     |      no       | better: 29.0% faster
OptbenchTPCHQ08                     | wallclock       |    29870301.000 |    46195016.000 |   ns   |    20%     |      no       | better: 35.3% faster
OptbenchTPCHQ16                     | wallclock       |    30408688.000 |    31187488.000 |   ns   |    20%     |      no       | better:  2.5% faster
SkewedJoin                          | wallclock       |           3.660 |           3.835 |   s    |    10%     |      no       | better:  4.6% faster

In https://buildkite.com/materialize/nightly/builds/12652#01983260-bac3-452a-ae1e-db06ddb71b08
This actually makes me wonder: Are we maybe fsyncing too much? If we are only writing temporary state that a restarted Materialize won't use, maybe we should use eatmydata in production? :D

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@def- def- requested a review from a team as a code owner July 22, 2025 10:08
@def- def- force-pushed the pr-dnm branch 4 times, most recently from 3912933 to 6a6f301 Compare July 22, 2025 13:17
@def- def- requested review from a team as code owners July 22, 2025 13:17
@def- def- requested a review from aljoscha July 22, 2025 13:17
@def- def- force-pushed the pr-dnm branch 2 times, most recently from 7d08525 to a40247e Compare July 22, 2025 14:51
@def- def- changed the title DNM: Try out new agents Use eatmydata in CI and some test fixes I found while using Jul 22, 2025
@def- def- force-pushed the pr-dnm branch 2 times, most recently from ab0f0e9 to e584425 Compare July 22, 2025 16:51
@def- def- enabled auto-merge July 22, 2025 17:07
@def- def- requested review from ptravers, antiguru and ggevay July 22, 2025 17:08
Copy link
Contributor

@ggevay ggevay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be amazing if it works reliably!

But I'm wondering about those tests that kill containers. I guess not all of them use S3, right? Where do these write their Persist data?

@def- def- disabled auto-merge July 23, 2025 10:35
@def-
Copy link
Contributor Author

def- commented Jul 23, 2025

I have not seen any misbehavior in tests that kill containers. I could disable eatmydata for them if we are worried.

@ggevay
Copy link
Contributor

ggevay commented Jul 23, 2025

I have not seen any misbehavior in tests that kill containers.

Well, it might be that it works most of the time, but sometimes would produce mysterious CI fails. So, I'd say we need to make those tests that kill containers not use eatmydata.

@def- def- changed the title Use eatmydata in CI and some test fixes I found while using Use eatmydata in CI Jul 23, 2025
@def-
Copy link
Contributor Author

def- commented Jul 23, 2025

I'm not sure. Maybe it's an even better test to not have fsync in those tests and make sure Materialize can still recover? fsync is not always implemented properly in the OS/filesystem/disk. Otherwise we are baking the behavior of our specific CI agents into our assumptions (ext4 with ordered writes, reliable SSDs, etc)

(Apparently the POSIX definition says fsync may just do nothing, which is kind of the motivation for Eat My Disk. Fun video, but unfortunately bad quality: https://www.youtube.com/watch?v=LMe7hf2G1po)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants