Skip to content

Commit 8292c74

Browse files
authored
Have the reindexer reindex more river_job indexes (#963)
Here, expand the scope of the reindexer so that beyond the pair of indexes it reindexes right now, it also reindexes all the other ones in the `river_job` table. This won't make a huge different for many uses of River, but it'll make a particular difference in cases where for some amount of time `river_job` became very large, but has since contracted to a much more modest size. B-trees expand but never shrink, so in that case the indexes will be left very large despite holding very little, and the only way to get them back down to size is to rebuild them. More reindexes will always put more load on the database, but we're still doing that as carefully as we can in that indexes are rebuilt one by one, and using `CONCURRENTLY`. Back in #935 we added protection so that in case of a situation where index builds are too expensive and start to fail, the reindexer will bail out instead of trying and failing over and over again.
1 parent 9466aa0 commit 8292c74

File tree

4 files changed

+29
-3
lines changed

4 files changed

+29
-3
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1717

1818
- Remove unecessary transactions where a single database operation will do. This reduces the number of subtransactions created which can be an operational benefit it many cases. [PR #950](https://github.com/riverqueue/river/pull/950)
1919
- Bring all driver tests into separate package so they don't leak dependencies. This removes dependencies from the top level `river` package that most River installations won't need, thereby reducing the transitive dependency load of most River installations. [PR #955](https://github.com/riverqueue/river/pull/955).
20+
- The reindexer maintenance service now reindexes all `river_job` indexes, including its primary key. This is expected to help in situations where the jobs table has in the past expanded to a very large size (which makes most indexes larger), is now a much more modest size, but has left the indexes in their expanded state. [PR #963](https://github.com/riverqueue/river/pull/963).
2021
- The River CLI now accepts a `--target-version` of 0 with `river migrate-down` to run all down migrations and remove all River tables (previously, -1 was used for this; -1 still works, but now 0 also works). [PR #966](https://github.com/riverqueue/river/pull/966).
2122

2223
### Fixed

client_test.go

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6428,7 +6428,13 @@ func Test_NewClient_Defaults(t *testing.T) {
64286428
require.False(t, enqueuer.StaggerStartupIsDisabled())
64296429

64306430
reindexer := maintenance.GetService[*maintenance.Reindexer](client.queueMaintainer)
6431-
require.Equal(t, []string{"river_job_args_index", "river_job_metadata_index"}, reindexer.Config.IndexNames)
6431+
require.Contains(t, reindexer.Config.IndexNames, "river_job_args_index")
6432+
require.Contains(t, reindexer.Config.IndexNames, "river_job_kind")
6433+
require.Contains(t, reindexer.Config.IndexNames, "river_job_metadata_index")
6434+
require.Contains(t, reindexer.Config.IndexNames, "river_job_pkey")
6435+
require.Contains(t, reindexer.Config.IndexNames, "river_job_prioritized_fetching_index")
6436+
require.Contains(t, reindexer.Config.IndexNames, "river_job_state_and_finalized_at_index")
6437+
require.Contains(t, reindexer.Config.IndexNames, "river_job_unique_idx")
64326438
now := time.Now().UTC()
64336439
nextMidnight := time.Date(now.Year(), now.Month(), now.Day(), 0, 0, 0, 0, time.UTC).AddDate(0, 0, 1)
64346440
require.Equal(t, nextMidnight, reindexer.Config.ScheduleFunc(now))

internal/maintenance/reindexer.go

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,15 @@ const (
2626
ReindexerTimeoutDefault = 1 * time.Minute
2727
)
2828

29-
var defaultIndexNames = []string{"river_job_args_index", "river_job_metadata_index"} //nolint:gochecknoglobals
29+
var defaultIndexNames = []string{ //nolint:gochecknoglobals
30+
"river_job_args_index",
31+
"river_job_kind",
32+
"river_job_metadata_index",
33+
"river_job_pkey",
34+
"river_job_prioritized_fetching_index",
35+
"river_job_state_and_finalized_at_index",
36+
"river_job_unique_idx",
37+
}
3038

3139
// Test-only properties.
3240
type ReindexerTestSignals struct {

internal/maintenance/reindexer_test.go

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ func TestReindexer(t *testing.T) {
127127
require.True(t, requireReindexOne(indexName))
128128
})
129129

130-
t.Run("ReindexesEachIndex", func(t *testing.T) {
130+
t.Run("ReindexesMinimalSubsetofIndexes", func(t *testing.T) {
131131
t.Parallel()
132132

133133
svc, bundle := setup(t)
@@ -163,6 +163,17 @@ func TestReindexer(t *testing.T) {
163163
}
164164
})
165165

166+
t.Run("ReindexesDefaultIndexes", func(t *testing.T) {
167+
t.Parallel()
168+
169+
svc, _ := setup(t)
170+
171+
svc.Config.ScheduleFunc = runImmediatelyThenOnceAnHour()
172+
173+
require.NoError(t, svc.Start(ctx))
174+
svc.TestSignals.Reindexed.WaitOrTimeout()
175+
})
176+
166177
t.Run("ReindexDeletesArtifactsWhenCancelledWithStop", func(t *testing.T) {
167178
t.Parallel()
168179

0 commit comments

Comments
 (0)