Skip to content

Merge delta->main indexer crashing #3833

@nojaneus

Description

@nojaneus

Bug Description:

Hi!
I do merge delta->main every 5 minutes with:

sudo -u manticore indexer --merge ns_fantlab_editions_idx ns_fantlab_editions_delta_idx --rotate

Sometimes indexer crashing:

using config file '/etc/manticoresearch/manticore.conf'...
merging table 'ns_fantlab_editions_delta_idx' into table 'ns_fantlab_editions_idx'...
*** Oops, indexer crashed! Please send the following report to developers.
Manticore 13.13.0 e5465fe44@25100704 (columnar 8.1.0 e1522a2@25100213) (secondary 8.1.0 e1522a2@25100213) (knn 8.1.0 e1522a2@25100213) (embeddings 1.0.1)
-------------- report begins here ---------------
Current document: docid=0, hits=0
Current batch: minid=0, maxid=0
Hit pool start: docid=0, hit=0
-------------- backtrace begins here ---------------
Program compiled with Clang 16.0.6
Configured with flags: Configured with these definitions: -DDISTR_BUILD=bullseye -DUSE_SYSLOG=1 -DWITH_GALERA=1 -DWITH_RE2=1 -DWITH_RE2_FORCE_STATIC=1 -DWITH_STEMMER=1 -DWITH_STEMMER_FORCE_STATIC=1 -DWITH_NLJSON=1 -DWITH_UNIALGO=1 -DWITH_ICU=1 -DWITH_ICU_FORCE_STATIC=1 -DWITH_JIEBA=1 -DWITH_SSL=1 -DWITH_ZLIB=1 -DWIT
Built on Linux x86_64 (bullseye) (cross-compiled)
Stack bottom = 0x0, thread stack size = 0x20000
Trying system backtrace:
begin of system symbols:
/usr/bin/indexer(_Z12sphBacktraceib+0x227)[0x56537e5c2ab7]
/usr/bin/indexer(_Z7sigsegvi+0xbb)[0x56537e47432b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14140)[0x7f6516f00140]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c)[0x7f6516db173c]
/usr/bin/indexer(_ZN10CSphReaderD2Ev+0x22)[0x56537e7952b2]
/usr/bin/indexer(_ZN13CSphIndex_VLN10MergeWordsI16DiskIndexQword_cILb1ELb0EES2_EEbPKS_S4_11VecTraits_TIjES6_P14CSphHitBuilderR10CSphStringR17CSphIndexProgress+0x8f0)[0x56537e559770]
/usr/bin/indexer(_ZN13CSphIndex_VLN7DoMergeEPKS_S1_PK10ISphFilterR10CSphStringR17CSphIndexProgressbb+0x62a)[0x56537e4c01da]
/usr/bin/indexer(_ZN13CSphIndex_VLN5MergeEP9CSphIndexRK11VecTraits_TI18CSphFilterSettingsEbR17CSphIndexProgress+0x110)[0x56537e4bf370]
/usr/bin/indexer(_Z7DoMergeRK17CSphConfigSectionPKcS1_S3_RN3sph8Vector_TI18CSphFilterSettingsNS4_13DefaultCopy_TIS6_EENS4_14DefaultRelimitENS4_16DefaultStorage_TIS6_EEEEbb+0xac6)[0x56537e4738c6]
/usr/bin/indexer(main+0x3085)[0x56537e477aa5]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea)[0x7f6516d4dd0a]
/usr/bin/indexer(_start+0x2a)[0x56537e4694da]
Trying boost backtrace:
 0# sphBacktrace(int, bool) in /usr/bin/indexer
 1# sigsegv(int) in /usr/bin/indexer
 2# 0x00007F6516F00140 in /lib/x86_64-linux-gnu/libpthread.so.0
 3# cfree in /lib/x86_64-linux-gnu/libc.so.6
 4# CSphReader::~CSphReader() in /usr/bin/indexer
 5# bool CSphIndex_VLN::MergeWords<DiskIndexQword_c<true, false>, DiskIndexQword_c<true, false> >(CSphIndex_VLN const*, CSphIndex_VLN const*, VecTraits_T<unsigned int>, VecTraits_T<unsigned int>, CSphHitBuilder*, CSphString&, CSphIndexProgress&) in /usr/bin/indexer
 6# CSphIndex_VLN::DoMerge(CSphIndex_VLN const*, CSphIndex_VLN const*, ISphFilter const*, CSphString&, CSphIndexProgress&, bool, bool) in /usr/bin/indexer
 7# CSphIndex_VLN::Merge(CSphIndex*, VecTraits_T<CSphFilterSettings> const&, bool, CSphIndexProgress&) in /usr/bin/indexer
 8# DoMerge(CSphConfigSection const&, char const*, CSphConfigSection const&, char const*, sph::Vector_T<CSphFilterSettings, sph::DefaultCopy_T<CSphFilterSettings>, sph::DefaultRelimit, sph::DefaultStorage_T<CSphFilterSettings> >&, bool, bool) in /usr/bin/indexer
 9# main in /usr/bin/indexer
10# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6
11# _start in /usr/bin/indexer
 0# sphBacktrace(int, bool) in /usr/bin/indexer
 1# sigsegv(int) in /usr/bin/indexer
 2# 0x00007F6516F00140 in /lib/x86_64-linux-gnu/libpthread.so.0
 3# cfree in /lib/x86_64-linux-gnu/libc.so.6
 4# CSphReader::~CSphReader() in /usr/bin/indexer
 5# bool CSphIndex_VLN::MergeWords<DiskIndexQword_c<true, false>, DiskIndexQword_c<true, false> >(CSphIndex_VLN const*, CSphIndex_VLN const*, VecTraits_T<unsigned int>, VecTraits_T<unsigned int>, CSphHitBuilder*, CSphString&, CSphIndexProgress&) in /usr/bin/indexer
 6# CSphIndex_VLN::DoMerge(CSphIndex_VLN const*, CSphIndex_VLN const*, ISphFilter const*, CSphString&, CSphIndexProgress&, bool, bool) in /usr/bin/indexer
 7# CSphIndex_VLN::Merge(CSphIndex*, VecTraits_T<CSphFilterSettings> const&, bool, CSphIndexProgress&) in /usr/bin/indexer
 8# DoMerge(CSphConfigSection const&, char const*, CSphConfigSection const&, char const*, sph::Vector_T<CSphFilterSettings, sph::DefaultCopy_T<CSphFilterSettings>, sph::DefaultRelimit, sph::DefaultStorage_T<CSphFilterSettings> >&, bool, bool) in /usr/bin/indexer
 9# main in /usr/bin/indexer
10# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6
11# _start in /usr/bin/indexer
-------------- backtrace ends here ---------------
-------------- backtrace ends here ---------------
Please, create a bug report in our bug tracker (https://github.com/manticoresoftware/manticore/issues)
and attach there:
a) searchd log, b) searchd binary, c) searchd symbols.
Look into the chapter 'Reporting bugs' in the manual
(https://manual.manticoresearch.com/Reporting_bugs)
Please, create a bug report in our bug tracker (https://github.com/manticoresoftware/manticore/issues)
and attach there:
a) searchd log, b) searchd binary, c) searchd symbols.
Look into the chapter 'Reporting bugs' in the manual
(https://manual.manticoresearch.com/Reporting_bugs)
Dump with GDB via watchdog

After this the command

sudo -u manticore indextool --check ns_fantlab_editions_idx

tell me:

Manticore 13.13.0 e5465fe44@25100704 (columnar 8.1.0 e1522a2@25100213) (secondary 8.1.0 e1522a2@25100213)
Copyright (c) 2001-2016, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Copyright (c) 2017-2025, Manticore Software LTD (https://manticoresearch.com)
using config file '/etc/manticoresearch/manticore.conf'...
checking table 'ns_fantlab_editions_idx'...
checking schema...
checking dictionary...
FAILED, wrong word-delta (pos=7066253, word=work, len=4, begin=140, delta=112)
FAILED, word order decreased (pos=7066253, word=work, prev=work)
FAILED, wrong word-delta (pos=7066371, word=work, len=4, begin=12, delta=1)
FAILED, word order decreased (pos=7066371, word=work, prev=work)
FAILED, wrong word-delta (pos=7066377, word=work, len=4, begin=7, delta=3)
FAILED, word order decreased (pos=7066377, word=work, prev=work)
FAILED, wrong word-delta (pos=7066387, word=work, len=4, begin=6, delta=5)
FAILED, word order decreased (pos=7066387, word=work, prev=work)
FAILED, wrong word-delta (pos=7066399, word=work, len=4, begin=5, delta=6)
FAILED, word order decreased (pos=7066399, word=work, prev=work)
FAILED, wrong word-delta (pos=7066412, word=work, len=4, begin=6, delta=4)
FAILED, word order decreased (pos=7066412, word=work, prev=work)
FAILED, wrong word-delta (pos=7066423, word=work, len=4, begin=6, delta=4)
FAILED, word order decreased (pos=7066423, word=work, prev=work)
FAILED, wrong word-delta (pos=7066434, word=work, len=4, begin=6, delta=4)
FAILED, word order decreased (pos=7066434, word=work, prev=work)
FAILED, wrong word-delta (pos=7066445, word=work, len=4, begin=5, delta=4)
FAILED, word order decreased (pos=7066445, word=work, prev=work)
FAILED, wrong word-delta (pos=7066456, word=work, len=4, begin=6, delta=4)
FAILED, word order decreased (pos=7066456, word=work, prev=work)
FAILED, wrong word-delta (pos=7066467, word=work, len=4, begin=7, delta=3)
FAILED, word order decreased (pos=7066467, word=work, prev=work)
FAILED, wrong word-delta (pos=7066477, word=work, len=4, begin=6, delta=5)
FAILED, word order decreased (pos=7066477, word=work, prev=work)
FAILED, wrong word-delta (pos=7066489, word=work, len=4, begin=7, delta=4)
FAILED, word order decreased (pos=7066489, word=work, prev=work)
FAILED, wrong word-delta (pos=7066500, word=work, len=4, begin=5, delta=6)
FAILED, word order decreased (pos=7066500, word=work, prev=work)
FAILED, wrong word-delta (pos=7066513, word=work, len=4, begin=7, delta=4)
FAILED, word order decreased (pos=7066513, word=work, prev=work)
FAILED, wrong word-delta (pos=7066524, word=work, len=4, begin=7, delta=3)
FAILED, word order decreased (pos=7066524, word=work, prev=work)
FAILED, wrong word-delta (pos=7066534, word=work, len=4, begin=6, delta=5)
FAILED, word order decreased (pos=7066534, word=work, prev=work)
FAILED, wrong word-delta (pos=7066546, word=work, len=4, begin=6, delta=4)
FAILED, word order decreased (pos=7066546, word=work, prev=work)
FAILED, wrong word-delta (pos=7066557, word=work, len=4, begin=5, delta=4)
FAILED, word order decreased (pos=7066557, word=work, prev=work)
FAILED, wrong word-delta (pos=7066568, word=work, len=4, begin=7, delta=3)
FAILED, word order decreased (pos=7066568, word=work, prev=work)
FAILED, wrong word-delta (pos=7066578, word=work, len=4, begin=9, delta=1)
FAILED, word order decreased (pos=7066578, word=work, prev=work)
FAILED, wrong word-delta (pos=7066586, word=work, len=4, begin=6, delta=5)
FAILED, word order decreased (pos=7066586, word=work, prev=work)
FAILED, wrong word-delta (pos=7066598, word=work, len=4, begin=7, delta=3)
FAILED, word order decreased (pos=7066598, word=work, prev=work)
FAILED, wrong word-delta (pos=7066608, word=work, len=4, begin=6, delta=4)
FAILED, word order decreased (pos=7066608, word=work, prev=work)
FAILED, wrong word-delta (pos=7066619, word=work, len=4, begin=7, delta=4)
FAILED, word order decreased (pos=7066619, word=work, prev=work)
FAILED, wrong word-delta (pos=7066630, word=work, len=4, begin=5, delta=6)
FAILED, word order decreased (pos=7066630, word=work, prev=work)
FAILED, wrong word-delta (pos=7066643, word=work, len=4, begin=6, delta=3)
FAILED, word order decreased (pos=7066643, word=work, prev=work)
FAILED, wrong word-delta (pos=7066653, word=work, len=4, begin=6, delta=5)
FAILED, word order decreased (pos=7066653, word=work, prev=work)
FAILED, wrong word-delta (pos=7066665, word=work, len=4, begin=6, delta=5)
FAILED, word order decreased (pos=7066665, word=work, prev=work)
FAILED, wrong word-delta (pos=7066677, word=work, len=4, begin=5, delta=5)
FAILED, word order decreased (pos=7066677, word=work, prev=work)
FAILED, wrong word-delta (pos=7066689, word=work, len=4, begin=6, delta=5)
FAILED, word order decreased (pos=7066689, word=work, prev=work)
FAILED, wrong word-delta (pos=7066701, word=work, len=4, begin=7, delta=2)
FAILED, word order decreased (pos=7066701, word=work, prev=work)
FAILED, wrong word-delta (pos=7066710, word=work, len=4, begin=8, delta=3)
FAILED, word order decreased (pos=7066710, word=work, prev=work)
FAILED, wrong word-delta (pos=7066720, word=work, len=4, begin=6, delta=5)
FAILED, word order decreased (pos=7066720, word=work, prev=work)
FAILED, wrong word-delta (pos=7066732, word=work, len=4, begin=6, delta=4)
FAILED, word order decreased (pos=7066732, word=work, prev=work)
FAILED, wrong word-delta (pos=7066743, word=work, len=4, begin=6, delta=5)
FAILED, word order decreased (pos=7066743, word=work, prev=work)
FAILED, wrong word-delta (pos=7066755, word=work, len=4, begin=6, delta=5)
FAILED, word order decreased (pos=7066755, word=work, prev=work)
FAILED, unexpected checkpoint (pos=7066767, word=764599, words=55, expected=64)
FAILED, word order decreased (pos=7066743, word=work, prev=work)
FAILED, wrong word-delta (pos=7066755, word=work, len=4, begin=6, delta=5)
FAILED, word order decreased (pos=7066755, word=work, prev=work)
FAILED, unexpected checkpoint (pos=7066767, word=764599, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7067446, word=764663, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7068132, word=764727, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7068905, word=764791, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7069585, word=764855, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7070293, word=764919, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7070942, word=764983, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7071628, word=765047, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7072259, word=765111, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7072890, word=765175, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7073580, word=765239, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7074303, word=765303, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7074912, word=765367, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7075570, word=765431, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7076221, word=765495, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7076887, word=765559, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7077554, word=765623, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7078196, word=765687, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7078868, word=765751, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7079499, word=765815, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7080129, word=765879, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7080759, word=765943, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7081439, word=766007, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7082102, word=766071, words=55, expected=64)
FAILED, unexpected checkpoint (pos=7082755, word=766135, words=55, expected=64)
checking data...
WARNING, multiple tail hits (wordid=0(work), rowid=724, hit=0x2ffffc2, last=0x2ffffbe)
checking rows...
checking attribute blocks index...
checking kill-list...
checking docstore...
checking dead row map...
checking doc-id lookup...
check FAILED, 99 of 8393 failures reported, 2.1 sec elapsed

Problems began when I upgraded my old SphinxSearch to ManticoreSearch 13.13.0.
All my delta indexes are calculated in a loop, sequentially, but without any sleep delays between them. However, I have noticed that if I change the order of index updates, the error either becomes less frequent or disappears altogether. Additionally, if I only update the specified index, the error never occurs. I only see the error when I perform a merge delta immediately after a merge delta for another index.
After chashing a can only rebuild main index. Unfortunately, I haven't found any input data for a guaranteed reproducible problem. It feels like there's some randomness involved. Thank you for your work! Alexey.

Manticore Search Version:

13.13.0

Operating System Version:

Debian GNU/Linux 11

Have you tried the latest development version?

Yes

Internal Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

  • Implementation completed
  • Tests developed
  • Documentation updated
  • Documentation reviewed

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions