LSMTree compaction creates duplicate timestamped indexes that are not cleaned up

## Description
When creating indexes on large datasets (33.8M records), ArcadeDB's LSMTree compaction process creates multiple timestamped duplicate indexes that persist in the database instead of being cleaned up after compaction completes.
## Steps to Reproduce
1. Import a large dataset (e.g., MovieLens ml-latest with 33,832,163 ratings)
2. Create indexes on the imported data:
```sql
CREATE INDEX ON Movie (movieId) UNIQUE
CREATE INDEX ON Rating (userId) NOTUNIQUE
CREATE INDEX ON Rating (movieId) NOTUNIQUE
CREATE INDEX ON Link (movieId) UNIQUE
CREATE INDEX ON Tag (movieId) NOTUNIQUE
```
3. Query the schema to see all indexes:
```sql
SELECT name, typeName, properties, unique, automatic
FROM schema:indexes
ORDER BY typeName, name
```
## Expected Behavior
Expected 5 indexes total (one per CREATE INDEX command).
## Actual Behavior
Found **80 indexes** instead of 5 - with 15+ timestamped duplicates per table:
- `Movie[movieId]` (expected)
- `Movie_0_172987397898984` (duplicate)
- `Movie_1_172987421520553` (duplicate)
- `Movie_2_172987445142122` (duplicate)
- ... (13 more duplicates)
All duplicates are marked as `automatic=true`.
## Analysis
Based on source code review:
1. **LSMTreeIndexMutable.java (line 168)**:
```java
public LSMTreeIndexCompacted createNewForCompaction() {
final String newName = componentName.substring(0, last_) + "_" + System.nanoTime();
return new LSMTreeIndexCompacted(..., newName, ...);
}
```
2. **LSMTreeIndex.java (line 548)**:
```java
protected LSMTreeIndexMutable splitIndex(...) {
final String newName = mutable.getName().substring(0, last_) + "_" + System.nanoTime();
final LSMTreeIndexMutable newMutableIndex = new LSMTreeIndexMutable(..., newName, ...);
}
```
These timestamped index files are created during compaction but appear not to be properly cleaned up after compaction completes.
## Impact
- **Functional**: ✅ Queries work correctly using the main indexes
- **Performance**: ⚠️ Duplicates don't affect query speed but waste disk space
- **Storage**: ❌ 16x storage overhead for index files
## Environment
- Dataset: MovieLens ml-latest (33,832,163 ratings, 86,538 movies, 2,328,316 tags, 9,742 links)
- ArcadeDB: Python bindings via arcadedb_embedded
- JVM Heap: 8GB
- Database: Embedded mode
## Logs
During index creation on large dataset:
```
⚠️ Index creation failed: Command failed: com.arcadedb.exception.NeedRetryException:
Cannot create a new index while asynchronous tasks are running (LSMTreeIndexCompactor)
```
LSMTree compaction logs show:
```
LSMTreeIndex 'Movie[movieId]' compacted 50 pages, remaining 0 pages
(totalKeys=289037 totalValues=2251732)
```
## Questions
1. Are timestamped index files intended to be temporary during compaction?
2. Should they be automatically cleaned up after compaction completes?
3. Is there a configuration to control compaction cleanup behavior?
## Suggested Fix
After compaction completes, cleanup logic should:
1. Identify timestamped index files matching pattern `{indexName}_\d+`
2. Remove them from schema if they're marked as temporary/compaction artifacts
3. Delete the corresponding physical files
## Workaround
Users can manually drop timestamped indexes:
```sql
DROP INDEX Movie_0_172987397898984;
-- Repeat for all timestamped duplicates
```
However, this requires knowing which indexes are duplicates vs. legitimate user-created indexes.[+] Tested on `25.10.1-SNAPSHOT`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

LSMTree compaction creates duplicate timestamped indexes that are not cleaned up #2701

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Analysis

Impact

Environment

Logs

Questions

Suggested Fix

Workaround

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Uh oh!

LSMTree compaction creates duplicate timestamped indexes that are not cleaned up #2701

Description

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Analysis

Impact

Environment

Logs

Questions

Suggested Fix

Workaround

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions