Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 93 additions & 0 deletions npc_disappearing_analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Disappearing NPC Bug Analysis: Root Cause Investigation

## Problem Statement

Multiple server administrators have reported NPCs randomly not spawning for players after server restarts when upgrading from v0.9.0-beta-161 to v0.9.0-beta-162. The issue affects multiple servers running Typewriter and appears to be introduced by specific changes in beta-162.

## Timeline Correction

**v0.9.0-beta-161**: NPCs worked correctly without spawning issues
**v0.9.0-beta-162**: Introduced disappearing NPC bugs

## Root Cause Analysis: Changes That Introduced the Bug

Through analysis of the commits between v0.9.0-beta-161 and v0.9.0-beta-162, several changes in the entity-extension have been identified as the likely culprits:

### 1. Entity Movement Packet Changes ❌ PROBLEMATIC

**Commit**: `4cf2bd44613b7325f42d47ed7a82598d39a52398`

**Analysis**: This commit disabled `WrapperPlayServerEntityRelativeMoveAndRotation` packets and forced all entity movement to use `WrapperPlayServerEntityPositionSync` instead.

**Why this could cause issues**:
- Changed fundamental entity movement behavior that was working in beta-161
- May have introduced timing issues with entity position synchronization
- Could cause entities to not appear if absolute positioning has different packet timing requirements
- The switch from relative to absolute positioning may have unintended side effects on entity visibility

```kotlin
// Working in beta-161: Relative movement
WrapperPlayServerEntityRelativeMoveAndRotation(entityId, delta.x, delta.y, delta.z, ...)

// Changed in beta-162: Absolute positioning (potentially problematic)
WrapperPlayServerEntityPositionSync(entityId, EntityPositionData(...))
```

### 2. Threading System Overhaul ❌ INTRODUCED RACE CONDITIONS

**Commits**: `07ed2dd7bf4e6100cac894d110b7f9f6e91b2bde` and `733e115e0b5ead7b235123f33836ad991ab3b51c`

**Analysis**: Major changes to thread pool management with new `MAX_PLATFORM_THREADS`, `MAX_VIRTUAL_THREADS`, and `CORE_POOL_SIZE = 6`.

**Why this could cause issues**:
- Threading changes during server startup can introduce race conditions
- Entity loading timing may have changed, causing entities to not spawn properly
- The new thread pool behavior might not properly handle entity initialization during server restarts
- Virtual threads might have different timing characteristics affecting entity spawning

### 3. Entity Lifecycle Changes ❌ BROKE STATE MANAGEMENT

**Commits**: `53877a4d24b3dc1c481d0f8901359bfa6b779d6f` and `e71553f2c67ada751fd2e957f8f52833b0059e81`

**Analysis**: Multiple changes to entity lifecycle management including:
- Added `EntityFrame.clean()` method
- Changed asset loading from `fetchAsset()` to `fetchStringAsset()`
- Modified arm swing handling with MainHand awareness
- Altered setup() call order in `EntityCinematicAction`

**Why this could cause issues**:
- The new `EntityFrame.clean()` method might be clearing entity state too aggressively
- Asset loading changes could cause entities to fail loading their appearance data
- Setup order changes could break entity initialization sequences
- These changes modified working entity lifecycle management from beta-161

## Why The Bug Appears Random

The disappearing NPC bug manifests randomly due to the timing-dependent nature of the changes:

1. **Threading Race Conditions**: The new thread pool implementation creates variable timing during server startup
2. **Network Packet Timing**: The switch to absolute positioning packets may have different network timing characteristics
3. **Entity Lifecycle Timing**: Changes to asset loading and initialization order create timing dependencies
4. **Server Performance Variations**: Different server performance affects the timing of these race conditions

## Impact Assessment

**Introduced Problems in beta-162:**
- ❌ Entity movement packet behavior changes (commit 4cf2bd44)
- ❌ Threading instability during startup (commits 07ed2dd7, 733e115e)
- ❌ Entity lifecycle state management disruption (commits 53877a4d, e71553f2)

## Recommended Solutions

To resolve the disappearing NPC issue, consider:

1. **Revert Entity Movement Changes**: Consider reverting to the relative movement packet system that worked in beta-161
2. **Review Threading Changes**: Analyze if the new thread pool implementation properly handles entity spawning during server initialization
3. **Audit Entity Lifecycle Changes**: Ensure the new `EntityFrame.clean()` and asset loading changes don't interfere with entity persistence
4. **Add Defensive Coding**: Implement additional checks and retries for entity spawning to handle timing-dependent failures

## Conclusion

The disappearing NPC issue was introduced by changes made between v0.9.0-beta-161 and v0.9.0-beta-162. The primary suspects are the entity movement packet changes, threading system overhaul, and entity lifecycle modifications. These changes, while potentially well-intentioned, disrupted the stable entity spawning behavior that worked correctly in beta-161.

To resolve the issue, a careful review and potential rollback of these changes should be considered, followed by more gradual implementation with proper testing across server restart scenarios.