Skip to content

Significant Performance Regression in Code Editing: Request to Restore DeepSeek-Coder-V2 as Primary Coder Model #80

@Victorcorcos

Description

@Victorcorcos

Significant Performance Regression in Code Editing: Request to Restore DeepSeek-Coder-V2 as Primary Coder Model

Summary

The merge of DeepSeek-Coder-V2 and DeepSeek-V2-Chat into DeepSeek-V2.5 has resulted in a dramatic performance regression for code editing tasks. Can I request that the deepseek/deepseek-coder API endpoint be redirected back to the original DeepSeek-Coder-V2 model rather than the merged V2.5 version?

Performance Impact

Aider Benchmark Results

The regression is most evident in the Aider code editing benchmark, which evaluates LLMs' ability to modify existing code:

  • DeepSeek-Coder-V2 (original): 73.7% pass rate - 1st position on leaderboard
  • DeepSeek-V2.5 (current deepseek/deepseek-coder): 17.8% pass rate - Significant drop in ranking

This represents a 76% decrease in performance on code editing tasks.

Benchmark Comparison

Model Aider Score Ranking Performance Change
DeepSeek-Coder-V2 (original) 73.7% 1st Baseline
DeepSeek-V2.5 (merged) 17.8% Low -76% ↓

Current Issue

When developers use aider --model deepseek/deepseek-coder, they expect to get the best coding-focused model from DeepSeek. However, they're now receiving the merged V2.5 model, which has significantly degraded code editing capabilities compared to the original specialized Coder-V2 model.

Impact on Developer Experience

  1. Unexpected Performance Drop: Developers who were using DeepSeek Coder V2 for its superior code editing capabilities are now experiencing much worse results without any warning
  2. Tool Integration Issues: Code editing tools like Aider that recommended DeepSeek Coder V2 as a top choice now perform poorly with the merged model
  3. Loss of Specialized Capabilities: The original Coder V2's specialized code editing abilities appear to have been diluted in the general-purpose merged model

Proposed Solutions

Option 1 (Preferred): Restore Original Coder Model

  • Redirect deepseek/deepseek-coder API endpoint back to DeepSeek-Coder-V2-Instruct
  • Keep DeepSeek-V2.5 available as deepseek/deepseek-chat for general-purpose tasks
  • This maintains the principle that specialized models should excel in their domains

Option 2: Provide Clear Model Differentiation

  • Create a new endpoint like deepseek/deepseek-coder-v2-original for the original model
  • Update documentation to clearly explain the performance differences
  • Provide migration guidance for users who need the original performance

Option 3: Improve V2.5 Code Editing Performance

  • Address the specific regression in code editing capabilities in the V2.5 model
  • Ensure the merged model doesn't sacrifice specialized performance for generality

Technical Context

According to your official documentation, DeepSeek-V2.5 is described as a "powerful combination" that "retains the robust code processing power of the Coder model." However, empirical testing shows this is not the case for code editing specifically, where the original Coder V2 significantly outperformed the merged version.

The Aider benchmark specifically tests:

  • Code modification and editing capabilities
  • Ability to understand and apply code changes accurately
  • Performance on real-world coding assistance scenarios

Request

I respectfully request that DeepSeek consider restoring the original DeepSeek-Coder-V2 as the model served by the deepseek/deepseek-coder endpoint, or at minimum, provide a clear path for developers to access the original high-performance code editing model.

This would:

  • Restore trust in DeepSeek's commitment to specialized model performance
  • Maintain compatibility for existing tools and workflows
  • Ensure developers get the best coding assistance experience

References


Thank you for considering this request. DeepSeek-Coder-V2 was an exceptional model for code editing, and I hope to see its capabilities restored for developers who depend on high-quality code assistance.

# Significant Performance Regression in Code Editing: Request to Restore DeepSeek-Coder-V2 as Primary Coder Model

Summary

The merge of DeepSeek-Coder-V2 and DeepSeek-V2-Chat into DeepSeek-V2.5 has resulted in a dramatic performance regression for code editing tasks. I request that the deepseek/deepseek-coder API endpoint be redirected back to the original DeepSeek-Coder-V2 model rather than the merged V2.5 version.

Performance Impact

Aider Benchmark Results

The regression is most evident in the Aider code editing benchmark, which evaluates LLMs' ability to modify existing code:

  • DeepSeek-Coder-V2 (original): 73.7% pass rate - Knowledge cutoff date  #1 position on leaderboard
  • DeepSeek-V2.5 (current deepseek/deepseek-coder): 17.8% pass rate - Significant drop in ranking

This represents a 76% decrease in performance on code editing tasks.

Benchmark Comparison

Model Aider Score Ranking Performance Change
DeepSeek-Coder-V2 (original) 73.7% #1 Baseline
DeepSeek-V2.5 (merged) 17.8% Low -76% ↓

Current Issue

When developers use aider --model deepseek/deepseek-coder, they expect to get the best coding-focused model from DeepSeek. However, they're now receiving the merged V2.5 model, which has significantly degraded code editing capabilities compared to the original specialized Coder-V2 model.

Impact on Developer Experience

  1. Unexpected Performance Drop: Developers who were using DeepSeek Coder V2 for its superior code editing capabilities are now experiencing much worse results without any warning
  2. Tool Integration Issues: Code editing tools like Aider that recommended DeepSeek Coder V2 as a top choice now perform poorly with the merged model
  3. Loss of Specialized Capabilities: The original Coder V2's specialized code editing abilities appear to have been diluted in the general-purpose merged model

Proposed Solutions

Option 1 (Preferred): Restore Original Coder Model

  • Redirect deepseek/deepseek-coder API endpoint back to DeepSeek-Coder-V2-Instruct
  • Keep DeepSeek-V2.5 available as deepseek/deepseek-chat for general-purpose tasks
  • This maintains the principle that specialized models should excel in their domains

Option 2: Provide Clear Model Differentiation

  • Create a new endpoint like deepseek/deepseek-coder-v2-original for the original model
  • Update documentation to clearly explain the performance differences
  • Provide migration guidance for users who need the original performance

Option 3: Improve V2.5 Code Editing Performance

  • Address the specific regression in code editing capabilities in the V2.5 model
  • Ensure the merged model doesn't sacrifice specialized performance for generality

Technical Context

According to the Deep Seek official documentation, DeepSeek-V2.5 is described as a "powerful combination" that "retains the robust code processing power of the Coder model." However, empirical testing shows this is not the case for code editing specifically, where the original Coder V2 significantly outperformed the merged version.

The Aider benchmark specifically tests:

  • Code modification and editing capabilities
  • Ability to understand and apply code changes accurately
  • Performance on real-world coding assistance scenarios

Request

I respectfully request that DeepSeek consider restoring the original DeepSeek-Coder-V2 as the model served by the deepseek/deepseek-coder endpoint, or at minimum, provide a clear path for developers to access the original high-performance code editing model.

This would:

  • Restore trust in DeepSeek's commitment to specialized model performance
  • Maintain compatibility for existing tools and workflows
  • Ensure developers get the best coding assistance experience

References


Thank you for considering this request. DeepSeek-Coder-V2 was an exceptional model for code editing, and I hope to see its capabilities restored for developers who depend on high-quality code assistance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions