Skip to content

Conversation

@zhtmike
Copy link
Collaborator

@zhtmike zhtmike commented Oct 16, 2025

What does this PR do?

Make examples/janus to be runnable with MS2.6/2.7

Fixes # (issue)

  • The updated Llama model in Transformers requires setting _support_cache_class=False to be compatible with graph mode. We manually added this attribute in inference.py and generation_inference.py.
  • The legacy enable_compile_cache=True setting caused some unknown errors, so we removed it.
  • There were some code formatting errors in CI due to an internal .toml file, which we resolved by removing it.
  • For text-only training, we fixed the broadcast error related to the image mask.
  • For training with mixed tasks, we discontinued support for graph mode with dynamic inputs because: 1) PyNative mode offers better performance; 2) The graph mode interface for dynamic inputs is broken in MS2.7.

Adds # (feature)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?
  • Did you make sure to update the documentation with your changes? E.g. record bug fixes or new features in What's New. Here are the
    documentation guidelines
  • Did you build and run the code without any errors?
  • Did you report the running environment (NPU type/MS version) and performance in the doc? (better record it for data loading, model inference, or training tasks)
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@xxx

@zhtmike zhtmike self-assigned this Oct 16, 2025
@zhtmike zhtmike added the bug Something isn't working label Oct 16, 2025
@zhtmike zhtmike marked this pull request as ready for review October 17, 2025 06:35
@zhtmike
Copy link
Collaborator Author

zhtmike commented Oct 17, 2025

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request makes the Janus examples compatible with MindSpore 2.6/2.7. The changes include updating environment and performance documentation, applying necessary compatibility fixes for graph mode in newer MindSpore versions, resolving a broadcast error during training, and cleaning up script arguments and configurations. The changes are well-justified and align with the PR's objective. I've added one comment regarding a minor naming inconsistency for future code maintainability improvement.

@zhtmike zhtmike changed the title [Bug fix] Janus-Pro runnable with MS2.6/2.7 [Bug Fix] Janus-Pro runnable with MS2.6/2.7 Oct 17, 2025
@vigo999 vigo999 added this to mindone Oct 17, 2025
@vigo999 vigo999 moved this to In Progress in mindone Oct 17, 2025
@vigo999 vigo999 self-requested a review October 18, 2025 03:13
@vigo999 vigo999 added this pull request to the merge queue Oct 18, 2025
Merged via the queue into mindspore-lab:master with commit 06421ef Oct 18, 2025
3 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in mindone Oct 18, 2025
@zhtmike zhtmike deleted the janus_fix branch October 20, 2025 02:53
vigo999 added a commit that referenced this pull request Nov 2, 2025
- Added PR links to model components where specific PRs exist (#1288, #1148)
- Added PR links to examples models that have individual PRs (#1378, #1233, #1363, #1243, #687, #1362, #1227, #1346, #1200, #1369)
- Noted that some components were added as part of broader pipeline implementations
- Improved traceability for specific model additions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants