Skip to content

Observing mem access fault and page faults with model migx_inference_sd21_benchmarks #3933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
eddieliao opened this issue Apr 7, 2025 · 5 comments
Assignees
Labels
bug Something isn't working DeepLearningModels Artifacts related to DLM benchmarking/parity checks

Comments

@eddieliao
Copy link
Contributor

SD21 seems to be running out of memory on higher batch sizes on Navi4x systems

@eddieliao eddieliao added bug Something isn't working DeepLearningModels Artifacts related to DLM benchmarking/parity checks labels Apr 7, 2025
@eddieliao eddieliao self-assigned this Apr 7, 2025
@eddieliao
Copy link
Contributor Author

Rejected ticket as issue is not reproducible on latest build.

@eddieliao
Copy link
Contributor Author

Re-opened as issue still is reproducible according to reporter. Looking into potential differences between XTX and XTW.

@eddieliao
Copy link
Contributor Author

Fixed with #3995.

@eddieliao
Copy link
Contributor Author

Still reported as reproducible, reopening to track

@eddieliao eddieliao reopened this May 14, 2025
@eddieliao
Copy link
Contributor Author

Cannot reproduce on any GPU besides 0, waiting on response from reporter to verify if issue is observed elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working DeepLearningModels Artifacts related to DLM benchmarking/parity checks
Projects
None yet
Development

No branches or pull requests

1 participant