Skip to content

[Bug]: 使用AutoModelForCausalLM加载0.5B的PP-UIE要占用56G NPU显存 #11094

@zhiyongLiu1114

Description

@zhiyongLiu1114

软件环境

- paddlepaddle:3.2.0
- paddle-custom-npu:3.2.0
- paddlenlp: 3.0.0b4

-NPU: 910B
-服务器:昇腾

重复问题

  • I have searched the existing issues

错误描述

在一张空的910BNPU上使用AutoModelForCausalLM加载0.5B的PP-UIE要占用56G  NPU显存,
再次使用相同的脚本加载0.5B的PP-UIE则只占用4G显存(因为一张910B出去56G后只剩下这么多了)却也能正常推理。

稳定复现步骤 & 代码

from paddlenlp.transformers import AutoModelForCausalLM
from paddlenlp.transformers import AutoTokenizer
from paddlenlp.generation import GenerationConfig
from paddlenlp.trl import llm_utils

model_id = "paddlenlp/PP-UIE-0.5B"

model = AutoModelForCausalLM.from_pretrained(model_id, use_flash_attention=False)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(model_id, padding_side="left")
generation_config = GenerationConfig.from_pretrained(model_id)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions