-
Notifications
You must be signed in to change notification settings - Fork 559
Simplify the Config code #2770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify the Config code #2770
Conversation
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR simplifies and standardizes configuration handling by:
- Renaming configuration fields (e.g.,
max_length
→max_model_len
,tensor_parallel_degree
→tensor_parallel_size
,num_layers
→num_hidden_layers
) - Consolidating many per‐config dataclasses into
__init__(args)
‐based classes - Updating all references across runners, models, and layers to use the new field names
Reviewed Changes
Copilot reviewed 28 out of 29 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
fastdeploy/worker/xpu_model_runner.py | Updated max token length and layer count references |
fastdeploy/worker/worker_process.py | Switched all TP/EP degree names; overhauled initialize_fd_config |
fastdeploy/worker/vl_gpu_model_runner.py | Aligned VL runner to new config constructors and naming |
fastdeploy/worker/gpu_model_runner.py | Mirror updates from XPU runner for GPU runner |
fastdeploy/spec_decode/mtp.py | Replaced num_layers with num_hidden_layers in MTP proposer |
fastdeploy/rl/rollout_model.py | Aligned MoE config references to new model_config fields |
fastdeploy/model_executor/models/tp_utils.py | Renamed TP degree check |
fastdeploy/model_executor/models/*.py | Bulk replace of tensor_parallel_degree , num_layers , and model config checks across all model implementations |
fastdeploy/config.py | Major rewrite: removed dataclasses, introduced args‐driven inits |
Comments suppressed due to low confidence (1)
fastdeploy/model_executor/models/model_base.py:56
- The docstring mentions 'vocab_size, use_topp_sampling, etc.' but the initializer only takes a single 'configs' dict. Update the doc to reflect the actual signature, e.g., 'configs (dict): configuration parameters including ...'.
vocab_size, use_topp_sampling, etc.
# Set default block num for profile run | ||
self.max_block_num: int = 2000 | ||
# block size | ||
self.block_size: int = 64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ParallelConfig.__init__
sets block_size
twice (first to 16, then to 64), leading to conflicting defaults. Remove the redundant assignment to ensure a single authoritative default.
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
# | ||
eos_tokens_lens: int = 2 | ||
# Enable chunked prefill | ||
enable_chunked_prefill: str = "store_true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为什么不是个bool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是bool,你看我修改后的代码是改为bool的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
谢谢指出,之前处理冲突的时候覆盖掉bool了
for key, value in args.items(): | ||
if hasattr(self, key): | ||
setattr(self, key, value) | ||
capture_size = [i for i in range(1, max_capture_batch_size + 1)] | ||
self.init_with_cudagrpah_size(cudagraph_capture_sizes=capture_size) | ||
self.use_cudagraph = use_cudagraph | ||
#TODO(wangmingkai02): change graph_opt_level=2 when using static mode with cinn | ||
if enable_static_graph_inference: | ||
self.graph_opt_level = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- args 传入的参数来自 config.json 还是启动参数
- 把参数默认值统一放在 init() 里的目的是什么,原本的写法相当于把参数的声明、默认值设置和根据用户配置( config.json、 启动参数)设置分开了,代码看起来比较好区分
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1,args来自启动参数,2,主要满足config全局统一管理的特性,比如rl有些场景不需要args传参,会自己传一些参数,如果缺乏默认值,会没有对应的字段,后续执行代码会有问题
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- done 2.是为了满足 hasattr() 对吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的
self.args.max_capture_batch_size) | ||
self.args.enable_static_graph_inference, | ||
self.args.max_capture_batch_size, | ||
vars(self.args)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
类似use_cudagraph 的参数都通过 args 传递了,当用户配置参数名和config 参数名不一致,会有风险
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里我们必须是传参的参数名要和config里的字段名一致,不一致就用不了,这也是修改后通过代码来做的强制规范
该PR做了如下工作: