UPN in three-stage training process #18

gongchenting · 2025-04-16T02:29:01Z

In the three-stage training process of ChatRex, will the parameters of UPN be updated? Additionally, how should the GT boxes be mixed with the UPN boxes? Will the boxes from UPN be filtered first, just like in the testing examples, before being mixed with the GT boxes.

Mountchicken · 2025-04-16T07:01:07Z

Hi @gongchenting
UPN is not trained together with the MLLM; it is trained separately and is only used to provide proposal boxes. The GT boxes are directly mixed with the UPN-generated boxes, and we remove any UPN boxes whose IoU with GT boxes is greater than 0.9.

gongchenting · 2025-04-16T08:08:26Z

Hi @gongchenting UPN is not trained together with the MLLM; it is trained separately and is only used to provide proposal boxes. The GT boxes are directly mixed with the UPN-generated boxes, and we remove any UPN boxes whose IoU with GT boxes is greater than 0.9.

Thank you for your reply. Here a few more questions I would like to confirm, and I look forward to your further response.

After removing the UPN boxes that have an IOU greater than 0.9 with the GT boxes, will the remaining UPN boxes undergo a second filtering based on the 0.3 threshold? Or will they be selected based on scores until the total number of boxes reaches 100?
After mixing the GT boxes and UPN boxes, when constructing the training inputs, will the GT boxes and UPN boxes be mixed and shuffled to generate the corresponding object tokens?

Mountchicken · 2025-04-16T08:12:55Z

The process is as follows: given an image, we first pass it through the UPN and select boxes with a score greater than 0.3. These selected UPN boxes are then combined with the ground truth (GT) boxes. If the total number of boxes after merging exceeds 100, we remove some of the UPN boxes so that the final total (UPN + GT) is 100. If the total number of boxes is less than or equal to 100, we make no changes.
After merging the UPN boxes with the GT boxes, we randomly shuffle the combined boxes and update the object indices accordingly.

gongchenting · 2025-04-16T08:21:48Z

Thank you very much! It was very helpful for me to understand ChatRex.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPN in three-stage training process #18

UPN in three-stage training process #18

gongchenting commented Apr 16, 2025 •

edited

Loading

Mountchicken commented Apr 16, 2025

gongchenting commented Apr 16, 2025

Mountchicken commented Apr 16, 2025

gongchenting commented Apr 16, 2025

UPN in three-stage training process #18

UPN in three-stage training process #18

Comments

gongchenting commented Apr 16, 2025 • edited Loading

Mountchicken commented Apr 16, 2025

gongchenting commented Apr 16, 2025

Mountchicken commented Apr 16, 2025

gongchenting commented Apr 16, 2025

gongchenting commented Apr 16, 2025 •

edited

Loading