Visual Agent RL Training
Attempted RL training on GUI tasks using AgentCPM-GUI; identified dataset limitations as the key bottleneck.
Current Work:
- SFT + RL finetuning of Qwen2.5-VL with VERL for 3D reasoning tasks such as coordinate estimation.
Attempted RL training on GUI tasks using AgentCPM-GUI; identified dataset limitations as the key bottleneck.
Current Work: