Visual Agent RL Training

Attempted RL training on GUI tasks using AgentCPM-GUI; identified dataset limitations as the key bottleneck.

Current Work:

  • SFT + RL finetuning of Qwen2.5-VL with VERL for 3D reasoning tasks such as coordinate estimation.