Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback

2025-09-19 by AiNEWS2025

[Submitted on 26 May 2025 (v1), last revised 18 Sep 2025 (this version, v2)]

View a PDF of the paper titled WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback, by Minda Hu and 9 other authors

View PDF
HTML (experimental)

Abstract:Web agents powered by Large Language Models (LLMs) show promise for next-generation AI, but their limited reasoning in uncertain, dynamic web environments hinders robust deployment. In this paper, we identify key reasoning skills essential for effective web agents, i.e., reflection & lookahead, branching, and rollback, and curate trajectory data that exemplifies these abilities by reconstructing the agent’s (inference-time) reasoning algorithms into chain-of-thought rationales. We conduct experiments in the agent self-improving benchmark, OpenWebVoyager, and demonstrate that distilling salient reasoning patterns into the backbone LLM via simple fine-tuning can substantially enhance its performance. Our approach yields significant improvements across multiple benchmarks, including WebVoyager, Mind2web-live, and SimpleQA (web search), highlighting the potential of targeted reasoning skill enhancement for web agents.

Submission history

From: Minda Hu [view email]
[v1]
Mon, 26 May 2025 14:03:37 UTC (528 KB)
[v2]
Thu, 18 Sep 2025 11:32:15 UTC (307 KB)

Source link

#Enhancing #Web #Agent #Reasoning #Reconstructing #ChainofThought #Reflection #Branching #Rollback