ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning Paper • 2606.03503 • Published 28 days ago • 25
π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows Paper • 2605.14678 • Published May 19 • 108
HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking Paper • 2505.02322 • Published May 5, 2025 • 1
Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration Paper • 2509.14760 • Published Sep 18, 2025 • 53