ads
sxcasf
AI & ML interests
None yet
Recent Activity
upvoted a paper about 1 month ago
Multi-Objective and Mixed-Reward Reinforcement Learning via Reward-Decorrelated Policy Optimization commentedon a paper about 2 months ago
Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation