RL

Ai2의 새로운 Olmo3.1은 더 강력한 추론 벤치마크를 위해 강화 학습 훈련을 확장합니다.

12월 12, 2025

📋 Ai2의 새로운 Olmo3.1은 더 강력한 추론 벤치마크를 위해...

The AI industry’s biggest week: Google’s rise, RL mania, and a party boat

12월 10, 2025

📋 The AI industry’s biggest week: Google’s rise, RL...

Beyond math and coding: New RL framework helps train LLM agents for complex, real-world tasks

11월 29, 2025

📋 Beyond math and coding: New RL framework helps...

Prime Intellect debuts INTELLECT-3, an RL-trained 106B parameter open source MOE model it claims outperforms larger models across math, code, science, reasoning (Prime Intellect)

11월 28, 2025

📋 Prime Intellect debuts INTELLECT-3, an RL-trained 106B parameter...

Meta’s DreamGym framework trains AI agents in a simulated world to cut reinforcement learning costs

11월 20, 2025

📋 Meta’s DreamGym framework trains AI agents in a...

Meta AI, 강화 학습 RL 에이전트를 위한 텍스트 경험 합성기인 DreamGym 출시

11월 17, 2025

📋 Meta AI, 강화 학습 RL 에이전트를 위한 텍스트...