Ai2의 새로운 Olmo3.1은 더 강력한 추론 벤치마크를 위해 강화 학습 훈련을 확장합니다.

Table of Contents

📋 Ai2의 새로운 Olmo3.1은 더 강력한 추론 벤치마크를 위해 강화 학습 훈련을 확장합니다. 완벽가이드

소개
핵심 특징
상세 정보

✨ Ai2의 새로운 Olmo3.1은 더 강력한 추론 벤치마크를 위해 강화 학습 훈련을 확장합니다.

★ 8 전문 정보 ★

🎯 핵심 특징

✅ 고품질

검증된 정보만 제공

⚡ 빠른 업데이트

실시간 최신 정보

💎 상세 분석

전문가 수준 리뷰

📖 상세 정보

The Allen Institute for AI (Ai2) recently released what it calls its most powerful family of models yet, Olmo 3. But the company kept iterating on the models, expanding its reinforcement learning (RL) runs, to create Olmo 3.1.The new Olmo 3.1 models focus on efficiency, transparency, and control for enterprises. Ai2 updated two of the three versions of Olmo 2: Olmo 3.1 Think 32B, the flagship model optimized for advanced research, and Olmo 3.1 Instruct 32B, designed for instruction-following, multi-turn dialogue, and tool use. Olmo 3 has a third version, Olmo 3-Base for programming, comprehension, and math. It also works well for continue fine-tuning. Ai2 said that to upgrade Olmo 3 Think 32B to Olmo 3.1, its researchers extended its best RL run with a longer training schedule. “After the original Olmo 3 launch, we resumed our RL training run for Olmo 3 32B Think, training for an additional 21 days on 224 GPUs with extra epochs over our Dolci-Think-RL dataset,” Ai2 said in a blog post.

📰 원문 출처

원본 기사 보기

Tags: 32B its Olmo RL think