分析譛滄俣: 2025年08月06日 | 逕滓・: 2025年08月06日 23:00
邱剰ィ・strong>5莉カたョ鬮伜刀雉ェ險倅コ九r分析た励∝スア髻ソ蠎ヲ繧定ゥ穂セ。
蟷ウ蝮・う繝ウ繝代け繝医せ繧ウア: 84.0/100
Recent releases focus on specialized models; slide with timelines and benchmarks.
New benchmarks address long-context and agents; include radar charts.
隕∫エ・ NASA & IBM released Surya, the first open-source AI foundation model for heliophysics on Hugging Face.
蜃コ蜈ク: @ClementDelangue (2025-08-20 23:55:31 JST)
隕∫エ・ IBM & NASA released Surya 1.0, an open-source foundation model for heliophysics to forecast space weather.
蜃コ蜈ク: @rohanpaul_ai (2025-08-22 14:04:02 JST)
隕∫エ・ ByteDance released Seed-OSS 36B LLM on Hugging Face, with strong long-context and reasoning capabilities.
蜃コ蜈ク: @HuggingPapers (2025-08-21 01:38:52 JST)
讎りヲ・ NASA & IBMた碁幕逋コた励◆螟ェ髯ス迚ゥ逅・ュヲ蛻・㍽蛻昴・オ繝シ繝励Φソース蝓コ逶、繝「テΝ
実用諤ァ: 莠コ蟾・陦帶弌繝サ髮サ蜉帙う繝ウ繝輔Λたョ菫晁ュキたォ逶エ邨舌☆繧矩㍾隕∵橿陦・/p>
信頼諤ァ: NASA/IBM たョ蜈ャ蠑城」謳コ繝励Ο繧ク繧ァ繧ッ繝医・ォ倥>信頼諤ァ
諢冗セゥ: 螳・ョ吝、ゥ豌嶺コ亥アたョ邊セ蠎ヲ蜷台ク翫↓繧医j縲∫樟莉」遉セ莨壹・イ繝ウ繝輔Λ菫晁ュキた悟、ァ蟷・↓蜷台ク・/p>
讎りヲ・ LLM蜃コ蜉帙rPytest繝ゥイ繧ッたォテせ繝医〒た阪kオ繝シ繝励Φソース評価フレームワーク
謚陦楢ゥ穂セ。: LLM評価テ・繝ォたォた翫¢繧鬼OTA縲∵悽譬シ驕狗畑たォ驕ゥた励◆実用諤ァ
繧ウ繝溘Ηニユ繧」: GitHub 10.3K スタ繝シ迯イ蠕励∵エサ逋コたェ髢狗匱邯咏カ壻クュ
ス繝斐・繧ォ繝シ繝弱・テ 髢狗匱閠・↓たィた」たヲたョLLMテせ繝医・邁。譏捺ァ繧貞シキ隱ソ
This month saw a surge in open-source LLM releases, emphasizing hybrid architectures and specialized applications like space weather forecasting, signaling a shift toward efficient, domain-specific models. Benchmarks evolved to address real-world agent performance and long-context reasoning, highlighting gaps in current evaluations and pushing for more dynamic assessments.
隕∵ア・ "・代°譛亥・たョAIニΗ繝シスたョ豈取律たョた倥gい⊇いr繧上◆す・たァた昴l繧偵・繝シスたォHTMLたョス繝ゥイ繝峨r豈取律譖エ譁ーた励※た上□た輔>"
実装 讒矩蛹悶ョ繝シタた九iたョ閾ェ蜍菱TMLス繝ゥイ繝臥函謌舌すステΒ螳梧・
🔧 驕狗畑貅門y螳御コ・ 蜈・蜉帙ョ繝シタたョ蠖「蠑上↓蜷医oた帙※縲∵律谺。繝サ譛域ャ。たョ閾ェ蜍輔せ繝ゥイ繝臥函謌舌′蜿ッ閭スたァい/p>