Telegram Group & Telegram Channel
Forwarded from Garyの梦呓
DeepScaleR-1.5B-Preview

DeepscaleR-1.5b 是在 DeepSeekR1-distilled-Qwen1.5b 上仅使用 3800 A100h(~$4500) 进行 RL 微调的 LLM

该模型在 AIME 2024 上获得了 43.1%@1 的准确性,较基底(28.8%)提高 14%,在 1.5B 参数下超过了 o1-preview
(Arena Math 中 R1>Gemini 2 Thinking>o1p>Gemini 2 Pro)

Open sourced dataset, code, training logs and models
Github: Github.com/agentica-project/deepscaler
Inference GGUF
#AI



group-telegram.com/Laoself/10333
Create:
Last Update:

DeepScaleR-1.5B-Preview

DeepscaleR-1.5b 是在 DeepSeekR1-distilled-Qwen1.5b 上仅使用 3800 A100h(~$4500) 进行 RL 微调的 LLM

该模型在 AIME 2024 上获得了 43.1%@1 的准确性,较基底(28.8%)提高 14%,在 1.5B 参数下超过了 o1-preview
(Arena Math 中 R1>Gemini 2 Thinking>o1p>Gemini 2 Pro)

Open sourced dataset, code, training logs and models
Github: Github.com/agentica-project/deepscaler
Inference GGUF
#AI

BY Laoself 🙂‍↕️






Share with your friend now:
group-telegram.com/Laoself/10333

View MORE
Open in Telegram


Telegram | DID YOU KNOW?

Date: |

DFR Lab sent the image through Microsoft Azure's Face Verification program and found that it was "highly unlikely" that the person in the second photo was the same as the first woman. The fact-checker Logically AI also found the claim to be false. The woman, Olena Kurilo, was also captured in a video after the airstrike and shown to have the injuries. "Your messages about the movement of the enemy through the official chatbot … bring new trophies every day," the government agency tweeted. And while money initially moved into stocks in the morning, capital moved out of safe-haven assets. The price of the 10-year Treasury note fell Friday, sending its yield up to 2% from a March closing low of 1.73%. Again, in contrast to Facebook, Google and Twitter, Telegram's founder Pavel Durov runs his company in relative secrecy from Dubai. Individual messages can be fully encrypted. But the user has to turn on that function. It's not automatic, as it is on Signal and WhatsApp.
from pl


Telegram Laoself 🙂‍↕️
FROM American