Telegram Group & Telegram Channel
⭐️ Simple GRPO

Вы можете запустить GRPO (Group Relative Policy Optimization - основной алгоритм Deepseek r1), для моделей на 8b параметров на GPU стоимостью 10 долл/ч.

4xH100 достаточно для тренировки Llama 3.1 8b и алгоритм прекрасно работает.

Код: https://github.com/minosvasilias/simple_grpo

@data_analysis_ml

#gpro #deepseek #reasoning
Please open Telegram to view this post
VIEW IN TELEGRAM



group-telegram.com/data_analysis_ml/3156
Create:
Last Update:

⭐️ Simple GRPO

Вы можете запустить GRPO (Group Relative Policy Optimization - основной алгоритм Deepseek r1), для моделей на 8b параметров на GPU стоимостью 10 долл/ч.

4xH100 достаточно для тренировки Llama 3.1 8b и алгоритм прекрасно работает.

Код: https://github.com/minosvasilias/simple_grpo

@data_analysis_ml

#gpro #deepseek #reasoning

BY Анализ данных (Data analysis)




Share with your friend now:
group-telegram.com/data_analysis_ml/3156

View MORE
Open in Telegram


Telegram | DID YOU KNOW?

Date: |

Telegram Messenger Blocks Navalny Bot During Russian Election Just days after Russia invaded Ukraine, Durov wrote that Telegram was "increasingly becoming a source of unverified information," and he worried about the app being used to "incite ethnic hatred." In addition, Telegram now supports the use of third-party streaming tools like OBS Studio and XSplit to broadcast live video, allowing users to add overlays and multi-screen layouts for a more professional look. "He has kind of an old-school cyber-libertarian world view where technology is there to set you free," Maréchal said. The regulator took order for the search and seizure operation from Judge Purushottam B Jadhav, Sebi Special Judge / Additional Sessions Judge.
from vn


Telegram Анализ данных (Data analysis)
FROM American