Telegram Group & Telegram Channel
⭐️ Simple GRPO

Вы можете запустить GRPO (Group Relative Policy Optimization - основной алгоритм Deepseek r1), для моделей на 8b параметров на GPU стоимостью 10 долл/ч.

4xH100 достаточно для тренировки Llama 3.1 8b и алгоритм прекрасно работает.

Код: https://github.com/minosvasilias/simple_grpo

@data_analysis_ml

#gpro #deepseek #reasoning
Please open Telegram to view this post
VIEW IN TELEGRAM



group-telegram.com/data_analysis_ml/3156
Create:
Last Update:

⭐️ Simple GRPO

Вы можете запустить GRPO (Group Relative Policy Optimization - основной алгоритм Deepseek r1), для моделей на 8b параметров на GPU стоимостью 10 долл/ч.

4xH100 достаточно для тренировки Llama 3.1 8b и алгоритм прекрасно работает.

Код: https://github.com/minosvasilias/simple_grpo

@data_analysis_ml

#gpro #deepseek #reasoning

BY Анализ данных (Data analysis)




Share with your friend now:
group-telegram.com/data_analysis_ml/3156

View MORE
Open in Telegram


Telegram | DID YOU KNOW?

Date: |

On Feb. 27, however, he admitted from his Russian-language account that "Telegram channels are increasingly becoming a source of unverified information related to Ukrainian events." WhatsApp, a rival messaging platform, introduced some measures to counter disinformation when Covid-19 was first sweeping the world. He adds: "Telegram has become my primary news source." Despite Telegram's origins, its approach to users' security has privacy advocates worried. However, the perpetrators of such frauds are now adopting new methods and technologies to defraud the investors.
from sg


Telegram Анализ данных (Data analysis)
FROM American