Telegram Group & Telegram Channel
⭐️ Simple GRPO

Вы можете запустить GRPO (Group Relative Policy Optimization - основной алгоритм Deepseek r1), для моделей на 8b параметров на GPU стоимостью 10 долл/ч.

4xH100 достаточно для тренировки Llama 3.1 8b и алгоритм прекрасно работает.

Код: https://github.com/minosvasilias/simple_grpo

@data_analysis_ml

#gpro #deepseek #reasoning
Please open Telegram to view this post
VIEW IN TELEGRAM



group-telegram.com/data_analysis_ml/3156
Create:
Last Update:

⭐️ Simple GRPO

Вы можете запустить GRPO (Group Relative Policy Optimization - основной алгоритм Deepseek r1), для моделей на 8b параметров на GPU стоимостью 10 долл/ч.

4xH100 достаточно для тренировки Llama 3.1 8b и алгоритм прекрасно работает.

Код: https://github.com/minosvasilias/simple_grpo

@data_analysis_ml

#gpro #deepseek #reasoning

BY Анализ данных (Data analysis)




Share with your friend now:
group-telegram.com/data_analysis_ml/3156

View MORE
Open in Telegram


Telegram | DID YOU KNOW?

Date: |

For Oleksandra Tsekhanovska, head of the Hybrid Warfare Analytical Group at the Kyiv-based Ukraine Crisis Media Center, the effects are both near- and far-reaching. READ MORE Some people used the platform to organize ahead of the storming of the U.S. Capitol in January 2021, and last month Senator Mark Warner sent a letter to Durov urging him to curb Russian information operations on Telegram. The regulator said it has been undertaking several campaigns to educate the investors to be vigilant while taking investment decisions based on stock tips.
from sg


Telegram Анализ данных (Data analysis)
FROM American