Telegram Group & Telegram Channel
Кстати, сейчас в bay area проходит mooc курс Advanced LLM agents
с лекциями на youtube, которые могут смотреть все (как мы любим, без регистрации и смс).

Сегодня как раз одна такая лекция "Learning to Self-Improve & Reason with LLMs", 4pm SF time, но посмотреть можно и потом. Они часто начинают позднее.

Перепосчу.
Our 2nd lecture will be happening today @4:00pm PST! You can find the livestream here: https://www.youtube.com/live/_MNlLhU33H0

Today, our amazing guest speaker Jason Weston will be presenting, "Learning to Self-Improve & Reason with LLMs."

We describe some recent methods for LLMs whereby they can self-learn how to perform better at tasks relevant to human users, from reasoning or math tasks to creative tasks. In particular we describe the methods of Iterative DPO (https://arxiv.org/abs/2312.16682), Self-Rewarding LLMs (https://arxiv.org/abs/2401.10020), Iterative Reasoning Preference Optimization (https://arxiv.org/abs/2404.19733),  Thinking LLMs (https://arxiv.org/abs/2410.10630), Meta-Rewarding LLMs (https://arxiv.org/abs/2407.19594), and more! 



group-telegram.com/tatiwonderland/68
Create:
Last Update:

Кстати, сейчас в bay area проходит mooc курс Advanced LLM agents
с лекциями на youtube, которые могут смотреть все (как мы любим, без регистрации и смс).

Сегодня как раз одна такая лекция "Learning to Self-Improve & Reason with LLMs", 4pm SF time, но посмотреть можно и потом. Они часто начинают позднее.

Перепосчу.
Our 2nd lecture will be happening today @4:00pm PST! You can find the livestream here: https://www.youtube.com/live/_MNlLhU33H0

Today, our amazing guest speaker Jason Weston will be presenting, "Learning to Self-Improve & Reason with LLMs."

We describe some recent methods for LLMs whereby they can self-learn how to perform better at tasks relevant to human users, from reasoning or math tasks to creative tasks. In particular we describe the methods of Iterative DPO (https://arxiv.org/abs/2312.16682), Self-Rewarding LLMs (https://arxiv.org/abs/2401.10020), Iterative Reasoning Preference Optimization (https://arxiv.org/abs/2404.19733),  Thinking LLMs (https://arxiv.org/abs/2410.10630), Meta-Rewarding LLMs (https://arxiv.org/abs/2407.19594), and more! 

BY Tati's Wonderland




Share with your friend now:
group-telegram.com/tatiwonderland/68

View MORE
Open in Telegram


Telegram | DID YOU KNOW?

Date: |

Stocks closed in the red Friday as investors weighed upbeat remarks from Russian President Vladimir Putin about diplomatic discussions with Ukraine against a weaker-than-expected print on U.S. consumer sentiment. Anastasia Vlasova/Getty Images However, the perpetrators of such frauds are now adopting new methods and technologies to defraud the investors. WhatsApp, a rival messaging platform, introduced some measures to counter disinformation when Covid-19 was first sweeping the world. The S&P 500 fell 1.3% to 4,204.36, and the Dow Jones Industrial Average was down 0.7% to 32,943.33. The Dow posted a fifth straight weekly loss — its longest losing streak since 2019. The Nasdaq Composite tumbled 2.2% to 12,843.81. Though all three indexes opened in the green, stocks took a turn after a new report showed U.S. consumer sentiment deteriorated more than expected in early March as consumers' inflation expectations soared to the highest since 1981.
from tw


Telegram Tati's Wonderland
FROM American