Tati's Wonderland | Telegram Webview: tatiwonderland/68 -

Telegram Group & Telegram Channel

Tati's Wonderland

Кстати, сейчас в bay area проходит mooc курс Advanced LLM agents
с лекциями на youtube, которые могут смотреть все (как мы любим, без регистрации и смс).

Сегодня как раз одна такая лекция "Learning to Self-Improve & Reason with LLMs", 4pm SF time, но посмотреть можно и потом. Они часто начинают позднее.

Перепосчу.
Our 2nd lecture will be happening today @4:00pm PST! You can find the livestream here: https://www.youtube.com/live/_MNlLhU33H0.

Today, our amazing guest speaker Jason Weston will be presenting, "Learning to Self-Improve & Reason with LLMs."

We describe some recent methods for LLMs whereby they can self-learn how to perform better at tasks relevant to human users, from reasoning or math tasks to creative tasks. In particular we describe the methods of Iterative DPO (https://arxiv.org/abs/2312.16682), Self-Rewarding LLMs (https://arxiv.org/abs/2401.10020), Iterative Reasoning Preference Optimization (https://arxiv.org/abs/2404.19733), Thinking LLMs (https://arxiv.org/abs/2410.10630), Meta-Rewarding LLMs (https://arxiv.org/abs/2407.19594), and more!

CS 194/294-280 (Advanced LLM Agents) - Lecture 2, Jason Weston

www.group-telegram.com/tw/tatiwonderland.com/68

1.2K viewsTanya, edited Feb 3 at 17:59

group-telegram.com/tatiwonderland/68

Create: 2025-02-03
Last Update: 2025-02-16 08:28:25

Кстати, сейчас в bay area проходит mooc курс Advanced LLM agents
с лекциями на youtube, которые могут смотреть все (как мы любим, без регистрации и смс).

Сегодня как раз одна такая лекция "Learning to Self-Improve & Reason with LLMs", 4pm SF time, но посмотреть можно и потом. Они часто начинают позднее.

Перепосчу.
Our 2nd lecture will be happening today @4:00pm PST! You can find the livestream here: https://www.youtube.com/live/_MNlLhU33H0.

Today, our amazing guest speaker Jason Weston will be presenting, "Learning to Self-Improve & Reason with LLMs."

We describe some recent methods for LLMs whereby they can self-learn how to perform better at tasks relevant to human users, from reasoning or math tasks to creative tasks. In particular we describe the methods of Iterative DPO (https://arxiv.org/abs/2312.16682), Self-Rewarding LLMs (https://arxiv.org/abs/2401.10020), Iterative Reasoning Preference Optimization (https://arxiv.org/abs/2404.19733), Thinking LLMs (https://arxiv.org/abs/2410.10630), Meta-Rewarding LLMs (https://arxiv.org/abs/2407.19594), and more!

BY Tati's Wonderland

Share with your friend now:
group-telegram.com/tatiwonderland/68

Open in Telegram

Telegram | DID YOU KNOW?

Date: 2025-02-16|

Stocks closed in the red Friday as investors weighed upbeat remarks from Russian President Vladimir Putin about diplomatic discussions with Ukraine against a weaker-than-expected print on U.S. consumer sentiment. Anastasia Vlasova/Getty Images However, the perpetrators of such frauds are now adopting new methods and technologies to defraud the investors. WhatsApp, a rival messaging platform, introduced some measures to counter disinformation when Covid-19 was first sweeping the world. The S&P 500 fell 1.3% to 4,204.36, and the Dow Jones Industrial Average was down 0.7% to 32,943.33. The Dow posted a fifth straight weekly loss — its longest losing streak since 2019. The Nasdaq Composite tumbled 2.2% to 12,843.81. Though all three indexes opened in the green, stocks took a turn after a new report showed U.S. consumer sentiment deteriorated more than expected in early March as consumers' inflation expectations soared to the highest since 1981.
from tw

Telegram Tati's Wonderland
FROM American