Tati's Wonderland | Telegram Webview: tatiwonderland/68 -

Telegram Group & Telegram Channel

Tati's Wonderland

Кстати, сейчас в bay area проходит mooc курс Advanced LLM agents
с лекциями на youtube, которые могут смотреть все (как мы любим, без регистрации и смс).

Сегодня как раз одна такая лекция "Learning to Self-Improve & Reason with LLMs", 4pm SF time, но посмотреть можно и потом. Они часто начинают позднее.

Перепосчу.
Our 2nd lecture will be happening today @4:00pm PST! You can find the livestream here: https://www.youtube.com/live/_MNlLhU33H0.

Today, our amazing guest speaker Jason Weston will be presenting, "Learning to Self-Improve & Reason with LLMs."

We describe some recent methods for LLMs whereby they can self-learn how to perform better at tasks relevant to human users, from reasoning or math tasks to creative tasks. In particular we describe the methods of Iterative DPO (https://arxiv.org/abs/2312.16682), Self-Rewarding LLMs (https://arxiv.org/abs/2401.10020), Iterative Reasoning Preference Optimization (https://arxiv.org/abs/2404.19733), Thinking LLMs (https://arxiv.org/abs/2410.10630), Meta-Rewarding LLMs (https://arxiv.org/abs/2407.19594), and more!

CS 194/294-280 (Advanced LLM Agents) - Lecture 2, Jason Weston

www.group-telegram.com/jp/tatiwonderland.com/68

1.2K viewsTanya, edited Feb 3 at 17:59

group-telegram.com/tatiwonderland/68

Create: 2025-02-03
Last Update: 2025-02-16 08:35:49

Кстати, сейчас в bay area проходит mooc курс Advanced LLM agents
с лекциями на youtube, которые могут смотреть все (как мы любим, без регистрации и смс).

Сегодня как раз одна такая лекция "Learning to Self-Improve & Reason with LLMs", 4pm SF time, но посмотреть можно и потом. Они часто начинают позднее.

Перепосчу.
Our 2nd lecture will be happening today @4:00pm PST! You can find the livestream here: https://www.youtube.com/live/_MNlLhU33H0.

Today, our amazing guest speaker Jason Weston will be presenting, "Learning to Self-Improve & Reason with LLMs."

We describe some recent methods for LLMs whereby they can self-learn how to perform better at tasks relevant to human users, from reasoning or math tasks to creative tasks. In particular we describe the methods of Iterative DPO (https://arxiv.org/abs/2312.16682), Self-Rewarding LLMs (https://arxiv.org/abs/2401.10020), Iterative Reasoning Preference Optimization (https://arxiv.org/abs/2404.19733), Thinking LLMs (https://arxiv.org/abs/2410.10630), Meta-Rewarding LLMs (https://arxiv.org/abs/2407.19594), and more!

BY Tati's Wonderland

Share with your friend now:
group-telegram.com/tatiwonderland/68

Open in Telegram

Telegram | DID YOU KNOW?

Date: 2025-02-16|

'Wild West' "Russians are really disconnected from the reality of what happening to their country," Andrey said. "So Telegram has become essential for understanding what's going on to the Russian-speaking world." For Oleksandra Tsekhanovska, head of the Hybrid Warfare Analytical Group at the Kyiv-based Ukraine Crisis Media Center, the effects are both near- and far-reaching. "We're seeing really dramatic moves, and it's all really tied to Ukraine right now, and in a secondary way, in terms of interest rates," Octavio Marenzi, CEO of Opimas, told Yahoo Finance Live on Thursday. "This war in Ukraine is going to give the Fed the ammunition, the cover that it needs, to not raise interest rates too quickly. And I think Jay Powell is a very tepid sort of inflation fighter and he's not going to do as much as he needs to do to get that under control. And this seems like an excuse to kick the can further down the road still and not do too much too soon." Investors took profits on Friday while they could ahead of the weekend, explained Tom Essaye, founder of Sevens Report Research. Saturday and Sunday could easily bring unfortunate news on the war front—and traders would rather be able to sell any recent winnings at Friday’s earlier prices than wait for a potentially lower price at Monday’s open.
from jp

Telegram Tati's Wonderland
FROM American