group-telegram.com/tatiwonderland/68
Last Update:
Кстати, сейчас в bay area проходит mooc курс Advanced LLM agents
с лекциями на youtube, которые могут смотреть все (как мы любим, без регистрации и смс).
Сегодня как раз одна такая лекция "Learning to Self-Improve & Reason with LLMs", 4pm SF time, но посмотреть можно и потом. Они часто начинают позднее.
Перепосчу.
Our 2nd lecture will be happening today @4:00pm PST! You can find the livestream here: https://www.youtube.com/live/_MNlLhU33H0.
Today, our amazing guest speaker Jason Weston will be presenting, "Learning to Self-Improve & Reason with LLMs."
We describe some recent methods for LLMs whereby they can self-learn how to perform better at tasks relevant to human users, from reasoning or math tasks to creative tasks. In particular we describe the methods of Iterative DPO (https://arxiv.org/abs/2312.16682), Self-Rewarding LLMs (https://arxiv.org/abs/2401.10020), Iterative Reasoning Preference Optimization (https://arxiv.org/abs/2404.19733), Thinking LLMs (https://arxiv.org/abs/2410.10630), Meta-Rewarding LLMs (https://arxiv.org/abs/2407.19594), and more!
BY Tati's Wonderland
![](https://photo.group-telegram.com/u/cdn4.cdn-telegram.org/file/olojdtVajCNXmEciw2qDO_todvil4slFWxULOieutKD7418Q8vjCdgGEb-23wEve6g-YIJPgIgXXtAs-DNgpdPTbXnU9umkGPPSU1jDo7skVpWVhkzDbiJJ1z_Xhgf-R3V8ijKo7FjqRHyq1e0utDThfpkWm4c8ErVwy8Fja5AP7Z6MJiaagY2RotVjHvAWG9EoffHWCDYSnSy4DaJBpPPxTk92ffchV8Pfa4MTsi8KY2W2yYWfX4ozmW4n7vHKTKPxU0sNBrJ4AyLFy5eq3G5OjIlV_Q36vRd3DxZGpZgvfwy3gEebdP4CRFDPqE1hgzj0eB9fomdHN31QRYqlXrg.jpg)
Share with your friend now:
group-telegram.com/tatiwonderland/68