Kali Novskaya | Telegram Webview: rybolos_channel/1344 -

Notice: file_put_contents(): Write of 9381 bytes failed with errno=28 No space left on device in /var/www/group-telegram/post.php on line 50

Warning: file_put_contents(): Only 4096 of 13477 bytes written, possibly out of free disk space in /var/www/group-telegram/post.php on line 50
Kali Novskaya | Telegram Webview: rybolos_channel/1344 -

Telegram Group & Telegram Channel

🌸Подборка NeurIPS: LLM-статьи 🌸
#nlp #про_nlp #nlp_papers

Вот и прошёл NeurIPS 2024, самая большая конференция по машинному обучению. Ниже — небольшая подборка статей, которые мне показались наиболее интересными. Про некоторые точно стоит сделать отдельный обзор.

Агенты
🟣StreamBench: Towards Benchmarking Continuous Improvement of Language Agents arxiv
🟣SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering arxiv
🟣AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents arxiv

🟣DiscoveryWorld: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents arxiv

Бенчмарки
🟣DevBench: A multimodal developmental benchmark for language learning arxiv
🟣CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark arxiv
🟣LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages arxiv
🟣CLUE - Cross-Linked Unified Embedding for cross-modality representation learning arxiv
🟣EmoBench: Evaluating the Emotional Intelligence of Large Language Models arxiv

LLM
🟣The PRISM Alignment dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models arxiv
🟣UniGen: A Unified Framework for Textual Dataset Generation via Large Language Models arxiv
🟣A Watermark for Black-Box Language Models arxiv

Please open Telegram to view this post

VIEW IN TELEGRAM

StreamBench: Towards Benchmarking Continuous Improvement of Language Agents

Recent works have shown that large language model (LLM) agents are able to improve themselves from experience, which is an important ability for continuous enhancement post-deployment. However,...

www.group-telegram.com/tr/rybolos_channel.com/1344

12.1K viewsDec 16 at 11:56

group-telegram.com/rybolos_channel/1344

Create: 2024-12-16
Last Update: 2025-01-11 15:53:17

🌸Подборка NeurIPS: LLM-статьи 🌸
#nlp #про_nlp #nlp_papers

Вот и прошёл NeurIPS 2024, самая большая конференция по машинному обучению. Ниже — небольшая подборка статей, которые мне показались наиболее интересными. Про некоторые точно стоит сделать отдельный обзор.

Агенты
🟣StreamBench: Towards Benchmarking Continuous Improvement of Language Agents arxiv
🟣SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering arxiv
🟣AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents arxiv

🟣DiscoveryWorld: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents arxiv

Бенчмарки
🟣DevBench: A multimodal developmental benchmark for language learning arxiv
🟣CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark arxiv
🟣LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages arxiv
🟣CLUE - Cross-Linked Unified Embedding for cross-modality representation learning arxiv
🟣EmoBench: Evaluating the Emotional Intelligence of Large Language Models arxiv

LLM
🟣The PRISM Alignment dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models arxiv
🟣UniGen: A Unified Framework for Textual Dataset Generation via Large Language Models arxiv
🟣A Watermark for Black-Box Language Models arxiv

BY Kali Novskaya

Share with your friend now:
group-telegram.com/rybolos_channel/1344

Open in Telegram

Telegram | DID YOU KNOW?

Date: 2025-01-11|

Despite Telegram's origins, its approach to users' security has privacy advocates worried. He said that since his platform does not have the capacity to check all channels, it may restrict some in Russia and Ukraine "for the duration of the conflict," but then reversed course hours later after many users complained that Telegram was an important source of information. Either way, Durov says that he withdrew his resignation but that he was ousted from his company anyway. Subsequently, control of the company was reportedly handed to oligarchs Alisher Usmanov and Igor Sechin, both allegedly close associates of Russian leader Vladimir Putin. This provided opportunity to their linked entities to offload their shares at higher prices and make significant profits at the cost of unsuspecting retail investors. The Securities and Exchange Board of India (Sebi) had carried out a similar exercise in 2017 in a matter related to circulation of messages through WhatsApp.
from tr

Telegram Kali Novskaya
FROM American