LLaMa-2: лучшая опенсорсная языковая модель (by Meta)
Авторы обновили обучающий датасет, сделав его чище и больше (2T токенов), добавили более быстрый grouped-query attention, удлинили контекст до 4k токенов и учили в несколько этапов: pretraining, supervised fine-tuning, RLHF.
Из интересных наблюдений — RL не просто портит калибровку вероятностей (что первыми заметили openAI), а на самом деле корректирует температуру, балансируя между фактологической точностью и креативностью, в зависимости от промпта.
LLaMa-2: лучшая опенсорсная языковая модель (by Meta)
Авторы обновили обучающий датасет, сделав его чище и больше (2T токенов), добавили более быстрый grouped-query attention, удлинили контекст до 4k токенов и учили в несколько этапов: pretraining, supervised fine-tuning, RLHF.
Из интересных наблюдений — RL не просто портит калибровку вероятностей (что первыми заметили openAI), а на самом деле корректирует температуру, балансируя между фактологической точностью и креативностью, в зависимости от промпта.
'Wild West' "We as Ukrainians believe that the truth is on our side, whether it's truth that you're proclaiming about the war and everything else, why would you want to hide it?," he said. What distinguishes the app from competitors is its use of what's known as channels: Public or private feeds of photos and videos that can be set up by one person or an organization. The channels have become popular with on-the-ground journalists, aid workers and Ukrainian President Volodymyr Zelenskyy, who broadcasts on a Telegram channel. The channels can be followed by an unlimited number of people. Unlike Facebook, Twitter and other popular social networks, there is no advertising on Telegram and the flow of information is not driven by an algorithm. On December 23rd, 2020, Pavel Durov posted to his channel that the company would need to start generating revenue. In early 2021, he added that any advertising on the platform would not use user data for targeting, and that it would be focused on “large one-to-many channels.” He pledged that ads would be “non-intrusive” and that most users would simply not notice any change. DFR Lab sent the image through Microsoft Azure's Face Verification program and found that it was "highly unlikely" that the person in the second photo was the same as the first woman. The fact-checker Logically AI also found the claim to be false. The woman, Olena Kurilo, was also captured in a video after the airstrike and shown to have the injuries.
from us