Telegram Group & Telegram Channel
The Shape of Learning: Intrinsic Dimensions in Transformer-Based Models

Препринт нашей новой работы! Оказалось, что языковые модели «упаковывают» свои репрезентации в очень маленькое пространство с внутренней размерностью не больше 60. И при этом анизотропия на средних слоях трансформеров-декодеров стремится к единице! Получается, эмбеддинги из середины модели расположены вдоль одной линии.

Еще одно интересное наблюдение — обучение LLM делится на две фазы: расширение и последующее сжатие активаций (см. картинку). А перед взрывами лосса их размерность немного подрастает.

UPD: приняли на EACL 🎉

Статья



group-telegram.com/abstractDL/250
Create:
Last Update:

The Shape of Learning: Intrinsic Dimensions in Transformer-Based Models

Препринт нашей новой работы! Оказалось, что языковые модели «упаковывают» свои репрезентации в очень маленькое пространство с внутренней размерностью не больше 60. И при этом анизотропия на средних слоях трансформеров-декодеров стремится к единице! Получается, эмбеддинги из середины модели расположены вдоль одной линии.

Еще одно интересное наблюдение — обучение LLM делится на две фазы: расширение и последующее сжатие активаций (см. картинку). А перед взрывами лосса их размерность немного подрастает.

UPD: приняли на EACL 🎉

Статья

BY AbstractDL




Share with your friend now:
group-telegram.com/abstractDL/250

View MORE
Open in Telegram


Telegram | DID YOU KNOW?

Date: |

The company maintains that it cannot act against individual or group chats, which are “private amongst their participants,” but it will respond to requests in relation to sticker sets, channels and bots which are publicly available. During the invasion of Ukraine, Pavel Durov has wrestled with this issue a lot more prominently than he has before. Channels like Donbass Insider and Bellum Acta, as reported by Foreign Policy, started pumping out pro-Russian propaganda as the invasion began. So much so that the Ukrainian National Security and Defense Council issued a statement labeling which accounts are Russian-backed. Ukrainian officials, in potential violation of the Geneva Convention, have shared imagery of dead and captured Russian soldiers on the platform. The news also helped traders look past another report showing decades-high inflation and shake off some of the volatility from recent sessions. The Bureau of Labor Statistics' February Consumer Price Index (CPI) this week showed another surge in prices even before Russia escalated its attacks in Ukraine. The headline CPI — soaring 7.9% over last year — underscored the sticky inflationary pressures reverberating across the U.S. economy, with everything from groceries to rents and airline fares getting more expensive for everyday consumers. Emerson Brooking, a disinformation expert at the Atlantic Council's Digital Forensic Research Lab, said: "Back in the Wild West period of content moderation, like 2014 or 2015, maybe they could have gotten away with it, but it stands in marked contrast with how other companies run themselves today." The next bit isn’t clear, but Durov reportedly claimed that his resignation, dated March 21st, was an April Fools’ prank. TechCrunch implies that it was a matter of principle, but it’s hard to be clear on the wheres, whos and whys. Similarly, on April 17th, the Moscow Times quoted Durov as saying that he quit the company after being pressured to reveal account details about Ukrainians protesting the then-president Viktor Yanukovych. Unlike Silicon Valley giants such as Facebook and Twitter, which run very public anti-disinformation programs, Brooking said: "Telegram is famously lax or absent in its content moderation policy."
from us


Telegram AbstractDL
FROM American