Forwarded from MiaoTony's Box (MiaoTony 🐱)
Arch Linux: Recent news updates
Glibc 2.41 corrupting Discord installation
We plan to move
This issue has been fixed in the Discord canary build. If you rely on audio connectivity, please use the canary build, login via browser or the flatpak version until the fix hits the stable Discord release.
There have been no reports that (written) chat connectivity is affected.
source
(author: Frederik Schwan)
Glibc 2.41 corrupting Discord installation
We plan to move
glibc
and its friends to stable later today, Feb 3. After installing the update, the Discord client will show a red warning that the installation is corrupt.This issue has been fixed in the Discord canary build. If you rely on audio connectivity, please use the canary build, login via browser or the flatpak version until the fix hits the stable Discord release.
There have been no reports that (written) chat connectivity is affected.
source
(author: Frederik Schwan)
杰哥的{运维,编程,调板子}小笔记
导出飞书日历为 iCalendar 格式¶
背景¶
之前用了一段时间飞书日历,想要把日历里的事件导出来备份,但是发现飞书自己的导出功能太弱,因此参考 从飞书导出日历到 Fastmail - Xuanwo's Blog 进行了导出的尝试。
导出方法¶
上面提到的文章,是通过 CalDAV 的方式进行的日历同步。因此我第一步也是配置飞书的 CalDAV 服务:
1. 打开飞书客户端
2. 点击设置
3. 点击日历
4. 设置 CalDAV 同步
按照界面所示,配置 CalDAV 同步,就可以得到用于 CalDAV 的域名、用户名和密码了。如果只是要订阅,那么到这一步,就可以直接用 CalDAV 客户端来同步了。但我想进一步得到 iCalendar 格式的日历文件。
于是我参考了上述文章的评论区的做法:
也就是说,用 vdirsyncer 把日历同步到本地,再转换为 iCalendar 格式的日历文件。参考 vdirsyncer 文档,这件事情并不复杂:
1. 按照 vdirsyncer:
2. 编辑
3. 配置好以后,进行同步:
此时在
为了让一个
运行上述脚本:
source
导出飞书日历为 iCalendar 格式¶
背景¶
之前用了一段时间飞书日历,想要把日历里的事件导出来备份,但是发现飞书自己的导出功能太弱,因此参考 从飞书导出日历到 Fastmail - Xuanwo's Blog 进行了导出的尝试。
导出方法¶
上面提到的文章,是通过 CalDAV 的方式进行的日历同步。因此我第一步也是配置飞书的 CalDAV 服务:
1. 打开飞书客户端
2. 点击设置
3. 点击日历
4. 设置 CalDAV 同步
按照界面所示,配置 CalDAV 同步,就可以得到用于 CalDAV 的域名、用户名和密码了。如果只是要订阅,那么到这一步,就可以直接用 CalDAV 客户端来同步了。但我想进一步得到 iCalendar 格式的日历文件。
于是我参考了上述文章的评论区的做法:
@jason5ng32jason5ng32Oct 28, 2024分享一下我的方法:1. 在服务器上安装 vdirsyncer ,这个工具可以同步 CalDAV 的内容,在同步设置里,不需要先找到 UUID,可以直接用飞书提供的 URL。2. 写一个 Python 脚本,将 vdirsyncer 同步的内容合并成单一的 ics 文件。3. 将 ics 文件放到一个地址稍微复杂一点的 http 目录里,可以外部访问。4. 写一个 run.sh 脚本,通过 crontab 每 10 分钟执行一次 vdirsyncer 同步和日历文件合成。
也就是说,用 vdirsyncer 把日历同步到本地,再转换为 iCalendar 格式的日历文件。参考 vdirsyncer 文档,这件事情并不复杂:
1. 按照 vdirsyncer:
pip3 install vdirsyncer
2. 编辑
~/.vdirsyncer/config
,填入在飞书处得到的用户密码:[general]status_path = "~/.vdirsyncer/status/"[pair my_contacts]a = "my_contacts_local"b = "my_contacts_remote"collections = ["from a", "from b"][storage my_contacts_local]type = "filesystem"path = "~/.contacts/"fileext = ".vcf"[storage my_contacts_remote]type = "caldav"url = "https://caldav.feishu.cn"username = "REDACTED"password = "REDACTED"
3. 配置好以后,进行同步:
vdirsyncer discover && vdirsyncer sync
此时在
~/.contacts
目录下,已经能看到很多个 vcf 文件了,每个 vcf 文件对应了日历中的一个事件。实际上,这些文件就已经是 iCalendar 格式了,只不过每个文件只有一个事件。为了让一个
.ics
文件包括日历的所有事件,写了一个脚本,实际上就是处理每个 vcf 文件,去掉每个文件开头结尾的 BEGIN:VCALENDAR
和 END:VCALENDAR
,把中间的部分拼起来,最后再加上开头结尾:import sysall_lines = []all_lines += ["BEGIN:VCALENDAR"]for f in sys.argv[1:]: content = open(f).read().strip() lines = content.splitlines() all_lines += lines[1:-1]all_lines += ["END:VCALENDAR"]print("\n".join(all_lines))
运行上述脚本:
python3 dump.py ~/.contacts/*/*.vcf > dump.ics
,这样得到的 .ics
文件就可以直接导入到日历软件了。source
Forwarded from 一个存在的世界 (Miao Wu)
https://ericrotenberg.wordpress.ncsu.edu/cbp2025/ 6th Championship Branch Prediction (CBP2025)
Eric Rotenberg
6th Championship Branch Prediction (CBP2025)
NC State ECE
Daniel Lemire's blog
Thread-safe memory copy
A common operation in software is the copy of a block of memory. In C/C++, we often call the function memcpy for this purpose.
But what happens if, while you are copying the data, another thread is modifying either the source or the destination? The result is fundamentally unpredictable and almost surely a programming error.
Why would you ever code a copy function in such a way given that it is an error? Suppose you are implementing a JavaScript engine in C++, like Google v8. In JavaScript, we have SharedArrayBuffer instances that can be modified and copied from different threads. As the engineer working on the JavaScript engine, you cannot always prevent users from writing buggy code.
In any case, you get a data race: two or more threads access the same memory location simultaneously, where at least one of the accesses is a write operation, without a synchronization mechanism to ensure that these operations occur in a specific order.
What happens? The C++ standard states that a data race results in undefined behavior. In effect, the C++ language does not tell you what happens. A crash might occur. Of course, the JavaScript engineer would rather not see a crash.
Importantly, ‘undefined behavior’ also does not tell you that there is necessarily an error. Effectively, it tells you that as programmer, you acquire the additional responsibility to ensure that it is safe code. There is no warranty coming from the programming language itself.
Why do languages like C and C++ leave undefined behavior?
A good analogy is an organization with many sub-components, where new sub-components could be added at any time. Think of an interstellar federation of planets. The interstellar federation can specify overall laws that are well defined, but there will be remaining corner cases that are specific to which planet you reside in.
That’s the spirit of C and C++: these programming languages can target a very wide range of platforms. For some of these platforms, a data race is without consequence… for others, it could be highly problematic or just slow. Also, by not specifying the behavior, it allows the compiler designer some options. So the programming language leaves it up to you to check.
Consider a conflictual memory copy where you, for example, copy from array A to array B while another thread copies from array B to array A. Under most platforms, this will not cause a crash or anything especially dangerous. You might get garbage data in your arrays, in the worst case.
But if you use automated sanitizer tools, you may still get a warning regarding the data race, even when it is inconsequential. You can silence the warning, by telling the tools that you have a check that the copy is safe.
Instead, you could roll your own ‘safe’ memory copy, where load the content byte by byte (for example) in an atomic fashion. A possible solution in C++20 looks like so:
We have now done away with any kind of undefined behavior. The code ought to be perfectly ‘safe’, there is no more data race.
So why not always use this safe approach?
Because it can be 40 times slower than a conventional memory copy.
It becomes an engineering question. Sometimes performance really does not matter.
In programming, there is practically never a free lunch. It is common that you have take your pick: aim for high performance but acquire more responsibilities, or sacrifice performance for the sake of having fewer worries.
source
Thread-safe memory copy
A common operation in software is the copy of a block of memory. In C/C++, we often call the function memcpy for this purpose.
But what happens if, while you are copying the data, another thread is modifying either the source or the destination? The result is fundamentally unpredictable and almost surely a programming error.
Why would you ever code a copy function in such a way given that it is an error? Suppose you are implementing a JavaScript engine in C++, like Google v8. In JavaScript, we have SharedArrayBuffer instances that can be modified and copied from different threads. As the engineer working on the JavaScript engine, you cannot always prevent users from writing buggy code.
In any case, you get a data race: two or more threads access the same memory location simultaneously, where at least one of the accesses is a write operation, without a synchronization mechanism to ensure that these operations occur in a specific order.
What happens? The C++ standard states that a data race results in undefined behavior. In effect, the C++ language does not tell you what happens. A crash might occur. Of course, the JavaScript engineer would rather not see a crash.
Importantly, ‘undefined behavior’ also does not tell you that there is necessarily an error. Effectively, it tells you that as programmer, you acquire the additional responsibility to ensure that it is safe code. There is no warranty coming from the programming language itself.
Why do languages like C and C++ leave undefined behavior?
A good analogy is an organization with many sub-components, where new sub-components could be added at any time. Think of an interstellar federation of planets. The interstellar federation can specify overall laws that are well defined, but there will be remaining corner cases that are specific to which planet you reside in.
That’s the spirit of C and C++: these programming languages can target a very wide range of platforms. For some of these platforms, a data race is without consequence… for others, it could be highly problematic or just slow. Also, by not specifying the behavior, it allows the compiler designer some options. So the programming language leaves it up to you to check.
Consider a conflictual memory copy where you, for example, copy from array A to array B while another thread copies from array B to array A. Under most platforms, this will not cause a crash or anything especially dangerous. You might get garbage data in your arrays, in the worst case.
But if you use automated sanitizer tools, you may still get a warning regarding the data race, even when it is inconsequential. You can silence the warning, by telling the tools that you have a check that the copy is safe.
Instead, you could roll your own ‘safe’ memory copy, where load the content byte by byte (for example) in an atomic fashion. A possible solution in C++20 looks like so:
void safe_memcpy(char *dest, const char *src, size_t count) {
for (size_t i = 0; i < count; ++i) {
char input =
std::atomic_ref<const char>(src[i])
.load(std::memory_order_relaxed);
std::atomic_ref<char>(dest[i])
.store(input, std::memory_order_relaxed);
}
}
We have now done away with any kind of undefined behavior. The code ought to be perfectly ‘safe’, there is no more data race.
So why not always use this safe approach?
Because it can be 40 times slower than a conventional memory copy.
It becomes an engineering question. Sometimes performance really does not matter.
In programming, there is practically never a free lunch. It is common that you have take your pick: aim for high performance but acquire more responsibilities, or sacrifice performance for the sake of having fewer worries.
source
Chips and Cheese
Intel’s Battlemage Architecture
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Intel’s Battlemage Architecture
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Telegraph
Intel’s Battlemage Architecture
Intel’s Alchemist architecture gave the company a foot in the door to the high performance graphics segment. The Arc A770 proved to be a competent first effort, able to run many games with credible performance. Now, Intel is passing the torch to a new graphics…
Forwarded from Phoronix
GNU Gold Linker Is Deprecated & Will Be Gone For Good Without New Developers
https://www.phoronix.com/news/GNU-Gold-Linker-Deprecated
https://www.phoronix.com/news/GNU-Gold-Linker-Deprecated
Phoronix
GNU Gold Linker Is Deprecated & Will Be Gone For Good Without New Developers
With the recent GNU Binutils 2.44 release, one of the changes is worth calling out in its own article: the GNU Gold linker is now officially deprecated and is now being segregated to its own extra Binutils package but risks being removed all together without…
Forwarded from 一个存在的世界 (Miao Wu)
这个脑洞有意思
Forwarded from tsuThoughts
Social Stockfish
像国际象棋分析引擎一样预测和你对话对象的接下来 5 次交流,从而告诉你当前最好的回复。
https://fixvx.com/eddybuild/status/1889908182501433669
像国际象棋分析引擎一样预测和你对话对象的接下来 5 次交流,从而告诉你当前最好的回复。
https://fixvx.com/eddybuild/status/1889908182501433669
vxTwitter / fixvx
💖 12.96K 🔁 660
💖 12.96K 🔁 660
Eddy Xu (@eddybuild)
built an ai that sees 5 moves ahead in any conversation and tells you the optimal thing to say
Daniel Lemire's blog
AVX-512 gotcha: avoid compressing words to memory with AMD Zen 4 processors
The recent AMD processors (Zen 4) provide extensive support for the powerful AVX-512 instructions. AVX-512 (Advanced Vector Extensions 512) is an extension to the x86 instruction set architecture (ISA) introduced by Intel. These instructions enhance the capabilities of processors by allowing for more data to be processed in parallel. You can process registers made of 64 bytes!
One of the neat trick is that given a mask, you can ‘compress’ words: Suppose that you have a vector made of thirty-two 16-bit words, and you want to only keep the second one and third one, then you can use the vpcompressw instruction and the mask 0b110. It will produce a register where the second and third words are placed in first and second position.
An even nicer trick is that you can use this instruction to write just these two words out to memory. You can invoke this functionality with the _mm_mask_compressstoreu_epi16 function intrinsic.
This works well on recent Intel processors, but not so well on AMD Zen 4 processors.
We have a fast function in the simdjson library to minify a file (remove unnecessary spaces).
https://github.com/simdjson/simdjson/pull/2335
source
AVX-512 gotcha: avoid compressing words to memory with AMD Zen 4 processors
The recent AMD processors (Zen 4) provide extensive support for the powerful AVX-512 instructions. AVX-512 (Advanced Vector Extensions 512) is an extension to the x86 instruction set architecture (ISA) introduced by Intel. These instructions enhance the capabilities of processors by allowing for more data to be processed in parallel. You can process registers made of 64 bytes!
One of the neat trick is that given a mask, you can ‘compress’ words: Suppose that you have a vector made of thirty-two 16-bit words, and you want to only keep the second one and third one, then you can use the vpcompressw instruction and the mask 0b110. It will produce a register where the second and third words are placed in first and second position.
An even nicer trick is that you can use this instruction to write just these two words out to memory. You can invoke this functionality with the _mm_mask_compressstoreu_epi16 function intrinsic.
This works well on recent Intel processors, but not so well on AMD Zen 4 processors.
We have a fast function in the simdjson library to minify a file (remove unnecessary spaces).
https://github.com/simdjson/simdjson/pull/2335
source
Arch Linux: Recent news updates
Cleaning up old repositories
Around two years ago, we've merged the
On systems where
The following deprecated repositories will be removed:
Please make sure to remove all use of the aforementioned repositories from your
source
(author: Sven-Hendrik Haase)
Cleaning up old repositories
Around two years ago, we've merged the
[community]
repository into [extra]
as part of the git migration. In order to not break user setups, we kept these repositories around in an unused and empty state. We're going to clean up these old repositories on 2025-03-01.On systems where
/etc/pacman.conf
still references the old [community]
repository, pacman -Sy
will return an error on trying to sync repository metadata.The following deprecated repositories will be removed:
[community]
, [community-testing]
, [testing]
, [testing-debug]
, [staging]
, [staging-debug]
.Please make sure to remove all use of the aforementioned repositories from your
/etc/pacman.conf
(for which a .pacnew
was shipped with pacman>=6.0.2-7
)!source
(author: Sven-Hendrik Haase)
Forwarded from Yingchi Long
vllm 也可以是 llvm
Forwarded from lycwww
想要变 moe 就要先研究 MoE
Forwarded from Fugoes In Mirror
Chips and Cheese
Zen 5's AVX-512 Frequency Behavior
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Zen 5's AVX-512 Frequency Behavior
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Telegraph
Zen 5's AVX-512 Frequency Behavior
Zen 5 is AMD's first core to use full-width AVX-512 datapaths. Its vector execution units are 512 bits wide, and its L1 data cache can service two 512-bit loads per cycle. Intel went straight to 512-bit datapaths with Skylake-X back in 2017, and used fixed…