Dijkstra 算法的延伸

我们知道 Dijkstra 算法是一个高效的单源最短路径(SSSP)算法,本文将不再赘述他的细节。但同时,Dijkstra 也是一个动态规划算法。Dijkstra 算法的正确性源自无负边权图的若干性质。如果一个问题本身也满足这些性质,那么即使它不是一个图论最短路径问题,也可以使用 Dijkstra 算法解决。那么,这些性质是什么呢?

Read More


Manacher 回文计数算法

以下假设字符串下标从 $0$ 开始,子串记号 $s[i..j]$ 左闭右闭。

给定长度为 $n$ 的字符串 $s$,Manacher 算法可以在 $O(n)$ 的时间复杂度内找到 $s$ 的所有回文子串。

我们先以寻找长度为奇数的子串为例。首先需要明确的是,如果 $s$ 中以第 $i$ 个字符为中心的最长回文子串长度为 $d=2p+1$,则以下皆为 $s$ 的回文子串:

$$s[i-(p-1)..i+(p-1)],\ldots, s[i-1..i+1], s[i..i]$$

因此,我们只需对所有下标 $i$ 求解出以 $s[i]$ 为中心的最长回文子串长度 $2p_i+1$,即可知道 $s$ 的所有回文子串。

Read More


Go Fact: Zero-sized Field at the Rear of a Struct Has Non-zero Size

There’s a concept in Golang called zero-sized type (or ZST), namely, a type whose variables take up zero bit of memory. One of them is the famous struct{}. People often use map[string]struct{} to efficiently emulate a set structure. Others include zero-length arrays such as [0]int, albeit not very common, are adopted to enforce some properties of a customized type.

Read More


Display *big.Rat Losslessly and Smartly in Golang

Floating-point numbers, as we know, are notorious for losing precision when their values become too large or too small. They are also bad at representing decimals accurately, yielding confusions like 0.1 + 0.2 != 0.3 for every beginner in their programming 101.

Albeit being imprecise, floats are good enough for most daily scenarios. For those not, however, Golang provides *big.Rat to the rescue. Rats are designed to represent rational numbers with arbitary precision, addressing most flaws of floats, yet at a cost of much slower computation speed and bigger memory footprint. For example, we are confident to compare 0.1 + 0.2 to 0.3 using Rats without caring about tolerance:

Read More


代码的仪式

我们常能在英文社区看到 coding ceremory 一词,或译为 代码的仪式。Stack Overflow 上有个问题 What does “low ceremony” mean?,作者曾如此提问:

In the Trac Main Features page https://trac.edgewall.org/wiki/TracFeatures, Trac is said to emphasize “ease of use and low ceremony”. Can someone please explain what “ceremony” means in the context of software usage?

low ceremony 与 ease of use 作并列短语,可见在程序开发的语境下,代码的仪式不是一个褒义词——过多的仪式并没有好处。用户 Rowan Freeman 则作此回答:

Low ceremony means a small amount of code to achieve something. It means you don’t need to set up a lot of things in order to get going.

如其所述,代码的仪式是完成一个功能所需要的额外准备。仪式越少,准备工作越简洁,完成起来也越容易。

代码仪式被称为仪式,正如古代祭祀的舞蹈,传统庆典的繁文缛节,其对完成目标贡献甚微,却又是不可或缺的步骤。复杂的仪式冗长而乏味,我们偏偏还得忍受其枯燥,如履薄冰,完成得分毫不差——这也解释了为什么大多数人都不喜欢代码仪式。

不同的代码仪式

依照呈现的形式,代码的仪式可以分为 编写仪式 和 运行仪式 两类。

Read More


中式亲属称谓研究之一:构建半群

前段时间在 V2EX 的一个帖子 /t/943948 中看到了一个有趣的问题:

在中文的亲属称谓体系中,我们会有“爸爸的爷爷”与“爷爷的爸爸”是同一个人,即“爸爸”和“爷爷”这两个称谓是可交换的。那么是否可以找到一个准则,以归纳所有这样的可交换称谓对?

欲解决这个问题,我们可以先使用群论对亲属关系进行建模,在此基础上分析其代数结构,进而得出亲属关系可交换的条件。本系列期用两篇博文阐述这一理论。此为其第一篇,将介绍亲属半群的建立以及该代数结构的相关性质。

Read More


Diving from the CUDA Error 804 into a bug of libnvidia-container

Several users reported to encounter "Error 804: forward compatibility was attempted on non supported HW" during the usage of some customized PyTorch docker images on our GPU cluster.

At first glance I recognized the culprit to be a version mismatch between installed driver on the host and required driver in the image. The corrupted images as they described were built targeting CUDA == 11.3 with a corresponding driver version == 465 , while some of our hosts are shipped with driver version 460. As a solution I told them to downgrade the targeting CUDA version by choosing a base image such as nvidia/cuda:11.2.0-devel-ubuntu18.04, which indeed well solved the problem.

But later on I suspected the above hypothesis being the real cause. An observed counterexample was that another line of docker images targeting even higher CUDA version would run normally on those hosts, for example, the latest ghcr.io/pytorch/pytorch:2.0.0-devel built for CUDA == 11.7. This won’t be the case if CUDA version mismatch truly matters.

Afterwards I did a bit of research concerning the problem and learnt some interesting stuff which this post is going to share. In short, the recently released minor version compatibility allows applications built for newer CUDA to run on machines with some older drivers, but libnvidia-container doesn’t correcly handle it due to a bug and eventually leads to such an error.

Towards thorough comprehension, this post will first introduce the constitution of CUDA components, following with the compatibility policy of different components, and finally unravel the bug and devise a workaround for it. But before diving deep, I’ll give two Dockerfile samples to illustrate the problem.

Read More


Modern Cryptography, GPG and Integration with Git(hub)

GPG (the GNU Privacy Guard) is a complete and free implementation of the OpenPGP standard. Based on various mature algorithms to select from, GPG acts as a convenient tool for daily cryptographic communication.

GPG has two primary functionalities: (1) it encrypts and signs your data for secure transfering and verifiable information integrity, and (2) it features a versatile key management system to construct and promote web of trust. GPG also has a well-designed command line interface for easy integration with other applications such as git.

This article is going to briefly elaborate some key concepts and usage of GPG, and then present demonstration to cryptographically sign git commits with the help of GPG.

Read More