# Diving from the CUDA Error 804 into a bug of libnvidia-container

Several users reported to encounter "Error 804: forward compatibility was attempted on non supported HW" during the usage of some customized PyTorch docker images on our GPU cluster.

At first glance I recognized the culprit to be a version mismatch between installed driver on the host and required driver in the image. The corrupted images as they described were built targeting CUDA == 11.3 with a corresponding driver version == 465 , while some of our hosts are shipped with driver version 460. As a solution I told them to downgrade the targeting CUDA version by choosing a base image such as nvidia/cuda:11.2.0-devel-ubuntu18.04, which indeed well solved the problem.

But later on I suspected the above hypothesis being the real cause. An observed counterexample was that another line of docker images targeting even higher CUDA version would run normally on those hosts, for example, the latest ghcr.io/pytorch/pytorch:2.0.0-devel built for CUDA == 11.7. This won’t be the case if CUDA version mismatch truly matters.

Afterwards I did a bit of research concerning the problem and learnt some interesting stuff which this post is going to share. In short, the recently released minor version compatibility allows applications built for newer CUDA to run on machines with some older drivers, but libnvidia-container doesn’t correcly handle it due to a bug and eventually leads to such an error.

Towards thorough comprehension, this post will first introduce the constitution of CUDA components, following with the compatibility policy of different components, and finally unravel the bug and devise a workaround for it. But before diving deep, I’ll give two Dockerfile samples to illustrate the problem.

# Reproduction Samples

The host reported as problematic has 8x GeForce RTX 3090 with driver version 460.67 and CUDA 11.2. Here is an image with torch == 1.12.1 built for CUDA 11.3 and fails on the host:

By contrast below is an image with torch == 2.0.0 built for CUDA 11.7 and runs normally:

For convenience I also write a Makefile to combine the process of building and running either image:

With the Makefile you can run make good or make bad to see respective results:

We start off touring from the constitution of CUDA.

# Components of CUDA

When talked about the term “CUDA”, two concepts “CUDA Toolkit” and “NVIDIA Display Drivers” are usually mixed up. This figure illustrates their distinction as well as the cascading relationship:

The driver at low level bridges the communication between softwares and underlying NVIDIA hardwares. The toolkit instead lies at a higher level to provide convenience for easy GPU programming.

If we take a closer look at the driver, we can see it decomposed into two secondary components “user-mode driver or UMD (libcuda.so)” and “kernel-mode driver or KMD (nvidia.ko)”. The KMD runs in OS kernel to do the most intimate contact with the hardware, while the UMD as an abstraction provides API to communicate with the kernel driver.

Generally, the applications compiled by CUDA toolkit will dynamically search and link libcuda.so during starting, which under the hood dispatches user requests to the kernel as illustrated below:

So far so good, if only the compiler in toolkit agrees on APIs with the targeting driver.

Sadly, that is not the norm. In real world, developers compile the programs on one machine and dispatch them to run on others, expecting those programs compiled by a specific version of CUDA toolkit could run on a wide variety of hardwares, or otherwise users would complain about the corrupted binaries.

Towards this guarantee, several compatibility policies are induced.

# CUDA Compatibility Policies

Before we introduce the policies, we should know about how the components are versioned. The CUDA toolkit and the drivers adopt different version schemes, with the toolkit versioned like 11.2 and drivers like 460.65. Therefore, “driver 460.65” refers to the version of libcuda.so and nvidia.ko; similarly, when somebody says “CUDA 11.2”, it’s the toolkit version being mentioned.

NVIDIA devises multiple rules to ensure user binaries would work on a wide range of driver-hardware combinations, which can be grouped into two categories, i.e., toolkit-driver compatibility and UMD-KMD compatibility.

## Toolkit-driver compatibility

These policies constrain that binaries compiled by a specific CUDA toolkit can run on what version of driver.

Basically we have the “Backward Compatibility”. Each CUDA toolkit has a so-called toolkit driver version . Binaries compiled by that toolkit are guaranteed to run on drivers newer than the toolkit driver version. For example, the toolkit driver version of CUDA 11.2 is 460.27.03, which means binaries compiled by CUDA 11.2 should work on any driver >= 460.27.03. This is the most fundamental and agelong policy.

From CUDA 11 onwards, another policy named “Minor Version Compatibility” was proposed. This policy allows binaries compiled by toolkits with the same major version to a the same driver version requirement. For example, binaries compiled by CUDA 11.0 would work on driver >= 450.36.06. Since CUDA 11.2 has the same major version with CUDA 11.0, binaries compiled by CUDA 11.2 could also work on driver >= 450.36.06 .

The backward compatibility ensures compiled binaries would work on machines shipped with drivers of future version, while the minor version compatibility reduces the necessity of upgrading drivers to run some newly compiled binaries. Generally, a binary compiled by CUDA toolkit $X.Y$ should work with driver with version $M$, if either of the following satisfies:

1. CUDA toolkit $X.Y$ has toolkit driver version $N$ and $M \geq N$;
2. $X \geq 11$ and a CUDA toolkit $X.Y_2$ has toolkit driver version $N_2$ and $M \geq N_2$.

However, the above policies only consider the relationship between CUDA toolkit and drivers. What if the user-mode and kernel-mode drivers have diverged version? This is where UMD-KMD compatibility applies.

## UMD-KMD compatibility

In ideal case, kernel-mode driver should always work with user-mode driver with the same version. But upgrading kernel-mode drivers is sometimes tricky and troublesome, of which some users such as data center admins could not take the risk. Towards this consideration, NVIDIA devised the “Forward Compatibility” to allow old-versioned KMD to cooperate with new-versioned UMD under some circumstance.

Specifically, a kernel-mode driver would support all user-mode drivers releases during its lifetime. For instance, the driver 418.x has end of life (EOL) in March 2022, before which driver 460.x was released, then KMD 418.x would work with UMD 460.x. The compatibility does not involve anything at a higher level such as CUDA toolkit.

It’s worth noting that, this policy does not apply to all GPU hardwares but only a fraction of them. NVIDIA has limited forward compatibility to be applicable for systems with NVIDIA Data Center GPUs (the Tesla branch) or NGC Server Ready SKUs of RTX cards . If you own a GeForce RTX 3090, like in my scenario, you won’t enjoy this stuff.

## Summary of Compatibility

Let’s make a quick review for the various types of compatibility policies. If you have a binary compiled by CUDA $X.Y$, a host with UMD (libcuda.so) versioned $M$ and KMD (nvidia.ko) versioned $M'$, then they would work fine if both of the two conditions hold:

1. The UMD and KMD is compatible. Specifically, either
1. the GPU supports forward compatibility (Tesla branch or NGC ready), and driver $M$ comes before the EOL of driver $M'$ (the forward compatibility); or
2. $M = M'$.
2. The CUDA toolkit and UMD is compatible. Specifically, either
1. CUDA toolkit $X.Y$ has toolkit driver version $N$ and $M \geq N$ (the backward compatibility); or
2. major version $X \geq 11$ and there exists another toolkit $X.Y_2$ with toolkit driver version $N_2$ and $M \geq N_2$ (the minor version compatibility).

Generally, validating the above conditions should help whenever you run in any compatibility problems.

# Back to Our Problem

So, what’s wrong with the docker image bad? With above rules in hands we can perform a simple analysis.

Could it be a toolkit-driver incompatibility? Probably NO. According to Table. 1 here, the minor version compatibility applies with CUDA 11.x and driver >= 450.80.02, which our driver version 460 satisfies, let alone binary compiled by CUDA 11.7 working like a charm in the case of docker image good.

It should be due to a KMD-UMD incompatibility, namely, the version of libcuda.so and nvidia.ko is incompatible. Since forward compatibility is not applicable for RTX 3090, we are expecting condition 1.2 holds, where libcuda.so and nvidia.ko should have the same version – this obviously was not the case.

## How nvidia driver works with docker?

A process in a container is technically a special process on the host, which shares the same model as other processes do to interact with GPU drivers. Since KMD runs in kernel and not interfered by user space, all programs regardless of on host or in containers are communicate with the same KMD.

By contrast, a program can flexibly choose which user-mode driver to link against. It can either link to the UMD installed along with the KMD on the host, or brings its own UMD during packaging and distribution.

We can list out all the UMDs in a running good container with the command:

Looks like there is only one copy of libcuda.so that lies in /usr/lib/x86_64-linux-gnu/ with version 460.67. However, such libcuda.so was not packed with the docker image from the beginning. The library disappears if you omit the --gpus argument:

In fact, the library exists on the host and is injected into the container by docker runtime during the startup. This post demonstrates the injection process by viewing docker’s log. Mounting libcuda.so from the host will maximally ensures the KMD-UMD correspondence aligned.

Now that the docker runtime would choose a native UMD, why did the image bad fail?

## The internal of image bad

We can likewise check the UMDs in a running bad container as belows:

OOPS!!! Looks like there’s big difference here. We could derive two observations from the result:

1. There is already a libcuda.so bundled inside the image at /usr/local/cuda-11.3/compat/libcuda.so.465.19.01, with a higher version of 465.19.01.
2. During startup, both the native libcuda.so.460.67 and the bundled libcuda.so.465.19.01 are symlinked under /usr/lib/x86_64-linux-gnu/, and most importantly, it’s the bundled one being linked as libcuda.so and chosen by the program.

And that is the reason why the docker image bad violates KMD-UMD compatibility!

# The bug of libnvidia-container

Such misbehavior is a consequence of a bug of libnvidia-container. But before we talk about it, let’s take a step back to see what the directory /usr/local/cuda-X/compat does and why should it exist.

Actually the compat directory is part of the CUDA compat package, according to the official docs, which exists to support the forward compatibility . The official base image nvidia/cuda:11.3.0-cudnn8-devel-ubuntu20.04 had this package built in, which contains a higher version UMD libcuda.so.465.19.01 in case of an older-versioned KMD running on the host. As aforementioned, to apply forward compatibility there exists requirement on the underlying hardware. When the requirement unsatisfied, such as for our RTX 3090 GPUs, the libcuda.so from compat package should hopefully not be linked against.

Unfortunately, current release of nvidia-docker would roughly attempt to apply forward compatibility, regardless of whether the GPUs meet the limitation.

The problem was encountered and studied by Gemfield who posted an article PyTorch 的 CUDA 错误：Error 804: forward compatibility was attempted on non supported HW as explanation. Gemfield observed nvidia-docker would simultaneously symlink both the native UMD on host and the compat UMD in docker image under /usr/lib/x86_64-linux-gnu/, and brutely choose the one with higher version as the libcuda.so.1, against which user programs would link.

Obviously this behavior is neither in line with forward compatibility nor with minor version compatibility. Gemfield opened an issue NVIDIA/nvidia-docker#1515 for discussion, where the author guessed it was a bug of libnvidia-container and another issue NVIDIA/libnvidia-container#138 was referred. Both issues are not yet resolved up till now.

The workaround is simple – if there’s no compat package, the compat UMD won’t be applied. We can either remove the compat package or brutely delete the /usr/local/cuda-X/compat directory to let it work:

# Epilogue

This article elaborates the cause and workaround of CUDA Error 804 when NVIDIA GPUs working with docker. As preknowledge, I introduced the consistution of CUDA, the various categories of CUDA compatibility policies, and how the docker runtime deals with GPU driver. The culprit was discovered to be a bug or deficiency of libnvidia-container, which mishandled forward compatibility and minor version compatibility and was not yet resolved. As a workaround, one can remove the CUDA compat image inside the image to avoid forward compatibility being applied and light the minor version compatibility.

# Modern Cryptography, GPG and Integration with Git(hub)

GPG (the GNU Privacy Guard) is a complete and free implementation of the OpenPGP standard. Based on various mature algorithms to select from, GPG acts as a convenient tool for daily cryptographic communication.

GPG has two primary functionalities: (1) it encrypts and signs your data for secure transfering and verifiable information integrity, and (2) it features a versatile key management system to construct and promote web of trust. GPG also has a well-designed command line interface for easy integration with other applications such as git.

This article is going to briefly elaborate some key concepts and usage of GPG, and then present demonstration to cryptographically sign git commits with the help of GPG.

# Modern Cryptography 101

To understand how GPG or other privacy tools work, I should first introduce some basic ideas of modern cryptography. Let’s start with the two primary problems of secure communication, which includes data encryption and data integrity/authenticity verification.

## Data Encryption

Peer-to-peer data encryption aims to prevent the message from being spied by a potential third party, especially when the two parties are communicating over a channel open to the public. Imagine Alice and Bob are mailing through pigeons, with the message unencrypted and clearly written on the paper. It is possible for a third person called Blake to intercept the pigeon, open the attached mailbox and read the message in it, without Alice and Bob knowing his existence.

Data encryption is introduced to defend against such attacks. For secure data exchange, Alice and Bob should agree on some kind of invertible message processing pipeline. The sender preprocesses (encrypts) the message before it attached to the pigeon, and the recipient performs the inverted process (decrypts) to read the clear message. In terms of cryptography, such pipeline is called a cryptographic algorithm or a cipher.

A cipher usually works with a key (or several keys). With the cipher fixed, the message encrypted with one key should only be decrypted with the same one. Modern ciphers are carefully designed to satisfy that Blake is hard to perform decryption without the key, even if he knows the full detail of the cipher. Under this assurance, Alice and Bob only have to choose a specific algorithm as cipher from the public list, and agree on the key before communication. This simplifies the process of negotiation, as they don’t have to discuss the sophisticated implementation of the cipher.

The currently available ciphers can be roughly categorized into two families, the symmetric ciphers and the public-key ciphers.

## Symmetric Ciphers

Symmetric ciphers encrypt and decrypt messages using the same key. They went back far into human history. You might have heard of the Caesar cipher that replaces each plaintext letter with a different one a fixed number of places down the alphabet, which is a famous example of this category. For Caesar cipher, the key is the number of positions being shifted, like 3 for a tranformation of A->D, B->E.

Symmetric cipher exposes several drawbacks in realistic usage. First, it provides no defense against the scenario of the key being stolen. If Blake somehow knows the key, he can both spy and forge the messages sent between Alice and Bob. Also, it would require $n(n-1)/2$ keys to achieve pairwise communication among $n$ persons, increasing the expense of key exchange and opportunity of leakage.

## Public-key Ciphers

By contrast, the public-key ciphers mitigate the problems by adopting a pair of keys instead of just one. A message encrypted by one key might only be decrypted with the other, and vice versa.

Practically we name one of them as public key and the other as secret key. The public key is published to whom we want to communicate with, while the secret key is kept locally and must only be known to ourselves. When Alice sends a message to Bob, the message is encrypted with Bob’s public key, and Bob uses his own private key to decrypt it on receiving.

Public-key ciphers reduce the adverse impact of public key leakage. An attacker with Alice’s public key in hand is unable to decrypt messages sent by others to her. Also, only $n$ keys have to be exchanged for $n$-person pairwise communication. The advantages overall result in lower key exchanging expense and inclined popularity of public-key ciphers in real life.

## Digital Signatures for Data Integrity

Ciphers solve the problem of data encryption, preventing the messages transfered from being spied by a third party, albeit they do not guarantee the integrity and authenticity of the data. Bob cannot tell whether the message he received is truly sent by Alice, since his public key is known by the world. Towards this purpose, the concept of digital signatures must be introduced.

Digital signatures employs the idea of hashing. In cryptography, hashing is a technique to generate digest for a piece of message. The digest must be almost unique, that is, two different messages should ideally have unequal value of digests. Also, it should guarantee that no one would recover the original plaintext from the digest .

Practically, Alice the sender would attach an encrypted digest as digital signature along with the message, by first applying a hash function and then encrypt with Alice’s own private key on the message. Anyone can decrypt the signature with Alice’s public key to verify that the message is truly signed by Alice and sent as-is. Since no one else knows Alice’s private key, the signature cannot be forged and hence is a mighty tool to assure authenticity.

# GPG

The ciphers and digital signatures form the foundation of modern cryptography, upon which OpenPGP is proposed and GPG built as a high-level structure for convenient daily usage. This post will not explain the full details of GPG, but its basic idea and some of the frequently-used operations as tutorial.

Compared with the basic public-key system, OpenPGP further adopts a more sophisticated design. OpenPGP adopts a concept “user” to distinguish identities. A user is uniquely identified by his real name and email, and could own a primary key pair plus an optional collection of sub key pairs, each key pair with potentially different capabilities such as encryption or signing. The separation of keys’ responsibility enables one to revoke compromised keys without interfering the validity of others, leading to more flexible key management.

## Key Generation

To create a user and generate the key pair, we can use the gpg --generate-key command

In this example, we’ve created a user with real name being FooBar and the email foobar@foobar. During the process, the program will prompt a dialog inquiring to enter a passphrase, which acts as the main guardian to access your secret key.

By default GPG generates two keys with different capabilities. The primary key prefixed with pub is for signing (S) and certifying (C), and a sub key prefixed with sub for encrypting (E). With gpg --list-key and gpg --edit-key commands, we can inspect the keys stored in our local database and edit one or more of them.

## Basic Document Signing

When posting a document to the public, one would like to claim his issuance and expects no one could tamper the content, which can be achieved by digitally signing the document. Let’s check an example

Here we create a file named doc with a string "hello world" as the content. gpg --sign -u FooBar signs and encrypts the given document with user FooBar‘s secret key, with the bundled result written to a new file doc.gpg. A person knowing FooBar‘s public key could verify its integrity with --verify

or directly decrypt it with --decrypt

If the content of doc.gpg be tampered, either of the above operations will fail.

GPG provides several flags to customize the generation of digital signature. For instance, flag --clearsign forces the signature to be separately attached after the plain text, which is more convenient for scenario like sending via e-mail

With -o<filename> the output will be directed to <filename> instead of the default file name doc.gpg.

## Document Encryption and Trust of Web

Document signed using above method could be read by a wide audience, as long as they have user FooBar‘s public key. For a more limited usage where the document should be seen by specific recipient, say user BazBaz, we should encrypt it with BazBaz‘s public key.

The command gpg --export -u BazBaz > bazbaz.gpg will dump all public keys of user BazBaz to file bazbaz.gpg, which can be distributed and imported by other users across the web. As an example, user FooBar imports the file to his local database

As we can see, the public key of BazBaz now shows up in the local list, but somehow the uid is labeled as [unknown] instead of [ultimate] as FooBar does.

The label [unknown] indicates that GPG will distrust any newly imported keys by default. OpenPGP comes with a multi-level trust model in defense against someone pretending as others’ identity, with [unknown] being the least trusted level. GPG will prompt us if we attempt to encrypt with an [unknown] key

This mechanism protects us from accidentally sending secret information to forged identity.

To tell GPG that the identity is really trusted, we can sign the public key to increase its trust level. Remember this must be done after you actually verify the identity via direct contact to that person. The --sign-key flag is used for this purpose

Now check the list again, we can see that the trust level of BazBaz‘s key changes from [unknown] into [full].

OpenPGP’s trust model allows trust to propagate over the web, which eases the overhead of acknowledging key identities. In short, if user A trusts user B‘s identity, and user B has signed the public key of user C, then user A will transitively trust user C‘s identity. User A by this way has no need to individually verify the identity of all the imported keys and therefore enjoys an easier key management scheme.

# GPG and Git/Github Integration

GPG can be employed to claim the authenticity of your code by digitally signing your Git commits. Since Git itself uses email to distinguish authors, it’s possible to commit as other people’s identity. A story described how one could push code to Github as the identity of Linus Torvalds. Such vulnerability can be exploited to disseminate malicious code or falsy information over the internet.

## Integrate GPG with Git

The Github Docs has a series of posts as the guideline to commit signing and Github interoperation. To start with, we should tell Git about our signing key:

With the ! suffix the key precedes others and would always be used. We can alternatively configure to sign commits by default

As demonstration, let’s switch to the workspace of a git repository and commit the code as usually did

Afterwards, we can inspect the history and see a digital signature attached

which indicates the commit has been signed with success.

## Integrate GPG with Github

GPG-signed commits can be highlighted with a Verified label displayed aside on Github, as showcased in the image below,

from which other people would know and trust the authenticity of this commit. Towards this effect, one should associate his GPG keys with Github profile. As instructed in “Adding a GPG Key”, the GPG public key is firstly exported from the command line in the text-form as

which should be copied to the clipboard with the separators included. Then in the upper-right corner of any page on Github, click the profile avatar and select Settings -> Access -> SSH and GPG keys -> New GPG key, paste the previously copied content into the box, and confirm with the Add GPG Key button, we should finish the association.

# Conclusion

GPG is a convenient software to do cryptography jobs and perform key management. While its history went back into old days and the UX might look wierd, it still stands as one of the de-facto standards in modern world. This article extensively explains the fundamental idea of modern cryptography on which GPG is based, followed with the demonstration of some GPG every-day usages, and further the instruction to integrate it with external tools/services such as Git or Github. Hopefully it will enlighten you about the approaches to carry out secure message exchange in daily life.

# Move the Root Partition of Ubuntu

Some days ago, I made the decision to shrink the footprint of Windows system on my laptop and reallocate the disk space to the Ubuntu system that resides next to it. Ubuntu is competent for my daily use of programming and web browsing so I hardly launched the OEM-shipped Windows since the laptop was bought. The Windows takes up a not-so-small portion of my SSD space, which can be better utilized instead of wasted in vain.

As planned in the diagram above, 136 GB space would be reclaimed from the Windows C: partition and merged into the root partition of Ubuntu. I had the experience to adjust the size of disk partitions, but this time the job was a little more risky, since it involved moving the starting point of Linux root partition. Linux relies on special information in directory /boot/efi to boot itself, and if the information is not modified accordingly during the moving, the entire system would become unbootable.

To avoid catastrophic consequence, I did some research beforehand and read a detailed guidance on AskUbuntu. It turns out the tweak requires two steps to accomplish. First is to adjust the partition sizes with the GParted tool as I used to do for ordinary data partitions. The GParted system has to reside on and be booted from a separate USB device, so that the hard disks in my laptop can be fully unmounted for manipulation. This is the easiest part thanks to the straightforward GUI partition editor provided by GParted, with which I can do the adjustment in a few clicks.

Each disk partition is assigned with a UUID or serial number like b424102c-a5a6-489f-b0bd-0ea0fc3be7c3 to uniquely identify itself, which will change as the partition moved or resized. So the next step is to rebuild grub configuration to ensure it contains the new serial number of my root partition. But before running grub-install I have to emulate the directory hierarchy of my Ubuntu system by mounting relevant partitions to form the root directory and using chroot to start an interactive shell in it

Following the guidance, however, chroot did not succeed and complained that /bin/bash could not be found. I checked the corresponding directory /tmp/mydir/bin and found it was a broken symbolic link

It appears that /bin is a symlink to usr/bin, but my /usr directory resides on the other partition and not yet mounted. With the directory /usr mounted, the chroot command works as desired.

The spawned interactive shell allows command to run as it were in my Ubuntu system. Type grub-install /dev/nvme0n1 to write in the new serial number of root partition. It’s worth noting that the argument /dev/nvme0n1 passed to grub-install is the name of the hard disk device to write instead of some partition name like /dev/nvme0n1p1.

Oops, the command failed and something still going wrong. After some time of inspection, I find the culprit to be that the directory /boot/efi is empty, which by default should be a bind mount to partition /dev/nvme0n1p1 but not mounted properly. This can be solved with another mount command

By now the boot information is eventually updated. I reboot my laptop and everything works as-is.

So the take away for this tweaking is that some special directories like /usr or /boot/efi reside in different partitions outside of the root directory / on my laptop. If you are fixing the grub and come across some similar error reports, be sure to correctly mount all concerning partitions to form the filesystem hierarchy.

# A New Programmer Kicks a Roadblock

The time I composed my first program can be back to my junior high school age. It was the first day of PC lesson, and everybody crowded to the computer classroom. We were told to learn “programming” there. The kids who were talented would be selected and trained for OI . Others instead would go to an ordinary class and learn something more general.

I was anxious. Before the time I had no concept of what “programming” is, nor had I ever gone through a real PC lesson. The PC lesson in my primary school barely taught anything. Over the time the teachers let us play games instead. I could type merely a dozen of characters per minute, since I’d never received a thorough typing training. I was ignorant of inside the metal box. I was a complete computer idiot.

But some of my classmates did. They typed swiftly like wind, they knew how to play with the operating system, and what’s more, they were chattering excitedly about things like “C language”, “array” or “for-loop”, words I’d never heard of.

I sit in front of a monitor and the class began. The teacher said we were going to learn a language named “Pascal”, and she instructed us to open the “Free Pascal IDE”. I followed a few clicks through a cascaded menu and finally reached the item. A window popped out.

The screenshot was taken on my Ubuntu recently, but at the time it was on Windows 7 and looked slightly different. Not many people these days have heard of the Pascal language, and fewer have seen this antique interface.

It was the weirdest interface I had ever seen. The IDE was like another system trapped in a small unresizable window , with queerly rendered icons and widgets. The menu wouldn’t expand on cursor hovering. The editor wouldn’t scroll when I wheel my mouse. And most importantly, there was English everywhere, which frightened me.

The teacher then showed us our first program to type. It was a simple one that reads an integer from one file, and writes its square to another. The code was like

It took me quite a while to put these lines onto the screen, and more time to “save the code as a file”. Before the day I had no idea of what a “file” is, plus the file selector of IDE was not ergonomic at all. After saving I just noticed an icon with title program1.pas popped out in the Windows file explorer. Then I hit the Compile menu entry. More icons popped out, including one named program1.exe — and that was my program.

The next to figure out was how to run the program, which was comprised of several complicated steps.

The first thing I should do is to right-click in the file explorer, select the “New -> Text Document” entry, and rename it to program1.in. The OS would prompted me as I’ve changed the file extension, but I should click “Yes”. Then right-click on the created file, select “Open with…” and choose “Notepad” in the dialog. In the notepad type an integer like 3, save and close it.

By now the input was prepared, and I should double-click the program1.exe file to execute the program. A black window flashed by, and one more icon with title program1.out appeared. Open it with the same trick as input file, where I saw the result number 9.

Woah, that was amazing. Within 40 minutes I’d created something “intelligent”, albeit excessively simple, working faithfully whatever number I fed .

In company with the flood of joy, however, there goes the frustration. It aroused a feeling that programming is complicated as ordering a banquet for a serious occasion, with so much doctrinal detail to care about. What upset me the most is, I spent most of my time fighting against irrelevant issues, but caught little idea about true programming throughout the class hour. The reason is that I lacked certain understanding about the OS beneath, without which one could go nowhere on the trip towards programming.

And there exists another question — is interacting with a program always so painful? Of course not, but until several months later did I realize the assign(...) statements were not a necessity of an integral program, and there’s so called “command line interface” where you can type the input easily and immediately get the result. The awkward interaction bridged by files was actually dedicated for OI evaluation, as I knew afterwards. It took me one year to understand the ABC of GUI programs, when I built my first Form-based application with Delphi. My program no longer shipped with an ugly black window! And it unlocked varied interaction as the ones for daily use. After more years, with a broader understanding of programming I get, I am able to create websites, mobile apps or anything fit in my requirement. But for a 12-year-old kid at the time, the first program was just not appealing and NOT COOL at all.

The class, of course, was not designed for teaching cool things. It was for choosing talented guys towards a specific target. But over the years, I kept seeing people who were new to programming and struggled at half way, for one reason or another. This makes me consider about the root obstacle for a new guy to learn programming.

The way they are taught is no doubt a fundamental factor. Learners should be motivated so as to earn confidence. I used to know some power users who get started with programming smoothly and swiftly. They have clear goals for programming, some to tweak the system behavior and others to automate daily work. They learn the minimal knowledge by documentation or blog posts, and then come up with a prototype program which accomplishes the job. The entire process is interesting and fulfilling.

But for elementary learners, the ones who know little or nothing about computers, this does not always apply. Most of them are aimless, having no idea what programming can be used for. What’s worse, they are taught to use inappropriate tooling, deteriorating the learning to some boring and painful nightmare.

I can remember in the Programming 101 of my college around six years ago, we were taught C and to use the obsolete Visual C++ 6.0 IDE. The compulsory course was rather like a math one, where most time was spent in the classroom reading slides, and homework was handwritten to figure out the result of code fragments. The merely four coding tasks were to implement some algorithms and data structures, fairly dull. Some of the classmates had no deep knowledge of computers, or even didn’t use before (since mobile devices were popularized). They went through a hard time to understand low-level concepts like pointers, and were desperate in finishing the coding tasks. They learnt for the exams, with little or no interest, and soon forgot the things in one or two semesters.

I am not claiming languages at low-level like C is not suitable as the first language — for those will major in computer science, they demonstrate well how the machine works. But other learners deserve a much modern language at a higher level, plus a coding environment that will hide off obscure machine detail. The language and tooling should ease the hurdle to create appealing projects.

The language we do have, like Python. It’d better to be young or carefully designed, so that backward compatibility won’t cause too much confusing syntax. It should support imperative paradigm so as not to blow the learner’s brain (unless for mathematicians), but not limited to this for going further. And most importantly, it should conceal the low-level stuff to better illustrate the basic idea of programming.

But the tooling we don’t, at least not yet perfect. Lots of work should be done to create such a layer between OS and ignorant learners, and should be done perfectly well without bugs. I’ve seen bugged programming environment leaked more detail about the underlying support, causing its users frustrated and frightened.

The need of domain-specific learners may also be noticed. Some people learn programming to improve the productivity in their expertised fields, e.g., data analysis or financial trading. Like power user, they would be fulfilled if the first few programs can assist the jobs, but from time to time that’s not the case. The guidance or tools, however, are often poorly crafted, probably because there’re few professional programmers in the field.

Over the time I have witnessed programming languages and toolchains evolving, which enables the chance for skilled guys to build faster and safer programs more easily, but the learning curve of 0 to 1 benefits not much from the trend. I am expecting to see it change in the future.

# Git-based Dependencies in Dart and Go

Both Dart and Go support decentralized package distribution. One is able to directly adopt an existing git repository as dependency, easing the effort of distributing packages.

Sometimes we might expect more fine-grained control on what to pull from a git repository. For example, to lock a package’s version, we would specify a particular tag, commit or branch name to pull from. Or if it’s a mono-repo, we would choose a sub-directory from the repository root. This post summarizes how to achieve these purposes in both languages.

## Dart

dart pub add has several options related to adding a git repository as dependency. To start with, one should specify the repository’s URL with the --git-url argument

This command adds a dependency named repo by pulling from https://github.com/user/repo.git. The --git-path argument can be provided to specify from which sub-directory of the repository dart should read from

Dart can also read from a specific git commit or branch (but not tags!), for which one should supply the --git-ref argument

## Go

Go modules are born to be based on git repositories. We regularly use go get to add a dependency

, which pulls a remote repository, and parses its content as a Go module. A tag @vX.Y.Z can be suffixed to specify a particular git tag like

, which instead pulls from a tag named vX.Y.Z. There’s a detailed description on version tag’s semantics at Go Modules Reference - Versions. Straightforwardly, we can append its path after the repository’s name if a sub-directory is to be used

Things become a little trickier when both sub-directory and tag are wanted. Literally, we might type a command as below

It, however, will fail with a complaint go: github.com/user/repo/subdir@vX.Y.Z: invalid version: unknown revision subdir/vX.Y.Z. What’s happening is, when a sub-directory is involved, Go modules will seek for a tag name with a pattern like subdir/vX.Y.Z, instead of aforementioned vX.Y.Z. This enables multiple sub-repos in a large mono-repo to individually tag their own version. We are hence required to rename the tag as subdir/vX.Y.Z, which should work as intended.

# Reversy Naming

I am always a dedicated fan of writing naturally readable code – by “naturally readable” I mean, one can read a line of code as if it were a sentence of English (or maybe other human languages). It’s believed that the practice encourages more self-explainable code, as the code reads more like a human-composed article, instead of some gibberish only recognizable by machine.

The practice recommends to name functions or variables following the word order of human language, for English that is, subjects come after verbs, and adjectives go before nouns that being modified. The samples below showcase how it guides naming in a program (please hold your opinions about the casing)

• append_to_list(lst, item). A function that appends an item to a list, which can read as “append to the list (specified by name lst) with the item”.
• register_service_notifier(func). A function that registers another function as a service notifier, which can read as “register a service notifier with the function func“.
• UserFollowersListView. The name of a web component which is a list view to display followers for a user.

It plays well and improves my developing experience most of the time, but there is no silver bullet, just like other practices or guidelines.Sometimes I found the readability even degrades. I kept skimming the lines and just couldn’t locate an item efficiently.

For a brief period, I thought it was caused by the verbosely long word sequence, since compared with shorter ones, they took more time to recognize. But then after some investigation, I realized it was not.

The true culprit is that, such “naturally readable” naming displaces the emphasis from the beginning of word sequence. The emphasis of a name is the words with highest level, usually the most general ones. For instance, append_to_list emphasizes on list, which is however placed at the rear of the name.

As human, at least for me, a name with its emphasis at the front is more recognizable than those doesn’t. When skimming through a screen of code, my sight focuses on token boundary like whitespaces, hopping from one to another. During which time, I will glimpse at one or two words next to the boundary, which, usually at the fore of a name, and subconsciously match them to what I am concerning.

The matching process itself, will speed up if I meet first the words at a leading position of the name / phrase. My mental model resolves a concept by first the general idea and then the descriptive details. This, however, goes to the opposite of most human languages, as far as I know, whose grammar puts general words after the modifiers.

And thus I come up with a new practice – the Reversy Naming, in order to accord with my mental model, which places the emphasis first in a name, and then goes the words at low-level. To illustrate, I apply the style to the three names above as an example

• append_to_list -> list_append
• register_service_notifier -> service_notifier_register
• UserFollowersListView -> ListViewUserFollowers

Probably wierd at first sight, but despite the inversed word order, they are not difficult to read. In fact, here comes several additional benefits.

Firstly, it conforms with the qualified syntax in most programming languages, which most people used to. A programming language with object-oriented paradigm usually supports a syntax like object.method. In Python I write things like list.append() for years, which is similar to aforementioned list_append, and I haven’t got any readability problem with it.

The next point is, the names will align well if they appears in consecutive lines. Consider there are many functions to operate a service, with “naturally readable” naming, we have

It is not evident at a glance that these functions are for manipulating the same type, although they have a common word “Service” in the middle. But with Reversy Naming, we could have

Now they share a common prefix, which indicates their affiliation, crystal and neat. If there’s a secondary emphasis, “Service -> Notifier” for instance, they will also align in a good manner.

I’ve also spotted similar naming rules in the wild, which supports Reversy Naming is an acceptable and recommended practice in some scenarios. For example, the Style Guide - Vue.js reads

### Order of words in component names

Component names should start with the highest-level (often most general) words and end with descriptive modifying words.

Since editors typically organize files alphabetically, all the important relationships between components are now evident at a glance.

# 人类一败涂地

“孩子，那就是天王星了吧？”

“我见过天王星，在小学的时候。”曾祖父依然望着窗外。“在一本科普杂志上，天王星是青色的，星环几乎立了起来，像芭蕾女孩的裙摆。但现在看到的好像和书上不太一样……”

“因为这船不是从黄道面过去的。”我有点不耐烦。错过了另一班更快的船，我不得不在路上多花一倍的时间，没有什么比这更糟了。“如果是从黄道面过去，天王星有可能是那个样子。”

“噢噢，是的……”曾祖父似乎察觉到我的不耐烦，怯怯地缩了缩脖子，安静下来。我又埋头看起了讲稿。

“不过，这真是个好时候啊！”曾祖父喃喃自语道。“一眨眼的功夫就到天王星了。”

“年轻的时候我去找你曾祖母。那时一个城市还很小，现在看来近得要贴在一起了吧。但就算坐上铁皮汽车，也要吭哧吭哧走上一个小时。到了夏天即使有冷气，车里也和发酵了一样，就像那鲮鱼罐头……现在已经没有这种东西了。”

“那时出门很难吧？”我随口搭了一句，但很快又后悔了，担心曾祖父开始聊个没完。我对古代的事不感兴趣，眼下还有更紧急的事。

“你们年轻人可能受不了，我们这代人是习惯了。”曾祖父似乎很高兴有人和他聊天。“汽车的时速不过百来公里，现在看来慢得像蜗牛——欸，飞机倒是挺快的，半天之内就能跨越一个大洋。但飞机也不是随便就能坐，人们得先到达城市的边缘，再经过一些繁杂的手续，才能搭上飞机。有时为飞行做的准备甚至比实际飞行时间还要长，就像在长长的晚祷后却吃了泡面一样可笑。”

“我还记得第一次坐飞机，是和我母亲出去玩的时候。那是一个温柔的傍晚。当我们的飞机从云中腾出时，我第一次从这个角度看到日落，兴奋得几乎尖叫了出来。天空离我那么近，但天空又是高不见顶。我被笼罩在一种美丽而深邃的神秘感中。

“现在的人或许难以理解这种兴奋了，但在那个年代，飞机可是一种充满希望的事物。”曾祖父眼里的目光变得深远，似乎在努力回味着这种感觉。“我们的先祖在土地上生活了上千年，也未曾腾空哪怕是数尺。这种”

“真好呀。那您第一次上太空时大概也是这种感受吧？”我对曾祖父的故事产生了兴趣，索性放下了电脑。还有一个多小时，聊聊天放松一下也好。

“上太空……那是好久好久之后的事了。”曾祖父的目光变得深远，好像在极力回忆。“说实话，曾祖父第一次进入太空时远不像坐飞机时那么兴奋，甚至可以说有些平静。”

“这是为什么呢？”我不禁感到奇怪。我从小是在空间站长大的，太空对我来说就像呼吸一样平常。但对于从地球迁徙来的曾祖父，太空应该会有十足的震撼。

“与其说是平静，倒不如说是释怀。”曾祖父接着说。“当时我想的是，像我这样活了大半辈子的人也能上太空了，人类总算还有点希望。”

“孩子，我虽然在地球上生活了大半辈子，但我对太空的记忆却是很早就有了。

“我出生在一个充满希望的年代。在我出生前三十年，人类已经登上了月球。我童年时看的科幻电影中，我们的远征者已经在太空拓荒了。甚至在不远的近未来，我们就能拥有会飞的汽车，会浮空的滑板。我小学时写过的作文《十年后的家乡》，对未来有着天马行空的想象。

“但在我的一整个青年时代，人类再也没有登上过月球。甚至，在太空中的人两只手都能数得过来。”

“最开始的登月只是两个国家无意义的竞赛罢了。”我努力回忆课堂上学过的近代史。“后来人们也证明了，远航要有十足的技术准备，只乘一叶扁舟是激进而冒险的。”

“但是那次登月让人类相信，太空时代已经来临。”曾祖父喃喃道。“人们在影视作品中歌颂着、憧憬着未来。人们对未来抱有信心，这种信心给了人们乐观和勇气，却在随后的大半个世纪中消耗殆尽。幼年时的幻想，到了暮年时仍是幻想。”

“当然了，最糟糕的还是那场萧条。”说到这里，曾祖父的目光黯淡了下来。

“萧条……是二十年代的事吧？”

“没错。虽然二十年代前已经有了些许端倪，但感觉真正的衰落还是从一场瘟疫开始的……现在已经很难想象瘟疫会有这么大的影响了。”

“确实……现代医学几乎是无所不能了。”

“更难想象的是，当时人们对自己的医学也是有信心的，这种信心甚至不减现在。人们认定可以控制住瘟疫的蔓延。”

“你们确实做到了。那场瘟疫和更古早的瘟疫比起来，不过是九牛一毛。”

“不不，你不了解，”曾祖父摇摇头，“那些是愚昧的年代，和我们有着本质的区别。我们本可以做得更好。”

“我们是曾经登上月球的种族，我们创造了这个星球的辉煌。但在当时，还是闹出了不少荒唐的事。这些事从现在来看想必是不可理喻的。不论是激进派，还是保守派，大家皆是如此。整个世界变得光怪陆离了起来。

“当时的瘟疫没有感染每一个人，却让每一个人感到了窒息。这种窒息是来自各个方面的，像一道无形的绳索悄悄勒紧人们的脖子。人们的束起了手脚，也束起了的思想，变得人人自危，不管在哪个经纬度都是一样——当然后者可能和瘟疫没有关系，只是这种感觉在瘟疫中被放大了。”

“真是奇怪呢。”我尝试代入这种感觉，发现自己理解不了——毕竟是好多年前的人了。但与曾祖父的对话让我感到很有意思。中学时我认为历史是枯燥的，现在我明白了，是缺乏临场感。

“孩子，不知道你能否想象。”曾祖父把目光移向窗外。“我出生在一个繁荣时代的末期，我怀着对未来的畅想步入青年，世界却开始走向衰退。这种衰退，如果单纯是由于外界的因素，那倒也是没什么。但这个辉煌的种族已经遇到过了同样的危机，却还是处理不好，甚至没有吸取任何教训。我所读过的历史，成为了我每一天的历史。

“从那时起，我不再对人类抱有希望。当时这么想的人不在少数。一个人年轻时认定了一件事，一辈子也就认定了这件事。

“有时我也会想，那些先于我出生的人又是怎样的呢？他们也许经历了比这更多的苦难，此时的心境不比我糟。我似乎没有理由失落，时代的衰退却又是真实地挡在我面前。但后来仔细一想，这种‘不应该失落’的想法更像是一种教条，来自前一代人的教条。前一代人无法带入我们的失落，就像我们无法代入他们的苦难。很多年后，当我们成为长辈，那个时代为我们塑造的想法又会成为新的教条。”

“孩子，你对人类有信心吗？”站台上，曾祖父突然问我。

“有的。”我回答。

“那就好。”曾祖父笑了笑。“去忙吧，孩子。我在这边自己逛一逛，逛累了自己回木星就好了。”

“您一个人可以吗？”我有点担心。

“放心吧，我对自己还是有信心的。”

# Invalid Golang Pointers Can Bite You Even If You Don't Dereference

In Golang, if you coerce a uintptr variable into unsafe.Pointer (or further, to some *T), the linter will warn with the message "possible miuse of unsafe.Pointer". This makes sense because the uintptr variable may contain an address that points to a piece of invalid memory, and dereferencing such a pointer is catastrophic (usually aborts the program).

I was always aware of the above discipline, but I thought it would be OK to hold the pointers but not dereference them. This is true in C/C++, but not for Golang, which I did not realize until recently.

In fact, the program can panic even if you just keep an invalid pointer on the stack!

## A strange invalid pointer panic

The story back from an attempt of interoperation between Golang and JVM, when I was working on a Go-written dynamic library which need to operate bluetooth socket on Android. Android does not provide any native interfaces for bluetooth, so I had to call into JVM and invoke Java APIs.

I have learned JNI beforehand, which is an interface designed for interacting with JVM from native codes. Since JNI is provided to programmers as C++ header files, I had to seek a Golang binding. Then I noticed xlab/android-go which, as utilities, encapsulates the full list of JNI types and functions. The project was out of maintenance for some while, but using only the JNI pieces should be fine.

With the help of xlab/android-go, I quickly finished a prototype of my library, so good, so far. I bundled the library into apk file, ran it on my phone, but unfortunately it crashed with the stack strace

I was not frightened, since no code would succeed in one go. But the error report did frustrate me from two perspectives

1. It involved one of my stack frames (kcore_android/bluetooth.ioWorker.Loop), but the panic was thrown from some source code that lies out of my codebase (runtime/stack.go).
2. It was caused by an invalid pointer, whose value was 0x1.

I guessed the pointer was returned from the Java side, for some unknown reason it had a wierd value of 0x1. But what I didn’t understand is how it could crash my program. I have tried carefully to avoid dereferencing any non-Go pointer in my code.

Also, the mismatch between stack frame and source code made me really difficult to locate the problem. For a time I thought goroutine 51 stopped at the scene where the pointer troubled, as its stack trace contained the aforementioned frame bluetooth.ioWorker.Loop, but it didn’t. In fact, the goroutine stopped at another line when I restarted the program! This was annoying.

It took me almost half a day to resolve and understand the problem. I will first explain the origin of the invalid pointer, and then show how it would crash the program.

## The origin of 0x1 pointer

In JNI, the C type jobject acts as a handle for Java object, which is technically an alias of void*. They can be created by calling most JNI functions like JNIEnv->CallObjectMethod.

Although being a pointer type, a jobject variable is not necessarily a valid pointer. To understand one should know that there exists two kinds of object references in JNI, local reference and global reference. Local references will be recycled at the end of a Java frame, while global references survive longer until you delete them.

They not only differ semantically, but practically diverse in values. Local references often contain smaller values like 0x01, 0x75, yet global references will have values like 0x7dbeffc1cf. I guess local references are not actual pointers but indices of some internal object tables.

Symmetrically, xlab/android-go defines a Jobject which was an alias for unsafe.Pointer. So if you recieve a local reference from JNI functions, you are owning an invalid pointer at Go side.

## Go runtime checks invalid pointers during stack growth

What’s interesting is that, goroutines do not statically allocate their stack. Instead, they are able to grow or shrink the stack according to our needs. I will not dive into the details of this mechanism, which you may read from the article Go: How does the goroutine stack size evolve? if you are interested.

My panic was thrown by an invalid pointer checking during stack growing. Why should the Go runtime check for invalid pointers here? Because growing a stack involves memory re-allocation, and the runtime must ensure no pointer is invalidated after the potential moving.

To see how a moving could invalidate pointers, let’s consider an example. Say we have a goroutine whose stack ranged in address space 0x8000 - 0x8800. An integer i int was stored at 0x8000, and a pointer ptr *int referenced to that int stored at 0x8004, whose value is 0x8000. Now we grow the stack by moving it to address space 0xA000 - 0xB000. If ptr retains its old value, it will no longer point to i since i has been moved to 0xA000! Therefore, during a stack growth, Go runtime must also check the existence for such pointers, and change their values accordingly.

However, the Go runtime does more than checking whether or not a pointer value falls in the old address space range. It also checks and complains about pointers with small values

The above snippet can be found at runtime/stack.go. If a pointer value is less than minLegalPointer (which is 4096), the runtime will also panic! And that’s the culprit for my case.

## Conclusion

Now I know that the panic comes from two aspects. First I have an invalid pointer due to FFI, although I don’t mean to dereference it. The Go runtime, however, does more than I thought behind the scene. It moves the goroutine stack when necessary, during which it checks and complains for invalid pointers.

This reminds me not to coerce foreign pointer-like values into Go pointers, if you won’t dereference them at the Go side. The safest practice is to keep them uintptr. As a solution, I patch and slim xlab/android-go into hsfzxjy/android-jni-go, which works like a charm.

I also create a minimal example to reproduce the above problem, for whom interested to investigate. In this example, the main goroutine stack will grow during the invocation of foo() -> bar() -> baz(), during which the Go runtime encounters the crafted pointer ptr, and eventually panics.

# Side Project（副业）

Side project 首先是实用的，这由计算机从业者的性质决定。在信息化的时代中，他们是日常生活工具的锻造者，是数字世界的铁匠、木匠。计算机从业者的创造主题广泛而多样，从游戏到生产力工具软件，乃至一些智能硬件，无一不可解决切实的问题。但与传统世界的工匠不同，完成一个完整的作品并不需要诸多精细化分工的角色。一旦有了想法，一个人便可马上付诸行动，借助完善的文档快速地构建一个可用的原型。和其他行业相比，计算机从业者可以花费更少的资源完成一个项目。

Side project 也是自豪感的一个重要来源。凭自己的力量创造一件事物，并用它来解决实际的问题，这本身就是一件了不得的事情。如果这个事物再被外人所知悉，那更是有超越造物主的自豪感——毕竟无人和造物主一起分享这种喜悦。自豪感不总是能轻易获得，却又是生活的必需品。如果你不能从工作中获得自豪感，你会更倾向于从 side project 中获得这种体验。

# A Flaw of Promoting Complex Trait Bounds in Rust

Days ago, for some reason, I was trying to implement a function that can polymorphize over its return type. The solution is simple, but my brain was jammed at that time, trapped in some complicated typing tricks for hours.

During the struggling, I coincidently ran into something that is temporarily a flaw in the current Rust compiler implementation. In some cases, the compiler is not smart enough to promote known trait bounds, and we have to replicate them again and again. Although the problem is afterwards proved to be a useless “X-Y Problem”, I would still like to share the story.

## The Problem

Let’s say we are going to write a function that digests a given &[u8] slice and computes a hash value. The function would adopt either of two different algorithms, and a u64 or u128 integer is returned as the hash result.

Trivially, this can be achieved by splitting into two functions get_hash_u64() and get_hash_u128(). But I prefer to have a single and unified interface, so concretely, I am expecting a function to polymorphize over its return type, with the following signature

Two things I should fill in for the above snippet

1. The where-clause. Some trait bounds might be satisfied for typevar T, and I expect them to be as concise as possible in order for less verbosity in callers.
2. The body. The function should behave differently regarding different typevar T.

In order to emulate the effect of choosing different hashing algorithm, we expect a different numeric value be returnedwhen different typevar T supplied. Also, since the argument b: &[u8] is irrelavant to our problem, I will omit it in the following text for brevity. So overall, I would like the two assertions to be held

Before stepping far, I will place a simple and straight-forward solution at the front, in case of anybody taking the same wrong path.

Specifically, we can define a trait, say HashVal, as the upper bound of all possbile return types for get_hash.

For each possible type such as u64 or u128, we place corresponding hashing algorithm in HashVal::digest

get_hash<T>() is now polymorphized over return type T. User might select a 64-bit hashing algorithm via a calling like get_hash::<u64>().

This solution is neat and, most importantly, the prerequisite T: HashVal is concise and self-explained, which saves a lot of verbosity in callers’ where-clause. However, this didn’t come to my mind at that time. I alternatively choose a more complicated solution.

In this version, I start by a dummy struct Hasher and a trait HashDispatcher<T>.

The struct Hasher implements HashDispatcher<T> for different type T with corresponding algorithm filled in digest() method

Function get_hash<T>() delegates the calling to Hasher::digest(), which requires a verbose trait bound Hasher: HashDispatcher<T>. In order to reduce the boilerplate, I was seeking to write another trait, named also HashVal, such that for all T being a HashVal, the trait bound Hasher: HashDispatcher<T> holds, or formally $$\text{T: HashVal} \Rightarrow \text{Hasher: HashDispatcher<T>}$$. If achieved, the signature of get_hash can be largely deduced into

## The Incorrect Attempt for HashVal

The first attempt I made was to place the bound in the where-clause of a generic impl

I mistakenly thought this would fulfill my purpose. The statement instead should read as “for every T that satisfies Hasher: HashDispatcher, T is a HashVal”, which delivers a different implication that converses to what I expect. Thank u/schungx for pointing out in the reddit thread. As a counter-example, one is able to impl other types as HashVal, while without ensuring them to satisfy my bound

## The Correct yet Flawed Attempt

u/SkiFire13 mentioned that the trait bound should be placed at the definition of HashVal to meet my requirement like this

I have no memory of seeing a where clause in the trait definition before. The syntax is not introduced by “The Book”, but rather mentioned in the RFC of where clause.

where-clause for trait is not a new concept. In fact, the “supertrait” bound can be regarded as a specialized version of where-style bound

More generally, the where-clause is used to elaborate the constraints that the typevars (or the special Self) should satisify. If SomeT: Trait holds, type SomeT should meet all the requirements in Trait‘s where-clause.

As for our case, the where-clause grants an upper bound for HashVal – any type T implements HashVal should satisfy Hasher: HashDispatcher<T> beforehand, which is precisely our requirement.

With this declaration, however, we still cannot deduce the trait bound of get_hash to T: HashVal, due to the flaw of current compiler. A long discussion “where clauses are only elaborated for supertraits, and not other things” can be found on Github back in 2015.

In short words, except from some simple constraints like supertraits, the constraints in where-clause will only be respected within the trait definition (to ensure some type-checks in the trait can pass), but not be promoted in other places.

The flaw is quite annoying. We still have to replicate the verbose trait bounds here and there. Hopefully it can be fixed in the future.