Go Fact: Zero-sized Field at the Rear of a Struct Has Non-zero Size

See the discussion of this article on Reddit.

There’s a concept in Golang called zero-sized type (or ZST), namely, a type whose variables take up zero bit of memory. One of them is the famous struct{}. People often use map[string]struct{} to efficiently emulate a set structure. Others include zero-length arrays such as [0]int, albeit not very common, are adopted to enforce some properties of a customized type.

type UserName struct{
name string
// ZST [0]func() makes the surrounding type incomparable
_phantom [0]func()
}
println(UserName{name: "a"} == UserName{name: "b"})
// invalid operation: UserName{…} == UserName{…}
// (struct containing [0]func() cannot be compared)
type Tag struct{
Value string
_tag struct{}
}

type Id struct{
Value string
_id struct{}
}

var a = Tag{Value: "Go"}
var b = Id(a)
// cannot convert a (variable of type Tag) to type Id

One may think that a ZST variable always occupies 0 byte of space, which however is not the case. For example, the result of unsafe.Sizeof against UserName might surprise you:

println(unsafe.Sizeof(UserName{}), unsafe.Sizeof("")) // 24 16

The size of UserName is 8 bytes greater than the size of a string, which means the field of [0]func() at the rear takes up exactly 8 bytes.

Well, this is heavy. The incorporation of ZSTs was aimed to ensure type safety without introducing any runtime overhead. However, if this implementation leads to an increased footprint, it may be considered less favorable. Why does it happen?

There’s an explanation from the Go team on Github:

[…] this means it (a pointer to ZST field) would actually point outside of the allocation.
Unlike many other languages, (C, C++, Rust, …) where creating pointer past the allocation (but not using it) is legal. In go it isn’t because the GC may at anytime inspect any pointer.

This design serves as a precautionary measure to avoid any potential confusion for the Garbage Collector runtime. Roughly speaking, Go organizes allocated objects into several memory blocks. At the same time, Go provides users the flexibility to create pointers pointing to or within those objects. These pointers are inspected by the garbage collector from time to time, therefore they must point to somewhere valid within a memory block. Now think about an edge case:

          memory block
______________/\______________
/ \
+-----+------------------------+----------------------+
| ... | name | <unallocated memory> |
+-----+------------------------+----------------------+
\__________ __________/|
\/ |
u := UserName{} |
\|/
p := &(u._phantom)

In this case, an object u of type UserName is placed at the end of a memory block. Meanwhile, a pointer p exists and points to its _phantom field. If the _phantom field took up zero byte, the pointer p would contain the address of right boundary of the memory block, which, once dereferenced by the GC, would lead to invalid access to unallocated memory. Hence, the Go compiler reserves additional bytes at the end of the UserName struct to prevent the occurrence of such cases.

That said, we do have solution to eliminate such overhead. As discussed in the same Github issue, reordering the ZST field to the middle of the struct eliminates the need for additional reserved bytes. This adjustment ensures that there is no possibility of a pointer pointing to the allocation boundary.

type UserName2 struct{
_phantom [0]func()
name string
}
println(unsafe.Sizeof(Username2{})) // 16

Hence, whenever we want to employ the “ZST-trick”, it is crucial to ensure its placement at the middle of a struct.


Author: hsfzxjy.
Link: .
License: CC BY-NC-ND 4.0.
All rights reserved by the author.
Commercial use of this post in any form is NOT permitted.
Non-commercial use of this post should be attributed with this block of text.

«硬卧

OOPS!

A comment box should be right here...But it was gone due to network issues :-(If you want to leave comments, make sure you have access to disqus.com.