- There are several advantages to intrusive node-based data structures that I haven't seen stated in the article or the comments:
- the same object can move between containers with no allocation and no need for a dedicated complex API
- the same object can be part of multiple containers at once; particularly useful for intrusive binary trees, for indexing data with different criteria.
- the container can be fully polymorphic, no need for all elements to be the same dynamic type.
- no need for complex allocators, you can just store the objects as you see fit.
- > The new version isn't generic. Rather, you embed the linked list node with your data. This is known as an intrusive linked list and tends to perform better and require fewer allocations.
I don’t get this. In the generic version the data is embedded with the linked list node, so there’s only one allocation per node. ‘T’ isn’t a pointer (or doesn’t have to be) there’s no indirection between data and node.
by steventhedev
6 subcomments
- This feels like a net negative result. It removes some of the complexity of using generics, but it couples between the data type and the collections it can be indexed by.
What are the benefits of this approach? Is it limited to data alignment, or is it out of a greater desire to remove generics?
by lightingthedark
4 subcomments
- Can someone explain how the claim of higher performance works here? In C, which lacks generics, an intrusive list is preferred because otherwise you end up with each node having a void pointer to the data it holds. The previous Zig version was a generic, so the data type could easily be a value type. Given that the compiler is allowed to rearrange the layout of both structs unless data is packed or extern (in which case it almost certainly won't want to have a linked list node in the middle anyway) isn't the resulting type exactly the same in memory unless you intentionally made T a pointer type?
- I don't use Zig, but one advantage of having a genericised intrusive linked list where the next pointer doesn't have to be the first thing in the structure is when you want to use larger types, such as 128-bit fields. Sticking a pointer at the beginning would mean the compiler would have to insert alignment padding after the pointer or break the default alignment.
- For better or worse, the Zig community has long been embracing the `@fieldParentPtr` builtin as the primary way to do things like OOP-ish inheritance, so this change feels pretty in-character for the language to me.
- Isn't the new one also horribly type-unsafe? Since you can put (parent) objects of any kind into the list without any mechanism to detect which is which?
- Couldn't we have a function to return the data like this?
pub const SinglyLinkedList(comptime T: type) type {
return struct {
first: ?*Node = null,
pub const Node = struct {
next: ?*Node = null,
};
const Self = @This();
pub fn data(self: Self) *T {
return @fieldParentPtr("node", self);
}
};
by alexvitkov
0 subcomment
- Please just use a pointer. Use metaprogramming when it makes sense, not to obfuscate something that was figured out 60 years ago.
by DeathArrow
3 subcomments
- I wonder why should a language implement in its libraries a SingleLinkedList and a DoubleLinkedList.
I do get why you need an array-like list, a dictionary/hashmap, a stack and a queue. I got the impression that linked lists aren't used very often. Or maybe it's a different case with Zig?
- > We give @fieldParentPtr the name of the field, a pointer to that field as well as a return type (which is inferred above by the assignment to a *User variable), and it gives us back the instance that contains that field.
That seems like a refactoring nightmare. If I say:
const user: *NotAUser = @fieldParentPtr("node", n);
Will it calculate this incorrectly? I mean, if `NotAUser` has a "node" field and `n` is actually from the `User` type (and it has a different offset in each type)?
- Not super uncommon in C land, I learned the technique from the Linux kernel.
https://github.com/codr7/hacktical-c/tree/main/list
Haven't come across another language (besides Zig maybe) low level enough for it to make sense. I've tried in C++ and Go, but it gets so ugly that it loses its appeal.
- This looks somewhat horrifying. How do I safely write a function that takes a linked list as a parameter?
by RKFADU_UOFCCLEL
0 subcomment
- This is in fact a com.on technique in the Linux kernel:
https://en.wikipedia.org/wiki/Offsetof#Usage
- Hmph, is there any rationale against a generic wrapper over it? Exposing the internal details can be irky in higher level application development. It would not hurt to have some inline functions.
by RcouF1uZ4gsC
0 subcomment
- Awesome! In my opinion, an intrusive linked-list is almost always what you really want when you reach for a linked list. An external linked list is often worse than a vector.
- I read Zig's new LinkedIn API, and I was intrigued
by budmichstelk
0 subcomment
- [dead]
- Linked lists should be obscure niche data structures for when you absolutely need their unique characteristics, not some front-and-center default.
- A few notes I would make as an outsider:
- conventionally we put the `node' struct at the start of the `user' struct. The advantage is that we can convert from node* to user* with a simple cast instead of messing around with offsets. Knowing the offset stuff is still potentially useful in case you want multiple `node's in a single struct, but that is seldom the case so you can almost always get by with just a cast.
- this strategy is, especially with the node at the start of the user, equivalent to inheritance. I take it from the fine article that zig doesn't have inheritance? If they are planning on adding more interfaces like this linked list one, then inheritance would be a good idea. We often treat inheritance using analogies to the animal kingdom or to geometric shapes, but data structures like linked lists are (to my understanding) the problem that inheritance is actually designed to solve, and it is really the best approach to solving problems like in the OP.
- just lol on zig speed running being a crappier C++.