Hacker News new | past | comments | ask | show | jobs | submit login
Simple Trie Implementation in C++11 (xorz57.netlify.com)
21 points by xorz57 29 days ago | hide | past | web | favorite | 61 comments

Like someone else mentioned, there's no need for Node pointers at all here, storing Nodes by value cuts away a decent chunk of complexity and should run plenty faster (especially when threads are linked).

I would consider switching the unordered_map to a sorted vector<pair<T, Node>>; or even better, move the key into Node and simply store as vector<Node>. Hash tables are relatively expensive to create in exchange for excellent performance for large tables, and you're creating a lot of tiny tables. A regular map would likely fall somewhere in between performance-wise since they're cheaper to create.

I don't think Node can contain a std::vector<Node>, since STL containers cannot have incomplete value types. But maybe I'm misunderstanding you?

I think absl::flat_hash_map<T, std::unique_ptr<Node>> would be worth considering in this application. Keys and values are stored inline, so the memory and creation costs should be comparable to std::vector - https://abseil.io/docs/cpp/guides/container#abslflat_hash_ma...

Not all containers though, since C++17 std::vector, std::list and std::forward_list do support incomplete element types as long as the allocator meets certain requirements. So std::vector<Node> is definitely OK.

Thanks for pointing this out! I didn't know that C++17 had relaxed this restriction.

Very nice, had no idea.

Nodes it is then :)

Do you have any links to share describing the change?


True enough, my mistake :)

Just getting rid of the hash buckets and shared_ptrs would be a giant leap in the right direction.

Nice. I have no experience from absl, but I've sucessfully used sorted vectors to speed up these kinds of solutions on several occasions [0].

[0] https://github.com/codr7/cidk/blob/master/src/cidk/env.hpp

> STL containers cannot have incomplete value types

As of C++17, certain containers have minimal support for incomplete types.

I did a quick re-implementation [0] taking these ideas into account. It also avoids assuming anything about inputs except iterating the key type and moves most functionality into Node to support more use cases.

[0] https://github.com/codr7/stash/blob/master/trie.cpp

Aren't the second and third lines of the insert/search implementations superfluous? Casting root to a boolean will never return false, since it is initialized. The insert implementation doesn't update root, so if root were nullptr, insert would be throwing away its work.

Yes, indeed!

I really love this! I saw your github repo "forest" as well. I wish there were more bundled resources or code examples like this. Not because I think everything should be out there in the open, free to get. But more as a way of retrieving examples about (maybe complex) algorithms.

Traditional books fail me in most of the cases as I want modern techniques combined with (maybe) "old" algorithms.

I would love it if people could show me modern examples and implementations (and reasoning) about let's say things from clrs for example:

- trees (red/black, splay, B, quadtrees, k-trees

- dynamic programming

- hasing (extendible, linear)

- pairing heaps, binomial queues

- shortest distances with dijkstra, johnson, bellman-ford

- finite state machines etc

- string algorithms like boyer-moore, knuth-morris-pratt

- (...)

FYI: I had no account prior to this comment, created one just for you ;) Keep up the good work OP!

I've implemented a red-black tree a while ago and wrote a post about it: http://jstimpfle.de/blah/rbtree/main.html

Technically it's been a solid implementation and the post probably contains a few helpful bits. But I overengineered the implementation by splitting in too many source files, and relying on source code generation.

A much cleaner "no bullshit" version of it now lives as part of my current project: https://github.com/jstimpfle/astedit . I've also added augmentation facilities (not heavily tested). I think it's pretty good, I just might open up the internals even more than they already are.

(I think the std::map approach from STL definitely has merits, especially when one needs just an associative container, quickly. But it's not a very efficient nor flexible approach.)

That project also has a text rope implementation that uses the red-black tree. I think a rope is much like a B-tree, but I might be wrong.

Also have a look at the linux kernel rbtree implementation. But I think it's not easily usable outside the linux tree.

I am pleased to hear you like my work! I have written a Red/Black and a Splay Tree Implementation as part of the forest repository but I wasen't satisfied enough with those so I removed them. I am planning to implement more stuff in the future so stay tuned!

Ohn it would have been cool to have a look. I have some start questions and code from a few years back at uni but I really disliked the courses, prof and way of programming. If you would be interested I can share them to gain some insights or motivations on how to (maybe not) do it :)

I feel you.

Is `shared_ptr` just used for convenience? I think `unique_ptr` would be excellent for this tree structure.

It's not too bad to traverse the tree using non-owning raw pointers.

I don't think there's need for smart pointers at all, really. The "children" member could just be an unordered_map<T, Node>.

Edit: on second thought, having a class contain a container of itself actually isn't allowed (though you would think it would be in cases like this).

Edit again: apparently it depends on the container. unordered_map does not allow instantiation with an incomplete type, but other containers do (e.g. vector or map).

Yes, I was playing around with it as well. It seems the support has been out there for a while, but officially since C++17 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n437... was accepted, s.t. std::vector, std::list and std::forward_list can work with incomplete types:

   struct example {
     std::vector<example> xs;

I believe containers can contain incomplete types as of C++17, but you cannot use their members.

Yes `unique_ptr` would be better!

I will probably include this implementation in a repository of mine called "forest". You can check it out here https://github.com/xorz57/forest Or you could copy paste it and open a pull request with additional features!

Doesn't using a map to implement a trie defeat the purpose? I mean, would such an implementation make sense in some situation?

Yes, it will, because map has O(1) GET operations, and that's all the trie needs.

In terms of space complexity, it does consume more space than a single element / node tree, (especially for maps with super low load factor) but trie data structure can accommodate all that extra space easily.

Do you mean using an array instead of map for storing pointer to the next node? Conceptually, an array is just a map that uses an integer and identity function and the hashing function.

In practice, the map implementation trades speed for space efficiency.

For an array implementation, the entire array of pointers has to be allocated for all possible character set for each node. This leads to wasted space since trie nodes usually only have child pointers for a subset of character sets. Map implementation don’t have to allocate space to store pointers for the entire character set, so it can be more space efficient. However, the hash function is probably an order of magnitude more cycle than that for array (usually just the ascii code with offset and some bound checking).

Well, an unordered_map / hash map is in the end backed by a dynamically sized array, so "Map implementation don’t have to allocate space to store pointers for the entire character set" might not be true for two reasons: (1) there _is_ an additional allocation* that is not present in the std::array case for the dynamic array and (2) there is a default size of this unordered_map when nothing has been inserted yet, are you sure it's smaller than 256 elements?

* Well, maybe there is something like small vector optimization going on for small unordered_maps, though.


For the standard implementation, the dynamically allocated array may be bigger than 256. I would imagine, since it’s C++, it’s probably possible to set the initial array size.

What is the best alternative still supporting UTF8? Yeah, sure, if you only support ASCII input you could use std::array<std::unique_ptr<Node>, 128>.

Edit: well, of course you could break up the UTF8 string in bytes and stick with a `std::array<std::unique_ptr<Node>,256>` structure. Is this how people do it (supposing normalizing the string is not an option)?.

> well, of course you could break up the UTF8 string in bytes and stick with a `std::array<std::unique_ptr<Node>,256>` structure. Is this how people do it (supposing normalizing the string is not an option)?

There are lots of options, but to a first approximation, yes, that's standard.

The key is to think of a trie as a special case of a finite state machine:

1. Your transition table should be a single contiguous block of memory. Namely, going from state s1 for byte b should be something like `trans[s1 * 256 + b]`. This reduces the amount of pointer chasing at search time.

2. With a small runtime cost (typically imperceptible for a large enough FSM), you can map your 256-sized alphabet down to a set of equivalence classes. e.g., Only to a set of bytes that discriminate a match. So your transition lookup becomes `trans[si * num_equiv_classes + equiv_class(b)]`. Although, this is harder to pull off if your trie supports mutation.

3. Instead of storing state identifiers in your transition table, you can store premultiplied state identifiers, which avoids the additional stride calculation at search time. So your transition lookup then becomes `trans[si + equiv_class(b)]`.

Those all mostly apply to dense representations. But if memory is more important, then you can use a sparse representation. In a sparse representation, the size of each state is variable, which means moving to the next transition is more costly. But it uses much less space.

I have shipped a trie into production that used std::unordered_map to store children and it performed quite well.

That's great!

This implementation lacks operations like clear() or remove(). I could create a repository if you would like to contribute.

I would love to contribute. Ping me if you get the repo setup.


I couldn't come up with such a concise and succinct implementation even if I tried.

I tried giving this a run from CLion and it crashes if I search first, see

#include "trie.hpp" #include <iostream>

int main() { trie<char16_t> trie;

    // Greek
    std::cout << trie.search(u"υπολογιστής") << std::endl;
    std::cout << trie.search(u"υπολογιστης") << std::endl;

    // English
    std::cout << trie.search(u"computer") << std::endl;
    std::cout << trie.search(u"compute") << std::endl;

    // Greek
  trie.insert(u"υπολογιστής"); //crash

  // English

  // Greek
  std::cout << trie.search(u"υπολογιστής") << std::endl;
  std::cout << trie.search(u"υπολογιστης") << std::endl;

  // English
  std::cout << trie.search(u"computer") << std::endl;
  std::cout << trie.search(u"compute") << std::endl;

  return 0;

My apologies. I updated the post! Now it works just fine!

There's still an issue if you insert something you've searched for first, e.g.:

  trie<char16_t> trie;


I verify this is true! I am working on it!

On a side note, I recently started competitive programming in my college, and while preparing for the upcoming xCPC competitions, I was wondering if I should use "auto" and a more functional approach in C++ (instead of using int, double, etc. and OOP). What do you think?

You mean for competitions, or in general? For competitions, go with whatever is faster/reduces your typing. In general, I think a lot of people would the same thing, but I'm in the (apparently small?) camp that thinks auto is massively overused. I use typedefs liberally instead of using auto all over the place. It prevents some implicit casts from being inhibited (which are rare, but which can cause correctness issues), and it documents things better. The cost is more typing, which in my view is just something you have to learn C++ is not intended to minimize.

I think typing isn't my issue (so far), but rather the performance. On one hand, it's one less thing to think about while figuring out a problem, but on the other hand, it might reduce the performance of my solution, and in that case, going back and refactoring the code to use the correct types would be a waste of time...

You seem to have a misconception of what `auto` entails... It just means "deduce type from the right-hand side of the assignment statement", instead of having to type it explicitly. Doesn't make your program dynamically typed or runtime typed all of a sudden.

Oh interesting. Why do you think auto would affect run-time performance? Could you describe an example?

I think auto won't have any run-time performance since type deduction happens at compile-time.

Also read this https://stackoverflow.com/questions/19618759/c-11-auto-compi...

I'm already well aware of this. I'm asking why the OP thinks otherwise. It's not outright impossible to make it affect performance [1] but I'm trying to see what the concern is.

[1] Like by taking the approach that results in a cast, e.g. here: https://news.ycombinator.com/item?id=20493263

Oh, I see. I didn't know that type deduction happens at compile-time :o

Yeah, that's how type systems work in most languages. That's why we have them: they prevent mistakes before you even run your code :)

Why would you think that using auto would reduce the performance of your code?

I usually explicitly type my integer constants based on the problem description (e.g. if N < 1e9 I would use int_fast32_t) and then let auto/decltype do most of the work after that.

I don't like auto over int, and other primitives. But auto in for loops or with iterators seems the way to go and is a bless!

With begin()/end() it's probably fine, but for-each loops it's not a great idea since then you run into problems when there was a cast intended. The main (only?) example of this pattern in the standard is vector<bool>, so of course people's immediate reaction to blame this on vector<bool>. However that's just the example currently in the standard, and people use it for other things too. By using auto you end up writing code that can break. Example:

  typedef std::vector<bool> Vec;
  Vec f(1, false);
  for (typename Vec::value_type &&x : f) { std::cout << typeid(x).name() << std::endl; }
  for (auto &&x : f) { std::cout << typeid(x).name() << std::endl; }

Do you have examples of cases other than vector<bool> where this is an issue?

Thanks, those were interesting to read about. It doesn't seem like something that would come up in the for loop case, though.

I mean, you can imagine proxies for stuff beside bools and matrices...

  template<class It> struct iterator_range { It b, e; It begin() const { return b; } It end() const { return e; } };
  template<class It> iterator_range<It> make_iterator_range(It b, It e) { iterator_range<It> r = { b, e }; return r; }

  template<class T> struct endian_reversed_iterator {
   T *p;
   struct reference {
    T *p;
    operator T() const { return boost::endian::endian_reverse(*p); }
    reference &operator =(T const &other) const { *p = boost::endian::endian_reverse(other); }
   endian_reversed_iterator(T *p) : p(p) { }
   reference operator *() { reference r = { p }; return r; }
   endian_reversed_iterator &operator++() { ++p; return *this; }
   bool operator!=(endian_reversed_iterator const &other) const { return p != other.p; }

  int main() {
   unsigned arr[] = { 1, 2 };
   for (auto x : make_iterator_range<endian_reversed_iterator<unsigned> >(&arr[0], &arr[2])) {
    ++x;  // whatever

That's what I've been doing. My issue is whether using auto would affect performance.

Just keep in mind that auto is different than const auto. Performance won't be an issue.

Personally, I use auto only when I have to write a long typename (ex. Iterators).

And what about object-oriented programming vs functional programming? Performance-wise, that is.

I am not sure about that. I am fairly new to functional programming :(.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact