So on eway to think of a hash table/map is a set of tuples of <key, value>. Imag...

eru · on Oct 18, 2019

Correct me, if I am wrong: take everything you described and remove the (logical) timestamp, and all you'd be losing out on would be the iteration in insertion order?

So how does the timestamp have any impact on persistence?

(We agree on basically everything you write.)

> [1]: And you can use an increment-only mutation counter instead of a timestamp to make this pure.

Agreed. That's what I'd call a logical timestamp or a logical clock.

> IIRC, if you're table can assume only insertions, this actually becomes really, really nice as a persistent data structure, since you can replace the backing vector with a backing linked list, and you only really lose out on iteration speed. Then you further extend that by swapping the linked list to a tree, and you share the entire backing structure.

That still doesn't tell you anything at all about how you make the hashtable itself persistent.

One simple way would be to just introduce an arbitrary order on your hashes, eg compare them as if they were ints, and then stick them in an ordered map container. Of course, that's just a round-about way to impose an arbitrary order on your keys. Works perfectly well, it's just not too interesting.

To come back to your original question:

> What's the value of a sorted map over an ordered map? > I don't think I've ever cared to iterate over map values based on the alphanumeric ordering of their keys.

In the cases we talked about making the keys comparable is only a means to an end, and any arbitrary order will do. That's the most common use case. Iterating over the keys in insertion order is also often useful.

In practice, I did come across some use cases that make genuine use of the order of keys. Eg when I was interested in (automatically) keeping track of the highest or lowest key or quantiles.

You can use a sorted map as a priority queue easily. A min heap would give you O(1) access to the minimum item, but if you actually want to pop it, it's O(log n) anyway. A sorted map as tree gives you O(log n) for basically all operations. For most uses that's either good enough, or doesn't even make a difference at all compared to a heap. Using your sorted map as a priority queue gives you arbitrary priority updates in O(log n) for no extra implementation complexity.

Sorted maps also came in handy when I was implementing geometric algorithms where I wanted to sweep a scanline over points. Or for divide and conquer algorithms over space.

A sorted map is also useful as a basic datastructure to build step functions on top of. (https://en.wikipedia.org/wiki/Step_function) A simple application is solving the infamous skyline problem.

When you want to store your data structure, having similar keys close together can help with compression. Google uses sorted string tables (sstable) show that principle quite well.