It's not really geospatial; you just have multidimentional indices and an intersect operation. I'm not dissing you, indices like that can be extremely useful and tricky to implement!
That said, why only 4D? 5D is very useful, but no one seems to support it :( (x,y,z,time,value)
Thank you for the feedback. I like the idea of lifting the cap for number of dimensions. I hardcoded a limit of 4 to handle the standard XYZM, but the implementation could technically handle around 20.
I really like Bolt. It's a wonderful library with an good API. I was inspired by the simplicity of it's transaction model.
Both Bolt and Bunt are ACID, and both persist data to disk. The biggest difference between them is that Bolt reads and writes from disk, while Bunt reads and writes from memory (and has an append-only file for durability).
Therefore the amount of data that Bolt can handle is limited by the size the disk, while Bunt is limited by the amount of RAM.
A general purpose database user will likely see a bump in performance by moving from Bolt to Bunt. What I'm seeing for my projects about 2x on reads and about a 40x on writes. I wrote a Raft store implementation that is a drop-in replacement for the the Bolt version. Here's a comparison benchmark: https://github.com/tidwall/raft-boltdb#benchmarks
It really comes down to what you need. Lots of data, or lots of speed.
> Both Bolt and Bunt are ACID, and both persist data to disk. The biggest difference between them is that Bolt reads and writes from disk, while Bunt reads and writes from memory (and has an append-only file for durability).
Just to be sure: does this mean that Bunt has a window of time where data is purely in ram only, and it is eventually persisted ? Because the description made me think that BuntDB was purely in-memor. Is there some upper limit on how much time an object may be in memory but not persisted yet ?
On another note, congrats for this project. I see that you changed the default "Set" to use strings instead of bytes, this was a bit of a pain point when I used BoltDB. Indexes should also be interesting.
Bunt is a purely in-memory database, but it also persists to disk so that the database can be reopened. It's a lot like Redis in this manner.
Basically, BuntDB requires that data be persisted prior to completing a transaction. There is no window of time where there is data in memory and not on disk. It's designed so that there is no way for data to exist in memory and not be on disk.
I decided that strings were a better way to go because 1) the string is the most common type in a key/value database, 2) strings take up less memory than a byte slice, and 3) strings are just bytes anyhow so they can always be converted using []byte(str).
Thanks for the kind words and I hope you give it a try.
I can understand why they only allow one read/write transaction at a time.
However, could they implement multiple concurrent read/write transactions by having the transaction fail if it writes to any key modified by any other concurrent transaction?
Like if writer X modifies a key at time t1, but writer Y opens a transaction at time t0 and tries to modify the same key at time t2, Y is told their transaction is invalid and should restart their operation from the beginning.
Sometimes this is slower than serialization. In fact, when you’re doing KV-CRUD work on in-memory data, it’s often slower than serialization. Keeping RW-sets is non-trivial overhead compared to the hardest typical part of KV-CRUD, tree or hash lookups.
Many many systems have more parallelism, but less throughput.
Now, if you want to prevent one transaction with a bad-actor blocking the system, then RW-sets, timeouts and OCC/MVCC might be a good idea, it just won’t be faster.
Is it common for this sort of database to not expose an interface over IP ? It seems to me that a local-only database would severely restrict the use-cases - but maybe I'm just ignorant of many local-only uses. Or should another program handle the networking, with BuntDB as a backend ?
It's common for programs to embed a database engine. Informally, you're doing this every time you write any structured data to a file.
Baking a database into your application drastically simplifies distribution/deployment and avoids network bottlenecks (at the cost of restricting your choice of storage engine, making it harder to hire staff experienced with your tech, etc).
That said, why only 4D? 5D is very useful, but no one seems to support it :( (x,y,z,time,value)