The original post: /r/linux by /u/diagraphic on 2025-01-20 20:31:37.

Hey everyone, I hope you’re all well. I’d like to share progress on TidesDB. If you don’t know TidesDB is an open-source library that provides an embedded key value database for fast write throughput implementing a unique log structured merge tree. Currently we are at 2 months of active development. I’d love to hear your feedback, insights, and more!

Currently here are some features

  • ACID transactions are atomic, consistent, isolated, and durable. Transactions are tied to their respective column family.
  • Concurrent multiple threads can read and write to the storage engine. Column families use a read-write lock thus allowing multiple readers and a single writer per column family. Transactions on commit and rollback block other threads from reading or writing to the column family until the transaction is completed. A transaction in itself is also is thread safe.
  • Column Families store data in separate key-value stores. Each column family has their own memtable and sstables.
  • Atomic Transactions commit or rollback multiple operations atomically. When a transaction fails, it rolls back all commited operations.
  • Cursor iterate over key-value pairs forward and backward.
  • WAL write-ahead logging for durability. Column families replay WAL on startup. This reconstructs memtable if the column family did not reach threshold prior to shutdown.
  • Multithreaded Compaction manual multi-threaded paired and merged compaction of sstables. When run for example 10 sstables compacts into 5 as their paired and merged. Each thread is responsible for one pair - you can set the number of threads to use for compaction.
  • Background Partial Merge Compaction background partial merge compaction can be started. If started the system will incrementally merge sstables in the background from oldest to newest once column family sstables have reached a specific provided limit. Merges are done every n seconds. Merges are not done in parallel but incrementally.
  • Bloom Filters reduce disk reads by reading initial blocks of sstables to check key existence.
  • Compression compression is achieved with Snappy, or LZ4, or ZSTD. SStable entries can be compressed as well as WAL entries.
  • TTL time-to-live for key-value pairs.
  • Configurable column families are configurable with memtable flush threshold, data structure, if skip list max level, if skip list probability, compression, and bloom filters.
  • Error Handling API functions return an error code and message.
  • Easy API simple and easy to use api.
  • Multiple Memtable Data Structures memtable can be a skip list or hash table.
  • Multiplatform Linux, MacOS, and Windows support.
  • Logging system logs debug messages to log file. This can be disabled. Log file is created in the database directory.
  • Block Indices by default TDB_BLOCK_INDICES is set to 1. This means TidesDB for each column family sstable there is a last block containing a sorted binary hash array. This compact data structure gives us the ability to retrieve the specific offset for a key and seek to its containing key value pair block within an sstable without having to scan an entire sstable. If TDB_BLOCK_INDICES is set to 0 then block indices aren’t used nor created and reads are slower and consume more IO and CPU having to scan and compare.
  • Statistics column family statistics, configs, information can be retrieved through public API.
  • Range queries are supported. You can retrieve a range of key-value pairs.
  • Filter queries are supported. You can filter key-value pairs based on a filter function.

It’s a passion project I started! I’ve been researching and writing database internals and log structured merge tree’s for a long while. It’s something I do daaiiillyy!

GITHUB

https://github.com/tidesdb/tidesdb

Thank you for checking out my thread! :)