simple key-value storage api

minimalkv is an API for key-value store of binary data. Due to its basic interface, it is easy to implemented a large number of backends. minimalkv’s origins are in storing user-uploaded files on websites, but its low overhead and design should make it applicable for numerous other problems, an example is a session backend for the Flask framework.

Built upon the solid foundation are a few optional bells and whistles, such as automatic ID generation/hashing (in minimalkv.idgen). A number of backends are available, ranging from FilesystemStore to support for Amazon S3 and Google Storage through BotoStore.

A faster in-memory store suitable for session management and caching is supported through RedisStore

Example

Here’s a simple example:

from minimalkv.fs import FilesystemStore

store = FilesystemStore('./data')

store.put(u'key1', 'hello')

# will print "hello"
print store.get(u'key1')

# move the contents of a file to "key2" as efficiently as possible
store.put_file(u'key2', '/path/to/data')

Note that by changing the first two lines to:

from minimalkv.memory.redisstore import RedisStore
import redis

store = RedisStore(redis.StrictRedis())

you could use the code exactly the same way, this time storing data inside a Redis database.

Store factories (from configurations)

There are two possibilities to generate stores from configurations:

  1. Use a dictionary with configuration data (e.g. loaded from an ini file)

from minimalkv import get_store

params = {
    'account_name': 'test',
    'account_key': 'XXXsome_azure_account_keyXXX',
    'container': 'my-azure-container',
}
store = get_store('azure', **params)
store.put(u'key', b'value')
assert store.get(u'key') == b'value'
  1. Use an URL to specify the configuration

from minimalkv import get_store_from_url, get_store

store = get_store_from_url('azure://test:XXXsome_azure_account_keyXXX@my-azure-container')
store.put(u'key', b'value')
assert store.get(u'key') == b'value'

URL and store types:

  • In memory: memory:// and hmemory://.

  • Redis: redis://[[password@]host[:port]][/db] and hredis://[[password@]host[:port]][/db]

  • Filesystem: fs:// and hfs://, e.g., hfs:///home/user/data

  • Amazon S3: s3://access_key:secret_key@endpoint/bucket[?create_if_missing=true] and hs3://access_key:secret_key@endpoint/bucket[?create_if_missing=true]

  • Azure Blob Storage (azure:// and hazure://):
    • with storage account key: azure://account_name:account_key@container[?create_if_missing=true][?max_connections=2], e.g., azure://MYACCOUNT:passw0rd!@bucket_66?create_if_missing=true

    • with SAS token: azure://account_name:shared_access_signature@container?use_sas&create_if_missing=false[?max_connections=2&socket_timeout=(20,100)]

    • with SAS and additional parameters: azure://account_name:shared_access_signature@container?use_sas&create_if_missing=false[?max_connections=2&socket_timeout=(20,100)][?max_block_size=4*1024*1024&max_single_put_size=64*1024*1024]

  • Google Cloud Store: (gcs:// and hgcs://):
    • hgcs://<base64 encoded credentials JSON>@bucket_name[?create_if_missing=true&bucket_creation_location=EUROPE-WEST3]

      Get the encoded credentials as a string like so:

      from pathlib import Path
      import base64
      json_as_bytes = Path(<path_to_json>).read_bytes()
      json_b64 = base64.urlsafe_b64encode(json_as_bytes).decode()
      

Storage URLs starting with a h indicate extended allowed characters. This allows the usage of slashes and spaces in blob names. URL options with [] are optional and the [] need to be removed.

Why you should use minimalkv

no server dependencies

minimalkv does only depend on python and possibly a few libraries easily fetchable from PyPI, if you want to use extra features. You do not have to run and install any server software to use minimalkv (but can at any point later on).

specializes in (even large!) blobs

The fastest, most basic minimalkv backend implementation stores files on your harddrive and is just as fast. This underlines the focus on storing big blobs without overhead or metadata. A typical usecase is starting out small with local files and then migrating all your binary data to something like Amazon’s S3.

Table of contents

The core API

class minimalkv._key_value_store.KeyValueStore

Class to access a key-value store.

Supported keys are ascii-strings containing alphanumeric characters or symbols out of minimalkv._constants.VALID_NON_NUM of length not greater than 250. Values (or records) are stored as raw bytes.

__contains__(key: str) bool

Check if the store has an entry at key.

Parameters

keystr

The key whose existence should be verified.

Raises

ValueError

If the key is not valid.

IOError

If there was an error accessing the store.

__iter__() Iterator[str]

Iterate over all keys in the store.

Raises

IOError

If there was an error accessing the store.

delete(key: str) str | None

Delete data at key.

Does not raise an error if the key does not exist.

Parameters

key: str

The key of data to be deleted.

Raises

ValueError

If the key is not valid.

IOError

If there was an error deleting.

get(key: str) bytes

Return data at key as a bytestring.

Parameters

keystr

The key to be read.

Returns

datastr

Value associated with the key as a bytes object.

Raises

ValueError

If the key is not valid.

IOError

If the file could not be read.

KeyError

If the key was not found.

get_file(key: str, file: str | BinaryIO) str

Write data at key to file.

Like put_file(), this method allows backends to implement a specialized function if data needs to be written to disk or streamed.

If file is a string, contents of key are written to a newly created file with the filename file. Otherwise, the data will be written using the write method of file.

Parameters

keystr

The key to be read.

fileBinaryIO or str

Output filename or file-like object with a write method.

Raises

ValueError

If the key is not valid.

IOError

If there was a problem reading or writing data.

KeyError

If the key was not found.

iter_keys(prefix: str = '') Iterator[str]

Iterate over all keys in the store starting with prefix.

Parameters

prefixstr, optional, default = ‘’

Only iterate over keys starting with prefix. Iterate over all keys if empty.

Raises

IOError

If there was an error accessing the store.

keys(prefix: str = '') List[str]

List all keys in the store starting with prefix.

Parameters

prefixstr, optional, default = ‘’

Only list keys starting with prefix. List all keys if empty.

Raises

IOError

If there was an error accessing the store.

open(key: str) BinaryIO

Open record at key.

Parameters

keystr

Key to open.

Returns

file: BinaryIO

Read-only file-like object for reading data at key.

Raises

ValueError

If the key is not valid.

IOError

If the file could not be read.

KeyError

If the key was not found.

put(key: str, data: bytes) str

Store bytestring data at key.

Parameters

keystr

The key under which the data is to be stored.

databytes

Data to be stored at key, must be of type bytes.

Returns

str

The key under which data was stored.

Raises

ValueError

If the key is not valid.

IOError

If storing failed or the file could not be read.

put_file(key: str, file: str | BinaryIO) str

Store contents of file at key.

Store data from a file into key. file can be a string, which will be interpreted as a filename, or an object with a read() method.

If file is a filename, the file might be removed while storing to avoid unnecessary copies. To prevent this, pass the opened file instead.

Parameters

keystr

Key where to store data in file.

fileBinaryIO or str

A filename or a file-like object with a read method.

Returns

key: str

The key under which data was stored.

Raises

ValueError

If the key is not valid.

IOError

If there was a problem moving the file in.

Some backends support an efficient copy operation, which is provided by a mixin class:

class minimalkv._mixins.CopyMixin

Mixin to expose a copy operation supported by the backend.

copy(source: str, dest: str) str

Copy data at key source to key dest.

The destination is overwritten if it does exist.

Parameters

sourcestr

The source key of data to copy.

deststr

The destination for the copy.

Raises

ValueError

If source is not a valid key.

ValueError

If dest is not a valid key.

Returns

keystr

The destination key.

In addition to that, a mixin class is available for backends that provide a method to support URL generation:

class minimalkv._key_value_store.UrlKeyValueStore

Class is deprecated. Use the UrlMixin instead.

Deprecated since version 0.9.

Some backends support setting a time-to-live on keys for automatic expiration, this is represented by the TimeToLiveMixin:

minimalkv._constants.VALID_KEY_REGEXP = '^[\\\\`\\\\!"\\#\\$%\\&\'\\(\\)\\+,\\-\\.<=>\\?@\\[\\]\\^_\\{\\}\\~0-9a-zA-Z]+$'

This regular expression tests if a key is valid. Allowed are all alphanumeric characters, as well as !"`#$%&'()+,-.<=>?@[]^_{}~.

minimalkv._constants.VALID_KEY_RE = re.compile('^[\\\\`\\\\!"\\#\\$%\\&\'\\(\\)\\+,\\-\\.<=>\\?@\\[\\]\\^_\\{\\}\\~0-9a-zA-Z]+$')

A compiled version of VALID_KEY_REGEXP.

Implementing a new backend

Subclassing KeyValueStore is the fastest way to implement a new backend. It suffices to override the _delete(), iter_keys(), _open() and _put_file() methods, as all the other methods have default implementations that call these.

After that, you can override any number of underscore-prefixed methods with more specialized implementations to gain speed improvements.

Default implementation

Classes derived from KeyValueStore inherit a number of default implementations for the core API methods. Specifically, the delete(), get(), get_file(), keys(), open(), put(), put_file(), methods will each call the _check_valid_key() method if a key has been provided and then call one of the following protected methods:

KeyValueStore._check_valid_key(key: str) None

Check if a key is valid and raise a ValueError if it is not.

Always use this method to check whether a key is valid.

Parameters

keystr

The key to be checked.

Raises

ValueError

If the key is not valid.

KeyValueStore._delete(key: str)

Delete the data at key in store.

KeyValueStore._get(key: str) bytes

Read data at key in store.

Parameters

keystr

Key of value to be retrieved.

KeyValueStore._get_file(key: str, file: BinaryIO) str

Write data at key to file-like object file.

Parameters

keystr

Key of data to be written to file.

fileBinaryIO

File-like object with a write method to be written.

KeyValueStore._get_filename(key: str, filename: str) str

Write data at key to file at filename.

Parameters

keystr

Key of data to be written to file at filename.

filenamestr

Name of file to be written.

KeyValueStore._has_key(key: str) bool

Check the existence of key in store.

Parameters

keystr

Key to check the existance of.

KeyValueStore._open(key: str) BinaryIO

Open record at key.

Parameters

keystr

Key of record to open.

Returns

file: BinaryIO

Opened file.

KeyValueStore._put(key: str, data: bytes) str

Store bytestring data at key.

Parameters

keystr

Key under which data should be stored.

databytes

Data to be stored.

Returns

keystr

Key where data was stored.

KeyValueStore._put_file(key: str, file: BinaryIO) str

Store data from file-like object at key.

Parameters

keystr

Key at which to store contents of file.

fileBinaryIO

File-like object to store data from.

Returns

keystr

Key where data was stored.

KeyValueStore._put_filename(key: str, filename: str) str

Store data from file at filename at key.

Parameters

keystr

Key under which data should be stored.

filenamestr

Filename of file to store.

Returns

key: str

Key where data was stored.

Atomicity

Every call to a method on a KeyValueStore results in a single operation on the underlying backend. No guarantees are made above that, if you check if a key exists and then try to retrieve it, it may have already been deleted in between (instead, retrieve and catch the exception).

Python 3

All of the examples are written in Python 2. However, Python 3 is fully supported and tested. When using minimalkv in a Python 3 environment, the only important thing to remember is that keys are always strings and values are always byte-objects.

Indices and tables