simple key-value storage api

minimalkv is an API for key-value store of binary data. Due to its basic interface, it is easy to implemented a large number of backends. minimalkv’s origins are in storing user-uploaded files on websites, but its low overhead and design should make it applicable for numerous other problems, an example is a session backend for the Flask framework.

Built upon the solid foundation are a few optional bells and whistles, such as automatic ID generation/hashing (in minimalkv.idgen). A number of backends are available, ranging from FilesystemStore to support for Amazon S3 and Google Storage through BotoStore.

A faster in-memory store suitable for session management and caching is supported through RedisStore

Example

Here’s a simple example:

from minimalkv.fs import FilesystemStore

store = FilesystemStore('./data')

store.put(u'key1', 'hello')

# will print "hello"
print store.get(u'key1')

# move the contents of a file to "key2" as efficiently as possible
store.put_file(u'key2', '/path/to/data')

Note that by changing the first two lines to:

from minimalkv.memory.redisstore import RedisStore
import redis

store = RedisStore(redis.StrictRedis())

you could use the code exactly the same way, this time storing data inside a Redis database.

Store factories (from configurations)

There are two possibilities to generate stores from configurations:

  1. Use a dictionary with configuration data (e.g. loaded from an ini file)

from minimalkv import get_store

params = {
    'account_name': 'test',
    'account_key': 'XXXsome_azure_account_keyXXX',
    'container': 'my-azure-container',
}
store = get_store('azure', **params)
store.put(u'key', b'value')
assert store.get(u'key') == b'value'
  1. Use an URL to specify the configuration

from minimalkv import get_store_from_url, get_store

store = get_store_from_url('azure://test:XXXsome_azure_account_keyXXX@my-azure-container')
store.put(u'key', b'value')
assert store.get(u'key') == b'value'

URL and store types:

  • In memory: memory:// and hmemory://.

  • Redis: redis://[[password@]host[:port]][/db] and hredis://[[password@]host[:port]][/db]

  • Filesystem: fs:// and hfs://, e.g., hfs:///home/user/data

  • Amazon S3: s3://access_key:secret_key@endpoint/bucket[?create_if_missing=true] and hs3://access_key:secret_key@endpoint/bucket[?create_if_missing=true]

  • Azure Blob Storage (azure:// and hazure://):
    • with storage account key: azure://account_name:account_key@container[?create_if_missing=true][?max_connections=2], e.g., azure://MYACCOUNT:passw0rd!@bucket_66?create_if_missing=true

    • with SAS token: azure://account_name:shared_access_signature@container?use_sas&create_if_missing=false[?max_connections=2&socket_timeout=(20,100)]

    • with SAS and additional parameters: azure://account_name:shared_access_signature@container?use_sas&create_if_missing=false[?max_connections=2&socket_timeout=(20,100)][?max_block_size=4*1024*1024&max_single_put_size=64*1024*1024]

  • Google Cloud Store: (gcs:// and hgcs://):
    • hgcs://<base64 encoded credentials JSON>@bucket_name[?create_if_missing=true&bucket_creation_location=EUROPE-WEST3]

      Get the encoded credentials as a string like so:

      from pathlib import Path
      import base64
      json_as_bytes = Path(<path_to_json>).read_bytes()
      json_b64 = base64.urlsafe_b64encode(json_as_bytes).decode()
      

Storage URLs starting with a h indicate extended allowed characters. This allows the usage of slashes and spaces in blob names. URL options with [] are optional and the [] need to be removed.

Why you should use minimalkv

no server dependencies

minimalkv does only depend on python and possibly a few libraries easily fetchable from PyPI, if you want to use extra features. You do not have to run and install any server software to use minimalkv (but can at any point later on).

specializes in (even large!) blobs

The fastest, most basic minimalkv backend implementation stores files on your harddrive and is just as fast. This underlines the focus on storing big blobs without overhead or metadata. A typical usecase is starting out small with local files and then migrating all your binary data to something like Amazon’s S3.

The core API

class minimalkv._key_value_store.KeyValueStore

Class to access a key-value store.

Supported keys are ascii-strings containing alphanumeric characters or symbols out of minimalkv._constants.VALID_NON_NUM of length not greater than 250. Values (or records) are stored as raw bytes.

__contains__(key: str)bool

Check if the store has an entry at key.

keystr

The key whose existence should be verified.

ValueError

If the key is not valid.

IOError

If there was an error accessing the store.

__iter__()Iterable[str]

Iterate over all keys in the store.

IOError

If there was an error accessing the store.

delete(key: str)Optional[str]

Delete data at key.

Does not raise an error if the key does not exist.

key: str

The key of data to be deleted.

ValueError

If the key is not valid.

IOError

If there was an error deleting.

get(key: str)bytes

Return data at key as a bytestring.

keystr

The key to be read.

datastr

Value associated with the key as a bytes object.

ValueError

If the key is not valid.

IOError

If the file could not be read.

KeyError

If the key was not found.

get_file(key: str, file: Union[str, IO])str

Write data at key to file.

Like put_file(), this method allows backends to implement a specialized function if data needs to be written to disk or streamed.

If file is a string, contents of key are written to a newly created file with the filename file. Otherwise the data will be written using the write method of file.

keystr

The key to be read.

filefile-like or str

Output filename or file-like object with a write method.

ValueError

If the key is not valid.

IOError

If there was a problem reading or writing data.

KeyError

If the key was not found.

iter_keys(prefix: str = '')Iterator[str]

Iterate over all keys in the store starting with prefix.

prefixstr, optional, default = ‘’

Only iterate over keys starting with prefix. Iterate over all keys if empty.

IOError

If there was an error accessing the store.

keys(prefix: str = '')List[str]

List all keys in the store starting with prefix.

prefixstr, optional, default = ‘’

Only list keys starting with prefix. List all keys if empty.

IOError

If there was an error accessing the store.

open(key: str)IO

Open record at key.

keystr

Key to open.

file: file-like

Read-only file-like object for reading data at key.

ValueError

If the key is not valid.

IOError

If the file could not be read.

KeyError

If the key was not found.

put(key: str, data: bytes)str

Store bytestring data at key.

keystr

The key under which the data is to be stored.

databytes

Data to be stored at key, must be of type bytes.

str

The key under which data was stored.

ValueError

If the key is not valid.

IOError

If storing failed or the file could not be read.

put_file(key: str, file: Union[str, IO])str

Store contents of file at key.

Store data from a file into key. file can be a string, which will be interpreted as a filename, or an object with a read() method.

If file is a filename, the file might be removed while storing to avoid unnecessary copies. To prevent this, pass the opened file instead.

keystr

Key where to store data in file.

filefile-like or str

A filename or a file-like object with a read method.

key: str

The key under which data was stored.

ValueError

If the key is not valid.

IOError

If there was a problem moving the file in.

Some backends support an efficient copy operation, which is provided by a mixin class:

class minimalkv._mixins.CopyMixin

Mixin to expose a copy operation supported by the backend.

copy(source: str, dest: str)str

Copy data at key source to key dest.

The destination is overwritten if it does exist.

sourcestr

The source key of data to copy.

deststr

The destination for the copy.

ValueError

If source is not a valid key.

ValueError

If dest is not a valid key.

keystr

The destination key.

In addition to that, a mixin class is available for backends that provide a method to support URL generation:

class minimalkv._key_value_store.UrlKeyValueStore

Class is deprecated. Use the UrlMixin instead.

Deprecated since version 0.9.

Some backends support setting a time-to-live on keys for automatic expiration, this is represented by the TimeToLiveMixin:

minimalkv._constants.VALID_KEY_REGEXP = '^[\\\\`\\\\!"\\#\\$%\\&\'\\(\\)\\+,\\-\\.<=>\\?@\\[\\]\\^_\\{\\}\\~0-9a-zA-Z]+$'

This regular expression tests if a key is valid. Allowed are all alphanumeric characters, as well as !"`#$%&'()+,-.<=>?@[]^_{}~.

minimalkv._constants.VALID_KEY_RE = re.compile('^[\\\\`\\\\!"\\#\\$%\\&\'\\(\\)\\+,\\-\\.<=>\\?@\\[\\]\\^_\\{\\}\\~0-9a-zA-Z]+$')

A compiled version of VALID_KEY_REGEXP.

Implementing a new backend

Subclassing KeyValueStore is the fastest way to implement a new backend. It suffices to override the _delete(), iter_keys(), _open() and _put_file() methods, as all the other methods have default implementations that call these.

After that, you can override any number of underscore-prefixed methods with more specialized implementations to gain speed improvements.

Default implementation

Classes derived from KeyValueStore inherit a number of default implementations for the core API methods. Specifically, the delete(), get(), get_file(), keys(), open(), put(), put_file(), methods will each call the _check_valid_key() method if a key has been provided and then call one of the following protected methods:

KeyValueStore._check_valid_key(key: str)None

Check if a key is valid and raise a ValueError if it is not.

Always use this method to check whether a key is valid.

keystr

The key to be checked.

ValueError

If the key is not valid.

KeyValueStore._delete(key: str)

Delete the data at key in store.

KeyValueStore._get(key: str)bytes

Read data at key in store.

keystr

Key of value to be retrieved.

KeyValueStore._get_file(key: str, file: IO)str

Write data at key to file-like object file.

keystr

Key of data to be written to file.

filefile-like

File-like object with a write method to be written.

KeyValueStore._get_filename(key: str, filename: str)str

Write data at key to file at filename.

keystr

Key of data to be written to file at filename.

filenamestr

Name of file to be written.

KeyValueStore._has_key(key: str)bool

Check the existence of key in store.

keystr

Key to check the existance of.

KeyValueStore._open(key: str)IO

Open record at key.

keystr

Key of record to open.

file: file-like

Opened file.

KeyValueStore._put(key: str, data: bytes)str

Store bytestring data at key.

keystr

Key under which data should be stored.

databytes

Data to be stored.

keystr

Key where data was stored.

KeyValueStore._put_file(key: str, file: IO)str

Store data from file-like object at key.

keystr

Key at which to store contents of file.

filefile-like

File-like object to store data from.

keystr

Key where data was stored.

KeyValueStore._put_filename(key: str, filename: str)str

Store data from file at filename at key.

keystr

Key under which data should be stored.

filenamestr

Filename of file to store.

key: str

Key where data was stored.

Atomicity

Every call to a method on a KeyValueStore results in a single operation on the underlying backend. No guarantees are made above that, if you check if a key exists and then try to retrieve it, it may have already been deleted in between (instead, retrieve and catch the exception).

Python 3

All of the examples are written in Python 2. However, Python 3 is fully supported and tested. When using minimalkv in a Python 3 environment, the only important thing to remember is that keys are always strings and values are always byte-objects.

Indices and tables