Skip to main content

Storage

We use rocksdb to save the data. You should create the storage object fisrt, then you can create embedding objects.

When creating storage object, you should input data-dir and ttl. Data-dir is the path to save data. Ttl is time to live, which is supported by rocksdb.

condition

conditions are wrap in parameter object.

expire_days

If the last update time of the key is less then expire_days ago, this key is ignored.

configure parameters:

expire_days: int type

group

If the group is setted (0 <= group < 2^16), we will only dump keys have the same group.

configure parameters:

group: int type

min_count

If the update number of key is less then min_count, this key is ignored.

configure parameters:

min_count: int type

dump

When model training finishes, you may dump the weights of keys with some conditions.

file format

First part stores the group, dim and count of each group. Second part stores all the key and weight.

first part

typesizelengthdescription
int324bit1size
int324bitsizegroup ids
int324bitsizegroup dims
int64_t8bitsizekey count of group

second part

typesizelengthdescription
int64_t8bit1key value
int324bit1group of the key
float4bitdim of this groupweight of the key

Example

import damo

# first param: data dir
# second param: ttl second
storage = damo.PyStorage("/tmp/data_dir", 86400*100)


cond = damo.Parameters()
cond.insert("expire_days", 100)
cond.insert("min_count", 3)
cond.insert("group", 0)

storage.dump("/tmp/weight.dat", cond)