Often in the course of research computing temporary data is generated either in the form of information being collected form an external data source or intermediate computations. Also common in research is for computations to crash or fail. Because temporary data can be expensive to recompute, it’s often desirable to “pickup” where a computation left off. In order to do this, we need data structures that can easily re-load their contents so that the last viable state of the program can be reconstructed. This is what the autocache library aims to do.
Autocache provides wrappers around common python data structures such as lists, sets, and dictionaries that provide constant and high-performance storage of their contents to disk. As a result, the state of these data structures can be easily reconstructed without any extra programmer effort.
If you implement long-running computations or data acquisition routines, then autocache may be a useful tool for you.
Autocache is an open-source library under active development and is hosted on Google Code.
At present, please download the cutting edge version (straight off the trunk) using SVN:
svn checkout http://py-autocache.googlecode.com/svn/trunk py-autocache
Autocache is built and installed using distutils. From within the
src directory, run the command
python setup.py install.
Once the library has been installed, you can run the provided tests using the command
python -m autocache.test. If everything is installed correctly, all tests should pass.
We’re still working on getting official documentation up. However, in principle, the library is extremely easy to use since the main classes (
alist) work exactly like their native python counterparts. For some examples, take a look at the test code in the autocache.tests package.
If you have feedback, suggestions, bug fixes, or new features to contribute, we would love to hear from you! Please email us!
Frequently Asked Questions
How is this project different from Shelve?
Shelve is a module included in the standard python distribution which provides a dictionary-like object that can be used to store arbitrary objects to disk. While shelve has valuable many uses, one of them is not supporting high-performance mirroring of various data structures (lists, sets, and dictionaries) on disk so that crashes and partial computations can be easily recovered from or supported.