|
|
Appistry Storage is software that creates a scalable, reliable and highly cost-effective file storage system with no single points of failure, using only commodity servers and networking.
The system is composed of multiple computers at one or more data centers. Each computer in the system is called a worker. Workers join to create a single instance of the storage system. Each worker contains at least one processor (CPU), storage device and network adapter. The workers may have heterogeneous hardware profiles. Each worker may belong to only one cloud storage system at a time. The system is fully distributed and each worker acts as a complete cloud storage system unto itself, but is aware of other members of the same instance and shares storage responsibilities accordingly. The cloud system will coordinate all the workers and logically join their attached storage to expose a single, logical data store to the API. By default, all storage workers in the same fabric will auto-join the storage system for that fabric.
Moving Work to the Data
The processing elements of the workers in the system are expected to be used for computational efforts. Most processing systems typically move large data files over expensive networking and interconnects to the processing elements. A storage system that combines processing elements with the storage device allows applications to move the work to the data, reducing costs and improving performance.
Multiple Data Centers
Upon administrative request, workers in the cloud system can be logically segmented into territories. Territories provide an organizational unit to distribute files and data across the network. It is the expectation that territories will be segmented by network segment or data center to provide data reliability.
All client access to the cloud system is performed via RESTful, HTTP API calls, including file additions, modifications, deletions and administrative commands. Every single worker in the cloud system is capable of handling any HTTP call.
The stor utility is a command-line client that enables you to form storage worker systems and territories, as well as list, get, put and remove storage files. It connects to the fabric using the Storage API. |

Architecture Overview
The Storage system consists of a single executable service (fabric_storage), which is installed and managed with all worker services (see Platform Overview and Architecture for more information).

Building a Storage System
Building a storage system where all storage nodes communicate on the same fabric address is a simple process. When a worker joins a fabric the fabric keeper service will automatically direct its storage service to join the storage system of any active workers on that fabric. View the active workers using the Management Console following installation.
If you are building a standalone storage system, consult Standalone Storage Installation for more information.
Workers can be viewed and configured with the following operations:
- Management Console
- Storage API
| Resource |
Description |
| /workers/ |
GET a JSON array listing all of the workers in the storage system. |
| /workers/[IP] |
GET, from the connected worker's storage system, a JSON object with information on a specified worker; PUT (add) or DELETE (remove) an IP to/from the connected worker's storage system. |
| /workers/[IP]/config |
GET a JSON object detailing the configuration settings for the worker. |
| /workers/[IP]/controller |
POST 'start' or 'stop' command (in the body) to a specified worker in the storage system. |
| /workers/[IP]/disk-usage |
GET a JSON object reporting the disk usage for all workers in the system as currently known by a specified worker in the storage system. |
| /workers/[IP]/filestats |
GET JSON object reporting the number of files tracked by a specified worker in the storage system. |
| /workers/[IP]/log |
POST the message body to storage's logging system on the specified worker. |
| /workers/[IP]/state |
GET a JSON object with the current state of a specified worker in the storage system. |
| /workers/[IP]/territory |
GET (from the storage system) a JSON object with the territory of the specified worker or PUT (into the storage system) body text with a new territory for the specified worker. |
- stor
Setting up territories can be accomplished by the following means:
- Management Console
- Storage API
| Resource |
Description |
| /territories |
GET, from the connected storage system, a JSON array of existing territories. |
| /territories/[territoryID] |
GET, from the storage system, a JSON array of the given territory's workers. |
| /workers/[IP]/territory |
GET (from the storage system) a JSON object with the territory of the specified worker or PUT (into the storage system) body text with a new territory for the specified worker. |
- stor only allows you to view territories
Filesystem Mechanics
Storage is a distributed key-value store; it does not track directories as their own entities. The directories that the storage system physically creates on the drive are incidental to the filepath of a file given in the name of a file written to storage. Storage doesn't have any directory operations except filelist. A directory comes into existence when you put a file in it and goes away when all the files in it are gone.
The structure of files in the storage system is limited by the file system on the mounted drive. For example, using ext3, there is a limit of 31998 sub-directories per one directory, stemming from its limit of 32000 links per inode. Use your file system's limitations as a rough guide for the limitations of storage, even though the number of nodes in the system and the replication setting (N) for the files will both play a factor in the number of files that the system will allow you to write.
The diagram below depicts the file access methods for adding and reading files in the cloud storage system.
While writing and reading files in the system, the user is able to specify three parameters to the operation:
- R - The minimum number of workers queried in a successful read operation.
- W - The minimum number of workers updated in a successful write operation.
- N - The number of copies of a file to be held.
During a file modification operation:
- If N < W + R, then file consistency is guaranteed.
- If N >= W + R, then the system is running in eventual consistency mode. The user will always be able to write to an available worker; any version inconsistencies that are created will be detected by the system.
During a write operation, the system will respond with completion as soon as the file has been written W times (i.e. to W workers). The receiving worker will query the distributed hash table to determine where to write the file (it may write locally, it may not), write the files to the designated workers in parallel and then write to the hash table that the operation has been completed.
When writing files:
- The file will not be successfully written unless W copies can be written.
- Files will be written round-robin to as many workers as possible (<=N).
- When the storage system has multiple territories, files will be written round-robin to as many territories as possible, as evenly as possible.
- A file's RWN cannot be changed once the file is in the system, unless the file is purged and rewritten.
During a read operation, the system will query the hash table for where the files are written, validate R entries and deliver one file to the client. If it is desired, the R value may be specified for any given read of a file, for only that read.

The 'fabric_storage' service will communicate directly with the OS's file system API, with the networking stack, and with other Storage workers that it has been made aware of. Communication between the client and the fabric_storage service is accomplished in a RESTful fashion using HTTP and JSON. Communication between Storage workers is comprised of a mix of TCP (primarily), HTTP and UDP.
| Interface |
Description |
Worker to Worker Communication |
Client to Worker Communication |
Estimated Bandwidth |
| HTTP |
All management operations are accomplished using RESTful resources. |
|
|
<1KB plus message body |
| TCP |
Files exchanged between workers. |
|
|
<1KB plus file payload |
| UDP |
Hash table synchronization. |
|
|
137 bytes per machine per second |
Spaces
In Appistry Storage, files are written into logical containers called "spaces". By default, the storage system has a "default" space that is owned by "nobody" and governed by an access control list. New spaces may be created by administrators and will be owned, by default, by the creator. Ownership of a space may be assigned to any user in the system.
"default" Space
Space Operations
Access Control Lists
Access to storage entities is governed by ownership (implicit "read" and "write" permission) and access control lists. The access control list will "allow" a user or group "read" and "write" access to a given entity. There is currently no "disallow". Currently, only spaces are assigned ACLs: file access is inherited from the user's or group's permission on the space.

Users
Individual users can be assigned read and write permission to a given space.
Groups
Groups can be assigned read and write permission to a given space. If a user is added to the group, the user will inherit the group's permissions.
"default" Space
If a fabric is upgraded from v4.3.X.X, where there is a group.cfg with a "storage-user" group, the "storage-user" group will be added to the access control list for the "default" space with read and write permission. Otherwise, a similar group can be easily added to provide users with easy access to the "default" space or any other space in the system.
Editing an Access Control List
An access control list can be edited during space control list or at any time.
File Operations
The following methods are available to interact with the storage system.
- stor
- Storage API
| Resource |
Description |
/spaces/[spaceID]/filelist/[filepath]
/filelist/[filepath] |
GET, from a specified space in the storage system, a JSON array (or object) listing the files/directories in the given path. |
/spaces/[spaceID]/files/[filename]
/files/[filename] |
GET (read), PUT (write, overwrite, copy), MOVE (rename), or DELETE a specified file in a specified space of the storage system. |
/spaces/[spaceID]/location/[filename]
/location/[filename] |
GET, from a specified space, a JSON array of IPs:ports of the most recent version (if there are old copies present in the system, those locations do not show up) of the file at the given path. |
/spaces/[spaceID]/metadata/[filename]
/metadata/[filename] |
GET, from a specified space, a JSON array of everywhere the file is found (IP addresses) along with its metadata. |