Swift, OpenStack Swift
OpenStack Swift1 is an open-source object storage system designed, along with the rest of OpenStack, with the cloud in mind. Its functionality (and likely some of its design) is similar to Amazon S3.2 It supports storing and retrieving objects represented as binary blobs of data of arbitrary sizes. The object data is replicated to multiple machines in a cluster for redundancy, and a consistency service ensures that the replicas are kept in sync.
With OpenStack Swift, users (typically other cloud services that build upon it) can store and retrieve object data in a 2-level file-system-like storage space. Each user is associated with a storage account that provides an abstraction similar to the home folder of a user on a Unix system (e.g., two different accounts can have the same names for objects/containers). Each storage account also has specific meta information associated with it like the number of objects and containers currently used by the account, the total size of the objects stored in the account, as well as quota information (limits on the number of objects, containers, and storage the account may use).
Each storage account can contain one or more object containers, which are the equivalent of folders for a normal file system and used for grouping the objects together. Each container is identified by a unique name within the storage account. The containers contain one or more objects. Within the container, each object is identified by a unique name that is used to retrieve and store/update the data associated with it.
An overview picture of three abstractions is presented in Fig. 1.
Fig. 1. Swift object storage abstractions
Swift provides a RESTful HTTP API for managing containers and objects. All requests that are sent to Swift are composed of an HTTP verb (e.g., GET, PUT, DELETE) indicating the action to take on the object or container, authentication information, storage URL, and any data or metadata to be written.
A storage URL in Swift for an object has the following structure:
The storage URL has two basic parts: cluster location and storage location. Using the example above, we can break the storage URL into its two main parts:
- Cluster location: swift.ixiacom.com/v1/
- Storage location (for an object): /myaccount/mycontainer/myobject
A storage location can have three formats:
- The account storage location is a uniquely named storage area that contains the metadata (descriptive information) about the account itself as well as the list of containers in the account.
- Note that in Swift, an account is not a user identity. When you hear account, think storage area.
- The container storage location is the user-defined storage area within an account where metadata about the container itself and the list of objects in the container will be stored.
- The object storage location is where the data object and its metadata will be stored.
In Swift, objects are protected by storing multiple copies of data so that, in the case of a node failure, the data can be retrieved from another node. A client wanting to make requests to Swift will issue the request to a proxy that sits in front of the nodes storing the actual data. The proxy will look at the request and determine (based on the storage location) which node or nodes it needs to send the request to.
In the case of an object read (GET) request, only one replica that contains the object data will be sent for the request. In the case of an object write (PUT) request, multiple replicas are selected (typically a replica count of three is chosen) and the write request is forwarded to each replica.
To determine the replicas to which the object will be stored/retrieved from, Swift uses a distributed hash table similar in concept to Chord.3
When the proxy or the consistency services need to locate data it will look at the storage location (account, container, or object location) and consult one of the three rings: account ring, container ring, or object ring.
Each Swift ring is a modified consistent hashing ring that is distributed to every node in the cluster. The boiled-down version is that a modified consistent hashing ring contains a pair of lookup tables that the Swift processes and the services use to determine data locations. One table has the information about the drives in the cluster and the other has the table used to look up where any piece of account, container, or object data should be placed.
Swift wants to store data uniformly across the cluster and have it be available quickly for requests. When a process, like a proxy server process, needs to find where data is stored for a request, it will call on the appropriate ring to get a value that it needs to correctly hash the storage location. The hash value of the storage location will map to a partition value.
The “consistent” part of a modified consistent hashing ring is where partitions come into play. The hashing ring is chopped up into a number of parts, each of which gets a small range of the hash values associated to it. These parts are the partitions in Swift.
One of the modifications that make Swift’s hash ring a modified consistent hashing ring is that the partitions are a set number and uniform in size. As a ring is built, the partitions are assigned to drives in the cluster. This implementation is conceptually simple – a partition is just a directory sitting on a disk with a corresponding hash table of what it contains.
The relationship between a storage node, disk, and a partition is show in Fig. 2.
Fig. 2. The relationship of a storage node, disk, and a partition. Storage nodes have disks. Partitions are represented as directories on each disk.
During the initial creation of the Swift rings, every partition is replicated and each replica is placed as uniquely as possible across the cluster. Each subsequent rebuilding of the rings will calculate which, if any, of the replicated partitions need to be moved to a different drive. Part of partition replication includes designating handoff drives. When a drive fails, the replication/auditing processes notice and push the missing data to handoff locations.
Swift ATI Support
We’ve recently added OpenStack Swift support in our current Application Threat Intelligence (ATI) bi-weekly update, along with two new Superflows (OpenStack Swift Demo Super Flow and OpenStack Swift Replicated Object Storage and Retrieval) that demo the available actions as well as how they can be composed to emulate object storage and retrieval within a cluster. The contents of the OpenStack Swift Replicated Object Storage and Retrieval are shown in Fig. 3.
Fig. 3. OpenStack Swift Replicated Object Storage and Retrieval Superflow
Are you ready to deploy your own OpenStack Swift cluster or cloud? If so, look for and download our latest ATI updates from Ixia’s StrikeCenter4 so you can test that your deployment and all other networking elements will function as expected.
Leverage Subscription Service to Stay Ahead of Attacks
The Ixia BreakingPoint Application and Threat Intelligence (ATI) program provides bi-weekly updates of the latest application protocols and attacks for use with Ixia platforms.