The Blockstack Network stores application data using a storage system called Gaia. Transactional metadata is stored on the Blockstack blockchain and user application data is stored in Gaia storage. Storing data off of the blockchain ensures that Blockstack applications can provide users with high performance and high availability for data reads and writes without introducing central trust parties.
- Understand Gaia in the Blockstack architecture
- User control or how is Gaia decentralized?
- Understand data storage
- Gaia versus other storage systems
Understand Gaia in the Blockstack architecture
The following diagram depicts the Blockstack architecture and Gaia’s place in it:
Blockchains require consensus among large numbers of people, so they can be slow. Additionally, a blockchain is not designed to hold a lot of data. This means using a blockchain for every bit of data a user might write and store is expensive. For example, imagine if an application were storing every tweet in the chain.
Blockstack addresses blockchain performance problems using a layered approach. At the base of the system is a blockchain and the Blockstack Naming System (BNS). The blockchain governs ownership of names (identities) in the system, names such as domain names, usernames, and application names.
Names in Blockstack correspond to routing data in the OSI stack. The routing data is stored in the Atlas Peer Network, the second layer. Every core node that joins the Blockstack Network is able to obtain an entire copy of this routing data. Blockstack uses the routing data to associate names (usernames, domains, and application names) with a particular storage location.
The final layer is the Gaia Storage System. A Gaia system consists of a hub service and storage resource on a cloud software provider such as Azure, DigitalOcean, Amazon EC2, and so forth. Typically the compute resource and the storage resource belong to same cloud vendor. Gaia currently has driver support for S3 and Azure Blob Storage, but the driver model allows for other backend support as well.
Because Gaia stores application and user data off the blockchain, a Blockstack DApp is typically more performant than DApps created on other blockchains. Moreover, users choose where their data lives, and Gaia enables applications to access that user data via a uniform API. When the user logs in, the authentication process gives the application the URL of a Gaia hub, which then writes to storage on behalf of that user.
User control or how is Gaia decentralized?
A Gaia hub runs as a service which writes to data storage. The hub service writes to data storage by requiring a valid authentication token from a requestor. Typically, the hub service runs on a compute resource and the storage itself on separate, dedicated storage resource. Typically, both resources belong to the same cloud computing provider.
Gaia’s approach to decentralization focuses on user control of data and its storage. If a user can choose which Gaia hub provider to use, then that choice is all the decentralization required to enable user-controlled applications.
The control of user data lies in the way that user data is accessed.
When an application fetches a file
data.txt for a given user
lookup will follow these steps:
- Fetch the
- Read her profile URL from her
- Fetch Alice’s profile.
- Verify that the profile is signed by
- Read the
https://gaia.alice.org/) out of the profile
- Fetch the file from
alice.id has access to her zonefile, she can change where her profile
is stored. For example, she may do this if the current profile’s service or
storage is compromised. To change where her profile is stored, she changes her
Gaia hub URL to another Gaia hub URL from another hub provider. If Alice has
sufficient compute and storage resources herself, she may run her own Gaia
Storage System and bypass a commercial Gaia hub provider all together.
Applications writing directly on behalf of Alice do not need to perform a lookup. Instead, the Blockstack authentication flow provides Alice’s chosen application root URL to the application. This authentication flow is also within Alice’s control because Alice’s browser must generate the authentication response.
Understand data storage
A Gaia hub stores the written data exactly as given. It offers minimal guarantees about the data. It does not ensure that data is validly formatted, contains valid signatures, or is encrypted. Rather, the design philosophy is that these concerns are client-side concerns.
Client libraries (such as
blockstack.js) are capable of providing these
guarantees. Blockstack used a liberal definition of the end-to-end principle to
guide this design decision.
Gaia versus other storage systems
Here’s how Gaia stacks up against other decentralized storage systems. Features that are common to all storage systems are omitted for brevity.
|User controls where data is hosted||X|
|Data can be viewed in a normal Web browser||X||X|
|Data is read/write||X||X||X|
|Data can be deleted||X||X||X|
|Data can be listed||X||X||X||X||X|
|Deleted data space is reclaimed||X||X||X||X|
|Data lookups have predictable performance||X||X|
|Writes permission can be delegated||X|
|Listing permission can be delegated||X|
|Supports multiple backends natively||X||X|
|Data is globally addressable||X||X||X||X||X|
|Needs a cryptocurrency to work||X||X|
|Data is content-addressed||X||X||X||X||X|