Luca, January 2025
This document aims to be the entry point for anyone interested in Retrievability on Filecoin. Each section presents one aspect of retrievability, declined in the context of web3 and, specifically, of Filecoin Network.
Filecoin is the largest decentralized storage network to date, boasting over 4 EiB of raw byte capacity and approximately 120 PiB of user data stored (see here, keeping in mind that FIL+ deals have a 10x QAP multiplier).
For any storage network to achieve long-term success, it must ensure reliable data retrieval. In the case of Filecoin, retrievability refers to the ability to reliably access and retrieve files stored by Storage Providers (SPs). Unlike traditional cloud storage systems, where data retrieval is guaranteed by a centralized provider, Filecoin is a decentralized system. This introduces unique challenges related to network performance, data redundancy, SP reliability, and incentives.
This document explores the various strategies and protocols available within Filecoin to enhance data retrievability, along with their guarantees and limitations. It also gives an overview of different payments strategies and ways SPs and Clients can put in place in order to select each other.
Moreover, it presents a range of ideas for potential improvement in the context of retrievability which could be explored at protocol design level.
Each section ends with a summary table which aims to give a bird eye view of each section.
All the tables can be found in 📊 List of Summary Tables.
Filecoin enables decentralized storage, where clients store data with Storage Providers (SPs) and later retrieve that data on demand. However, unlike storage, which is provable (via Proof of Replication and Proof of SpaceTime), retrieval is a separate process that isn’t always provable. It depends on protocols and strategies that address different challenges.
Retrievability is influenced by several factors, including:
Filecoin offers various flavors of retrievability—strategies and protocols that help ensure data can be retrieved when needed. These range from redundancy models to retrieval networks and off-chain solutions.