The world of blockchain technology has witnessed remarkable growth, and with it, the need for robust data analytics solutions to extract valuable insights from blockchain networks. While various tools have emerged in recent years, the landscape to analyze Substrate blockchains, in particular, remains relatively nascent.
The current state of Substrate blockchain explorers, such as polkadot.js, polkastats, subscan, and others, showcases significant progress in providing visibility into blockchain data. These tools have contributed to enhancing the accessibility and ease of exploring transactions and events on Substrate-based blockchains. However, while these tools offer valuable exploration features, they have certain limitations. A one notable limitation is the lack of flexibility when it comes to performing aggregations based on specific parameters.
This restriction hinders the ability to derive deeper insights and perform complex analysis on the blockchain data. In this article, we will explore an approach that combines the power of open source technologies for indexing and database management, along with a versatile visualisation tool.
The goal of the article is to discover a simplified and user-friendly method to analyze Substrate blockchains using a range of powerful open-source tools. This approach offers a straightforward and intuitive way to delve into blockchain data analysis. While this approach excels in simplicity, it's important to note its limitations in scalability for large-scale blockchain networks. Nevertheless, by leveraging the capabilities of popular open-source tools, users can easily navigate and explore Substrate blockchains to uncover meaningful patterns and trends.
<aside> ℹ️ For those who are eager to dive right into the practical demonstration and get a quick overview of the contents of this article, I've created a video that summarizes the key points and provides a more interactive demo. In the video, I walk through the concepts discussed in this article and showcase the examples in action:
‣
</aside>
It's important to note that the emphasis in this article is on simplicity and flexibility to provide a comprehensive and customizable analytical experience available locally. It's also necessary to mention that this article is based solely on the author's experience and may not be as scalable or sophisticated as other solutions. Nevertheless, the emphasis on simplicity aims to provide a starting point for those seeking an accessible entry into Substrate blockchain analysis.
Throughout this article, the examples and instructions provided will be tailored for analyzing a Substrate blockchain based on the SORA network. Similarly, the configuration and usage of Subquery will be emphasized as the primary tool for indexing blockchain data. While the concepts and techniques discussed can be applied to other Substrate-based blockchains, it is important to keep in mind that certain details and configurations may vary depending on the specific blockchain network. The focus on SORA blockchain and Subquery stems from the author's familiarity with these networks and tools.
Metabase was chosen for its user-friendly interface and simplicity, making it an ideal choice for beginners and those seeking an easy-to-use analytics solution.
One might wonder, why not use query aggregation, as described in the Subquery documentation, to achieve similar results? While query aggregation is a powerful feature, it currently has limitations when it comes to working with JSON objects, including those of the @jsonField type. This means that aggregating data stored in generic @jsonFields, like in the example mentioned in Subquery’s GitHub issue #522, is technically impossible using the Graphql query editor. To overcome this limitation and provide a flexible and accessible analytical experience, Metabase serves as a valuable alternative.
Finally, it is worth noting that this article is not sponsored by any of the mentioned entities. The choice of SORA and Subquery is solely driven by the author's experience and familiarity with these technologies.
Before diving into the implementation details, it's essential to acknowledge that different options exist to analyze Substrate blockchains, each with its own strengths and weaknesses.
substrate-api-sidecar is a tool developed by Parity Technologies that provides a simplified interface for interacting with Substrate blockchains. The primary purpose of substrate-api-sidecar
is to expose blockchain data through a RESTful API, enabling users to retrieve information such as blocks, transactions, events, and account details. While substrate-api-sidecar
is a powerful tool for accessing and retrieving blockchain data, it primarily focuses on providing an interface to interact with the blockchain rather than performing complex data analysis or aggregations. For performing aggregations and advanced analysis on Substrate blockchains, additional tools and frameworks, such as those discussed in this article, can be used in combination with substrate-api-sidecar
. These tools allow for more flexibility in customizing aggregations and deriving insights from blockchain data.
The colorfulnotion/substrate-etl project focuses on providing an Extract, Transform, Load (ETL) framework specifically tailored for Substrate-based blockchains. It aims to simplify the extraction of blockchain data, perform necessary transformations, and load the data into a database for further analysis. This approach enables users to create their own custom data pipelines and apply aggregations based on their specific requirements.