Blockchain Performance, Throughput and Scalability
When considering the type of blockchain protocol to employ for a healthcare solution, the concepts of network performance, transaction throughput, and scalability must be considered.
In software, performance is generally measured by the volume of data moving from one endpoint to another. Performance depends on the latency and bandwidth of the network and, in blockchain specifically, the distribution size of the consensus nodes.
Typical performance for a private consortium blockchain platform ranges from hundreds of blocks per second to thousands of blocks per second. Real-world examples of healthcare data in blockchain technologies are limited and the metrics likely will be different based on the types of data in healthcare.
Blockchain technology best practice states ‘smaller is better’. Unfortunately for blockchain performance, most EHR and clinical data sets consist of large amounts of data, including historical information and data from multiple institutions. Data may not be accumulated in chronological order (one facility reports slower than another). Since a full copy of the ledger is presented each time a new transaction is added, as this information accumulates performance may degrade. Data sets containing demographics, encounters, diagnosis, medications and excluding larger data (images, notes, ETC) could help keep performance stable. This would still depend on the networks associated with the blockchain having sufficient bandwidth. There are other limiting factors affecting network performance within nodes, which are beyond the architectural design of the blockchain.
Patients owning their data and having the ability to provision access to who they choose is one of the potential uses for blockchain technology. This type of solution would be less impacted by the performance characteristics based on a less critical need for speed. Conversely, the need for critical information about a patient in an emergent scenario would be dependent on a high-performance solution; for example, the sharing of medication allergies during an encounter.
In general, all blockchain protocols express throughput in terms of blocks appended to the blockchain per second. This transaction metric depends on the applicable consensus algorithm, which specifies how nodes communicate to ensure the validity of the appended transaction and the consistency of each of their copies of the shared ledger.
The predominant factors affecting throughput will be the design, size of data and scope of the blockchain. Design based on the smallest footprint (critical vs. non-critical information) is a limiting factor. Current data trends focus on sharing everything and using what you need. Blockchain depends on a different mindset to facilitate throughput, focusing on moving a limited set of data. If throughput is a major consideration of the blockchain being designed, examples of easily encapsulated data are demographics, diagnosis, date of service and other self-contained pieces of information.
Blockchains scale best with lightweight metadata, provenance, transaction information, and audit information. In contrast, they do not scale as well with the addition of larger information types, such as images or full genomic datasets. This is because the data for every block committed to the blockchain must be replicated around every node of the blockchain, and each node of the blockchain stores a complete copy of the shared ledger, representing the sum of all data across all blocks from the start of the blockchain.
Scalability is always a consideration in software implementations. It is the largest problem faced by entrepreneurial companies built around a small set of clients who then need to assimilate a larger set of clients. The same will be true for blockchains storing healthcare data; it is at an entrepreneurial point and will need to expand rapidly. Design considerations could be as simple as multiple blockchains with different data sets (demographics change very little, encounters change on a regular basis), but options for scaling should be considered as part of initial buildouts.
With blockchain, the ability to refactor all of the information gathered (after the fact) as it grows will be challenging. Starting with the right set of data from the beginning (which is difficult) will improve the ability to scale.
The upfront design is critical to the performance, throughput and scalability of a blockchain. Having a clear idea of the problem that you are working to solve, what information and which stakeholders are needed to address the problem, and the expected result will shape these elements.
Limiting the result set initially and expanding as you discover what is needed and used, is a way to control all three of these factors. When speaking of a distributed ledger, getting what is needed versus getting it all will help with performance, throughput and scalability.
For further questions or content suggestions, please email firstname.lastname@example.org.