Reliability, availability and data integrity are generally considered synonymous when, in reality, they are not. Knowing the difference between these helps organisations understand how safe their data is.
Reliability determines the frequency of system repairs. Reliability is also said to be MTBF (Mean Time Between Failure). Mathematically it is defined as the reciprocal of a system’s failure rate. We all know that complicated systems are more prone to failures than the simple ones, due to the number of components in them. Thus, people generally think simple is better, as no component can cause a failure. The fault in this logic is that, when a system only has the basic components, every small failure can lead to downtime. The solution to this is to add redundancy to eliminate single point of failures. However, the downtime to this is the increase in the repair activities and lower reliability.
Availability determines the amount of time a storage system can attend to I/O requests and is generally put together in percentage. Its mathematical formula is MTBF/(MTBF+MTTR) with MTTR being the Mean Time To Repair. This shows that a system can be 100% available only when there is no Single Point of Failure, all the software are up-to-date, and repairs are nondisruptive.
Data Integrity can be defined as the ability to sustain the data accuracy as data written to, stored in or read from a system.
Wikipedia explains it as the maintenance and the assurance of, the accuracy and consistency of data over its entire life cycle.
Generally, data integrity is linked to media quality. Data protection algorithms, software and monitoring, play a significant role in maintaining data integrity. Strategies include
• Increasing the degree of resilience
• Correcting single bit errors before they convert to double-bit errors
• Shrinking the vulnerability windows
• Monitoring changes in media
• Regularly replacing media that show signs of failure
The capability of systems to meet the requirements requires the storage architects to think along the lines of usable availability. Usable availability is the system’s ability to meet the service level objectives, even in the occurrence of hardware failures. The strategy for this is to reserve enough spare performance and capacity.