Posts

Showing posts from 2016

Backblaze hard disk drive failure data: Update to Q2 2016

Image
Ross Lazarus, September 2016 This is a Kaplan Meier analysis of the BackBlaze hard drive reliability data, using all available data to end second quarter of 2016  from  https://www.backblaze.com/b2/hard-drive-test-data.html   .  Previous posts are  at  http://bioinformare.blogspot.com.au/2016/05/survival-analysis-of-hard-disk-drive.html  and  http://bioinformare.blogspot.com.au/2016/02/survival-analysis-of-hard-disk-drive.html   . I reran my scripts and got the plots shown below. It's taking a while to read all the data as there are now a very large number of drives spinning. A total of  41740623 rows were processed in about 35 minutes on my home desktop by the python script in the github repository. The new 8TB drives are performing the best of all - even better than the HGST and Hitachis - and way better than any of the earlier seagates. Hard to miss here - not so obvious in the report at Backblaze https://www.backblaze.com/blog/hard-drive-failure-rates-q2-2016/ Upd

Backblaze hard disk drive failure data: Update to Q2 2016

Image
Ross Lazarus, September 2016 This is a Kaplan Meier analysis of the BackBlaze hard drive reliability data, using all available data to end second quarter of 2016  from  https://www.backblaze.com/b2/hard-drive-test-data.html   .  Previous posts are  at  http://bioinformare.blogspot.com.au/2016/05/survival-analysis-of-hard-disk-drive.html  and  http://bioinformare.blogspot.com.au/2016/02/survival-analysis-of-hard-disk-drive.html   . I reran my scripts and got the plots shown below. It's taking a while to read all the data as there are now a very large number of drives spinning. A total of  41740623 rows were processed in about 35 minutes on my home desktop by the python script in the github repository. The new 8TB drives are performing the best of all - even better than the HGST and Hitachis - and way better than any of the earlier seagates. Hard to miss here - not so obvious in the report at Backblaze https://www.backblaze.com/blog/hard-drive-failure-rates-q2-2016/ Upd

Survival analysis of hard disk drive failure data: Update to Q1 2016

Image
Ross Lazarus, May 2016 This is an update to  http://bioinformare.blogspot.com.au/2016/02/survival-analysis-of-hard-disk-drive.html now that a dditional data for Q1 2016 has been released from https://www.backblaze.com/b2/hard-drive-test-data.html . I reran my scripts and got the plots shown below. Whole process only takes a few minutes. For me, the interesting thing is that so little really changes in the KM curves and statistics with 10% more data, suggesting that this statistical approach is reliable and robust, although in general we expect that more data provides better resolution.  The WD30-EFRX and WD10-EADS and drives are reordered in terms of failure risk with more data down near the middle of the pack, but the updated models KM curves otherwise suggest the same pattern of risk of failure over time. Hitachi and HGST have reversed their positions at the top of the manufacturer survival curves as a result of the additional data, but the other manufacturers remain largely
Image
Survival analysis of hard disk drive failure data. Ross Lazarus, February 2016 Executive Summary: Using a well established, objective analysis and data presentation method designed for right censored hard disk drive failure data provides insights which are not provided by simple descriptive statistics or charts. The Kaplan-Meier statistics and plots are recommended for routine use with hard drive failure data and their use is illustrated with 30M data points from the BackBlaze public data. Introduction: Hard disk drives are widely used for mass storage in servers, network attached storeage devices, laptops and desktop computers. Familiar and convenient as they are, these complex electro-mechanical devices are prone to sudden catastrophic failure, which can lead to very unpleasant consequences such as loss of data which was not securely backed up elsewhere. Selecting drive manufacturers and models for home or for commercial applications is complicated by the problem t