SQL Adventure: All About RAID

I have been asked about RAID a lot of times at the interviews previously. My mindset was quite blurred at a couple of times. But now, I am asking myself, ‘What is RAID?’, ‘What are RAID levels?’ and ‘How do you configure RAID level for SQL Server?’ In this post, I am starting with the basic concept of RAID.

What is RAID?

RAID shorts for Redundant Array of Independent (or Inexpensive) Disks. It’s the combination of multiple disk drives into large, high performance logical disks, which allows the same data to be redundantly stored in the different places crossed the physical disks. As the result, it will increase fault tolerance, data availability, system reliability and I/O performance.

What are RAID levels?

The common RAID levels used for SQL Server are RAID 0, 1, 5 0+1 or 1+0.

RAID 0: Blocks Stripped without Fault Tolerance

In RAID 0 array, data are spread out evenly crossed multiple disks with no redundancy. The minimum 2 physical disks are required.

It has the lowest cost in all RAID array, and provides superior I/O performance. However, because there is only one data copy, if one disk fails, all the data will lose. Avoiding RAID 0 for SQL Server, not even for tempdb, is always recommended.

RAID 1: Mirroring and Duplexing

It provides that data is written identically to at least two duplicated disks. You need minimum two disks for this configuration. Read performance is improved since either disk can be read at the same time. Write performance is the same as for single disk storage. The operation continues, as long as at least one drive is functioning in RAID 1 array. Therefore, it provides the best performance and the best fault-tolerance in a multi-user system, like online systems.

RAID 5: Blocks Striped with Distributed Parity

In the array, both data and parity information are spread cross all the disks. Parity information can be used to calculate the subsequent reads and re-create the data on the faulty disk, which allows us to continue operations, if there is only a single disk failure. The minimum required the disk number for a RAID 5 is three.

RAID 5 provides good read performance and good redundancy, due to distributed date over all of the disks. However, heavy write could overhead, because additional parity data has to be created and written to the array.

RAID 1+0: Blocks Mirroring+ Blocks Striped

In this array, the data is mirrored and the mirrors are striped. As the results of it, RAID 1+0 provides the best optimization performance with increased fault tolerance. It has an ability to continue the operation after multiple disks failure in one Stripe Set. For example, if disk1 and 2 both fail, the array will be still operational. However, the array would fail when either disk1&2 or disk3&4 become faulty at the same time. Minimum four disks in the array is required, which increases the cost

RAID 0+1: Blocks Striped + Blocks Mirroring

It’s also is the combination of RAID 0 and RAID 1. But the layout is different with RAID 1+0. In RAID 0+1, data is organized as stripes across multiple disks, and then the striped disk set is mirrored. Any one drive failing from each RAID-0 array (2 drives) will fail the entire array. For example, if disk 1 fails, disk2 will no longer be in used, the operations will only continue on in RAID O level without fault tolerant. In this case, if either disk3 or 4 fails before disk1 is replaced and mirroring pair starts to work again, the systems will not be functional any more. Therefore RAID 0+1 is less resilient than RAID10 It also requires minimum 4 disks.

After understanding of the basic concept of RAID Levels and the advantages/disvantages with different RAID, the next question would be how we configure RAID level with SQL Server in order to achieve the best performance. I will talk about this in the next post.

SQL Adventure

Wednesday, 21 September 2011

All About RAID

No comments:

Post a Comment