Wednesday 18 November 2015

RAID

I always forget what each RAID looks like. Here, I don't want to repeat the standard information but give some comparison and consequences of using each RAID. RAID 2 and 3 are not used any more, I suppose that the same holds for RAID 4. RAID 5 is a suitable solution for databases.

The main difference between RAID 4 and RAID 5 is that in RAID 4 you can have many reads at a time (if you don't check the parity for every read), but you cannot have many writes at a time because each write have to update the parity disk. The RAID 5 supports both many reads and many writes at the same time (providing the writes do not want to update the parity on the same disk). All the reads and writes for RAID 2 and RAID 3 are sequential (only one at a time) because all disks are involved in a single I/O.

You can use RAID 0 but once one of your disks fails, then you loose some data and it's difficult to recover.

The RAID can have a spare disk (not used for a time being) that can replace a disk which failed automatically, which means that we don't have to wait for a technician to replace the failed disk manually. It can speed up the recovery time. The only drawback of RAID is its price, RAID 5 for a dozen of disks costs about $6000 which is still a high price.

Many big companies have the same philosophy as RAID. In RAID we use many commodity disks to provide fast and reliable storage, which is still keenly priced when compared to one huge disk (even taking into account the RAID price). The huge IT companies use many commodity machines instead of beefy servers.