We have a HP DL380 G5 server with Centos 5.0 on it with hardware raid 5, as one of the disk has failed and spare disk has taken over it's place so we don't have any spare disk in the array. I want to add the spare disk in the array and i want your valuable suggestion how to do that. The machine has 3 SCSI hot pluggable drives.
I know the utility is hpacucli but dont know how to add the disk in the raid. This is my first time i am doing it and server is Live with some critical application.
if any one can let me know steps to add the spare disk including if we need to format the disk with ext3 file system before or after adding the disk.
The whole point of hardware RAID and hot-swappable disks is that you pull out the failed disk and stick the new one in and you're done.
It's not clear from your question how many drives you've got, if you've really got only 3 drives and one has failed, you should probably put the replacement disk in soon...
Dell hardware RAID solutions that include hot pluggable backplanes will automatically rebuild in swap. I use the RAID utilities to verify successful rebuilding. I cannot specifically speak for HP hardware but I would be shocked if it differs much.
No disk formatting or filesystem creation is necessary.
Yes, this is on of those "do my job for me" questions, have some pity:)
I'm at the limit for what I can do with the number of hard drives in a server without spending a substantial amount of money. I have four drives left to configure, and I can either set them up as a RAID 5 and dedicate a hot spare, or a RAID 10 with no hot spare. The size of each will be the same, and the RAID 5 will offer enough performance.
I'm RAID 5 shy, but I also don't like the idea of running without a hot spare. I'm not so interested in degraded performance, but the amount of time the system is without adequate redundancy. The server and drives are under a 13x5 4 hour response contract (although I happen to know that the nearest service provider is at least 2-3 hours away by car in the winter).
I should note that the server also has two RAID 1 arrays which would also be protected by the hot spare. Why don't they make drive cages with 9 bays! Heh.
What is the downtime tolerance of the array? Is it physically close, or in a remote data center? Bascially, if you can tolerate it, a cold spare allows you to do RAID10. The spare is sitting close by, but you have to physically do the swap. If that is not an acceptable scenario, then RAID5 with a hot spare is the only answer left.
Since you already have two RAID1 sets that have 1 drive failure tolerance, you really gain nothing by going RAID10 with no hot spare. Your entire array can still only survive a single drive failure.
These raids are all relative to each-other assuming the same disks and controller in the array.
Raid5: Good read speed, rotten write speed, can survive any double disk failure if the failures occur over enough time for the raid to rebuild between failures. (ie disk fails, raid rebuilds, disk fails, you're okay). If you have simultaneous double disk failures, you're SOL unless one of the failures is the hot spare. With a 4 disk array, half the double disk failures will ruin your day.
Raid6: Good read speed, really rotten write speed. Can survive any double disk failure. Not as commonly implemented as the other raids.
Raid10: Good read and write speeds, can survive any single disk failure, can survive (in the case of a 4 disk raid) half of potential double disk failures.
three-way-mirror + hot spare: lots less space, can survive any double disk failure and failure of up to 3 disks if the failures occur over enough time for the mirror to rebuild once. I'm not sure how many controllers / operating systems support this, but it was a feature I used in solaris with the MD stuff before ZFS.
There are a couple issues to worry about when looking at this:
how long does it take to rebuild an array? Sun started developing ZFS when they realized that under some situations, the time to rebuild a raid5 array is greated than the MTBF of the disks in the array, virtually guaranteeing that a disk failure results in an array failure.
disks from the same manufacturing lot may all have the same flaw (either the pallet was dropped or they put too much glue on the platters when they were making the disks)
The more complex the raid array, the more complex the software on the controller / implementation; I've seen as many raid controllers kill arrays as failed disks kill arrays. I've seen individual disks spin for years and years and years -- most do that in fact. The most reliable system I ever had was a box with redundant nothing that just never had a component failure. I've seen plenty of UPSs and raids and redundant (insert random components) cause failure because they made the system enough more complex that the complexity was the source of the failure.
You pays your moneys, you takes your chances... The question is,
Do you feel luck?
I'd have to disagree with CHopper3. Since there are only 4 drives in this situation your failure capabilities are the same (2 drives) with either scenario, except with raid 10 if you happen to lose the wrong 2 drives then you'll have a real problem. Also there is definitely an added benefit of having a global spare for your other RAIDs as well.
I think it depends as well on what you want to put on this RAID. In one situation my customer and I decided to go for two RAID1. The situation was like this: It was one Vmware server with 4 VMs on it. Two of those VMs have been considered very important (= valuable data on it and read/write intensive) and the other two less important. So we have put one important and one unimportant VM together on one RAID1 and the other two on the other RAID1. Our arguments where that in in a case of a 2 disk failure there is still a chance that everything will be working. The worst thing is that one RAID array is not working at all. Then we still have two VMs running.
So, my case is that it depends as well what you want to put on those disks maybe womething else would be the best solution.
What operating system is running on the server? If you have Linux than you could make two RAID1 and then combine them with LVM too.
Else I would recommend you to stick with the answer from Chris.
Some other things need to be considered too.. how big/fast are each of the drives? 1TB SATA drives could take forever and a day to rebuild off the hotspare in a RAID5 leaving a large window open to a second drive failure..
You say performance isn't an issue, but I've seen some considerable performance hits going on during a RAID5 rebuild (especially on writes).
R10 every time, that way half of your disks can fail, with R5+HS you're dead if two die.
Plus R10 should be quicker too.
In RAID 10 with four disks there are six combinations of double disk failure scenarios. Only two of the six will kill the whole array, so you can survive 66% of double disk failures