Home > Web Searches > Spares FAQ

Spares FAQ

I’ve seen a few search hits over the past weeks for spares, making spares, removing spares, disable spares and so on, so I thought I’d put down a quick FAQ on the subject. I will add to this as more searches come up, or if people ask specific questions.

How do I make new spares?

Any disk in your system, that is owned but has no data on it will be marked as a spare disk. Most systems have disk auto-assign enabled, but you can do a “disk assign <disk_id>” to do it hard. Be careful to get the <disk_id> correct as there are ways to override the protection and remove disks from a partner!

You cannot remove disks from an aggregate! This is very important, so grow your aggregates with caution, it’s easy to grow, impossible to shrink. If you’ve grown your aggregate too much then you’ll need to destroy it to regain those spares (that’s a bummer!).

If a disk has been moved around, or previously had data on it, you’ll need to zero it. “disk zero spares” does the job, depending on the size of the disk will depend on how long it takes, usually no more than 4 hours even for the largest of disks (at time of writing, 1TB disks).

How do I remove spares?

This depends what you are removing them for. Adding them in to an aggregate, just a simple case of “aggr add ….”. To remove them from the system, you can pretty much just hot pull it out. Disable autosupport first though, or you’ll probably get a replacement in the post ;). To swap them accross systems, you can use “disk remove_ownership <disk_id>” and then on the newer system “disk assign <disk_id>”, or if you’re lazy, “disk assign all”!

How do I disable spares?

Very simple, you can’t :(. The system needs protection, so much the same as the police will stop you for not wearing a seatbelt, despite your disregard for your own life, they will still stop you. The filer is a policeman! Regardless of your disregard for data protection, or perhaps your need for that extra little bit of space, the filer will complain constantly if you don’t have any hot spares. You will need at least 1 per disk type. That is, if you have FC and SATA, you will need at least one of each. If you have FC 10k and FC 15k, you can get away with only 1 FC 15k as it will slow down to 10k, but you cannot get away with just 1 FC 10k as it won’t be able to keep up with the rest.

Why does the GUI force me to have 2 hot spares?

This is the urban legen I’m told. NetApp engineering discovered that somewhere around 70% of returned disks only had soft errors and could easily be used for another year or so. Don’t quote me on the numbers there, but suffice to say it was a lot. A quick low-level format of the disk and reloading with firmware and the disk was ready to go back into production!

So NetApp built in this functionality into the storage controller. The Disk Maintenance Garage will take a failed disk, turn it offline and run some low-level diagnostics to see if the disk can be recovered. I’d say 50% of the time it can, and goes back in as a hot-spare! This saves everyone time and money. But just in case, the filer will still do a disk rebuild, and an autosupport will still be sent, and (providing a valid support contract), a disk will still be shipped in replacement. Because the process takes the failed disk offline, you still need 1 hot spare ready in case of double disk failure. So you need 2 hot spares per disk type.

Web Searches , , , , ,

  1. Rajesh Nair
    | #1

    Few steps to move spares from one controller to its partner.

    1. priv set advanced
    2. disk remove_ownership
    3. go the partner node
    4. disk assign
    5. aggr add -d

    If the disk is in a ‘failed’ or broken mode, do the following

    1. Add the disk to the spare pool (type following command)
    2. disk assign
    3. Follow steps in the previous scenario

  2. | #2

    I don’t believe you can enter partner mode unless the system has failed over, but switching to partner login should work. Thank you very much for the tips though, very useful!

  3. tazinblack
    | #3

    I’m pretty sure that you can run a filer without a spare.
    Therefor you have to tell your filer that it is ok to have no spare by setting
    raid.min_spare_count to 0 with command options raid.min_spare_count 0.
    Then you can add your spare to an aggregate with aggr add 1.
    But you should only use this in small raidgroups. The more disks you have in a raidgroup the more probably one fails. I use this only on small filers like 2020 in clustermode since there you have only 12 disks for two heads. There I prefer to have one raid_dp aggregate for each head without having a spare instead of having one raid4 aggregate with a spare for each head.
    So two disks can fail at the same time and you’ll keep your data nevertheless.
    With a raid4 aggregate and a spare you’ll need the rebuild to complete before the second disk can fail. If it fails while the rebuild is running you’ll loose your data.

  4. tazinblack
    | #4

    …and keep in mind: you can’t remove a disk from an aggregate after you added it.
    So think twice what you want!

  5. | #5

    You technically can, but I really wouldn’t recommend it. As an absolute minimum you need 1 hot spares, and as a recommended minimum I dislike going below 2. Even for a 2020 I’d still keep at least 1 hot spare, I’d prefer RAID-4 with 1 hot spare than RAID-DP with no hot spares.

  6. abdoozy
    | #6

    So I accidentally, and kind of stupidly, assigned my spare disk to an aggregate; the disk had failed, and after assigning ownership I added it to the aggregate. This is a 4243 shelf and all 14 disks on the shelf are in the aggregate now. The head that’s connected doesn’t have another aggregate with this disk type in it, although its partner does. Deconstructing this entire aggregate to recover a spare is going to be VERY inconvenient. Is there really *no way* to remove a disk from an aggregate? Can it somehow use its partner’s spare drive?

  7. | #7

    Hi all, D from NetApp here.

    Just came across this today.

    Chris Kranz :
    You technically can, but I really wouldn’t recommend it. As an absolute minimum you need 1 hot spares, and as a recommended minimum I dislike going below 2. Even for a 2020 I’d still keep at least 1 hot spare, I’d prefer RAID-4 with 1 hot spare than RAID-DP with no hot spares.

    Mathematically you’re far better off running RAID-DP with no spare vs RAID4 with spare.


  8. | #8

    Do you have any of the maths to share? :-)

  9. | #9

    ooops :-(

    I’m sorry, there’s really no way to remove this. You can use the partners spare (re-assign ownership) but that’ll consume this spare. You need an extra disk from somewhere to have as a hot-spare, and you need a minimum of one per controller. It’s a pain, but it’s fundamental to how WAFL and aggregates work so I’m afraid there’s no clever workaround other than a) buying more disks b) rebuilding the aggregate.

  1. No trackbacks yet.

This site is not affiliated or sponsored in anyway by NetApp or any other company mentioned within.
%d bloggers like this: