Home > General > Hot Spindles

Hot Spindles

March 11th, 2010

Excuse the absence in both presence and posts. It’s been a roller-coaster past year with personal injury and flat-out work schedules, so I have had little time or motivation to blog or show my face around the communities. My apologies, and I am determined to break this habit and get back into things once again! But enough of the chatter, get on with the writings…

This isn’t something I see very often, but when I do, it’s interesting to see the stats speak for themselves. I’m with a customer who had a scripted deployment of their NetApp estate a few years ago, and it wasn’t designed or delivered with too much care or attention (something I want to discuss another day). They have a VMware estate with SQL, Exchange and other things. It all runs across a total of over 100 15k FC spindles. It’s not a huge estate in comparison with other sites, so I’m intrigued into why they have such performance issues.

Now when you run through “sysstat –u”, you can see that the filer itself is doing very little, quite happily getting on with what it should do. But the disk is hitting 100% quite often. Immediately this shows a disk problem. They need more spindles, obviously?

Firstly there is an imbalance of spindles. They have a second aggregate on the partner controller that only has test volumes. I get permission to remove this and hot, I re-allocate these to the other controller and expand the existing aggregate. This doubles the spindle count, but I know it’s not going to do anything for existing performance (in that the data won’t automatically redistribute itself!).

If I run through “stats show disk:*:disk_busy” I can see something pretty obvious. There is a single disk in the entire system that is hitting 100%, the rest are not. There are a bunch of other disks (about 10), that are running 50-60%, and then the remaining disks ticking away at around 20-30%. So what has happened here? NetApp technology should prevent any form of hot spindle in the system.

My theory is this. The filer was racked and stacked out of the box, but the aggregate wasn’t grown (3 disk aggregate, 1 data, 2 parity). Some storage was provisioned and data migrated. They ran out of space, so grew the aggregate (a little), then copied a bunch more data onto the disks. After all this, they then added the rest of the disks. Now because data won’t automatically re-allocate on the fly, any data that remains unchanged (as will happen with VM System Disks, old Exchange emails, and old Data Warehousing data), then they are still sat on the original spindles or even spindle as when they were first installed.

So I’m now looking forward to the weekend. We’ll be upgrading them to Data ONTAP 7.3.2 and I can then run some reallocation scans across the system without impacting snapshot space usage (huge bonus, thank you NetApp!). I’m hoping that this will remove the hot spindle issue. I have some before stats, and I’ll pull out some after stats next week. I’ll update this post accordingly.

Lesson from the story? Setup your storage system COMPLETELY and thoroughly before you start throwing data at it. Don’t get excited about using your new storage toy and throw data on it immediately. I have seen the above scenario on several occasions now, and prior to ONTAP 7.3, it was a pain to fix.

Quick snapshot of the stats output. Keep in mind that across a cluster this will show all disks, so all disk stats are entirely relevant. The busy disks here just don’t add up to the actual number of disks in the system, and you can clearly see the one busy disk.

> sysstat -u 1
 CPU   Total    Net kB/s    Disk kB/s    Tape kB/s Cache Cache  CP  CP Disk
       ops/s    in   out   read  write  read write   age   hit time ty util
 11%    3220  6942  3270   4232      0     0     0    12   95%   0%  –  60%
 11%    2898  7385  4030   4892      0     0     0    11   94%   0%  –  69%
  9%    3547  1820  3496   3920     24     0     0    11   93%   0%  –  89%
  7%    2329  1160  3048   3892      0     0     0    11   93%   0%  –  81%
 10%    3173  2055  4851   4644      8     0     0    11   93%   0%  –  67%
  9%    2491  1860  4547   4568     24     0     0    11   91%   0%  –  98%
  9%    2523  2960  4404   5372      0     0     0    11   90%   0%  –  89%
 14%    5136  8173  4465   3352      0     0     0    11   95%   0%  –  81%

> stats show disk:*:disk_busy
… snip …


… snip …

General , , , , ,

  1. Ronny
    | #1

    Another important point is that you shouldn’t add only a single disk when you resize the aggregate if it’s nearly full, most of the new data gets written to the added disk. Thus, performance is really bad!
    My recommendation: create few large aggregates instead of many little ones. Add disks to the aggregate when usage is over 80%. And yes, use Performance Advisor and Thresholds to monitor your performance!

  2. | #2

    Thanks Chris – some really good tips there! Glad you’re writing again :)

  3. | #3

    Cheers for the feedback, feels good to actually get the chance to write something down again!!!

    And yes, adding single disks is a terrible thing to do. I know someone that buys 1 disk a month because that’s how their budget works. I hate this, and try to get them to store them and add them in bulk at the very least. Doesn’t help with their account manager encouraging them to do this can calling it storage on demand!!! :( Shocking!!!

  4. rick rhodes
    | #4

    You mention that “run some reallocation scans across the system without impacting snapshot space” as a new feature with 7.3.2. Maybe an idea for another blog entry would be to explain this some more, and why it’s important. I understand (previously) that reallocation would dump all the work into the snapshots, but I’m not aware of the change in 7.3.2 you mention that fixes/changes this.

  5. | #5

    Hopefully I’ll be running through this at the weekend, so I’ll be able to give some real world examples of how this works.

  6. | #6

    Of course you could always slot the new singleton drives into a shelf every month, but leave them idle as spares until you get a full new RAID group’s worth… just don’t tell them that ;-)

  7. Anton
    | #7

    @rick rhodes
    The new reallocation in 7.3.x is physical reallocation (reallocate -p, see the man page). And even if you expand an aggregate with an entire shelf or more, you might still want to do a physical reallocate of all the volumes in the aggregate, even if you don’t have hot disks. That way, you can stripe the data across even more spindles, so it will yield higher (read)performance for existing data as well.

  8. | #8

    Actually the manual page says that “reallocate -p” shouldn’t be used to spread data across the disks. It recommends doing reallocate against each volume within the expanded aggregate.

    Not sure what the actual impact of this is, I haven’t had a system to try this on which would see massive improvements.

  9. Erlendur
    | #9


    This is a wonderful Post

    Just a small question


    How can I find out which aggregate this disk belongs to?

    I tried with disk show and storage show disk, aggr status -r

    But couldn’t find any


  10. | #10

    Unfortunately I’m not 100% sure. It’s on my “to-do list” and I’ve yet to figure out how to translate the long address space the “stats” command gives you into something usable in terms of the actual disk address or location. Sorry this doesn’t help you out much :(

  11. Joe Ropar
  12. | #12

    That’s excellent! Thank you!

  13. Vladimir
    | #13

    I’m curious what are signs of necessity for running “reallocate”, besides having a disk with busy 99%?


  14. Vladimir
    | #14

    Ronny :
    And yes, use Performance Advisor and Thresholds to monitor your performance!

    What exactly are looking for in Perf.monitor? Latency, ops/sec ?

  15. | #15

    Hi Vladimir,

    Running “reallocate” is now considered fairly good practice on a variety of LUNs. Anything that will gain a benefit from large sequential reads is a good candidate for a regular scheduled reallocate, but also many different common types of LUNs will benefit anyway.

    Although the NetApp disk subsystem does a very good job of placing data in large chunks and stripes across the disks, it can only do so much either because a system is very busy or because the disks are very full. Running a reallocate afterwards is post-process so it can take it’s time to ensure the data is laid out totally evenly.

    I may be cautious of running reallocate if the disks are already 99% busy, reallocate will put a greater load on them for a period of time when the data is reallocated. I’d recommend doing this during a maintenance window, or out-of-hours.

  1. No trackbacks yet.

This site is not affiliated or sponsored in anyway by NetApp or any other company mentioned within.
%d bloggers like this: