Posts Tagged ‘RAID’

Functional Capacity

December 12th, 2012


It’s my job to size storage properly. How do I do that? Well, with a significant amount of faith in the vendors! Clearly we need an understanding of the existing footprint. Not just provisioned storage, but backup routines, actual used storage, database white-space, change-rates, data types and so on. It can actually be fairly complex if you need a detailed and precise sizing done. But let’s be honest, most customers haven’t the time or inclination to pay to do this, they’ll guess at a large growth figure (10% YoY or maybe 50% over 3 years), then we factor in a little contingency space and that gives us our functional capacity requirements that should have enough headroom to allow for any inconsistencies. Sounds like guess work? A certain amount is, yes :-)

So why am I saying functional capacity when most vendors sell us on usable capacity? Because usable capacity often doesn’t factor in any technology, so if I want snapshots, that needs capacity. Replication? A bit more. Clones? Obviously a bit more still. Then it can get more complicated, what RAID configuration, 10, 5, 6? What size RAID groups? Most people will take the vendors recommendations here, but do you trust the vendor? If its a competitive solution are you getting a good discount, or is the configuration changing to optimise the functional capacity without impacting their margin too much? Then there’s efficiencies, tiering (what’s your working data set?), deduplication (how much data commonality do you have?), thin provisioning (how much white space or over provisioned storage?). Also the data split, how many of your required IOPS should be served from SAS, how much is near-line, how much flash do you need to accelerate the lot?

NetApp Capacity Planner_v2 SAS

However, it’s incredibly difficult (sometimes impossible) to estimate functional capacity. What is your change rate? You have a NetApp today and the change rate is 2% weekly. How much of that is deduplicated? How will that change when you most to a different storage technology with a different stripe size and RAID configuration? You use clones today: what is the write and change-rate of those clones? What is the impact of moving from one RAID size / type to another? This is all very difficult to be exercise over. Sure experience will us to become clairvoyant and to predict what you require with relative accuracy, but this isn’t always accurate. This is why there is always significant headroom in any storage solution. This is why a lot of the vendor quoted storage efficiencies can often be difficult to prove as you’ll bake in a lot of headroom. No doubt this will allow an innovative group of sales people in a few years to come back and guarantee us x% efficiency over your existing solution.
This is also a very good reason why capacity-on-demand is becoming so popular! Predictable cost (even at an uplift) and additional capacity that is always available. There’s nothing worse than explaining that the system went down because you run out of disk space! But it means none of us have to spend too much time (and money) on giving a detailed capacity planning report, we can make some very educated assumptions and size against that. It’s only if you need specific guarantees or performance that you need to go into a bit more detail around analysing the existing estate, and even that is tricky, especially if there’s a bottleneck to day. I’m really hoping that with flash technology we will be sizing against capacity only, as performance is very difficult to size accurately against (more so than functional capacity!).
Enhanced by Zemanta

General , , , , , , ,

RAID Atomicity

March 27th, 2011

As you do, I was reading up on RAID levels while in the bath. The topic of atomicity came up, and it’s something I wanted to share.

Not usually the most reliable source of technical data, but I’ll quote Wikipedia to help explain atomicity to set the stage. Taken from under the section of “Problems with RAID”…

This is a little understood and rarely mentioned failure mode for redundant storage systems that do not utilize transactional features. Database researcher Jim Gray wrote “Update in Place is a Poison Apple”[28] during the early days of relational database commercialization. However, this warning largely went unheeded and fell by the wayside upon the advent of RAID, which many software engineers mistook as solving all data storage integrity and reliability problems. Many software programs update a storage object “in-place”; that is, they write a new version of the object on to the same disk addresses as the old version of the object. While the software may also log some delta information elsewhere, it expects the storage to present “atomic write semantics,” meaning that the write of the data either occurred in its entirety or did not occur at all.

This has come back into light recently but under a different guise with SSD write failure problems. Many SSD manufacturers and enterprise storage vendors are addressing this with new firmware that writes all data sequentially, never over-writing a data block until all of the disk has been written then starting over-writing blocks from the start (that have obviously been freed up first).

General , , , , , ,

This site is not affiliated or sponsored in anyway by NetApp or any other company mentioned within.