Home > General > Locked / Busy Snapshots

Locked / Busy Snapshots

This is one of those annoying situations that can cause things to come crashing down when you least expect it.

Locked snapshots can happen for several reasons, the obvious ones are that they are the basis of a SnapMirror or SnapVault update (despite both relying on the snapshots, they will only actually lock a snapshot during an active transfer, you can happily delete the snapshots otherwise and this will destroy the replication relationship, more or less). They can also be because of a vol copy, or for a very brief period a snap restore.

2 of the less obvious, but more common reasons are that of a LUN Clone and a FlexClone. Both in principle are fairly similar, but in practice act very different.

Simply a clone will base itself on a snapshot and then create a sparse clone based on this snapshot, and as such the original blocks of data, hence using no data. The clone can then be split, but 99/100 it will be used for some sort of verification or reporting and then removed.

A LUN Clone is the one that causes us most problems. As said, it creates a clone based on a snapshot, but the clone is created within the volume, and only of a particular LUN. This is infact a very similar concept to that of the new Single File FlexClone available in 7.3. Based on a given snapshot, you will have a new LUN created within the same volume as the parent LUN. This will use no storage, and is great for running verifications, or possibly testing or even reporting.

The problem arises due to the nature of snapshots and the LUN clone. A snapshot takes a copy of the file pointer tables of an entire volume. A LUN clone will create a thin-provisioned clone of the parent LUN within the same volume. So despite using no actual storage, the LUN clone will get snapshotted by any new snapshots taken after the LUN. This is where the problems start. If we now delete the LUN clone, the clone will still exist within a snapshot, so the LUN will still be locked. If you aren’t quick and noticing this issue, you may have a lot of snapshots that are all locking this LUN clone.

The only way to unlock this snapshot is to delete all the subsequent snapshots from within the volume, then you can remove the LUN clone once and for all and the snapshot it is based upon.

Unfortunately this is quite a common occurence with SnapManager verification, often the verify fails, and as such the dismount of the cloned disk (LUN) fails to occur, subsequent snapshots happen and so you can easily get into a locked snapshot scenario quite often. The solution? Either keep a close eye on your verifications, or invest in FlexClone!

So how does FlexClone offer you protection? Well, it creates and clone of the entire FlexVol, and as snapshots are based on the volume, any subsequent snapshots will not contain any information regarding this clone. The clone is based on a single snapshot, so potentially if a verification fails, or you forget to destroy a FlexClone, you can get yourself into a situation where scheduled backups are unable to remove this old snapshot, but this is simple to fix. Either destroy the FlexClone or split it off, the snapshot is then released, and you are back to normal. A lot smoother than LUN Clones!

Unfortunately I still see LUN clones affecting production environments quite often. My recommendation is always to try include FlexClone, or at the very least be very pro-active with monitoring of your backups. FlexClone offers a lot of other benefits besides this, so I feel it is a worthwhile investment.

General , , , , ,

  1. No comments yet.
  1. No trackbacks yet.

This site is not affiliated or sponsored in anyway by NetApp or any other company mentioned within.
%d bloggers like this: