22
Feb
10

Data Progression? Extents of tiering?

After reading Jon Toigo’s blog this weekend, I had to add on in more than a “comment”, as a follow on to the response from a Compellent “architect”.

Jon was addressing, in “Tears of Storage”, a point that is in great controversy in the storage industry and is being blog’ed about constantly (@storagetexan, @storagewonk, @xiotech, to mention a few), that is storage controller comparison and “features” that vendors offer. The “architect’s” response was from a giddy brain washed perspective, and that is, “this is really cool and everyone should have it!”.   

The reality of Data Progression, and lower level tiering in general, is that when they say “data”, it is NOT progressing “data” at all, underlying storage does not understand “data”, it understand bits and blocks of bits, and as such, protects those and should do that VERY well first. To protect “data”, or tier “data”, you should have an understanding of the full value of that underlying data and be able to “progress” or “regress” that data based on it’s value to the user and not to the storage controllers, things such as “Is it the CEO’s email”, in some companies that is important and is the only qualifier and not “did I touch this bit”, this is the “value” a controller cannot put on the data.

This entire story is akin to the outsourcing discussion that was held at a CIO event in Ohio last year (shamefull plug for CIO Practicum), and the outcome of that, and virtually EVERY discussion over “should I outsource” , you outsource something that is made basic and can be done easily and quicker by someone else. Data ILM is not “easy”, it requires a little effort on the part of the data owner (classification and prioritization) and simply ignoring those two points is just throwing your hands up in the air and saying you give up. Application vendors  (backup applications, databases, etc) have those type capabilities AND they actually know how the data is being used!

In a nutshell, Data Progression is a “feature” that a company “sells”, that requires you to buy a license and more storage and let them handle it “seamlessly” in the background (oh, wait, it does take resources to migrate that data?, hmmm), and it only addresses the single point of “when you last touched a block”, which is the least significant of the equation, but, the only one the storage vendor can understand.

Now, when the days of “Intelligent Application driven Storage” comes to play, that will rock! Think about it, letting the application drive the operations based on “their” needs, the tide is turning and the “features” of today will be the 8 tracks of yesterday, passé . But, don’t fret, vendors will come out with new “cool” things for you to spend money on!

Advertisements

4 Responses to “Data Progression? Extents of tiering?”


  1. 1 John Dias
    February 24, 2010 at 2:06 am

    Since I’m the “architect” to whom you refer I feel “compelled” to respond to this posting!

    So, let me state that while I don’t disagree with Jon (and by extension you) on the point that we need a mechanism with file system and data content awareness to provide truely seemless, autonomic (hat tip to Rob Peglar) and intelligent handling of information from cradle to grave, I’m not sure why block level tiering seems so poisonous to you both. Block level tiering addresses only one aspect of that very complex issue but the technology does solve a very important and painful problem for storage administrators TODAY.

    We can wait for the perfect solution and continue to pin all data in high performance storage tiers “just in case” – but I believe most stewards of the IT budget are willing to take some steps to reclaim storage and improve efficiency until this cure is discovered.

    • February 24, 2010 at 3:10 am

      John,
      I am sorry if you felt I was be negative on you specifically, it’s that HSM has existed for years (I remember it at Prime Computer and Silicon Graphics) and has been driven by the applications.
      Those applications have faltered a bit, but, if a storage vendor could work with the application as opposed to independent of the application, could that not be positive?
      Should the applications not have control over the data, if they can?
      Should the owner of the storage not be able to utilize all of the storage they buy?
      With that being said, as I understand Data Progression, you need to have multiple tiers of storage AND empty space on at least one of them to migrate the bytes to.
      In addition, I understand that there are variables set to notify when you exceed certain thresholds as well as hold back space so those features (such as Data Progression) can work.
      I also understand that there is a cost associated with the licensed feature of Data Progression.
      After all of that, are there any negatives to Data Progression? (Beyond actual storage utilization levels, power consumption, true cost of ownership of the total applied solution, Data in flight resource utilization and security).
      I do understand the marketing positives, but, am trying to be objective, especially since there are alternatives that can have greater performance implications, greater resource utilization levels and be more cost efficient to the end storage owner.

      As an Enterprise Architect myself, I try (operative word, sorry if I fail) to be objective to the entire implementation and how the actual storage plays into the total solution best, from an overall fiscal and architectural perspective.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: