Archive

Posts Tagged ‘Classification’

Data Protection Management from ‘Nice to Have’ to ‘Need to Have’

December 15th, 2009 Steve Kenniston No comments

Data protection management has come a long way in the past decade.  More importantly the features and functionality that are in products these days and what customers have come to expect are now no longer ‘nice to have’ feature in the data center, they are ‘need to have’ features.

Additionally, the term ‘data protection’ is morphing every day and has different meanings to different people.  Questions like ‘is replication data protection?’ or ‘is archive data protection?’ or ‘is DR / BC a function of protection?’ are now common in IT circles.  Each in their own right is a methodology for protecting information or has some play in the grand scheme of data protection.  The reality is, much like every answer in IT, the answer to these questions is ‘it depends’.  Data Protection has many different definitions, which start to expand the scope of what it actually is and more importantly, how it is managed cost effectively across the whole environment.

It is this expanding scope of data protection  where data protection management tools come into play, and the more flexible and granular the tool, the more effective.  It is hard to have good data protection capabilities without having insight to the environment.  First, understanding what type of data lives in the environment, where it is, how it is used and some characteristics about its age or its access frequency helps to determine how to best protect the information.  This is where a data protection management tool that provides some insight to the file system adds a great deal of value.

Next, if archive is a part of data protection (and I would argue that a functional archive, when used properly, is) then a data protection management tool that provides insight to the data in the archive can also help manage the overall protection process within the greater environment.  Knowing if the data in the archive is actually being accessed or if it can be deleted (unless stored for compliant purposes) can help to control archive costs.

If replication is a part of the overall data protection scheme, a data protection management tool that provides insight to this process can also add a great deal of value.  Identifying if links are up, if data is moving between sites and if the data is available, accessible and meets my recovery point objectives at the remote site can ease the concern of recoverability in the event of a disaster.

And finally, providing as much information as possible such as deduplication rates,  tape growth, disk growth (in disk based backup targets – including deduplication targets), as well as providing true analytics into the backup environment to help make decisions as to when to switch from a tape-based solution to a disk-based solutions.  These analytics need to be in-depth enough to show that if some data that is being protected with traditional backup technologies are moved to a next generation solution, such as source-based deduplication, then what affect will it have on the overall backup environment, will it help to better control costs, will it help to increase SLAs?

At a higher level, customers are telling me that they no longer want to manage backup, they just want it to work and they want proof it is working.  As customers move to a more virtualized IT infrastructure, they find that they are being forced to rearchitect their data protection environment and they are now looking to solutions that elevate the process.  IT is looking for tools to make their environment “data protection aware.” As virtual machines are added to the environment they are automatically protected and want notification if they are not so they can mitigate any risk, and let’s face it, backup is all about risk mitigation.  Backup is insurance.  Wouldn’t it be nice if your insurance company had deeper insight to all the cars / drivers in your family and told you when your teenager was speeding on a monthly basis and told you that your premiums are going to go up if they don’t start driving the speed limit before they got the ticket and your premiums increased?

Any tool that IT invests in for a common process, data protection in this case, needs to be flexible enough to allow IT to manage as much of the overall process from a single pain of glass.  Good data protection management tools need to provide IT as much visibility into the overall data protection environment as possible in order to help make good decisions about what data technologies should be invested in, in order to help IT meet its overall SLAs and hence business objectives.

There is no sense spending a great deal of money on rearchitecting a backup environment if there is no insight to the success of the new architecture.  Sooner or later, management needs to have the pretty graphs that prove to someone that the right decisions are being made when it comes to protecting information, or when it comes to how much is spent on data protection or if the SLAs can be met.  Not having good data protection management tool, and spending too much on new data protection architectures while not meeting your SLAs could lead to a RGE (resume generating event).  Data protection management tools today are a need to have, not a nice to have.  Make the investment and put your data protection environment back on the Road to Recovery.

Post to Twitter Tweet This Post

Scridb filter

A Data Protection Reference Architecture – The Final Chapter

September 1st, 2009 Steve Kenniston 2 comments

The Architecture

This ‘architecture’ diagram, as you can see, is not a typical architecture diagram, but hopefully it can be used to align your business and business objectives with the technologies that are available and can best be applied to solve your issues helping to balance, cost, complexity and compliance.

This diagram can also be used to do a couple of other things.  It can help you begin to classify your data and align your  data to your business objectives.  It also lets you begin to identify what data or data services in your environment that may be more important to you than others and based on this help you to choose areas you may want to outsource or move to the cloud.

As you can tell, there really is not one solution for meeting all your data protection needs.  The challenge comes with managing multiple solutions in an effort to meet your business objectives.  While there are only a few technologies available that allow you to manage your environment across all your RPOs and RTOs, it is important that I point out EMC’s NetWorker is able to do this, centralizing your data protection infrastructure  for ease of management.  It allows you to manage traditional backup, source based deduplicated backup with Avamar, CDP with RecoverPoint, as well as the EMC disk libraries and tape where the data is stored.  Now, I am not saying that NetWorker solves all of your data protection challenges, nor am I suggesting that replacing one traditional backup technology for another is the right answer, but what I am saying is that if you’re looking to have all the feature functionality required to meet all your business objectives and you want easier management, NetWorker is one avenue to get you there.  Additionally, the underlying image of the triangle represents data protection management.  Putting all the new technology in place is one thing, managing it, and ensuring you are now meeting your business needs is another.  EMC’s Data Protection Advisor can help here as well.

This diagram can help customers layout a new, better data protection schema for their environment and start thinking about data protection a bit more strategically versus tactically.  It can also help vendors speak to customers about how they should look at their environment in order to identify specific challenges and the means they need to alleviate these challenges , taking backup, beyond.

Post to Twitter Tweet This Post

Scridb filter

A Data Protection Reference Architecture – Part 1

August 14th, 2009 Steve Kenniston No comments

This blog will have multiple parts.  I will introduce my view of a data protection reference architecture and the next few blog posts will talk to components of that architecture.

The other day  I had a very interesting conversation with a colleague of mine in Australia.  He was looking for a data protection reference architecture that he could use to speak to his customer.  As you can imagine having this conversation over the phone could pose to be a difficult challenge.  When the conversation began, my fear was he was looking for an ‘architecture’ diagram that included data protection appliances, backup servers, disk libraries, tape libraries and backup agents.  I quickly realized that this is an impossible conversation to have with him without knowing:

A)     the customer’s environment or challenges

B)      the customer’s business objectives

I find that most vendors don’t know A or B when speaking to a customer about their data protection ‘issues’, but they really should.  Having a more thoughtful conversation with customers in a consultative fashion is more relevant to customers in understanding their challenges and helping to align these challenges to the best possible solution.

I started my conversation with the diagram shown below (Figure 1).  A simple triangle divided horizontally into 4 segments and the middle two segments divided vertically in half.  Each segment represents different business objectives within a company.  As you go around the triangle, you can see that there are different technologies and different methodologies for attacking data protection challenges, which is why there is no longer a “one size fits all” approach when it comes to protecting data today. Let’s face it; the two most important commodities in backup are time and capacity.  One of the primary drivers behind the type of protection that is used is the Recovery Point Objective or RPO.  Different technologies provide different RPOs and each has a different price point as well as there are different processes that can be applied to attach RPOs.

Figure 1

Figure 1

Having a conversation specific to this diagram can have a tremendous amount of value on a number of fronts, including; aligning technology needs with business objectives as well as highlighting critical pain points and beginning a roadmap that helps implement data protection technology based on business needs and budget and put you on the Road to Recovery.

The next post will cover the foundation of the triangle – Archive.

Post to Twitter Tweet This Post

Scridb filter

Information Classification – IT’s Hardest Job

April 16th, 2009 Steve Kenniston No comments

I have decided information today, is like a group of friends. If you look at my LinkedIn page or my Facebook page you see that I have over 600 connections and over 180 friends respectively. What does this really mean? Obviously don’t stay in touch with all of these people. So why do we have these connections? I think it is because we believe that in the future, each one of these connections will offer some kind of value to us. It may be that they will be a friend to us, they may share common experiences to help us through a personal issue, and they may help us find a mate or even a job. We just don’t know so we hang on to the connection.

This is not unlike information. We are all tired of hearing that “data is growing at an exponential rate” but we never look at why. It is simple. We believe that ‘someday’ we may need that ‘valuable’ piece of content so we better not delete it. More importantly, the people who are accountable for managing that data (IT) are one step removed from the ‘value’ discussion (usually) so rather than delete anything and be responsible for “loosing data” they save and protect everything.

Recently I spent 4 hours on my Facebook page ‘categorizing’ my friends. I created a number of categories, friends from high-school, friends from college, colleagues from work (current), colleagues from work (past), industry connections and relatives. As you can imagine there are some friends that belong in more than one category – so how do I choose which one they should go in? Also, what happens if I change jobs? Where do the ‘colleagues (work)’ friends go? When do I move them? Do I remember to move them?

I have often said when presenting to customers, “EMC can help you with all aspects of you data except for one thing. EMC will never know the value of a piece of your content to you. You have to tell us, and then we can manage it properly.” Typically when customers hear that statement, they agree, but they also agree that the process of classifying data is a daunting task. You can see the challenge of just organizing friends in Facebook. There are so many permeations of how data can be classified that IT chooses the path of least resistance, store and protect everything.
While storing and protecting everything is easy, it also hits at the three biggest challenges IT are faced with; cost, complexity and compliance. These three vulnerabilities are the toughest to balance because not only are they important in their own right, they also are interdependent. As data grows, the inability to protect it grows which means IT either needs to spend more money or be out of compliance.

The cycle is only broken when new processes are introduced. These processes are a part of a key message when it comes to data protection; assess (classify), archive, backup, manage. Only when customers believe that the struggle of trying to keep cost, complexity and compliance in check happens when a new process is introduced, can the cycle be broken. Once new processes are in place, the data center can become more efficient.

Consider this analogy: In July 1936 Henry Philips received a patent on a new type of screw and screw driver he had invented. This new “technology” changed the world of mass production and machine repair.

He didn’t set out to make the life of hand tools easier, he was trying to solve an industrial problem. The new screw and screwdriver was designed for use with power tools and more specifically power tools on an assembly line.

The slot in the screw allowed itself to seat itself in the tool automatically when contact is made which saves a second or two and if you have 100’s or 1000’s of screws like in cars or airplanes then it saves a great deal of time.

In 1938 Henry was able to get the American Screw company to spend a $500,000 to develop a manufacturing process around the new screw. By 1940 nearly all of the American manufactures had switched to the new process and the new screws. It made all the assembly of military air craft and jeeps much more efficient. Having these vehicles made faster and more efficiently contributed to a competitive advantage.

So, it’s like I say when talking to customers; “The hardest thing to change in the data center is not technology it is process “. Once the psychological inertia of dealing with a new process is overcome, then progress can be made.

Once customers start to classifiy their information (assign value to it), they can begin to archive their ‘old’ data.  This will still provide them access to it, just not as quickly. Once this data is removed from the backup stream, backups will then run much more efficiently. Additionally, deploying new technologies such as deduplication for specific data types (realized during a proper classification effort) allows IT to more efficiently backup specific data types in specific areas for much lower cost. Now that all the work has gone into establishing a new set of processes, IT will want to continue to manage this new set of processes to ensure that all the hard work they have done has tangible business capabilities. New processes can help IT attack cost, complexity and compliance but it all starts with information classification.

Posted by Steve Kenniston

Post to Twitter Tweet This Post

Scridb filter

Twitter links powered by Tweet This v1.6.1, a WordPress plugin for Twitter.