Archive

Archive for August, 2009

Nick Glasgow Update…

August 28th, 2009 admin No comments

If you all remember, back in May the blogging community kicked off a world wide search to help a fellow colleague, Nick Glasgow, with a bone marrow transplant that could save him from leukemia.

Earlier this month Nick received a successful marrow transplant and is on the Road to Recovery.

I wanted to keep you all apprised of his progress and let you know you can learn more @ http://markfredrickson.wordpress.com/

Thank you to all for your support and kindness.

Post to Twitter Tweet This Post

Scridb filter
Categories: EMC Tags:

A Data Proteciton Reference Architecture – Part 4

August 27th, 2009 Steve Kenniston No comments

Business Critical Applications

The tip of the triangle focuses on the applications (or data) that drives your business.  It is these applications within your business that, should they go down for any length of time, cost you money.  The recovery of this information, in the event of a ‘disaster’, needs to be very fast (RTO in minutes) and the data can’t be very ‘old’ when it is recovered (short RPO, less than 24 hours).   Typically,  the technologies that are used for these types of applications are replication (synchronous or asynchronous) or continuous data protection (CDP).  These technologies ensure that recovery at the alternate location  are instant (or near instant) and / or give users the ability to pick a point in time they want to recover to in order to ensure no data loss and the ability to bring up the applications as fast and accurately as possible.  This category, much like the rest of them, have the same disclaimer, ‘one size (product) does not fit all’.  Depending upon the value of the data in this tier, and the risk to the business if this data is unavailable drives the technology and spend in this part of the triangle.  Keep in mind, the right technology (Don’t choose CDP if you need an active remote file system) gives you the best recovery (RPO) for your business needs and can keep you on the Road to Recovery.

Post to Twitter Tweet This Post

Scridb filter

Twitter is not a Great Polling Tool – However…

August 25th, 2009 Steve Kenniston 1 comment

Last week I crossed 500 Twitter followers.  I thought I would test out my followers to see how many thoughtful responses I would get if I posed a real question about ‘Data Protection’.  So the question was:

What do you think of when you think about ‘Data Protection’?

I got 3 replies – less than 1% – wow!  I would have thought people who you follow would be more engaged in back and forth dialog versus just talking about what they are doing.  Anyway, here are the 4 thoughtful responses:

@wcpreston – @skenniston To me DP includes backup (traditional, CDP, near-CDP, dedupe), archive, DR/BC, & intrusion & spam detection/prevention

@jasoniehl – RT @wcpreston @skenniston To me DP includes backup (traditional, CDP, near-CDP, dedupe), archive, DR/BC, & intrusion & spam detection/preven

@datastorsupport – @skenniston I like knowing not only are the plans running, but data can actually be restored. Trust but verify.

First, thank you all who replied, I do appreciate it.

I am working on a project to figure out the ‘real definition’ of ‘Data Protect’.  I am curious if people think of data protection as just backup or does it include replication, CDP, archive, etc..

I’ve tried looking at how the Analysts define it but there is no consistency.  Back to the drawing board.

If you want to leave a comment about what Data Protection means to you – leave it here.  And if you have any good data protection taxonomies you would like to share, send the info along.

BTW: I am doing this all in an effort to help customers take Backup Beyond!

Post to Twitter Tweet This Post

Scridb filter
Categories: Backup Tags:

A Data Proteciton Reference Architecture – Part 3

August 24th, 2009 Steve Kenniston No comments

The ‘Fat Middle’

In the ‘fat middle’ of the triangle, as I stated last week, there are a number of ways to protection information.  I have chosen to break apart the middle into two categories.  The reality is, this is meant to be used as a tool for helping you lay out a strategy so your boxes could be based on capacity and could end up in different areas of the triangle depending upon your business needs.  The thing to keep in mind is that it’s not about your environment matching these boxes exactly, but it’s about making sure that all of the critical data that requires backup with a 24 hour RPO is protected; you then alignthe data value in the box with the most appropriate technology to 1) solve the challenge 2) fit best in your environment.

SMB / ROBO

First, let me clarify my terminology.  ROBO is remote office, back office and SMB is small to medium business.  If we think about the business needs that are most important in this arena, they are:

1)      Low cost

2)      Simplicity (one tool)

3)      24 hour RPO is adequate

Small and medium businesses, as well as remote offices, need a robust data protection solution that allows them to meet their backup windows and that has the ability to recover data that is not any older than 24 hours (RPO).  The RTO drives whether the backup target is disk or tape.   Faster recoveries come from disk.  Another thing to keep in mind is that there isn’t usually a lot of technical expertise at these sites so the backup application needs to be very simple to manage.

Backup appliances or appliance-like backup technologies tend to work very well in these environments.  A self contained backup appliance, (disk based) with the ability to replicate efficiently to another site for disaster protection is a great solution for sites like these.

In the case of SMBs, they can take advantage of a single application with integrated disk that could replicate to the cloud for very little cost and management while meeting their data protection objectives.  If cost is a driving factor, and the customer just wants better backup and recovery performance, moving to an appliance-based, capacity-optimized disk solution that could replicate is a viable option.  If the customer does not have a desire to replace their existing backup solution because it is working fine for them, then moving to disk based backup can help with most performance requirements.  (This is also true for the data center as well.)  And when customers really want tape as their backup medium for getting data off-site then the management will be a bit more complex but still easily achievable.

For remote offices in large corporations, again, an appliance that IT can remotely manage and replicate efficiently back to a data center gives users at the remote site local recovery time objectives, hours, as well as a DR strategy in the event there is a site level issue.

Along these lines, I have spoken to a number of customers lately who are utilizing virtual machines.  In a number of these cases, a virtual backup appliance is a great way to reduce the amount of complexity that is added to a customer’s environment yet still achieve the business requirements.

The Data Center

Next in the ‘fat middle’ is the data center.  There are many different backup challenges here.  One challenge follows the 80 / 20 rule.  Eighty percent of the data is usually unstructured data (file system) and 20% is structured (database and email).  As a general rule, the 80% of the data that is file system data is great for next generation data protection solutions such as source-based data deduplication.  Now, there are exceptions to this rule but a majority of the time source-based deduplication is the perfect fit.

A source-based deduplication solution could require that old backup agents are removed and new ones be deployed.  It may also mean that media servers are removed or repurposed.  The tradeoff for the extra work required to implement a source-based solution however are:

1)      Faster backups

2)      Less capacity stored on backup media (disk)

3)      More time freed up for the 20% of the backup environment that needs more resources

The third item in this list is very important.  As we discussed there is no longer a ‘one size fits all’ solution for data protection.  I also mentioned that source-based deduplication is a great fit for the unstructured data in your environment.  However, for the structured data, information that has  a high change rate, traditional backup applications typically backup this data faster than source-based deduplication.  Keep in mind; if you are in a larger data center, you have probably architected your backup infrastructure to meet the demands of your more important applications.  These applications probably backup on the SAN, may be server-less, and have likely required a good deal of time spent by IT ensuring that there are no issues with protecting these applications.  However, with the data growth in the environment, backups across the enterprise are running more slowly so they are having an impact on the critical business applications.  By implementing a source-based backup solution for the 80% of the data in the environment that it is a good fit for, you off load the traditional backup application so that it can focus on the 20% of the data in the environment that may have a greater business need.

Another good fit for source-based deduplication is in virtual server environments.  The benefits of server virtualization are soon forgotten when it comes time to back them up.  The reality is, virtual servers are not designed for high I/O and backup is the application in your environment with the highest I/O.  By leveraging source-based deduplication and removing all the redundant data at the source before it needs to be sent through the virtual servers physical resources, you can dramatically  decrease your backup bottleneck and increase your TCO with virtual servers.

When it comes to source-based deduplication, one thing to consider is that some customers may not want to go through the process of changing over to a new data protection technology.  If this is the case or if there is an area where source-based deduplication isn’t a good fit, disk targets such as VTL or target based deduplication is a good way to increase backup performance over tape, and reduce the capacity of data that you are storing on a daily basis.

Also, when I speak with customers these days, they want to reduce their reliance on tape more and more.  Deduplication solutions allow for appliance to appliance replication very efficiently.  This enables customers to get data off site efficiently and store data on disk at the same cost as storing data on tape while increasing operational recovery and ensure you are on the Road to Recovery.

Post to Twitter Tweet This Post

Scridb filter

A Data Protection Reference Architecture – Part 2

August 20th, 2009 Steve Kenniston No comments

Archive

The most fundamental part of developing a good data protection architecture starts at the base of the triangle with Archive.  Archive is often an overlooked component of data protection – It’s not just for regulated business anymore.  Archive essentially gives users 100% data deduplication efficiency.  What I mean by this is that you have the ability to remove ‘stale’ data (and by ’stale’ I don’t mean unimportant data, I just mean data that is not accessed frequently) completely from your backup stream so you don’t continue to back it up.  Let’s face it; the two most important commodities in backup are time and capacity.  Both of these are interdependent of one another.  The more capacity you have, the longer it takes to backup and the more money it costs to store.  The longer it takes you to backup, the less likely you are to be meeting your business objectives.  Data capacities aren’t shrinking, they are growing.  According to the latest IDC data, capacity is growing at a staggering pace of 65% year over year and the digital pack rat in all of us is too afraid to get rid of anything,  compromising backup windows and hence the business.  By archiving data that hasn’t been touched in some period of time and removing it from the backup stream, you can relieve some of the pressure on your backups and possibly not have to make any significant changes to your backup infrastructure.

Also, you don’t have to backup to a special purpose device or appliance for archive.  You can archive data to any file system.  I would keep in mind however, that you want to archive to a platform that can keep costs low.  Remember this data is not unimportant, just not highly used.  Take into account your RTO and store the data on the most cost effective platform possible that also aligns to the business objectives.  This may be tape, it may be optical or it may be disk.  If it is disk, you want to store it on disk that is optimized for this type of data, optimized for capacity (deduplication, compression, single instancing), has low power and cooling costs, can replicate for availability and is highly reliable.  You will also want to make sure that it is integrated to some extent with an application that lets you find the data pretty quickly when you need it and put you further down the Road to Recovery.

In my next post we will talk about what I call the ‘fat middle’.  In this area most all of the data has a 24 hour RPO and is where traditional and next generation backup applications play.  There are many use cases for data protection in this area and RTOs tend to drive the medium to which data is backed up to (disk or tape).  Stay tuned for Part 3.

Post to Twitter Tweet This Post

Scridb filter

Storage Switzerland

August 18th, 2009 Steve Kenniston No comments

One of the more thoughtful analysts in the industry, in my opinion is George Crump from Storage Switzerland.  (I like the name and George is as independent as you can get in

this business.)  Yesterday I had the pleasure of briefing George on EMC’s Data Protection Vision.  I like talking with George for a couple of reasons.  First, he gets it.  What does that mean.  Read his material.  He is genuinely trying to educate IT folks on what is really important in the data center and how to address these challenges.  Next, he keeps the ‘pay for’, ‘vendor spin’ to a minimum.  George works hard to just talk about the facts of a product or industry and talk about how products can help without selling.  The reality is, we live in a great technological time.  The problem with IT is that only 50% of the problems are technology related.  The other 50%  is psychological.  IT can’t just implement new technology because its cool or even because it really does solve a problem.  Sometimes new technology is too expensive to implement or the solution that is currently in place had a three year amortization and your only two years into your product life.  Or, more importantly, the new technology may be the greatest technology at the right price but it doesn’t fit into the current IT priorities.  These are all things IT needs to work through when considering whether or not to invest in new technology.  The other thing George and I spoke about was the fact that it gets difficult to be ’strategic’ in IT especially given certain economic times.  A lot of times IT just needs a band-aide or quick fix to move on to more important issues that really drive the business.  I talk about this  a lot, especially when it comes to backup.  Lets face it, it may not be what we all want to hear but backup is not strategic to most environments.  The applications that drive the business are most important.  Backup is about risk mitigation and information availability if everything else fails.  Right, ‘if everything else fails’, and IT typically invests in technology in the front end in an effort to have as little failure as possible.  Meaning, IT doesn’t just buy JBOD with no RAID if they think the environment shouldn’t be put at that kind of risk.  So IT is  already investing in some risk management up front which drives the spend on the back end for data protection.

I wanted to say “Thanks” to George for taking the time to come in and understand the bigger strategy EMC is driving with its products in the data protection space and to talk about our existing successes with the current portfolio.  Hopefully George, as well as all of you, can see how we are helping to put customers on the Road to Recovery.

Post to Twitter Tweet This Post

Scridb filter

A Data Protection Reference Architecture – Part 1

August 14th, 2009 Steve Kenniston No comments

This blog will have multiple parts.  I will introduce my view of a data protection reference architecture and the next few blog posts will talk to components of that architecture.

The other day  I had a very interesting conversation with a colleague of mine in Australia.  He was looking for a data protection reference architecture that he could use to speak to his customer.  As you can imagine having this conversation over the phone could pose to be a difficult challenge.  When the conversation began, my fear was he was looking for an ‘architecture’ diagram that included data protection appliances, backup servers, disk libraries, tape libraries and backup agents.  I quickly realized that this is an impossible conversation to have with him without knowing:

A)     the customer’s environment or challenges

B)      the customer’s business objectives

I find that most vendors don’t know A or B when speaking to a customer about their data protection ‘issues’, but they really should.  Having a more thoughtful conversation with customers in a consultative fashion is more relevant to customers in understanding their challenges and helping to align these challenges to the best possible solution.

I started my conversation with the diagram shown below (Figure 1).  A simple triangle divided horizontally into 4 segments and the middle two segments divided vertically in half.  Each segment represents different business objectives within a company.  As you go around the triangle, you can see that there are different technologies and different methodologies for attacking data protection challenges, which is why there is no longer a “one size fits all” approach when it comes to protecting data today. Let’s face it; the two most important commodities in backup are time and capacity.  One of the primary drivers behind the type of protection that is used is the Recovery Point Objective or RPO.  Different technologies provide different RPOs and each has a different price point as well as there are different processes that can be applied to attach RPOs.

Figure 1

Figure 1

Having a conversation specific to this diagram can have a tremendous amount of value on a number of fronts, including; aligning technology needs with business objectives as well as highlighting critical pain points and beginning a roadmap that helps implement data protection technology based on business needs and budget and put you on the Road to Recovery.

The next post will cover the foundation of the triangle – Archive.

Post to Twitter Tweet This Post

Scridb filter

Twitter links powered by Tweet This v1.6.1, a WordPress plugin for Twitter.