Archive

Posts Tagged ‘Restore’

How Much Backup Capacity Does Deduplication Really Save?

November 30th, 2009 Steve Kenniston No comments

There is a lot of discussion around data deduplication for backup these days.  (I wish I could deduplicate all the turkey I ate last week.)  In fact, Gartner claims that “…by 2012, deduplication will be applied to 75% of backups.”  And when asked “Why?” the response was “…deduplication is too compelling to ignore.”  But I say “prove it”.  So I put together some backup capacity numbers for storing data on tape (non-compressed and compressed) versus storing data, deduplicated (fixed block and variable block), on disk and the numbers show a dramatic savings in backup space which translates into cost savings.

The Parameters

As with any ‘analysis’ numbers can be ‘spun’ to make them say what you want.  That said, I tried to be as straight forward as possible, so let me also show my methodology so you can see how my numbers were derived.

  • I charted the amount of capacity created using a retention policy of:
    • 14 Dailies
    • 4 Weeklies
    • 12 Monthlies
  • I selected 10TB of primary storage capacity
  • I did this for file system backups only
  • I charted the data for 30%, 40%, 50% and 60% primary storage growth rates
  • I charted traditional tape based backup (non-compressed)
  • I charted traditional tape based backup (compressed, 2:1)
  • I charted fixed block disk based deduplicated backup
  • I charted variable block disk based deduplicated backup (3 to 5 times more efficient than fixed block deduplication)

The Effect

The first thing to think about is the sheer number of full backup copies that must be maintained when utilizing the above retention schedule.  The above retention policy leads to 17.2 copies of the primary storage (12 yearly’s + 4 monthlies + the equivalent of 1.2 with dailies = 17.2 copies) .  Translation: one terabyte of primary storage becomes 17.2 terabytes of tape storage.  This means, backup administrators need to pay for the physical tapes as well as the offsite transport and storage costs.  Now 17.2 terabytes of tape doesn’t sound like much but keep in mind that is for 1TB of primary capacity.  Ten TB of primary capacity yields 172 TB of tape capacity.  Now add in year over year storage growth.  At 30% primary storage growth, the backup storage growth grows 23%, at 40% primary storage growth, the backup storage growth grows 29%, at 50% primary storage growth, the backup storage growth grows 33% and at 60% primary storage growth and the backup storage grows 38%.

Figure 1 below shows, 10 TB of primary capacity growing at 30%, 40%, 50% and 60% along the x-axis respectively and the corresponding capacity of tape or disk consumed along the y-axis is.

Figure 1

The graph shows that compressed backup to tape obviously yields a 50% capacity improvement over non-compressed tape as one would expect. It also reflects that fixed block deduplicated disk capacity is only about 48% more efficient than uncompressed tape storage yet variable block deduplication is 81% more storage efficient than uncompressed tape storage.

Interesting as well, the chart reveals that fixed block deduplication is 3% less efficient than compressed tape whereas variable block deduplication is 62% more efficient than compressed tape. Typically, with the same data change rates, and equivalent data sets, variable block deduplication is 3 to 5 times more efficient than fixed block deduplication.

The moral of the story – if you’re going to do deduplication, variable block is the way to go. From a cost perspective, there is essentially no difference in the $/TB price however there is much more value in the long run with variable block deduplication. Vendors typically charge a $/TB price for their deduplication solutions. The difference between fixed and variable block deduplication comes down to the capacity of data that is stored in the backups which directly translates into costs. If you take a look at Figure 2, over time, starting with 1TB of primary capacity growing at 25% over the course of one year, IT will need almost 2TB of backup capacity with fixed block deduplication versus less than 1TB of capacity using variable block deduplication (assumes fixed block is 5x less efficient from imperial data that has been collected in the field.). The most important part of this graph is the slope of the blue and red lines. The greater the degree of slope (red line), the more frequently IT will need to purchase capacity to protect the given data set as well as need to pay for licensing as it pertains to deduplication software. IT wants the smaller slope.

Figure 2

*Note: Some companies will position their fixed block technologies as variable block by stating that you (the user) has the ability to set the block size to what ever you want, however, once set, it stays that way for all of your data.  The difference is, true variable technologies adjust the block size on the fly using their algorithms to ensure maximum efficiency with no management.

Bang for the Buck

The most important benefit, as with most things in IT however is overall cost savings. Deduplicated disk solutions are anywhere from 2.5X to 3X more expensive than tape, however with the overall capacity savings, there can be significant cost savings. Figure 3 is representative of the overall costs of new deduplicating disk systems and traditional tape backup systems (including tapes and off-site storage costs). I will caveat this by saying every TCO and ROI has a ton of ‘what ifs’ that factor into overall costs including things like FTE for backup engineers and long term retention costs, but for the most part, disk systems reduce a good deal of these costs (with the exception of power and cooling) and increase the reliability, security and performance of backups and recoveries.

Figure 3

1 The chart above is based on a rough cost of $8,000 per terabyte of tape backup system costs (including media and off-site storage) and rough cost of $20,000 per terabyte of deduplicated disk backup system costs for the period of one year.  Prices will vary depending upon your configuration and these estimates do not include space, power, cooling or human costs.

As I stated above there are only a few factors that are involved in this very raw calculation.  There are a number of other factors involved with a backup process including WAN costs (if replacing tape with disk), remote office facilities, installation (professional services), and software and hardware maintenance to name a few.  But no matter how you look at it, disk based backup with variable block deduplication wins over tape.

Backing data up to deduplicated disk not only saves the amount of backup capacity that is used, it also has other implications for a data protection environment.  First, backing up to disk versus backing up to tape helps to reduce the reliance on tape and the inherent limitations, security concerns and reliability issues surrounding tape.  Recovery of data from disk reduces the operational costs and decreases the recovery time objective.  Additionally the reliability of disk with RAID is much higher than the reliability of tape.

New data protection technologies are evolving backup to a degree where the entire data protection process is getting easier manage by removing multiple points of management (backup servers, media servers, tape libraries and physical tape).  As backup continues to evolve, this can help simplify the overall process and;

  • Increase reliability of backups
  • Reliability of recoveries
  • Decrease backup times
  • Decrease the time to recover data

The Bottom Line

New challenges in protecting information are arising every day, whether it is data growth, remote office data protection or virtualization, backup is getting harder not easier.  Data deduplication is providing backup administrators with tremendous benefits around backup processes and cost savings.  It is important to keep in mind that everybody’s environment is different and utilizes different methods and processes for managing and protecting information.  It is also important to take a look at your data protection environment today and understand the use cases where it is time to make new investments.  I encourage you to look at new technologies to help you with emerging challenges and weigh the overall solution including costs as well as benefits of disk based recovery.  New backup technologies that leverage data deduplication can save IT a lot of money and put you on back on the Road to Recovery.

Post to Twitter Tweet This Post

Scridb filter

The Side Effects of Backup on Server Virtualization

September 14th, 2009 Steve Kenniston 2 comments

Server virtualization has changed the IT landscape dramatically.  It has become a magic potion curing a number of ills in the physical server world such as low individual CPU utilization and excess use of space, power and cooling in the data center.  However, like all potions that cure what ails you, there can be side effects.  You need to be careful of what the Witch Doctor orders.

When I speak with customers who have aggressively implemented a virtual server infrastructure, 9 out of 10 will tell me that they underestimated the affect that virtualization would have on their backups and backup process and how backup might actually make virtualization less of the magic potion they had hoped, when not considered during the virtual server assessment and planning process.  So what is the issue?  Backup is a virtualization bottleneck, and without addressing it, you may not be able to obtain the server consolidation ratios you had been expecting which can have a negative effect on your virtual server TCO and ROI.

This is a timely discussion as VMworld has just concluded.  VMware users flocked to VMworld looking for best practices when it comes to implementing virtual server technology.  Because virtualization allows IT to reduce the overall physical hardware infrastructure, users will be looking at how to maximize their server consolidation ratios (get as many virtual servers on a physical server as they can and still provide good application performance).

I often hear that companies assess their environments by looking at the production applications on their physical server environment, identify their work loads and translating that into some consolidation ratio of physical servers to virtual servers.  I also hear, from these same customers, that backup was never taken into consideration during the assessment phase when trying to identify the best possible consolidation ratios.  These customers implement their new virtual server environments, install the backup agent they had previously been using for physical server backups and attempt to backup their virtual servers and they find that they would only be able to protect 50% to 60% of the new environment.  Why?

Let’s look at the physics.  Let’s say your virtualization ratio is 12 virtual servers to 1 physical server.  Ten physical servers backup with 12 NIC cards, 12 CPUs, 12 Memory ‘chunks’, etc… When you moved these 12 physical servers into the virtual world and put them on one physical server did you put 12 NIC cards in the new physical server?  Did you put 12 CPUs in the new server?  Do you have 12x the memory?  Chances are, probably not.  However the capacity didn’t change did it?  So how could one expect that the backup performance, which is I/O, memory and CPU intensive would operate well in a virtual world?

Diagram 1 below show how when you backup 12 servers, the resource drain on each server is roughly 25% (per system during a full backup).  When you virtualize these 12 servers onto one or two physical servers, your physical system utilization shoots up to 80%+.  This utilization can be so dramatic that it actually effects the number of virtual servers you can have on these systems which can ruin your virtual server TCO / ROI.

Figure 1

Figure 1

Simple math dictates, unless you have all the same resources on your new physical server as you did on all your physical servers before the consolidation, you won’t get the same backup performance.  I have spoken with customers who aimed to do a 25 to 1 virtual to physical server consolidation, who  were only actually able to get a 15 to 1 consolidation ratio in reality because their backup application couldn’t handle 25 virtual servers on one physical server, leaving some unprotected.

People could argue that if you properly schedule each virtual machine to backup in a window when all the other systems are not backing up, then perhaps you could get by with traditional backup.  The flip side is, IT has been telling me they don’t want to manage the backup process anymore than they have to.  So how do you ‘fix’ this problem?

The issue is that backup is a very intensive I/O application therefore there is only one way to fix the problem.  You need to reduce the amount of I/O generated and sent through the physical devices that house the virtual servers during backup.  Virtual servers were designed to provide a lot of benefits but high I/O capabilities is not one of them.  (This is okay, every technology implementation has its tradeoffs.  When the positives outweigh the negatives, especially in a substantial way, as they do with virtual servers, you usually have a paradigm shift, and this is what we are seeing with virtual servers.)

So how do you change the I/O pattern of backup?   You do so by decreasing the amount of data that is utilizing the shared resources during backup.  There are a couple of ways to do this.  One way is to leverage the storage array and snapshot the data.  Snapshots allow you to make copies of virtualized server data and mount this snapshot to a proxy host and off-load the backups from the physical server that house the virtual servers.  The downsides are:

1)      This becomes a new set of processes to manage unlike traditional backup processes

2)      You need extra storage capacity with this solution

3)      You will need to manage another physical server (proxy server)

4)      You will need more backup agents from your backup software provider

The most efficient way, however, is to take advantage of a new backup software application that leverages data reduction (data deduplication) on the client.  Your processes stay the same, there is no need for additional primary storage hardware and by leveraging a ‘smarter’ backup client, you will reduce the I/O tax on your physical server devices and thereby have the ability to maximize your TCO / ROI for your new virtual server environment.

Additionally, a number of these technologies have additional offerings that truly make them next generation.  Backup licensing is slowly moving to a capacity based license model.  One great feature of these new products is the fact that there is no charge for clients or agents.  This allows you to create a virtual server template with the backup agent embedded within it.  You no longer have to worry about proliferating backup clients and then paying for all those clients when it is time to ‘true up’ with your backup software vendor.  Data deduplication technologies also offer the ability to replicate the backup data efficiently to disk at a remote site so you can develop a more efficient disaster recovery plan that reduces the reliance on a tape and increases your overall operational efficiency.

Regardless of which path you choose, each requires IT to rethink their backup strategies when it comes to protecting virtual server environments.

I encourage you to do two things as you consider moving to a virtual server infrastructure:

1)      Make sure you are thinking about data protection when architecting your new virtual server environment

2)      Check out some of the new technologies and best practices offered by vendors for protecting virtual servers.

Hopefully this will help put your virtual server world back on the Road to Recovery!

Post to Twitter Tweet This Post

Scridb filter

A Data Protection Reference Architecture – The Final Chapter

September 1st, 2009 Steve Kenniston 2 comments

The Architecture

This ‘architecture’ diagram, as you can see, is not a typical architecture diagram, but hopefully it can be used to align your business and business objectives with the technologies that are available and can best be applied to solve your issues helping to balance, cost, complexity and compliance.

This diagram can also be used to do a couple of other things.  It can help you begin to classify your data and align your  data to your business objectives.  It also lets you begin to identify what data or data services in your environment that may be more important to you than others and based on this help you to choose areas you may want to outsource or move to the cloud.

As you can tell, there really is not one solution for meeting all your data protection needs.  The challenge comes with managing multiple solutions in an effort to meet your business objectives.  While there are only a few technologies available that allow you to manage your environment across all your RPOs and RTOs, it is important that I point out EMC’s NetWorker is able to do this, centralizing your data protection infrastructure  for ease of management.  It allows you to manage traditional backup, source based deduplicated backup with Avamar, CDP with RecoverPoint, as well as the EMC disk libraries and tape where the data is stored.  Now, I am not saying that NetWorker solves all of your data protection challenges, nor am I suggesting that replacing one traditional backup technology for another is the right answer, but what I am saying is that if you’re looking to have all the feature functionality required to meet all your business objectives and you want easier management, NetWorker is one avenue to get you there.  Additionally, the underlying image of the triangle represents data protection management.  Putting all the new technology in place is one thing, managing it, and ensuring you are now meeting your business needs is another.  EMC’s Data Protection Advisor can help here as well.

This diagram can help customers layout a new, better data protection schema for their environment and start thinking about data protection a bit more strategically versus tactically.  It can also help vendors speak to customers about how they should look at their environment in order to identify specific challenges and the means they need to alleviate these challenges , taking backup, beyond.

Post to Twitter Tweet This Post

Scridb filter

A Data Proteciton Reference Architecture – Part 3

August 24th, 2009 Steve Kenniston No comments

The ‘Fat Middle’

In the ‘fat middle’ of the triangle, as I stated last week, there are a number of ways to protection information.  I have chosen to break apart the middle into two categories.  The reality is, this is meant to be used as a tool for helping you lay out a strategy so your boxes could be based on capacity and could end up in different areas of the triangle depending upon your business needs.  The thing to keep in mind is that it’s not about your environment matching these boxes exactly, but it’s about making sure that all of the critical data that requires backup with a 24 hour RPO is protected; you then alignthe data value in the box with the most appropriate technology to 1) solve the challenge 2) fit best in your environment.

SMB / ROBO

First, let me clarify my terminology.  ROBO is remote office, back office and SMB is small to medium business.  If we think about the business needs that are most important in this arena, they are:

1)      Low cost

2)      Simplicity (one tool)

3)      24 hour RPO is adequate

Small and medium businesses, as well as remote offices, need a robust data protection solution that allows them to meet their backup windows and that has the ability to recover data that is not any older than 24 hours (RPO).  The RTO drives whether the backup target is disk or tape.   Faster recoveries come from disk.  Another thing to keep in mind is that there isn’t usually a lot of technical expertise at these sites so the backup application needs to be very simple to manage.

Backup appliances or appliance-like backup technologies tend to work very well in these environments.  A self contained backup appliance, (disk based) with the ability to replicate efficiently to another site for disaster protection is a great solution for sites like these.

In the case of SMBs, they can take advantage of a single application with integrated disk that could replicate to the cloud for very little cost and management while meeting their data protection objectives.  If cost is a driving factor, and the customer just wants better backup and recovery performance, moving to an appliance-based, capacity-optimized disk solution that could replicate is a viable option.  If the customer does not have a desire to replace their existing backup solution because it is working fine for them, then moving to disk based backup can help with most performance requirements.  (This is also true for the data center as well.)  And when customers really want tape as their backup medium for getting data off-site then the management will be a bit more complex but still easily achievable.

For remote offices in large corporations, again, an appliance that IT can remotely manage and replicate efficiently back to a data center gives users at the remote site local recovery time objectives, hours, as well as a DR strategy in the event there is a site level issue.

Along these lines, I have spoken to a number of customers lately who are utilizing virtual machines.  In a number of these cases, a virtual backup appliance is a great way to reduce the amount of complexity that is added to a customer’s environment yet still achieve the business requirements.

The Data Center

Next in the ‘fat middle’ is the data center.  There are many different backup challenges here.  One challenge follows the 80 / 20 rule.  Eighty percent of the data is usually unstructured data (file system) and 20% is structured (database and email).  As a general rule, the 80% of the data that is file system data is great for next generation data protection solutions such as source-based data deduplication.  Now, there are exceptions to this rule but a majority of the time source-based deduplication is the perfect fit.

A source-based deduplication solution could require that old backup agents are removed and new ones be deployed.  It may also mean that media servers are removed or repurposed.  The tradeoff for the extra work required to implement a source-based solution however are:

1)      Faster backups

2)      Less capacity stored on backup media (disk)

3)      More time freed up for the 20% of the backup environment that needs more resources

The third item in this list is very important.  As we discussed there is no longer a ‘one size fits all’ solution for data protection.  I also mentioned that source-based deduplication is a great fit for the unstructured data in your environment.  However, for the structured data, information that has  a high change rate, traditional backup applications typically backup this data faster than source-based deduplication.  Keep in mind; if you are in a larger data center, you have probably architected your backup infrastructure to meet the demands of your more important applications.  These applications probably backup on the SAN, may be server-less, and have likely required a good deal of time spent by IT ensuring that there are no issues with protecting these applications.  However, with the data growth in the environment, backups across the enterprise are running more slowly so they are having an impact on the critical business applications.  By implementing a source-based backup solution for the 80% of the data in the environment that it is a good fit for, you off load the traditional backup application so that it can focus on the 20% of the data in the environment that may have a greater business need.

Another good fit for source-based deduplication is in virtual server environments.  The benefits of server virtualization are soon forgotten when it comes time to back them up.  The reality is, virtual servers are not designed for high I/O and backup is the application in your environment with the highest I/O.  By leveraging source-based deduplication and removing all the redundant data at the source before it needs to be sent through the virtual servers physical resources, you can dramatically  decrease your backup bottleneck and increase your TCO with virtual servers.

When it comes to source-based deduplication, one thing to consider is that some customers may not want to go through the process of changing over to a new data protection technology.  If this is the case or if there is an area where source-based deduplication isn’t a good fit, disk targets such as VTL or target based deduplication is a good way to increase backup performance over tape, and reduce the capacity of data that you are storing on a daily basis.

Also, when I speak with customers these days, they want to reduce their reliance on tape more and more.  Deduplication solutions allow for appliance to appliance replication very efficiently.  This enables customers to get data off site efficiently and store data on disk at the same cost as storing data on tape while increasing operational recovery and ensure you are on the Road to Recovery.

Post to Twitter Tweet This Post

Scridb filter

A Data Protection Reference Architecture – Part 2

August 20th, 2009 Steve Kenniston No comments

Archive

The most fundamental part of developing a good data protection architecture starts at the base of the triangle with Archive.  Archive is often an overlooked component of data protection – It’s not just for regulated business anymore.  Archive essentially gives users 100% data deduplication efficiency.  What I mean by this is that you have the ability to remove ‘stale’ data (and by ’stale’ I don’t mean unimportant data, I just mean data that is not accessed frequently) completely from your backup stream so you don’t continue to back it up.  Let’s face it; the two most important commodities in backup are time and capacity.  Both of these are interdependent of one another.  The more capacity you have, the longer it takes to backup and the more money it costs to store.  The longer it takes you to backup, the less likely you are to be meeting your business objectives.  Data capacities aren’t shrinking, they are growing.  According to the latest IDC data, capacity is growing at a staggering pace of 65% year over year and the digital pack rat in all of us is too afraid to get rid of anything,  compromising backup windows and hence the business.  By archiving data that hasn’t been touched in some period of time and removing it from the backup stream, you can relieve some of the pressure on your backups and possibly not have to make any significant changes to your backup infrastructure.

Also, you don’t have to backup to a special purpose device or appliance for archive.  You can archive data to any file system.  I would keep in mind however, that you want to archive to a platform that can keep costs low.  Remember this data is not unimportant, just not highly used.  Take into account your RTO and store the data on the most cost effective platform possible that also aligns to the business objectives.  This may be tape, it may be optical or it may be disk.  If it is disk, you want to store it on disk that is optimized for this type of data, optimized for capacity (deduplication, compression, single instancing), has low power and cooling costs, can replicate for availability and is highly reliable.  You will also want to make sure that it is integrated to some extent with an application that lets you find the data pretty quickly when you need it and put you further down the Road to Recovery.

In my next post we will talk about what I call the ‘fat middle’.  In this area most all of the data has a 24 hour RPO and is where traditional and next generation backup applications play.  There are many use cases for data protection in this area and RTOs tend to drive the medium to which data is backed up to (disk or tape).  Stay tuned for Part 3.

Post to Twitter Tweet This Post

Scridb filter

Process vs. Technology

May 1st, 2009 Steve Kenniston 1 comment

The hardest thing to change inside IT is not technology, it is process!  I say this because all too often there are technologies available that provide a far superior solution to a complex IT problem, however, this new technology may not fit into your existing business process.  Need proof?  Let’s take data protection as an example.  Did you know that VTLs (virtual tape libraries) and data deduplication technologies came out at the exact same point in history, 10 years ago?  Which technology had faster market adoption?  VTLs of course because implementing them didn’t cause a major disruption in processes.

Let’s take a look at a simple backup environment.  We won’t worry about archiving or compliance for the moment, just operational backup and recovery.  Today’s backup has a number of complexities.  There are some data sets that have weekly full backups and daily incremental backups.  There are some data sets that sit under applications that, for faster recovery capabilities and simplicity, require daily full backups.  Once the backups are done, in order to ensure true data protection reliability, a process of checking the backup logs to ensure every system was successfully protected begins.  Next, backup tapes are either created (if it is a disk based backup) or tapes are taken from the library and moved to a transportable box, hopefully a secure box.  Finally, a third party vendor comes to pick up the tapes and take them off site for safe-keeping.  Additionally, if the data is backed up using encryption, then the encryption keys are also kept off site for security purposes.

 Customers face these standard backup challenges:

1) Backups take too long and cannot meet backup windows as a result of too much data.

2) Backups fail due to poorly configured (networked) backup environments.

3) Backups at remote offices are ‘unreliable’. (Don’t follow best practices set in the data center.)

a. No one with the appropriate skill set is available to monitor these backups.

b. No one with the appropriate skill set is available to troubleshoot these backups.

c. No one with the appropriate skill set is available to perform data recovery.

4) New applications / processes cause additional challenges; does this application need incremental backups, full backups, what is the RPO / RTO???

5) Managing backup tapes is too difficult and costly.

However, the reality is that in this particular IT shop, no one has ever been fired for data loss. Each time there is a recovery request, data is recovered.  It may not be the absolute most recent data, or it may take 48 hours to recover, but eventually, the data is recovered. The question is, has everyone’s business objectives been met? Chances are the answer is “no” but when the issue of what it would cost to meet everyones’ needs comes up, there is usually no money in the budget for ‘backup’ and it’s right back to the same old way of doing things. Backup is not really strategic to a business (unless of course you’re in the business of providing backup solutions to customers) but it is more of an insurance policy. There is no doubt you need it, but you want it for the lowest possible price, hope you never have to call on it, and when you do, you better get good service.

Maybe that is why EMC is now the GEICO of data protection.

 That aside, when there is money in the budget, it usually comes in small doses so backup administrators have to make the biggest impact in the ‘easiest’ way possible. This means, implement something that allows them to meet most of their challenges and doesn’t:

1) Change process because they already have run books established for data recovery and because everyone is already trained on the existing technology.

2) Change configuration because they have already invested a great deal of time and money to sort out their issues with the existing products.

3) Cost a lot of money

That usually means, augmenting the existing backup software technology with something that allows them to gain some efficiencies on the backend because they already have significant investments in their backup software. This was one of the main reasons for the success of VTL (virtual tape libraries). It is way easier to unplug the slow, serial tape library and replace it with fast, parallel disk. The backup administrator gets all the advantages of disk and doesn’t have to change a single process, except for maybe adding a step of cloning the data from the disk that looks exactly like tape, to an actual tape in order to offsite the data. Additionally, this is why companies with target deduplication devices became so popular so quickly. When VTL was having challenges solving backup data capacity issues, deduplication became the next popular thing.  The big issue was plugging into the existing infrastructure without disruption.  If I have to change too much about my process, I can’t ‘afford’ to make it work.

The trouble is backup administrators are at an inflection point. They can no longer continue to use the same old technology at the front of the backup process and meet the needs of the business. We are at a time when new technologies such as source based deduplication technologies can really have a significant impact on a number of the backup challenges. The problem is that it goes against the grain of why IT doesn’t want to change technology, because it forces a change to the process. For example, out come the traditional backup agents and new ones are put into place. Since data no longer is stored in tape format, new processes must be utilized for getting tape offsite. When backup administrators hear this, they tend to shy away from it. It costs money and it changes processes right when they had all the original processes figure out.  It is only now that source based deduplication solutions have gained significant momentum as it is really solving a number of the key data protection challenges for more than 70% of the data in most data centers.

  • Remote offices can now experience the same set of data protection best practices that are used in the data center. (Keeping in mind, IT is accountable for 100% of the data created in the corporate, local or remote.  This is good piece of mind.)VMware environments tend to ruin a TCO when using traditional backup applications. Leveraging source based deduplication can bring up your TCO and ROI.

This is not to say that source based deduplication is the savior of the backup world. It is not. There are places where source based deduplication technologies are not the best fit. Very large environments with very high change rates and little duplicate data don’t tend to be good fits. However, if you attack the places that are a good fit for source based deduplication, you will create relief in your backup environment at the target and that will be good for everyone.  It is time to take backup, beyond.

Posted by Steve Kenniston

Post to Twitter Tweet This Post

Scridb filter

Twitter links powered by Tweet This v1.6.1, a WordPress plugin for Twitter.