It is shaping up to be another year of IT budget pressure from data growth—especially the ongoing growth of unstructured data. Data can cost millions annually for a large enterprise to store, backup and replicate as needed for disaster recovery plans. Then there is the cost to replace outdated legacy storage systems which can fulfill the security requirements and processing speeds that modern workforces require – especially in the age of AI. The Komprise 2024 State of Unstructured Data Management found that 55% of organizations are spending 30% or more of their budget on data storage.
Cybersecurity is also driving data management costs as unused unstructured data tends to be the weakest link for ransomware attacks. This prompts organizations to keep extra copies of this data, which for petabytes of data can get very expensive.
Gartner predicts a 23% growth in IT spending on data center systems in 2025, which sounds high until you realize it is largely due to AI infrastructure investments and price increases on recurrent spending. The potential for higher tariff-induced pricing is a valid concern for IT software and equipment. Last year, IDC predicted that worldwide spending on public cloud services would reach $805 billion in 2024 and double in size by 2028. Storage is a significant percentage of this overall spending.
While investing in data storage infrastructure is essential to running a business and servicing customers, paying more to do the same means spending less on many other initiatives which benefit the business. These include building websites and apps, maintaining sound defenses from a cybersecurity attack, investing in AI and data analytics initiatives and developing high-performing productivity suites for employees. The goal for any IT organization regardless of priorities and goals is to reduce waste and optimize costs.
Benjamin Henry, Field CTO, Komprise, answers a few questions about optimizing data storage costs, whether in the cloud or in the data center.
How can companies balance the need to cut data storage costs while still managing the increasing volume of data?
By focusing on optimization, not just cost-cutting, you can make decisions that will right-place data for current needs and save money while not adversely affecting users. Start by evaluating your current storage practices and identifying areas of potential waste. Most (80%) of data in a typical organization is cold and has not been used in years, yet this data continues to consume expensive storage and backup resources. By offloading this cold data from expensive storage and backups, organizations can save at least 70 percent. Implement data tiering, which retains active data on high-performance systems while moving rarely accessed cold data to cheaper, cost-effective solutions such as object storage in the cloud. File tiering, which offloads entire files while retaining user access from the original location, also avoids user disruption and shrinks the ransomware attack surface. Clean up duplicate and orphaned data and delete data that is many years old and not necessary to keep for compliance or other reasons. This can quickly free up storage capacity.
How do you get started?
Look to attain holistic visibility of data across hybrid cloud storage and then start collecting and tracking new metrics, such as how much and what types of data you have, where it is stored and its costs for storage and backups, top data owners, common file types and sizes, data growth rates, and data access trends. Armed with these insights, you can now analyze and model new plans for data tiering, data migration and so on. The analytics are also vital to gaining departmental support as most users don’t really know how much data they are generating and storing nor what it is costing the business.
What role does cloud storage play in reducing data storage costs, and when should it be considered?
Cloud storage can be powerful for controlling costs, especially with its scalability, flexibility and pay-by-the-drink nature. Cloud providers offer many storage classes that can align with your data’s frequency of access–further optimizing costs—along with compliance needs. Cloud storage is ideal for migrating large volumes of data in a modernization effort or archiving rarely-accessed cold data from high-performance on-premises systems. Be sure to choose a tiering approach that is not storage specific and does not require you to bring all the files back, causing rehydration; this can result in cloud data egress fees, and worse–require you to buy more of a vendor’s storage in order to leave them!
Can you explain more about rehydration and when that occurs?
It’s critical to tier data in ways that can avoid or minimize data rehydration costs. This occurs when using storage vendor solutions for data management; if you need to migrate to another storage platform or service, you will first need to rehydrate all of your tiered data back from the cloud to the primary storage due to the characteristics of proprietary, block tiering methods that storage vendors employ. Storage refresh of the hardware typically occurs every 3-5 years so this exercise can be quite costly and time-consuming and requires IT to acquire storage capacity in their data center before moving data to the new vendor. Using a storage-agnostic data management solution with file-level tiering ensures you can move your data directly and avoid rehydration.
Last year, the big cloud providers (AWS, Microsoft, Google) said they were eliminating egress fees, but what’s the reality?
While cloud providers have indeed made some strides in reducing or eliminating certain egress fees, the reality is more nuanced. Many of the announced “no egress fee” policies only apply under specific conditions, such as data transfers between services within the same region or within their own ecosystem (e.g., AWS to AWS or Microsoft to Microsoft) or if you are pulling all your data out of the cloud and cancelling the subscription. In some cases, there are still costs for transferring data out of a cloud provider’s network to external data centers, between different clouds, or even between regions. It’s essential to understand the fine print of each provider’s pricing model and ensure that any cross-cloud or cross-region transfers are clearly mapped out to avoid surprises.
How can CIOs and CXOs manage the risks of unexpected egress costs?
First, IT leaders can carefully review the pricing structures of each cloud provider. Don’t move data to the cloud that may need to be recalled frequently. Keeping data and workloads within the same cloud provider and region as much as possible can minimize inter-cloud and inter-region transfers. Some cloud providers offer lower rates or even waive egress fees in exchange for a committed volume of data transfer or a longer-term contract. Deduplication ensures that you only move unique data, which will reduce egress fees. For frequently accessed data, a content delivery network (CDN) can be used to cache and serve data at the edge, reducing the need for repeated, costly egress from the cloud. Finally, data management solutions that deliver native access to data in the cloud after tiering it can eliminate egress fees as users can see their data at the target and move it without penalty to another service in the same cloud, such as a data lake or an AI/ML service.
How can data retention policies help control storage costs?
Instituting clear data retention policies can significantly reduce unnecessary storage usage. Regulations, such as in healthcare, may dictate these but don’t overlook the opportunity to set other policies based on usage patterns. With retention policies, you can use an unstructured data management solution to systematically tier or confine for deletion that data that is no longer needed—such as old log files, transaction records, or backup copies which no longer serve a purpose. This reduces the burden on expensive primary storage while reducing the risks of maintaining inactive data on old appliances that hackers can easily breach to break into the corporate network.
How does data lifecycle management contribute to lowering storage costs?
Data lifecycle management (DLM) automates the process of moving data through various storage tiers as it ages, from high-performance storage to cheaper, archival options. This way, you’re not wasting resources storing cold or historical data on expensive systems. You can move your data to different tiers and technologies as it ages or takes on new compliance requirements. Some data sets, such as clinical images, are rarely accessed after 30-45 days so you can archive them rather than consuming expensive storage space. IT can use a storage-agnostic data management solution to automatically move or delete data, according to predefined rules. Implementing DLM also streamlines compliance by ensuring data is retained as needed and securely disposed of when no longer required.
How do you maintain flexibility to adopt new, more efficient storage?
The days of using one or two core storage vendors are over. With so much choice and innovation, it’s critical to have in-house experts and the right partners who can keep their eyes on new developments. This way, your organization can move data to the latest advances such as QLC-Flash, Storage as a Service (STaaS) and cloud storage. Aim to avoid locking data into a single vendor’s proprietary storage format, so you can remain agile, optimize costs and easily shift to new technologies from any vendor. Managing data requires a different approach than managing storage. Ultimately, it’s about staying in control and ensuring you have the right level of visibility across all potential storage silos and mobility to ensure the right data workloads are in the right place at the right time.
By Randy Ferguson
from Cloud Computing – Techyrack Hub https://ift.tt/9NPosYQ
via IFTTT
0 Comments