Read the Beforeitsnews.com story here. Advertise at Before It's News here.
Profile image
Story Views
Now:
Last hour:
Last 24 hours:
Total:

Top 6 Cloud Data Warehousing Issues

% of readers think this story is Fact. Add your two cents.


Cloud data warehousing (CDW) platforms are reasonably praised as out-of-the-box enterprise-level solutions that outweigh the on-premise server infrastructure most companies used to maintain once. They eliminate maintenance costs, simplify data operations’ scalability, and provide top-tier computing power.

That said, data warehousing ROI should not be taken for granted. A CWD-based data stack demands supervision, and it’s up to data scientists to derive insights into optimizing its performance while keeping expenditures low.

What Does Data Observability Have to Do with CDW Cost Optimization?

Apart from analyzing expensive queries and overprovisioned instances, strategic CDW optimization prioritizes troubleshooting data quality issues. Flawed, incorrect, or anomalous data come at a price if they skew the statistical results that feed business analytics and inform decision-making. Moreover, they might induce data downtime and lead to a revenue loss of tens and hundreds of thousands of dollars per hour.

Data teams frequently rely on data observability tools to spot erroneous data point occurrences and intervene before those artifacts impact users. But the answer to “What is data observability?” shouldn’t be reduced to preventive troubleshooting and upkeeping data quality.

These days, going beyond the observability platforms’ functionality extends data operations beyond anomaly and error detection. A top-tier data operations platform should broaden data teams’ view of data usage, helping them identify and minimize redundant data or block excessive read/rewrite operations. These should also empower consistent and fully automated maintenance of CDW cost efficiency.

6 Key Cloud Data Warehousing Issues to Tackle

Let’s consider what common suspects demand your primary attention to ensure lean CDW resource consumption without sacrificing data accessibility and agile performance.

1. Lack of Data Health and Usage Monitoring

Cloud data spend often runs uncontrolled simply because the data quality team didn’t set definite benchmarks and alarm triggers to warn them about data use and health issues. It is up to data leaders to define data quality standards, map use cases, and establish the target resource usage.

Here are the things to do foremost once you determine the assessment framework:

  1. Stay aware of query performance. Start with setting thresholds for query execution time. Analyze their lengths and throttle rates to spot bottlenecks that need troubleshooting.
  2. Research logs. User activity logs put redundant queries and instances access in full view for in-depth analysis and optimization.
  3. Tag resources. It will clarify who uses data assets on purpose and who doesn’t.

In modern enterprise data reality, you can fully delegate monitoring routines to AI-driven observability software. The software will map your data infrastructure and identify normal use patterns, flagging aberrations automatically.

2. Overspending on ETLs and BI tools

Ingress/egress data transfer typically bloats the cost of ETL and BI tools. This is a recurring problem for enterprise storage heavily loaded with external API queries. However, you can still work around this issue by implementing computing capacity autoscaling. It will keep active processing power low during off-peak hours.

Also, try the following:

  1. Leverage a microservice architecture. Break down data pipelines into standalone, smaller services. Thus, you’ll be able to operate and scale them independently.
  2. Set dedicated preemptible instances. Revise the structure of your regular workloads. The practice shows that running delayable or temporary processes on preemptible or spot instances can be 90% cheaper than consuming CDW resources on demand.

3. Excessive Consumption of Storage Capacity

Find out how much of your current data volume is used for immediate workflows, and you’ll significantly minimize CDW storage consumption. The practice shows that many companies overlook their non-production, test environments running around the clock without any profit. You may shut some of them down instantly, while others can be scheduled to launch automatically.

4. Excessive Processing

Caching is the most impactful technique when eliminating repetitive processing and cutting CDW monthly bills. Make use of cashing in 3 ways that will benefit most:

  1. Cache raw data. Raw assets are requested for many processes, and you better keep them applicable all the time to cut computing expenses. Cache them, and you’ll cease the repeated retrieval that leads to racked-up bandwidth charges.
  2. Cache intermediate data. Same thing with data sets that go through incremental computing. Make those intermediate values available straight from the cache.
  3. Cache query results. Some expensive queries should be backed up as well, especially those that are resent frequently and have consistent parameters.

Another go-to advice for minimizing cloud computing expenditure is to tune SQL queries. In particular, data engineers parallelize long-running queries by applying MPP architecture. Parametrizing and lowering query timeouts are also commonly practiced.

5. Inefficient Data Architecture

Targeted and meaningful architecture improvements boost the accessibility of data assets and eliminate many delay occasions. Most professionals agree upon the efficiency of these practices:

  1. Adoption of mesh architecture. Mesh architecture allows teams to use depositories and domains autonomously. Thus, they can concentrate on cost management for such decentralized domains without encountering central bottlenecks.
  2. Data lake implementation. Data lakes are cheaper options for maintaining your raw data. Use them while keeping your analysis-ready data in CDW.
  3. Right-sizing cluster nodes. Enable auto-scaling to reserve properly sized nodes or shut down the unused ones.

6. Unwanted Operational Overhead

The point is that if many data flows can run smoothly without manual fine-tuning, you better move their instances to managed CDW platforms. Transfer supplementary, non-critical workloads to managed services. Eventually, you cut a big chunk of operational costs and prioritize engineering efforts in the areas worth developing and enhancing.



Before It’s News® is a community of individuals who report on what’s going on around them, from all around the world.

Anyone can join.
Anyone can contribute.
Anyone can become informed about their world.

"United We Stand" Click Here To Create Your Personal Citizen Journalist Account Today, Be Sure To Invite Your Friends.

Please Help Support BeforeitsNews by trying our Natural Health Products below!


Order by Phone at 888-809-8385 or online at https://mitocopper.com M - F 9am to 5pm EST

Order by Phone at 866-388-7003 or online at https://www.herbanomic.com M - F 9am to 5pm EST

Order by Phone at 866-388-7003 or online at https://www.herbanomics.com M - F 9am to 5pm EST


Humic & Fulvic Trace Minerals Complex - Nature's most important supplement! Vivid Dreams again!

HNEX HydroNano EXtracellular Water - Improve immune system health and reduce inflammation.

Ultimate Clinical Potency Curcumin - Natural pain relief, reduce inflammation and so much more.

MitoCopper - Bioavailable Copper destroys pathogens and gives you more energy. (See Blood Video)

Oxy Powder - Natural Colon Cleanser!  Cleans out toxic buildup with oxygen!

Nascent Iodine - Promotes detoxification, mental focus and thyroid health.

Smart Meter Cover -  Reduces Smart Meter radiation by 96%! (See Video).

Report abuse

Comments

Your Comments
Question   Razz  Sad   Evil  Exclaim  Smile  Redface  Biggrin  Surprised  Eek   Confused   Cool  LOL   Mad   Twisted  Rolleyes   Wink  Idea  Arrow  Neutral  Cry   Mr. Green

MOST RECENT
Load more ...

SignUp

Login

Newsletter

Email this story
Email this story

If you really want to ban this commenter, please write down the reason:

If you really want to disable all recommended stories, click on OK button. After that, you will be redirect to your options page.