Wednesday, 09 September 2020 14:16

How we learned (the hard way) to de-pollute our data lakes

Our analytics ambitions may have taken a temporary hit, but a cleaner environment brings us closer to realising value

VENDOR NEWS:  When the world embraced data science, it was hooked on the promise of data - or perhaps, more accurately, on the promise of unrealised insights within that data.

But if the past couple of years has taught us anything, it’s that the process of extracting intelligence isn’t as simple nor straightforward as it looks.

Unfortunately, we only have ourselves to blame for this.

Most organisations have tended to collect and dump data into large repositories. In the data science era, those repositories are most likely data lakes.

The challenge is that, particularly for those of us with larger historical data stores - and therefore more to theoretically gain from an investment in data science - unless we’ve consistently applied a standard set of rules to the way we collected and stored data over the years, from standard fields to schemas, it’s quite likely the data isn’t ‘clean’ enough to be usable and the data lake becomes a data swamp.

And so, for many companies that have gone down this path, the early years have been less science than organisation and decluttering. Time that we’d hoped to spend unlocking growth opportunities has still been well-spent, but we may have had to adjust the timelines and breadth of our ambitions.

In a sense, this continues. IDC’s research in Seagate’s ‘Rethink Data’ report show that “making collected data usable” is still the number one barrier to putting data to work. Organisations may find comfort in knowing many of their peers are in the same boat.

Structure is important not just for our historical data stores, but also for consistency of the data we generate now.

It’s also crucial to our understanding of why we collect data in the first place, and more specifically what we should collect and what we hypothesize might be useful (either in a raw form, or transformed through data analytics).

Business leaders’ first task is to define why they want to collect data and what insights they are trying to gain. Only once they are clear on that agenda should they go after that particular goal, then figure out which data they need to collect and what intelligence they can get out of it.

Collecting data must be about what enables the business objective (what business owners want to learn). Unless they figure this out, amassing data won’t provide them with the value they expect.

A key solution proposed by Seagate to these data storage management challenges has to do with how business owners see the stored data. The idea is to see it - all of it - as if through a single pane of glass, regardless of in what system or on what storage medium it is housed.

If the business owners can achieve this unobstructed easy view into their data, then genuine data ownership will begin.

Read 1397 times

Subscribe to ITWIRE UPDATE Newsletter here

Active Vs. Passive DWDM Solutions

An active approach to your growing optical transport network & connectivity needs.

Building dark fibre network infrastructure using WDM technology used to be considered a complex challenge that only carriers have the means to implement.

This has led many enterprises to build passive networks, which are inferior in quality and ultimately limit their future growth.

Why are passive solutions considered inferior? And what makes active solutions great?

Read more about these two solutions, and how PacketLight fits into all this.


WEBINAR INVITE 8th & 10th September: 5G Performing At The Edge

Don't miss the only 5G and edge performance-focused event in the industry!

Edge computing will play a critical part within digital transformation initiatives across every industry sector. It promises operational speed and efficiency, improved customer service, and reduced operational costs.

This coupled with the new capabilities 5G brings opens up huge opportunities for both network operators and enterprise organisations.

But these technologies will only reach their full potential with assured delivery and performance – with a trust model in place.

With this in mind, we are pleased to announce a two-part digital event, sponsored by Accedian, on the 8th & 10th of September titled 5G: Performing at the Edge.



Related items

Share News tips for the iTWire Journalists? Your tip will be anonymous




Guest Opinion

Guest Interviews

Guest Reviews

Guest Research

Guest Research & Case Studies

Channel News