Lead Machine Pink 160x1200

Lead Machine Pink 160x1200

iTWire TV 705x108

Friday, 10 September 2021 12:07

Six factors to consider when undertaking data science in the cloud

By Rishu Saxena, Snowflake
Rishu Saxena, Data Platform Architect, Field CTO Office, Snowflake Rishu Saxena, Data Platform Architect, Field CTO Office, Snowflake

GUEST OPINION by Rishu Saxena, Data Platform Architect, Field CTO Office, Snowflake:  Shifting from on-premise IT infrastructures to cloud-based resources has changed the game in the world of data science. Analytics teams now have access to a massive amount of elastic computing power and numerous sources of internal and external data.

The change has also delivered access to managed cloud services that reduce the complexity of building, training, and deploying machine-learning and deep-learning models at scale.

However, this doesn’t mean that there are no challenges yet to be overcome. Data scientists, data engineers, and developers still have to learn and adapt to this new environment, and there is an ever-expanding and rapidly evolving ecosystem of tools and frameworks from which to choose. Many are simply learning as they go.

The very capabilities that make the cloud so exciting also create potential pitfalls. The ease of copying data across diverse systems can create governance challenges if not handled correctly, and the speed of change means data teams can bet on the wrong tool or framework and become stranded.

To reduce the chance of problems and ensure an organisation can achieve the results it expects, there are six key factors that should be considered. They are:

1. Make data governance a top priority:
It’s very important to enable iteration and investigation without compromising governance and security. For example, many data scientists intuitively want to copy a dataset before they start working on it. But it’s too easy to make copies, move on and forget they exist, creating a nightmare in terms of compliance, security, and privacy.

2. Forget your preconceptions:
If you’re coming from an on-premises world, you’ll often bring perceptions and biases about infrastructure that no longer apply to modern platforms in the cloud. Approach the cloud from first principles and start with what you want to achieve, not what you think is possible. That’s the only way to push the boundaries and take full advantage of this new environment.

3. Be careful not to create new data silos:
A key element that is closely tied to data governance is the concept of data silos. In the cloud, it’s important not to replicate the fragmentation that’s common in the on-premises world. The proliferation of tools, platforms and vendors is great for innovation, but it can also lead to redundant, inconsistent data being stored in multiple locations.

4. Maintain an open mind:
One of the exciting things in data science is that frameworks and tools are evolving at an incredible pace. However, it’s critical not to get locked into an approach that limits future options when technologies fall in and out of favour. Choose a data platform that won’t tie you into one framework or one way of doing things, with an extensible architecture that can accommodate new tools and technologies as they come along.

5. Incorporate additional data sources:
Cloud platforms make it significantly easier to incorporate external data from partners and data-service providers into existing models. This has been particularly important during the past year, as businesses sought to understand how the impact of COVID-19, fluctuations in the economy, and subsequent changes in consumer behaviour would affect their businesses.

Some organisations used data about local infection rates, foot traffic in stores, and signals from social media to predict buying patterns and forecast inventory needs. Consider what data you could be using.

6. Reduce your complexity:
AI technologies like machine learning and deep learning are immensely powerful and have a critical role to play for certain business needs, but they’re not right for every problem. Always start with the simplest option and increase complexity as needed.

Try a simple linear regression or look at averages and medians. Check the accuracy of predictions and whether the ROI of increasing the accuracy justifies a more complex approach.

Analytics tools are quickly becoming more powerful and able to deliver ever-greater benefits to the organisations putting them to work. By adding cloud resources to the picture, data science teams will be able to take their work even further in the future.

Data science plus the cloud is a very powerful mix.

Subscribe to ITWIRE UPDATE Newsletter here


It's all about Webinars.

Marketing budgets are now focused on Webinars combined with Lead Generation.

If you wish to promote a Webinar we recommend at least a 3 to 4 week campaign prior to your event.

The iTWire campaign will include extensive adverts on our News Site itwire.com and prominent Newsletter promotion https://itwire.com/itwire-update.html and Promotional News & Editorial. Plus a video interview of the key speaker on iTWire TV https://www.youtube.com/c/iTWireTV/videos which will be used in Promotional Posts on the iTWire Home Page.

Now we are coming out of Lockdown iTWire will be focussed to assisting with your webinatrs and campaigns and assassistance via part payments and extended terms, a Webinar Business Booster Pack and other supportive programs. We can also create your adverts and written content plus coordinate your video interview.

We look forward to discussing your campaign goals with you. Please click the button below.



iTWire TV offers a unique value to the Tech Sector by providing a range of video interviews, news, views and reviews, and also provides the opportunity for vendors to promote your company and your marketing messages.

We work with you to develop the message and conduct the interview or product review in a safe and collaborative way. Unlike other Tech YouTube channels, we create a story around your message and post that on the homepage of ITWire, linking to your message.

In addition, your interview post message can be displayed in up to 7 different post displays on our the iTWire.com site to drive traffic and readers to your video content and downloads. This can be a significant Lead Generation opportunity for your business.

We also provide 3 videos in one recording/sitting if you require so that you have a series of videos to promote to your customers. Your sales team can add your emails to sales collateral and to the footer of their sales and marketing emails.

See the latest in Tech News, Views, Interviews, Reviews, Product Promos and Events. Plus funny videos from our readers and customers.


Share News tips for the iTWire Journalists? Your tip will be anonymous




Guest Opinion

Guest Reviews

Guest Research

Guest Research & Case Studies

Channel News