For example, storage vendor EMC positioned the Isilon systems it announced last month as the place to store a data lake.
But analyst group Gartner raised a warning flag at the end of July, suggesting its clients should beware of the data lake fallacy.
"The idea is simple: instead of placing data in a purpose-built data store, you move it into a data lake in its original format. This eliminates the upfront costs of data ingestion, like transformation. Once data is placed into the lake, it's available for analysis by everyone in the organisation," explained research director Nick Heudecker.
"The fundamental issue with the data lake is that it makes certain assumptions about the users of information," said Mr Heudecker. "It assumes that users recognise or understand the contextual bias of how data is captured, that they know how to merge and reconcile different data sources without 'a priori knowledge' and that they understand the incomplete nature of datasets, regardless of structure."
Gartner subscribers can access the company's report The Data Lake Fallacy: All Water and Little Substance.
The industry is naturally more optimistic - see page 2.
EMC Isilon division CTO for APJ Charles Sevior told iTWire that organisations typically purchase storage to support a particular workload, and then find they can use the data with another application. The trick is to do that without disturbing the original application and without duplicating the data - hence the idea of a data lake that's accessible to multiple applications.
People are realising there is value in 'digital exhaust' - the massive data sets of call records, log files, surveillance video and much more - if they have the ability to extract useful and meaningful information from it.
But there's no substitute for human intelligence and curation, said Mr Sevior, adding that if he ran a business that depended on data, consulting a data scientist would be high on his agenda, if only for a preliminary investigation that would reveal how the data could be brought to bear on the business's processes.
"We are getting tremendous support" for the data lake concept, he claimed, as people wanted an easier way into this type of analysis than Hadoop offers.
Cloudera's concept of the enterprise data hub appears to be the equivalent of a data lake.
This approach makes it easy for that user to access the data he or she needs, and at the same time "our SQL is good enough for 90% of the workloads in the enterprise."
Organisations need to be agile and innovative as well as efficient and well-governed, Mr Awadallah observed.
Image: incorporates a public domain photograph via Wikimedia Commons