Top 10 Big Data Trends for 2017

Tableau published a paper on Top 10 Big Data Trends for 2017 that you can find here: http://tabsoft.co/2jXCXar

We disagree that speeding up Hadoop as number 1 trend. The author takes the evolutionary approach and not revolutionary one. What needed more and more is the event driven processing. The machinery should react to new incoming data that is easier to process. Even the 2nd trend is talking to some degree that Big Data is not just about to utilize Hadoop-like systems. Hadoop was written for batch processing.

Utilizing reactive streams:

When we were consulting for electronic book and academic journal publisher the vendor used Amazon Elastic MapReduce.  The vendor paid around 1.5 million annually for utilizing a very large cluster of Amazon EMR services for re-processing books and journals on daily basis.  In few instances, the error would happen and the vendor had to re-run the job.  We reengineered the system where only new content processed as soon as it would become “alive”.  Such system saved the company 90% on payment to Amazon for EMR resource utilization and newest books and journals appear online in a few seconds after the content would up-stream to the vendor.

5th trend on the list talks about the Variety of data.  But that’s not new, Gartner published the traits of Big Data in 2001.  What is not elaborated upon in the paper is HOW to tackle the variety.

Scala computer language is Scalable Language – language that borrowed from various programming languages the best concepts; enhanced and extended them.  It became trivial to create DSL ( Domain Specific Language) on top of Scala.  For example, in order to parse the JSON of any kind, the programmer can create a few lines of code that would match such JSON structure in Object Oriented way.

The paper prepares the reader to the new Trends in 2017 but focuses too much on Hadoop infrastructure and not at all on new trends that aid the Big Data – such as reactive streams, functional programming, elastic systems, non-blocking I/O, anychronous processing of “live” data. But we still recommend you to read Tableau’s research.