I’ve said that Big Data analysis needs to become mainstream and reach small(er) industries, as until now, “big data” as been applied to massive volumes of data. With the hadoop becoming more popular (and easy to use too) new opportunities for the areas of data mining or data analytics in general will certainly emerge. But will it be big data analysis?
One of the defining characteristics of big data is that sometimes you need more than just labeling it BIG. Forbes as a blog post where a few questions about the data are made. The interesting aspect about those 4 questions is that they could be easily summed into 2 points:
1st) Big Data is by nature complex, with intricate structures and relations among its components in such way that this entanglement needs long computations to grasp its inner aspects
2nd) Big Data is only big data if the time to those computations is a critical aspect of the industry trying to deal with big data. If that’s not the case the author claims it’s just a matter of “large data” analysis. In many contexts this means real-time or quasi real-time data processing.
The author goes on to state that according to these very few industries really process Big Data, but I tend to disagree a bit here as It’s my belief that the two points presented about Big Data are so correlated that indeed you’ll have “Big Data” at many scales as the space of exploration isn’t on Volume or Time alone but on a time-volume space, and therefore we’ll be able to find examples of big data in different scales. In any case I totally agree that the two points are the ones that one must ask to see if our data fits the “big data” label.