If you wonder, whether I came back from the conference with definitive answers - sorry, but no. BigData community is still is a very amorphous structure with a lot of ideas/tools/concepts floating around. Unfortunately, some software vendors (let's not name them, please) are also trying to fish in that muddy waters, and as a result some presentations became about 90% of marketing / 10% of content. As a result it also complicated my attempts to get "the big picture". Although, there are some common ideas floating around:
- people should not mix BigData and NoSQL, but they do! And that's one main confusion areas:
- BigData (Hadoop+MapReduce) is an extension to good-old data mining, just you can mine much more data much faster on much cheaper hardware. It makes the whole life easier because you can store data first in any way/shape/form it comes and try to make use of out it later - there is no need to define a structure beforehand.
- NoSQL is the environment where you can WORK with that (non)(semi)-structured data very efficiently from the very beginning. Minor problem - each existing solution is optimized for a pretty focused set of tasks. Yes, it is well-optimized, but you need to think hard to select right tools for the right problem.
- HADOOP is huge as a storage mechanism. Minor problem - as far as I understood, anything below 50-node cluster is considered a toy-box. So you need to have a real issue to be solved by it, because starting budgets are in $300k+
- NoSQL usually comes in conjunction with regular RDBMS and rarely a standalone, especially if you care about high reliability/control/audit etc. It is considered as a side performance booster - but it also can mask MAJOR problems in the RDBMS. I've heard a number of anecdotes, when IT stuff was introducing NoSQL for performance reasons, but at some point they've hired Oracle performance experts - and they scrapped NoSQL solution, because Oracle RDBMS started to crunch data as fast as it was needed. So, if you expect that your bad coders will do good code just because of the technology change - you may not be 100% right.
- Toolsets are in the disarray - and everybody got his own preferences. There is a lo-o-o-o-ng way for this environment to mature.