JAOO / Goto Conference 2010: NoSQL - an Overview (Emil Eifrem)
This talk gave a general introduction to NoSQL. Emil is CEO of Neo Technology, the company behind the Neo4j NoSQL database.
Emil started by talking about the name "NoSQL":
- He is unhappy about it - like almost everyone else
- It it not "No to SQL"
- It is also not "Never SQL"
- It is rather "Not only SQL"
He gave four reasons why NoSQL is import
- The exponential growth in data. I did some research and actually IDC says that 2009 the amount of data grew 62% to 800 billion gigabytes (0.8 Zettabytes). 2010 we will create 1.2 Zettabyte. This is clearly exponential growth - something we should be afraid of.
- We are seeing more and more connected data. While we used to have tex documents it is now about hypertext, blog, user generated content etc.
- Also the data is more and more semi structured. User generated content is a good example again. And also we are looking for information rather by using full text search and not detailed queries.
- The architecture changes from integration with a common database to individual systems with their own private database each. This enables specific databases for specific challengers.
It is unlikely that relational stores will solve these problems. Also it means that NoSQL won't replace relational stores. They solve different problems and there is more than enough data for all kinds of databases.
Emil then discusses the four types of NoSQL:
- Key-value stores are basically globally available Maps. Examples are Project Voldemort or Tokyo Cabinet/Tyrant. Their strength are the simple data model and they are great at scaling out horizontally. However, their weakness is the simplistic data model and they are a poor fit for complex data.
ColumnFamily / BigTable stores are a big table with column families i.e. you can have a lot of columns and structure them. Examples are HBase, HyperTable or Apache Cassandra.
Document databases store collections of documents with a document being a key-value collection. Documents might be represented as JSON etc. Examples are CouchDB or MongoDB.
- Graph databases use nodes with properties and typed relationships with properties. Examples include Sones GraphDB, InfiniteGraph and Neo4j.
Challenges for NoSQL in his opinion are:
Mindshare and product usability
Tool support for development and operations
Middleware support in frameworks etc.
Very interesting was the demo that he did with Michael Hunger. It showed a prototype of an integration of Neo4j into Spring Roo. It showed how parts of the entity object could be stored in Neo4j and other parts in a relational store with JPA. This will probably become more and more common place: Certain parts of a customer are a good fit for a relational database while the relations to other customers or items might be a good fit for a graph database. This model allows to combine both approaches and use the better solutions for the problem at hand.
Labels: Emil Eifrem, Goto Conference, JAOO2010, NoSQL