JAOO / Goto Conference 2010: NoSQL - an Overview (Emil Eifrem)

This talk gave a general introduction to NoSQL. Emil is CEO of Neo Technology, the company behind the Neo4j NoSQL database.

Emil started by talking about the name "NoSQL":

He is unhappy about it - like almost everyone else
It it not "No to SQL"
It is also not "Never SQL"
It is rather "Not only SQL"

He gave four reasons why NoSQL is import

The exponential growth in data. I did some research and actually IDC says that 2009 the amount of data grew 62% to 800 billion gigabytes (0.8 Zettabytes). 2010 we will create 1.2 Zettabyte. This is clearly exponential growth - something we should be afraid of.
We are seeing more and more connected data. While we used to have tex documents it is now about hypertext, blog, user generated content etc.
Also the data is more and more semi structured. User generated content is a good example again. And also we are looking for information rather by using full text search and not detailed queries.
The architecture changes from integration with a common database to individual systems with their own private database each. This enables specific databases for specific challengers.

It is unlikely that relational stores will solve these problems. Also it means that NoSQL won't replace relational stores. They solve different problems and there is more than enough data for all kinds of databases.

Emil then discusses the four types of NoSQL:

Key-value stores are basically globally available Maps. Examples are Project Voldemort or Tokyo Cabinet/Tyrant. Their strength are the simple data model and they are great at scaling out horizontally. However, their weakness is the simplistic data model and they are a poor fit for complex data.
ColumnFamily / BigTable stores are a big table with column families i.e. you can have a lot of columns and structure them. Examples are HBase, HyperTable or Apache Cassandra.
Document databases store collections of documents with a document being a key-value collection. Documents might be represented as JSON etc. Examples are CouchDB or MongoDB.
Graph databases use nodes with properties and typed relationships with properties. Examples include Sones GraphDB, InfiniteGraph and Neo4j.

Challenges for NoSQL in his opinion are:

Mindshare and product usability
Tool support for development and operations
Middleware support in frameworks etc.

Very interesting was the demo that he did with Michael Hunger. It showed a prototype of an integration of Neo4j into Spring Roo. It showed how parts of the entity object could be stored in Neo4j and other parts in a relational store with JPA. This will probably become more and more common place: Certain parts of a customer are a good fit for a relational database while the relations to other customers or items might be a good fit for a graph database. This model allows to combine both approaches and use the better solutions for the problem at hand.

Labels: Emil Eifrem, Goto Conference, JAOO2010, NoSQL

¶ 14:33 0 comments

JAOO / Goto Conference 2010: Spring Framework 3.0 On The Way To 3.1 (Jürgen Höller)

In this talk Jürgen showed the current step in the evolution of Spring i.e. the step from version 3.0 to 3.1. As most of the readers will be familiar with 3.0 I will not go into much detail concerning the first part of the talk.

Version 3.0.5 will be the last version for the 3.0 family. The first milestone for Spring 3.1 is scheduled for November - so it won't be long until you get something to play with.

Environment Specific Beans

One important feature will be environment specific beans. Applications usually run on several different environments. There is at least the Java SE environment you use for JUnit tests. You might do staging on a Tomcat server and production on a Java EE server. Beans that manage access to other system might be replaced by mocks for some of them. The goal of the environment specific beans is to deal with these differences in infrastructure while the deployment units are not changed. Usually operations insists that deployment units must not be changed between tests and production.

There are already ways to deal with this challenge. However, if this is supported as a first class part of the configuration it will be much easier and elegant to use. There are still discussions how this feature will be implemented so the details of this feature are not set
in stone.

The Spring Beans will be grouped by environment. The environment itself will be determined by an API that you can extend yourself for maximum flexibility. Placeholders will be resolved depending on the environment.

A possible syntax could be:
<bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close"> <property name="driverClass" value="${database.driver}"/> <property name="jdbcUrl" value="${database.url}"/> <property name="username" value="${database.username}"/> <property name="password" value="${database.password}"/> </bean> <beans profile="embedded"> <jdbc:embedded-database id="dataSource" type="H2"> <jdbc:script location="/WEB-INF/database/schema-member.sql"/> <jdbc:script location="/WEB-INF/database/schema-activity.sql"/> <jdbc:script location="/WEB-INF/database/schema-event.sql"/> <jdbc:script location="/WEB-INF/database/data.sql"/> </jdbc:embedded-database> </beans>

As you can see the beans element is reused to also cover different environments. As mentioned above the actual implementation might be different - and of course the engineers are happy if you have any feedback!

Improvements for the Java Application Configuration

Java Application Configuration with @Configuration was already introduced in Spring 3.0. In the upcoming release it will be improved to also cover a functionality that resembles the XML namespaces. These namespace are used throughout the framework to configure features like transactions or aop. As the following code sample shows one possible implementation uses a fluent API - in this case to create something that resembles <tx:annotation-driven />.

@Configuration
public void AppConfig {
  @Autowired
  private DataSource dataSource;

  @Bean
  public RewardsService rewardsService() {
    return new RewardsServiceImpl(dataSource);
  }

  @Bean
  public PlatformTransactionManager txManager() {
    return new DataSourceTransactionManager(dataSource);
  }
  
  public TransactionConfiguration txConfig() {
    return annotationDrivenTx().withTransactionManager(txManager());
  }
}

Cache Abstraction

The Spring modules project offered some integration for caching and there is also a very basic cache abstraction. In Spring 3.1 there will be support for EhCache, GemFire, Coherence, etc. Several will be shipped with Spring core but it will also be possible to plug in custom adapters if necessary. The caching itself can then be configured using annotations for example:

@Cacheable
public Owner loadOwner(int id);

@Cacheable(condition="name.length < 10")
public Owner loadOwner(String name);

@CacheInvalidate
public void deleteOwner(int id);

As you can see methods can be marked as cacheable. This can be fine tuned using SpEL (Spring Expression Language). Other methods can be marked as invalidating the cache.

Conversation Management

This feature will allow a user to handle multiple orders in a web application simultaneously - for example in several browser windows or tabs. The state of these must be isolated from one another. The HttpSession is not an option for this because windows share the same
HttpSession.

Spring will manage the window id e.g. by using MVC session form attributes. This is a much simpler problem than the flow support in Spring Web Flow that allows for a simpler solution.

Keeping Up

Spring 3.1 will also support Servlet 3.0 on servers like Tomcat 7 or GlassFish 3. This will include the automatic registration of framework listeners which will make it even easier to use Spring in these settings. Also the support for JSF 2.0 will be improved e.g. for conversations.

Sum Up

Spring 3.1 optimizes Spring in several ways. As the version number indicated polishing and improvements are the main subjects. In particular the cache abstraction is interesting as nowadays a lot of applications need this kind of feature to build scalable solutions.

Labels: Goto Conference, JAOO2010, Jürgen Höller

¶ 10:19 0 comments

Spring vs. Java EE and Why I Don't Care

Disclaimer: I work for VMware / SpringSource. That means I might be biased - but then again we did not just create Spring. We are also a member of the Java EE 6 Expert Group. The views are my personal view of course.

History

Spring has established itself as a very popular solution for Enterprise Java applications. In particular the advantages of Spring as compared EJB 1.x / 2.x were very obvious and were a major reason for Spring's success.

The Current State

Due to Spring’s early success and adoption, Java EE 5 and Java EE 6 were pushed to greatly simplify the Java EE programming model, increase developer productivity and become much simpler to use than previous versions. The current Java EE 6 solutions are thus just now achieving the ability to compete against Spring's programming model. Developers now are ready to ask the question "Why you would prefer Spring?" Here is my take:

Many key elements of Java EE 6 such as JSR-330 (@Inject), Common Annotations (@Resource, @RolesAllowed etc) and even some EJB annotations (@TransactionAttribute, @Asynchronous etc) are supported by Spring. So you can choose a programming model that is very similar to the Java EE programming model.
There are differences, among others: Spring typically uses Singleton beans, EJB typically used pooled beans - and of course the annotations for them are different. But do you really think this will impact productivity or even the success of a project?
However, Spring is much more flexible. Instead of the annotations mentioned above you can choose XML or Java based configurations. You can create custom extension using AOP. You can use the JMS and JDBC abstractions - all of that is not included in Java EE. So the programming model Spring can almost be considered a superset of Java EE. Usually more power and flexibility to choose different options is considered a superior solution.
There is a difference in platforms that you can deploy on: Typically you will need to use Java EE application server for Java EE solutions while Spring is perfectly happy to run on Tomcat or other simple servers. Tomcat is the predominant server in the marketplace so this is actually an important difference. (Let me add that you might use OpenEJB on Tomcat i.e. you can support more of the Java EE APIs on Tomcat if you really want to.) Also there are problems: If you want to run Java EE 6 you are limited to the very small set of certified servers. If you have a strategic commitment to an application server vendor who doesn't support Java EE 6 you can still use Spring. If operations won't install the latest release of your application server, you can still use Spring. Spring happily runs in these environments.
Of course Java EE also has an advantage: the fact that Java EE is baked into the server. Therefore you don't need to deploy it with the application which can make some things easier. It might be easier to set up your infrastructure. It might be easier to package and share your application. But I personally would prefer to be flexible in deployment over these advantages.

Before we start a flame war and get lost in technical details let me reiterate: Java EE and Spring can have a similar programming model. Spring is more flexible and I would prefer it. But: I don't think your project will fail because of the decision you made concerning this. And: I think a Java EE vs. Spring shoot out will be boring. If you do a side by side comparison you will end up with minor differences such as @Component instead of @Stateless or whether you will need to deploy some additional JARs. While that might be impressive in a demo I don't believe this will convince anyone to use one platform or the other. Certain features of Spring (e.g. the flexibility to use XML or Java based configuration) will probably not be shown as they just cannot be done using Java EE.

Why I don't care

So if a side by side comparison doesn't make a lot of sense - why would I choose Spring? There has to be a clear advantage somewhere to definitely answer this question. A personal note at this point: I am frequently work at a shared office space with some JavaScript and Ruby on Rails guys - I am the only Java guy there (and sometimes I think they pity me). After talking a lot with these other developers, I believe we need a compelling Java story that can live up to their developer experience. If Java EE and Spring are both on a comparable level concerning productivity we need to come up with novel ideas to improve. Looking at Ruby on Rails helps here - the approach is to combine a dynamic language with a powerful framework and a code generator. The next level of productivity is not Java EE vs. Spring. It must be something that can counter the productivity of Ruby on Rails. I believe this is:

Groovy / Grails: it combines a dynamic language with a powerful framework and a code generator - like Ruby on Rails. The only difference is that the solution is more adapted to the JVM and has better integration with Java, the most important platform in the Enterprise space.
Spring Roo: it combines Java/AsepctJ with a powerful framework (Spring) and a code generator - like Ruby on Rails. This is easier to learn for Spring developers than Groovy / Grails and has therefore appeals to a different audience.

So if you look at productivity the comparision should not be "Java EE vs. Spring" but rather "Groovy / Grails vs. Spring Roo vs. Ruby on Rails". Better productivity is not gained on the level of the framework or the programming model any more.

Asking the Wrong Question

The other reason why I think "Spring vs. Java EE" is the wrong question is: Both models on their own do not solve the challenges I typically see in projects. Let me give you some examples:

Integration: Today almost any project will need to integrate with other technologies through file transfer, Web Services, messaging etc. There are well known patterns for this - the Enterprise Integration Patterns by Gregor Hohpe. Neither Java EE nor the core Spring Framework help here - you will need to use a framework like Spring Integration, Apache Camel or its competitors.
Batches: A lot of project use batches. They might import data - or to run complex business logic. In the latter case, chances are that the batches are actually mission critical. Neither Java EE nor the core Spring Framework help here - you will need to use a framework like Spring Batch. (I believe there are no real competitors but I might be wrong).
Caching: Like many of our competitors we at SpringSource believe that caching is a very important part for a performant enterprise solution - you can tell by the product strategy in the Java EE space. Again this is something that Java EE does not cover. However, for Spring a cache abstraction is planned in Spring 3.1. A standard in Java EE - if there is ever to be one - won't be there until the next Java EE release.
Again looking at my Ruby on Rails office mates: They deploy a lot of projects into the cloud. Offerings like Heroku make that extremly simple. Of course you could run Spring or Java EE on an IaaS like Amazon EC2 or VMware vCloud Director cloud. But it might be easier to use a PaaS that will deal with scaling, fault tolerance etc automatically and it usually offers some interesting additional services. Google App Engine as the predominant public PaaS as well as VMforce have strategic commitments to Spring. The Cloud has clear economic benefits so this technology will become more and more important. The Java platform has to have a solution here - otherwise it will eventually become irrelevant.
More and more data is stored in non relational databases (NoSQL). Actually that is the only way we will be able to cope with the exponentially growing amount of data and the structures like graphs in social networks or unstructured user content. Again it this quite common in the Ruby on Rails camp - while for Java there is currently a lack of APIs. Java EE has no support - and it won't for a while. There are so many non relational stores a standard will be hard to define in a standards body.
Also integration in social networks like Twitter, Facebook and other common internet services will be more and more important. Again this is common in the Ruby on Rails community. And at least one of my customers with a very large web site mentioned this as "a feature marketing will love" - remember Facebook is almost as important a source for internet traffic as Google nowadays. We need solutions for this in the Java space.
Messaging: In the Java EE community the JMS standard has been very stable for 10+ years. Outside the Java community there was actually innovation. Therefore a lot of the office mates use RabbitMQ - and that is also the predominant messaging solution on the EC2 cloud. It is standardized on the protocol level (AMQP) and it is much more flexible (not just topics and queues). That is why SpringSource bought Rabbit Technologies and why we are investing considerable resources in the Spring-AMQP project.

So there are a lot of challenges Java EE just has no solution for. A side by side comparison of Spring and Java EE might lure you to believe that JSF + Spring / Java EE + database is all you will ever need. Look at your projects and decide for yourself if that is true. It certainly contradicts my experience.

Sum Up

To cut a long story short:

You won't see a lot of change in your productivity if you use Spring or the newest Java EE releases. The next level of productivity is not about these programming models but about Groovy / Grails or Spring Roo.
Spring and Java EE will only be a part of your technology stack. You will need to look at solutions for a lot of issues outside these. That is why SpringSource and others are currently investing heavily in creating such solutions for the Java crowd. And a lot of these solutions are well integrated with Spring.

...and a last thing

So you have read this blog post to the end. That is great - thanks a lot! Please don't start a religious battle now. Instead code something or do something fun. Enjoy!

Labels: I don't care, Java EE, Spring

¶ 10:42 37 comments

Environment Specific Beans

Improvements for the Java Application Configuration

Cache Abstraction

Conversation Management

Keeping Up

Sum Up

History

The Current State

Why I don't care

Asking the Wrong Question

Sum Up

...and a last thing

Über mich