Lessons learned from porting Pet Shop to NoSQL via Couchbase 2.0
Java Blueprints were developed to show you design patterns in enterprise Java. The Java Pet Store was designed to demonstrate the quintessential Java 2 Enterprise Edition (J2EE) application. This was mainly in the heady days of EJB 1.1-2.1, which had many failed and defective technologies, including the now-dumped Container Managed Persistence.
Around the same time, Puff Daddy became P. Diddy, then dropped the โPโ with the explanation that it was getting in between him and his fans. Likewise, J2EE dropped the โ2โ and became Java EE possibly for the same reason. Meanwhile, Sun abandoned the Pet Store business in 2007. But Antonio Goncalves recently picked the application back up and modernized it as the Pet Shop. It uses CDI and all Java EEโs latest fixings and versions to demonstrate the โright wayโ to do a Java EE application. You can find the code for Goncalvesโs Pet Store on GitHub.
[ Know your Java? Test your skills in the Java IQ test. | Learn how to work smarter, not harder with InfoWorldโs roundup of all the tips and trends programmers need to know in the Developersโ Survival Guide. Download the PDF today! | Keep up with the latest developer news with InfoWorldโs Developer World newsletter. ]
The opportunity: Port Pet Store to NoSQL
As frequently noted, Iโm incredulous about the JavaEE programming modelโs continued relevance in the modern era. CDI is mainly a codification or โstandardizationโ of Spring, assuming Oracleโs blessing means โstandardizationโ to you. The Spring Framework more or less won the programming model. Meanwhile, the traditional โthere is only the RDBMSโ thought pattern is a little less of a given now with the big data and NoSQL revolution in place. Moreover, in the era of consumerization of computing and the Web, the traditional transaction manager is less relevant. It is unlikely that a device (ร la database) transaction is sufficient any longer to support consistency.
With Couchbase 2.0โs recent release, it seemed to me the perfect time for my colleagues and I to try to port the Pet Store as a NoSQL application. What may surprise you is how little work it required. See for yourself in our extensive guide; you can find our code for this project on GitHub as well.
The example exposes a weakness of Java EE related to NoSQL in that we had to have a lot of direct to Couchbase API code to make it work. Ideally, weโd have used something more like Spring Data because Spring supports CDI. However, Spring Data does not yet support Couchbase 2.0 for more than caching. (Full support is in beta.)
This gave us a good chance to test drive Couchbase 2.0 with the Java Pet Store.
Couchbase 101: How it differs from what youโre used to
At a high level, Couchbase is a combination of a back-end data store and a built-in, document-level cache. It provides auto-failover, auto-sharding, and automatic load balancing. Couchbase does not have the concepts of databases and tables. Instead, data isolation is achieved through buckets. Buckets are like a database, where users store documents with different schemas. Thus, in our new Pet Store application, we created one bucket called petstore where all the documents for the application reside.
If you have multiple nodes in your cluster, Couchbase spreads the documents over all the nodes in the cluster. Say you have a three-node Couchbase cluster with one bucket that has three documents in it; each node might have one document. This is done via a hashing algorithm thatโs part of the Couchbase client โ but there can be more than one replica of this data. You specify the number of replicas when creating your bucket. After the bucket is created, you cannot change the number of replicas, so be sure to choose the number you really want.
Another consideration for replicas is that the replica data is also stored in memory. This means you use the memory on the first and second nodes. This is not necessarily a bad thing โ in the event of a failure, the data is available almost instantaneously. The catch is that you have to turn on auto-failover and specify an interval for a node to be considered down; the data wonโt be available for reads until this auto-failover takes place. The default is 30 seconds, during which time your application has to deal with having certain documents from a bucket be unavailable. In our configuration, we had a single testing server, so we had merely one node.
Couchbase multinode setup is fairly easy and requires next to nothing to maintain. It doesnโt require anything complicated like Zookeeper or extra configuration nodes. In case of a failed node, the Couchbase server can be auto-configured to initiate a failover, which means the failed node is removed from the cluster and read-write access is still available for other nodes.
For performance reasons, Couchbase manages the applicationโs working set in memory up to the amount of memory specified for the bucket. If the amount of data exceeds the amount of memory, the oldest documents are evicted from memory, though theyโre still on disk. This makes for a speedy system. It also means you donโt have to deal with a separate caching layer โ it is built-in cache.
The data is stored in the bucket as a key-value pair. You can store entire sets of objects this way because of the flexibility of JSON. Everything we stored for the application was stored as JSON. This requires marshaling (and unmarshaling) the JSON data from and to the objects in the application. We used Jackson mapper for this; it is widely used and allowed us more flexibility for circular references and the like.
The data is also schema-less. What this means is that if you want to add another field to a document, you just add it. You donโt have to worry about all the documents that already exist. They are more than happy to exist without the new field, and changes to the schema are painless and quick.
Couchbase has additional features that set it apart from other document databases โ and other NoSQL databases, for that matter. It is a distributed key-value store, the data manager is written in C/C++, and the cluster manager is written in Erlang. By having a large amount of built-in mapreduce functions, many simple operations become very easy to implement. This also provides a great reference for writing our own mapreduce functions.
Couchbase has B-tree-based indexes. You can index anything from entire views to embedded documents. However, it lacks in geospatial indexes (currently available in experimental mode only), although this becomes an issue solely if you are working with location data. Couchbase also does not have in-place updates. This is not a huge sticking point because the working set remains in memory all the time, so the updates are superfast.
Couchbase does not have any concept of capped collections. This is only an issue if you are working mainly with log data analysis. Couchbase maintains the working set in memory, but you can have much more data than the amount of memory. Although Couchbase 2.0 has a relatively higher cache miss rate, its developers are working to optimize this in the next release.
The data models for the Pet Store in NoSQL
The Java Pet Store application was originally deployed in Apache Derby using Hibernate and JPA. Because Derby is an embedded implementation, we switched the configuration to use MySQL. This enabled us to have an in-depth look at the relational schema design.
The application is being driven primarily by two events:
- When a new customer registers
- When a new order is created by a customer
We built the Couchbase documents around these two events: Customer and Order. These documents were designed to contain related entities as embedded documents. We also created a third document type, Category, to store inventory information: categories, products, and items. This design decision enabled the creation of separate indexes (or views) so that they can be fetched quickly. This also provides examples of both linked and embedded documents.
|
JSON example for Customer |
JSON example for Order |
JSON example for Category |
<code/>
{ <code/>
โidโ:โcustomer_marcโ, <code/>
โtypeโ:โcustomerโ, <code/>
โloginโ:โmarcโ, <code/>
โpasswordโ:โmarcโ, <code/>
โfirstnameโ:โMarcโ, <code/>
โlastnameโ:โFleuryโ, <code/>
โtelephoneโ:null, <code/>
โemailโ:โmarc@jboss.orgโ, <code/>
โhomeAddressโ:{ <code/>
โstreet1โณ:โ65 Ritherdon Roadโ, <code/>
โstreet2โ:null, <code/>
โcityโ:โLos Angelesโ, <code/>
โstateโ:null, <code/>
โzipcodeโ:โ56421โณ, <code/>
โcountryโ:โUSAโ <code/>
}, <code/>
โdateOfBirthโ:1363794557891, <code/>
โageโ:null <code/>
}
|
<code/>
{ <code/>
โidโ:โMarcโ, <code/>
โtypeโ:โorderโ, <code/>
โorderDateโ:null, <code/>
โcustomerโ:{ <code/>
โidโ:1, <code/>
โloginโ:โmarcโ, <code/>
โpasswordโ:โmarcโ, <code/>
โfirstnameโ:โMarcโ, <code/>
โlastnameโ:โFleuryโ, <code/>
โtelephoneโ:null, <code/>
โemailโ:โmarc@jboss.orgโ, <code/>
โhomeAddressโ:{ <code/>
โstreet1โณ:โ65 Ritherdon Roadโ, <code/>
โstreet2โณ:โโ, <code/>
โcityโ:โLos Angelesโ, <code/>
โstateโ:โโ, <code/>
โzipcodeโ:โ56421โณ, <code/>
โcountryโ:โUSAโ <code/>
}, <code/>
โdateOfBirthโ:1363722361660, <code/>
โageโ:0 <code/>
}, <code/>
โorderLinesโ:[ <code/>
{ <code/>
โidโ:null, <code/>
โquantityโ:1, <code/>
โitemโ:{ <code/>
โidโ:โitem_Goldfish_Male Puppyโ, <code/>
โtypeโ:โitemโ, <code/>
โnameโ:โMale Puppyโ, <code/>
โdescriptionโ:โLorem โฆโ, <code/>
โunitCostโ:12, <code/>
โimagePathโ:โfish2.jpgโ <code/>
} <code/>
}, <code/>
{ <code/>
โidโ:null, <code/>
โquantityโ:1, <code/>
โitemโ:{ <code/>
โidโ:โitem_Angelfish_Largeโ, <code/>
โtypeโ:โitemโ, <code/>
โnameโ:โLargeโ, <code/>
โdescriptionโ:โLorem โฆโ, <code/>
โunitCostโ:10, <code/>
โimagePathโ:โfish1.jpgโ <code/>
} <code/>
} <code/>
], <code/>
โdeliveryAddressโ:{ <code/>
โstreet1โณ:โ65 Ritherdon Roadโ, <code/>
โstreet2โณ:โโ, <code/>
โcityโ:โLos Angelesโ, <code/>
โstateโ:โโ, <code/>
โzipcodeโ:โ56421โณ, <code/>
โcountryโ:โUSAโ <code/>
}, <code/>
โcreditCardโ:{ <code/>
โcreditCardNumberโ:โ1234โณ, <code/>
โcreditCardTypeโ:โVISAโ, <code/>
โcreditCardExpDateโ:โ03/15โณ <code/>
}, <code/>
}
|
<code/>
{ <code/>
โidโ:โcategory_Birdsโ, <code/>
โtypeโ:โcategoryโ, <code/>
โnameโ:โBirdsโ, <code/>
โdescriptionโ:โAny of โฆโ, <code/>
โproductsโ:[ <code/>
{ <code/>
โidโ:โproduct_Amazon Parrotโ, <code/>
โtypeโ:โproductโ, <code/>
โnameโ:โAmazon Parrotโ, <code/>
โdescriptionโ:โGreat companion for up to 75 yearsโ, <code/>
โitemsโ:[ <code/>
{ <code/>
โidโ:โitem_Male Adultโ, <code/>
โtypeโ:โitemโ, <code/>
โnameโ:โMale Adultโ, <code/>
โdescriptionโ:โLorem โฆโ, <code/>
โunitCostโ:120, <code/>
โimagePathโ:โbird2.jpgโ <code/>
}, <code/>
{ <code/>
โidโ:โitem_Female Adultโ, <code/>
โtypeโ:โitemโ, <code/>
โnameโ:โFemale Adultโ, <code/>
โdescriptionโ:โLorem โฆโ, <code/>
โunitCostโ:120, <code/>
โimagePathโ:โbird2.jpgโ <code/>
} <code/>
] <code/>
}, <code/>
{ <code/>
โidโ:โproduct_Finchโ, <code/>
โtypeโ:โproductโ, <code/>
โnameโ:โFinchโ, <code/>
โdescriptionโ:โGreat stress relieverโ, <code/>
โitemsโ:[ <code/>
{ <code/>
โidโ:โitem_Male Adultโ, <code/>
โtypeโ:โitemโ, <code/>
โnameโ:โMale Adultโ, <code/>
โdescriptionโ:โLoremโฆโ, <code/>
โunitCostโ:75, <code/>
โimagePathโ:โbird1.jpgโ <code/>
}, <code/>
{ <code/>
โidโ:โitem_Female Adultโ, <code/>
โtypeโ:โitemโ, <code/>
โnameโ:โFemale Adultโ, <code/>
โdescriptionโ:โLorem โฆโ, <code/>
โunitCostโ:80, <code/>
โimagePathโ:โbird1.jpgโ <code/>
} <code/>
] <code/>
} <code/>
] <code/>
}
|
As the application is deployed, the Categories, Products, and Items are generated in the database by the database populator class. When a new customer goes to the home page, he or she has the option to sign in or register. At that point, a new customer document is created. The customer can then browse the existing categories, create an order, and save it. When the order is saved, a new order document is created with order details. The associated items are added as embedded documents in the order.
Using Couchbase views for data access
Couchbase uses views to simplify access to the data. They are generally a cross between a view and a stored procedures in the relational database world. In our application, we added a type field to the entity objects and set it to the type of object. For example, in the Order entity type="order". Then we created a view to pull back any document where the type was whatever we were looking for.
You can also drill into a document to have the view return anything in the document. An example of this is an item, which is embedded in the order line in the order document. We set up a view that query for the item for an ID. That is how the application pulls back an item relating to a specific product, which is in a specific category. This JavaScript code is very flexible.
One important note on views: When you first put a document into the database, it does not immediately become available to the view. It must first be saved to disk.
What we changed in the code
To change this application to work in Couchbase, a few files needed to change, primarily around the services. This is where, in the RDBMS version, there was a reference to the EntityManager. We decided to use the DbPopulator class to initialize the Couchbase connection because it is a singleton that starts at application start. We then use this connection similarly to the Entity Manager. The following is an example of some of the code changes that were necessary.
Derby repository calls
@Inject
<code/>
EntityManager em;
<code/>
find : TypedQuery<Category> typedQuery = em.createNamedQuery(Category.FIND_BY_NAME, Category.class);
<code/>
typedQuery.setParameter(โpnameโ, categoryName);
<code/>
save : em.persist(category);
<code/>
update : em.merge(category);
<code/>
delete : em.remove(em.merge(category));
<code>
Couchbase repository calls
find : client.get(categoryName);
<code/>
save : client.set(category.getName(), EXP_TIME, mapper.writeValueAsString(category));
<code/>
update : client.replace(category.getName(), EXP_TIME, mapper.writeValueAsString(category));
<code/>
delete : client.delete(category.getName());
<code>
Code changes were required for each of the CRUD operations for Customers, Orders, and Categories. Below is an example of the changes made to create a new customer.
The original code for customer creation
public Customer createCustomer(final Customer customer) {
<code/>
if (customer == null) throw new ValidationException(โCustomer object is nullโ);
<code/>
em.persist(customer);
<code/>
return customer;
<code/>
}
<code>
The Couchbase code for customer creation
public Customer createCustomer(final Customer customer) {
<code/>
if (customer == null) throw new ValidationException(โCustomer object is nullโ);
<code/>
// Sets the customerId to the current time in millis
<code/>
customer.setId(String.valueOf(new Date().getTime()));
<code/>
try {
<code/>
// Puts the document into the database.
<code/>
// Uses customer login as the key since this should be unique
<code/>
// EXP_TIME, Expiry time in seconds is set to 0, which means store forever.
<code/>
// Uses the Jackson mapper to conver the customer to json
<code/>
client.set(customer.getLogin(), EXP_TIME, mapper.writeValueAsString(customer));
<code/>
} catch (Exception ex) {
<code/>
ex.printStackTrace();
<code/>
}
<code/>
return customer;
<code/>
}
<code>
Our experience in switching this J2EE application from Apache Derby, JPA, and Hibernate to Couchbase demonstrated that the switch was fairly simple. The most difficult part was designing the schema, which is an issue when switching any application from an RDBMS to a NoSQL database. Still, designing a document database schema is generally easier than designing a relational database schema.
With the design behind us, it was simply a matter of switching the service classes to use the Couchbase client. Although this is a small application, implementing the Couchbase version was relatively easy and quick, which bodes well for other existing Java EE applicationsโ ability to take advantage of NoSQL.
Michael Brush, Elizabeth Edmiston, and Deep Mistry, who are all developers at Open Software Integrators, contributed to this article.
This article, โHow to teach a Java EE app new NoSQL tricks,โ was originally published at InfoWorld.com. Keep up on the latest developments in application development and read more of Andrew Oliverโs Strategic Developer blog at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.


