NoSQL Search with Cloudant

In the seemingly endless argument between the relational database proponents and those on the NoSQL side of the divide, the usual vectors of discussion occur around speed, availability, operational constrains and budget. One area that sometimes gets forgotten is the loss of a structured query language that followers of the NoSQL school have to overcome. The traditional way to substitute a structured query language has been through the use of systems built on top of MapReduce – an effective if somewhat complex solution. Into this fray comes Cloudant, with its new NoSQL query solution, Cloudant Search.

The concept behind Cloudant Search is that users should be able to interact with their data instantaneously, without needing to use MapReduce jobs or complex languages. The creators of Cloudant Search also wanted to create a solution that didn’t require the set up of third party r expensive solutions. Cloudant Search is a pretty elegant solution that lets users search using syntax they can grasp quickly. As Cloudant said in their blog post announcing the public beta of Cloudant Search;

Want to easily find all the documents that contain the word “bieber”? This is the Cloudant Search query you have to write;

bieber

Want to find all the records that have “my world” in their title? Just write:

title:”my world”

How about only finding all the artists who were born in Canada in February & March 1994?

type:artist country:canada dob<date>:[1994-02-01 TO 1994-03-31]

How about people who are fan of either Justin Bieber or Justin Timberlake, that live in JBiebz’ hometown in Ontario — wait, what’s it called, Stratwood? Stratburg? Strat- something. Oh, I’ll just search it:

type:person fan-of:(justin (bieber OR timberlake)) city:strat* state:ontario

Queries are served instantly, and new records are indexed immediately upon entering into the database. Cloudant Search is baked directly into the Cloudant codebase and leverages existing, such as the Lucene indexers, CouchDB’s view indexing logic and, Cloudant’s own Dynamo-inspired distribution algorithms. Cloudant Search has also been built to be extremely extensible, users can write their own indexing algorithms, leverage Lucene tokenizers or customize the Cloudant Search indexer itself, with a bit of Java code.

All this ease of use comes with some costs however – Cloudant Search required an initial, one time indexing and, as would be expected from a search product baked into the database itself, this index takes up disk space. Cloudant have limited the availability of some search parameters to limit resource consumption.

If NoSQL truly is the way of the future, then NoSQL will rapidly see some application which face non-technical types. Plain English searches will be imperative once that happens and Cloudant have done well to preempt this requirement with their search product.