December 3, 2012
Long time technology commentators understand the tensions between platform companies, intent on both growing their business and creating a healthy ecosystem, and ecosystem partners who leverage what the platform brings, but remain apprehensive about the long term intensions of the platform vendor. Case in point – the growing number of services offered by Amazon Web Services that start to encroach upon what AWS partners provide. With the launch of Redshift at AWS’ Re:Invent conference in Las Vegas, this was put into stark contrast yet again.
Redshift, according to AWS, is a service that offers:
…a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. Amazon Redshift offers you fast query performance when analyzing virtually any size data set using the same SQL-based tools and business intelligence applications you use today. With a few clicks in the AWS Management Console, you can launch a Redshift cluster, starting with a few hundred gigabytes of data and scaling to a petabyte or more, for under $1,000 per terabyte per year.
The obvious main target AWS has its sights set on are the large data warehouse and analysis vendors like Oracle, IBM and HP. But there are also companies using AWS infrastructure to build products that are now directly competitive with Redshift – companies like BitYota and Kognitio. Over on GigaOm there was a short piece covering BitYota’s reaction, essentially it went along expected lines:
Now, BitYota CEO Dev Patel, a former Yahoo exec, says there are data warehouses and then there are data warehouses. Bityota built its Software-as-a-Service offering from the ground up with its own technology and crafted it so users won’t have to sweat how to configure compute instances or storage. And doesn’t include Hadoop, so it will sport a performance advantage there. And they won’t have to hire Hadoop eggheads, who are expensive and hard to find.
Now of course there are differences between Redshift and other analysis/warehouse tools – Redshift itself works with existing analytics products from the likes of Jaspersoft and Cognos, whereas BitYota and Kognitio bundle warehousing and analytics – but that’s a shaky piece of ground to try and build a defensible product play from. I wanted to dive in to what all this means for the ecosystem and spent some time with Kognitio to get their take on it all. Kognitio was putting on a brave face, talking more about how it’s ready to be used with Amazon’s upgraded instance types – but the meat really happens when the discussion turns to Redshift.
Kognitio told me that they expected something from Amazon at the event, but that in their view Redshift is clearly targeted at pulling Oracle DW users from on-premises to Cloud in an AWS environment, versus the Oracle Cloud environment. they pointed out that AWS Keynote speakers even “poked fun” at vendors like Oracle as being “cloud washed”. As such their view is that the RedShift introduction supports the Kognitio position of the need for specialist public cloud based options for Information Management
In their view,RedShift is just another data source from which they can read, load data into RAM and execute queries on. As suck, Kognitio is a complementary offering to RedShift, in that the data can be stored efficiently in an AWS Cloud environment and then loaded rapidly into Kognitio in-memory structures for high-speed analytical processing. They pointed out that Kognitio runs both public and private, and hence delivers on the real enterprise need for a hybrid offering.
Kognition was quick to sing the complementary praises of Redshift:
While we don’t see head-to-head comparisons happening often, we would clarify that RedShift is an excellent replacement for existing data warehouses – those systems that store data – while Kognitio is specialized to accelerate the overall BI infrastructure as a “middle tier” between that persistence layer and the visualization/user interface layer in which it is presented
Clearly there is some substance to the perspective here, Kognitio pointed out the differences between in-memory analysis to column based approaches:
- Data naturally occurs in rows, so it can be loaded much faster this way
- In-memory RAM processing is thousands of times faster than conventional spinning hard disk
- Kognitio has superior load and complex query performance as a result of considerably lower data:core ratio with no need to build columnar structures which require extensive administration and tuning
All valid points that speak to a real difference between what Redshift is today, and what the analytics vendors provide. But AWS moves fast, and I’d be surprised if these vendors weren’t spending lots of time strategizing about what they’re going to do if, and when, AWS decides to eat their lunch. The joys of a rapidly developing ecosystem huh?