Databricks announced Monday at the sold-out Spark Summit 2014 that it received $33 million in series B funding from New Enterprise Associates. This follows a round of Series A funding of $14 million in September 2013.
It also announced the launch of the Databricks Cloud service on Amazon Web Services this fall. It is currently in limited beta release. It may eventually be available on other clouds such as Google or Azure. Databricks Cloud is a managed service based on Spark and will be supported by Databricks.
Apache Spark is an open source project that makes big data analysis faster. It works though in-memory computation. According to its website it allows developers to, “Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.
“Spark unifies what until now have been disparate components of Hadoop, according to Berkeley Computer Science professor and Databricks Chief Executive Ion Stoica, providing a single framework for data streaming, machine learning, graph processing, SQL (the most common language for querying databases) and other functions so that users don’t have to manually set up and integrate all these tools themselves.” said the Wall Street Journal.
Although Apache Spark is already deployable on AWS, Databricks cloud will help users manage the service. “Getting the full value out of their Big Data investments is still very difficult for organizations. Clusters are difficult to set up and manage, and extracting value from your data requires you to integrate a hodgepodge of disjointed tools, which are themselves hard to use. Our vision when founding Databricks was to free users to focus on turning data into value, instead of struggling with existing tools and systems,” said Databricks CEO Ion Stoica.
“Databricks Cloud delivers on this vision by combining the power of Spark with a zero-management hosted platform and an initial set of applications built around common workflows.”