Databricks announced its intention to acquire Arcion, an enterprise data replication specialist and a part of the Databricks Ventures portfolio. The acquisition, valued at over $100 million, is set to bolster Databricks’ capability to natively ingest data from a myriad of databases and SaaS applications into their Lakehouse Platform.
The announcement follows Databricks’ recent $500M funding round, which closed in September and saw participation from prominent investors like Nvidia and Capital One. The acquisition of Arcion also quickly follows Databricks’ acquisition of MosaicML in July.
What Arcion Brings to Databricks
Arcion, founded in 2016, is distinguished by its ability to ingest and replicate data in real time from specific databases. Its change data capture engine processes data as it emerges. Additionally, it connects with over 20 enterprise databases and data warehouses. This acquisition empowers Databricks to handle streaming data efficiently, apply its existing governance and security measures, and utilize this data for real-time analytics.
As data ingestion and replication become more critical as global data volumes surge, organizations require effective mechanisms to manage the escalating complexity and volume of data. Such capabilities prevent data silos that may arise when data is stored across various databases and platforms.
Arcion’s ingest technology is built on a change data capture (CDC) engine and offers connectors for more than 20 enterprise databases and data warehouses. This means that with Arcion’s integration, Databricks will be able to facilitate either continuous or on-demand data ingestion into the lakehouse. This will be fully integrated with the enterprise security, governance, and compliance features of the Databricks platform, simplifying the overall process for businesses.
Replication ensures data consistency across diverse locations. Although Databricks’ existing platform can connect to various data ingestion tools and cloud data warehouses, it lacks native data ingestion and replication features for customer-generated data. Arcion’s capabilities will eliminate the need for third-party tools for these processes.
With Arcion’s technology, Databricks can offer real-time replication of operational data, ensuring businesses have up-to-date information at their fingertips, which is vital for real-time analytics and decision-making. This should provide a nice extension of Databricks’ existing ETL solution.
Real-time data is becoming increasingly valuable as organizations harness generative AI models that necessitate ongoing training. Arcion’s technology, which captures real-time data changes, addresses the growing demand for low latency use cases, enriching Databricks’ existing integration technologies and its expanding streaming data capacities.
Without such tools, Databricks users would have had to create their integrations, relying heavily on integration platform as a service (iPaaS) vendors for their AI and machine learning needs. This acquisition equips Databricks with an integrated, platform-native alternative in addition to third-party iPaaS solutions.
With the acquisition of Arcion, Databricks not only strengthens its technical offerings but also potentially expands its customer base and reach. It represents a strategic move to address critical challenges faced by enterprises and to solidify its position as a market leader, underscoring the growing importance of seamless data integration in the era of big data and AI. The acquisition of Arcion by Databricks coincides with IBM’s purchase of Manta, showing that data management is fast becoming a critical focus in enterprise AI.
The move also furthers Databricks’ strategic push into AI. By integrating Arcion’s technology with MosaicML (an AI infrastructure startup Databricks acquired earlier), Databricks can deliver a robust system where Arcion acts as the primary data source feeding MosaicML. This synergy could streamline the process for clients wanting to build their AI models, making the entire workflow more efficient.
This is a strong acquisition that shows Databricks putting the pieces together to be able to offer a more comprehensive solution to its customers, making it an attractive choice for enterprises looking for a one-stop solution for their data and AI needs. This nicely sets the stage for further growth.
Disclosure: Steve McDowell is an industry analyst, and NAND Research an industry analyst firm, that engages in, or has engaged in, research, analysis, and advisory services with many technology companies, which may include those mentioned in this article. Mr. McDowell does not hold any equity positions with any company mention in this article.