Why Predictive Applications Require ACID

Why Predictive Applications Require ACID 2017-10-12T00:44:13+00:00

Many predictive applications require atomic updates of data (the “A” in ACID). For example, if you were extracting orders out of a ERP system or transactions from a POS system for a day or an hour, and halfway through the ingestion, something goes wrong, you simply can not leave the database in the inconsistent state with half the orders updates. You have to roll back the database to a consistent state.

In another example, let’s say we were implementing the supply chain crystal ball application that predicted some orders to be 5 days late. To provide a report of the stock-outs caused by these anticipated late orders, the application needs to take each order and its line items and “atomically” reschedule them.

For example, suppose an order with 5 line items is changed to a new delivery date, but while the system is processing this change, an exception is thrown and the operation must abort with only a subset of the order updated. In this case, the database would be in an inconsistent state. The inventory would have incorrect counts and any commitments or allocations could be flawed. This is why what-if planning systems require transactional ACID properties. If the entire order can not be successfully updated, the database must rollback to a consistent state. This ACID requirement is exacerbated by systems that allow many planners to simultaneously plan or systems that can concurrently update the state of planning systems because they need to protect against seeing each other’s changes in the database before they are complete.

Developers have been unable to easily develop predictive applications that are always-on and able to plan in the future because of the lack of ACID compliance. They keep the application loosely coupled from the analytics. The misconception in the marketplace is that you only need ACID compliance for the systems of record like ERP, CRM, HR, and POS systems. This is just untrue. AI systems that are advisory systems, that provide insights into the possible futures of an enterprise, desperately need ACID.

How Splice Machine Implements ACID

Splice Machine is an ACID compliant system that implements a version of a MVCC called Snapshot Isolation. Every update to a Splice Machine record does not update “in place”, it adds a new piece of data that is marked with a timestamp. Readers can only see data with timestamps that were “committed” before the read begins. When a transaction completes, then the system marks the transaction as committed and updates the global understanding of the current timestamp to be the completed transactions completed timestamp.

With Snapshot Isolation, readers of data do not lock writers and writers do not lock readers. This enables tight grained concurrency and is standard in traditional databases such as Oracle, MSFT SQL Server, MySQL, and Postgres. The Splice Machine unique contribution is making Snapshot Isolation work on a distributed scale-out architecture such as HBase and Hadoop. This is unique in the marketplace.

Next Chapter