ACID properties and SQL unit testing

As I mentioned in other articles, we live in an era in which data are the main asset of many companies and not infrequently a large part of their value comes from the amount of data they handle.

Have you ever wondered how the hell big companies like Google, Airbnb or Apple manage the immense volume of information they store without apparent incidents (at least serious)?  Database maintenance is a delicate job that can cause serious errors that compromise the integrity of users and customers information. What happens if for example several processes try to access or modify the same data at once? When it comes to small businesses with modest databases, mistakes are often solved with a few hours of re-inserting lost data. But in big data companies, data loss is not something that can they can afford. What standards are then followed in database administration to prevent serious damage?

The ACID properties

In reality, it is not that database administrators in Big Data companies don’t make mistakes. Rather they can’t commit error provoking changes. This is achieved by having SQL databases systems ACID compliant.

The “ACID” acronym stands for atomicity, consistency, isolation and durability. These features ensure that changes made in a database will be reliable and will not cause cascading errors if you make the wrong change.

Atomicity : It is the idea of “either all or nothing at all”. If a fault exists in a set of operations, any changes made are automatically rolled back.

Consistency : This property implies that any (successful) change will make the database transition from a valid state to another valid state. The operations carried out will not violate system restrictions.

Isolation : There will be times at which several operations are executed on the same data. Isolation guarantees that one operation does not affect another, that is, each change’s result will not affect other processes.

Durability : It is the guarantee that once a change takes place in the database, it will persist even if the system crashes after making such a change.

Transactions: code used in unit testing and to enforce ACID properties

Transactions are blocks of SQL code that modify the database. A transaction can consist of insert, update, and delete commands.

The best thing about transactions is that if a single order it contains and error, no change is made. This is achieved because the transactions do not affect the actual data but they rather temporarily clone the targeted tables and execute the orders on these “ghost” data. If the operation fails, the actual data remains the same. If, on the contrary, the code block runs smoothly the transaction implements the specified changes in the actual tables.

In essence, transactions are a very attractive way of performing unit testing in our SQL queries. If we are not sure that our SQL code will execute correctly, we can run it as a transaction to test it.

Syntax of a transaction

The syntax of a transaction is very simple. They all start with the START TRANSACTION command and end with the COMMIT command. Suppose I want to insert a new user in the users table and at the same time to change the password of an existing user called demouser:

START TRANSACTION ;
INSERT INTO USERS (NAME, PASSWORD) values ('usertest', 'P4ssw0rdt3st');
UPDATE USERS SET PASSWORD = 'N3wp4ss25' WHERE NAME = 'demouser';
COMMIT ;

With this small code block we ensure that either both operations run, or none of them run (on the actual data) in case of error. The COMMIT command ensures that changes are permanently implemented in case of running correctly.

Savepoints and Rollbacks

When we write a SQL transaction we can implement savepoints.  Savepoints allow us to return to the state in which the database was before further changes were made. 

Savepoints are a safe way to undo part of a transaction. Following with the example above, let’s say I want to insert a new user but this time I am just interested on testing the password change to the user demouser. After changing the password, if the operation runs successfully, I want to undo the change because I am only interested in seeing if this operation would run correctly. In this case I would create a savepoint between the two queries and would write the ROLLBACK TO command to go back to my savepoint:

START TRANSACTION;
INSERT INTO USERS (NAME, PASSWORD) VALUES ('usertest', 'P4ssw0rdt3st');
SAVEPOINT Point_1
UPDATE USERS SET PASSWORD = 'N3wp4ss25' WHERE NAME = 'demouser';
ROLLBACK TO Point_1

This transaction, if successful, would insert the user usertest but would not change demouser’s password since the transaction would leave the database in the state it was before creating my savepoint Point_1.

Concurrency 

The concept of concurrency is key to guarantee ACID properties.  Usually multiple users work in a a database simultaneously. In case several transactions try to access the same information at the same time, concurrency makes it possible for all those processes to run without errors. 

In this context, when the first user runs an operation on some data, the database engine will block this data from being accessed by any other processes until the first operation has finished. If the first transaction changes the data it was targeting, subsequent transactions will be made on these new data.

Conclusion

Assuring ACID compliance is key for companies that manage high volumes of data and for companies for which data are an important asset in their business model. In essence, ACID properties allow us to ensure to a certain degree the integrity of the information with which we work, and these can be enforced with code in the form of transactions.

Leave a Reply

Your email address will not be published.