SQL: transactions and data manipulation

Databases provide a great tool for storing large masses of data. But how do we write the data correctly into it? Are there mechanisms to ensure that we get the data right? Let's find out what transactions and their properties are, as well as basic commands for manipulating data.

Share

Reading time: 5 minutes

As we saw in the article SQL: creating a database, creating a database requires both conceptual context modeling and the definition of SQL scripts for defining table structures. Once our database has been created, it is necessary to populate it with data available to us and/or information from the interactions of our application users. To do this we need to use commands that fall under the category Data Manipulation Language or more simply DML.

Specifically, some DML instructions change the state of the database, that is, alter its contents. The commands we will analyze in this article are:

  • INSERT
  • DELETE
  • UPDATE

When these instructions are executed, the DataBase Management System (DBMS) must verify that all constraints defined on individual fields and referential integrity constraints between tables are respected. Otherwise, the data change must not be executed and the database must be returned to the state prior to the execution of the offending instruction. This is the “Consistency” property, one of the ACID properties of transactions, which also applies to individual instructions.

Before proceeding with an analysis of the syntax of a database modification instruction, let us introduce the concept of a transaction and its properties so that the context in which we operate is clearer.

Transactions

Transactions are a fundamental concept in databases. Introduced from the earliest development of relational databases, they have played a key role in the success of relational databases in the marketplace. Even today, they are one of the main reasons why relational databases are more widely used in many contexts. In fact, many NoSQL databases did not offer this feature until a few years ago. For example, MongoDB introduced transactions only since version 3 and with some limitations. Other NoSQL databases still do not implement this feature because of the complex management of transactions in a distributed environment.

But what are transactions? To better understand their mechanism, let us first take an example (also quite classic!).

Suppose a customer of a bank wants to transfer money from his bank account to that of another customer of the same bank. Note well that we have simplified the context to make sure that the write operations are within the same database and better understand the concept of transaction. In this example, the application will have to perform at least two write operations to the database: a subtraction of the amount specified by the user from his account balance and a credit of the same amount to the recipient’s bank account. Both of these write operations must be performed smoothly. Otherwise you can imagine that someone will lose money!

The problems that can generate an interruption in one of the operations are many. It could be that the client arranging the money transfer does not have enough funds in his account, or that the IBAN of the recipient’s account does not exist. Technical problems may also arise, such as a service interruption while the transaction is taking place.

The DBMS must therefore provide a robust mechanism so that it cancels all executed transactions if something abnormal happens during the execution of the procedure. This mechanism is called a transaction. Transactions provide not only the mechanism to restore the database data following a failure, but also the management of concurrent access to the data.

Let us now look at the properties of transactions to better understand them.

Transaction properties

Transactions are defined as a logical unit of work, not further decomposable composed of SQL read and modify data statements that takes the database from one consistent state to another consistent state. The data modified by the transaction will be visible and available to other transactions and users only when the transaction successfully terminates. The command that indicates successful termination is COMMIT. Otherwise when an error happens, the command that is executed is ROLLBACK.

The main properties of transactions, summarized by the acronym ACID are:

  • Atomicity
  • Consistency
  • Isolation
  • Durability

Atomicity

A transaction is defined as an indivisible unit (atom) of work, that is, all operations contained within the transaction must be executed without error. If an error or failure occurs during the execution of the operations contained within the transaction, everything that has been executed up to that point must be undone. The database cannot remain in an intermediate state assumed during the execution of a transaction.

Consistency

The execution of a transaction must bring the database from a consistent (correct) initial state to a consistent final state. Correctness is verified by the integrity constraints defined on the database. When a violation of an integrity constraint occurs, the DBMS takes action to cancel the transaction or, to change the state of the database by eliminating the violation of the constraint.

Isolation

The execution of a transaction is independent of the concurrent execution of other transactions. Changes on data made by one transaction are not visible by other transactions until the transaction is successfully terminated. For this reason, the DBMS does not save intermediate transaction states because they are not stable to be shared with other transactions. In fact, an intermediate state could be undone by a subsequent rollback, which would result in a domino effect on all transactions that have read the intermediate state.

Durability

Changes made within a successfully terminated transaction (COMMIT has been executed) are stored permanently. This means that, even if the data has not yet been saved to disk, the DBMS provides mechanisms for restoring the correct state of the database after a failure has occurred.

Data manipulation operations

Now that we have seen the properties of transactions that also apply to individual transactions if they are not included in a transaction, let us look at data manipulation statements. Each statement we will look at can update the contents of only one table at a time.

INSERT

This command is used to insert one or more rows of data into a database. Before inserting data, you must make sure that the table has already been created. The syntax of the INSERT query is as follows:

				
					INSERT INTO TableName [(ColumnsList)]
VALUES (ValueList);

				
			

It is also possible to insert multiple tuples that are returned by the execution of a query. The syntax in this case is as follows:

				
					INSERT INTO TableName [(ColumnsList)]
Query;

				
			

When an INSERT is performed, the DBMS checks that the data type provided for each field is correct, that the domain, tuple, and referential integrity constraints (if they are defined) are satisfied.

UPDATE

This command is used to update existing data in one or more rows or columns using UPDATE and WHERE clauses. The syntax of the UPDATE query is as follows:

				
					UPDATE TableName
SET column = expression
       {, column = expression } 
[ WHERE predicate];
				
			

If the UPDATE changes fields on which a referential integrity constraint is defined, the DBMS must verify that the data remains consistent. It relies on the table definitions to do this. It may, therefore, cancel the operation or cascade the change onto the referential tables.

DELETE

This command is used to remove one or more existing records from database tables. The syntax of the DELETE query is as follows:

				
					DELETE FROM TableName
[ WHERE predicate];
				
			

As with UPDATE, the DBMS must check for referential integrity constraints. Based on the definition in the CREATE TABLE it can either undo the operation or propagate the deletion to the referential tables.

More To Explore

Python language

Pandas: data analysis with Python [part 1].

Data scientists continually need to read, manipulate, and analyze data. In many cases they use specific tools, but sometimes they need to develop their own code. To do this, the Pandas library comes to our aid. Let’s learn about its data structures, how we can read data from different sources and manipulate it for our purposes.

Artificial intelligence

Gradio: web applications in Python for AI [Part 3]

With Gradio, it is possible to create web applications for our machine learning and AI models in just a few lines of code. Through some examples, we will see the advanced features available, such as authentication, caching, and input file processing. We will also build a chatbot and an image classifier from pre-trained models. Finally we will discuss how to deploy our project in a few simple steps.

Leave a Reply

Your email address will not be published. Required fields are marked *

Progetta con MongoDB!!!

Acquista il nuovo libro che ti aiuterà a usare correttamente MongoDB per le tue applicazioni. Disponibile ora su Amazon!

Design with MongoDB

Design with MongoDB!!!

Buy the new book that will help you to use MongoDB correctly for your applications. Available now on Amazon!