Basics of SQL

Python Mastery: From Beginner to Expert - Sykalo Eugene 2023

Basics of SQL
Working with databases

Introduction to SQL

SQL (Structured Query Language) is a programming language used to manage and manipulate relational databases. It was first developed in the 1970s and has since become the standard language for managing databases.

SQL is used to create, modify, and query databases. It allows users to interact with the data stored in a database and retrieve specific information based on a set of criteria. SQL is an essential tool for anyone who works with databases, including data analysts, data scientists, and software developers.

Some common tasks that can be performed using SQL include:

  • Creating new databases and tables
  • Adding, updating, and deleting data from a database
  • Retrieving data from a database based on specific criteria
  • Modifying the structure of a database (e.g., adding or removing columns)

SQL is a powerful tool that can be used to manage and manipulate large amounts of data. It is also relatively easy to learn, with a straightforward syntax that can be used to perform complex tasks with just a few lines of code. In the next section, we will cover the basics of creating and manipulating databases using SQL in Python.

Creating and manipulating databases

In SQL, a database is a collection of related data that is organized in a specific way. Databases can be used to store information about customers, products, orders, and any other type of data that an organization may need to manage.

Creating a database

To create a new database in SQL, we use the CREATE DATABASE statement followed by the name of the database. Here's an example:

CREATE DATABASE mydatabase;

This will create a new database called mydatabase.

Creating a table

Once we have created a database, we can create tables to store our data. Tables are used to organize data into rows and columns, similar to a spreadsheet. To create a new table, we use the CREATE TABLE statement followed by the name of the table and a list of columns. Here's an example:

CREATE TABLE customers (
 id INT PRIMARY KEY,
 name VARCHAR(255),
 email VARCHAR(255),
 age INT
);

This will create a new table called customers with four columns: id, name, email, and age. The id column is designated as the primary key, which means that it will contain a unique identifier for each row in the table.

Inserting data

Once we have created a table, we can insert data into it using the INSERT INTO statement. Here's an example:

INSERT INTO customers (id, name, email, age)
VALUES (1, 'John Smith', 'john@example.com', 30),
 (2, 'Jane Doe', 'jane@example.com', 25),
 (3, 'Bob Johnson', 'bob@example.com', 40);

This will insert three rows of data into the customers table.

Updating data

We can update data in a table using the UPDATE statement. Here's an example:

UPDATE customers
SET age = 35
WHERE name = 'John Smith';

This will update the age column for the row where the name is 'John Smith'.

Deleting data

We can delete data from a table using the DELETE FROM statement. Here's an example:

DELETE FROM customers
WHERE age > 40;

This will delete all rows from the customers table where the age is greater than 40.

Data types and operations

In SQL, each column in a table has a data type that specifies the kind of data it can store. Some common data types in SQL include:

  • INT (integer): used to store whole numbers
  • VARCHAR (variable-length character): used to store text strings of varying lengths
  • DATE: used to store dates
  • FLOAT (floating-point number): used to store decimal numbers
  • BOOLEAN: used to store true/false values

When creating a table, we must specify the data type for each column. Here's an example:

CREATE TABLE employees (
 id INT PRIMARY KEY,
 name VARCHAR(255),
 age INT,
 salary FLOAT
);

This creates a new table called employees with four columns: id, name, age, and salary, each with a specified data type.

In addition to data types, SQL supports a variety of operations for working with data in tables. Some common operations include:

  • SELECT: used to retrieve data from a table based on specified criteria
  • WHERE: used to filter data based on specific conditions
  • GROUP BY: used to group rows together based on a specific column
  • ORDER BY: used to sort rows based on a specific column
  • JOIN: used to combine data from two or more tables based on a related column

Here's an example of a SELECT statement that retrieves data from the employees table:

SELECT name, age, salary
FROM employees
WHERE salary > 50000
ORDER BY age DESC;

This statement will retrieve the name, age, and salary columns from the employees table for all employees with a salary greater than 50000, sorted by age in descending order.

Querying databases

In SQL, querying a database means retrieving data from one or more tables based on specified criteria. The most commonly used statement for querying a database is the SELECT statement.

The SELECT statement

The SELECT statement is used to retrieve data from one or more tables in a database. Here's the basic syntax of a SELECT statement:

SELECT column1, column2, ...
FROM table1
WHERE condition;

In this syntax, column1, column2, etc. are the names of the columns that you want to retrieve data from. table1 is the name of the table that you want to retrieve data from, and condition is an optional expression that specifies the criteria for selecting rows.

Here's an example of a SELECT statement:

SELECT *
FROM customers
WHERE age > 30;

This statement will retrieve all columns (*) from the customers table for all customers with an age greater than 30.

The WHERE clause

The WHERE clause is used to filter the results of a SELECT statement based on specific criteria. Here's the basic syntax of a WHERE clause:

SELECT column1, column2, ...
FROM table1
WHERE condition;

In this syntax, condition is an expression that specifies the criteria for selecting rows.

Here's an example of a SELECT statement with a WHERE clause:

SELECT *
FROM customers
WHERE age > 30 AND email LIKE '%@example.com';

This statement will retrieve all columns (*) from the customers table for all customers with an age greater than 30 and an email that contains @example.com.

The ORDER BY clause

The ORDER BY clause is used to sort the results of a SELECT statement based on a specific column. Here's the basic syntax of an ORDER BY clause:

SELECT column1, column2, ...
FROM table1
WHERE condition
ORDER BY column_name ASC|DESC;

In this syntax, column_name is the name of the column that you want to sort by, and ASC or DESC specifies the order of the sort (ascending or descending).

Here's an example of a SELECT statement with an ORDER BY clause:

SELECT *
FROM customers
WHERE age > 30
ORDER BY name ASC;

This statement will retrieve all columns (*) from the customers table for all customers with an age greater than 30, sorted by name in ascending order.

The GROUP BY clause

The GROUP BY clause is used to group the results of a SELECT statement based on the values in one or more columns. Here's the basic syntax of a GROUP BY clause:

SELECT column1, column2, ...
FROM table1
WHERE condition
GROUP BY column_name;

In this syntax, column_name is the name of the column that you want to group by.

Here's an example of a SELECT statement with a GROUP BY clause:

SELECT age, COUNT(*)
FROM customers
GROUP BY age;

This statement will retrieve the age column and the count of rows for each age group from the customers table, grouped by age.

Updating and deleting data

In SQL, we can update data in a table using the UPDATE statement. Here's the basic syntax of an UPDATE statement:

UPDATE table_name
SET column1 = new_value1, column2 = new_value2, ...
WHERE condition;

In this syntax, table_name is the name of the table that you want to update, and column1, column2, etc. are the names of the columns that you want to update. new_value1, new_value2, etc. are the new values that you want to set for the columns. condition is an expression that specifies the criteria for selecting the rows to update.

Here's an example of an UPDATE statement:

UPDATE customers
SET age = 35
WHERE name = 'John Smith';

This statement will update the age column for the row where the name is 'John Smith'.

We can also delete data from a table using the DELETE statement. Here's the basic syntax of a DELETE statement:

DELETE FROM table_name
WHERE condition;

In this syntax, table_name is the name of the table that you want to delete from, and condition is an expression that specifies the criteria for selecting the rows to delete.

Here's an example of a DELETE statement:

DELETE FROM customers
WHERE age > 40;

This statement will delete all rows from the customers table where the age is greater than 40.

Best practices for working with SQL in Python

When working with SQL in Python, there are several best practices that can help you write better code and avoid common pitfalls.

Use parameterized queries

One of the most important best practices when working with SQL in Python is to use parameterized queries. Parameterized queries are a way to write SQL queries that include placeholders for data that will be provided later. Here's an example:

import sqlite3

conn = sqlite3.connect('mydatabase.db')
c = conn.cursor()

name = 'John'
age = 30

c.execute('SELECT * FROM customers WHERE name=? AND age=?', (name, age))

In this example, we're using parameterized queries to retrieve data from a table called customers. The ? placeholders in the query will be replaced with the values of the name and age variables when the query is executed. Using parameterized queries is important because it helps protect against SQL injection attacks, where an attacker could use malicious data to modify or delete data in your database.

Avoid using string concatenation

Another best practice when working with SQL in Python is to avoid using string concatenation to build SQL queries. String concatenation can be error-prone and can make your code vulnerable to SQL injection attacks. Instead, use parameterized queries or a library that provides a safe way to build SQL queries, such as SQLAlchemy or Django's ORM.

Use transactions

When making changes to a database, it's a good idea to use transactions. Transactions are a way to group multiple SQL statements into a single unit of work that can be either completed or rolled back if an error occurs. Here's an example:

import sqlite3

conn = sqlite3.connect('mydatabase.db')
c = conn.cursor()

try:
 conn.begin()
 c.execute('INSERT INTO customers (name, age) VALUES (?, ?)', ('John Smith', 30))
 c.execute('UPDATE customers SET age=? WHERE name=?', (35, 'John Smith'))
 conn.commit()
except:
 conn.rollback()

In this example, we're using transactions to insert a new row into a table called customers, update the age of that row, and commit the changes. If an error occurs, the rollback() method is called to undo the changes.

Close connections when done

When you're finished working with a database in Python, it's important to close the connection to the database. Failing to do so can lead to resource leaks and other issues. Here's an example:

import sqlite3

conn = sqlite3.connect('mydatabase.db')
c = conn.cursor()

# do some work...

conn.close()

In this example, we're using the close() method to close the connection to the database. It's important to always close the connection when you're done working with the database.