Python Mastery: From Beginner to Expert - Sykalo Eugene 2023
Basics of SQL
Working with databases
Introduction to SQL
SQL (Structured Query Language) is a programming language used to manage and manipulate relational databases. It was first developed in the 1970s and has since become the standard language for managing databases.
SQL is used to create, modify, and query databases. It allows users to interact with the data stored in a database and retrieve specific information based on a set of criteria. SQL is an essential tool for anyone who works with databases, including data analysts, data scientists, and software developers.
Some common tasks that can be performed using SQL include:
- Creating new databases and tables
- Adding, updating, and deleting data from a database
- Retrieving data from a database based on specific criteria
- Modifying the structure of a database (e.g., adding or removing columns)
SQL is a powerful tool that can be used to manage and manipulate large amounts of data. It is also relatively easy to learn, with a straightforward syntax that can be used to perform complex tasks with just a few lines of code. In the next section, we will cover the basics of creating and manipulating databases using SQL in Python.
Creating and manipulating databases
In SQL, a database is a collection of related data that is organized in a specific way. Databases can be used to store information about customers, products, orders, and any other type of data that an organization may need to manage.
Creating a database
To create a new database in SQL, we use the CREATE DATABASE
statement followed by the name of the database. Here's an example:
CREATE DATABASE mydatabase;
This will create a new database called mydatabase
.
Creating a table
Once we have created a database, we can create tables to store our data. Tables are used to organize data into rows and columns, similar to a spreadsheet. To create a new table, we use the CREATE TABLE
statement followed by the name of the table and a list of columns. Here's an example:
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(255),
email VARCHAR(255),
age INT
);
This will create a new table called customers
with four columns: id
, name
, email
, and age
. The id
column is designated as the primary key, which means that it will contain a unique identifier for each row in the table.
Inserting data
Once we have created a table, we can insert data into it using the INSERT INTO
statement. Here's an example:
INSERT INTO customers (id, name, email, age)
VALUES (1, 'John Smith', 'john@example.com', 30),
(2, 'Jane Doe', 'jane@example.com', 25),
(3, 'Bob Johnson', 'bob@example.com', 40);
This will insert three rows of data into the customers
table.
Updating data
We can update data in a table using the UPDATE
statement. Here's an example:
UPDATE customers
SET age = 35
WHERE name = 'John Smith';
This will update the age
column for the row where the name
is 'John Smith'
.
Deleting data
We can delete data from a table using the DELETE FROM
statement. Here's an example:
DELETE FROM customers
WHERE age > 40;
This will delete all rows from the customers
table where the age
is greater than 40.
Data types and operations
In SQL, each column in a table has a data type that specifies the kind of data it can store. Some common data types in SQL include:
INT
(integer): used to store whole numbersVARCHAR
(variable-length character): used to store text strings of varying lengthsDATE
: used to store datesFLOAT
(floating-point number): used to store decimal numbersBOOLEAN
: used to store true/false values
When creating a table, we must specify the data type for each column. Here's an example:
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(255),
age INT,
salary FLOAT
);
This creates a new table called employees
with four columns: id
, name
, age
, and salary
, each with a specified data type.
In addition to data types, SQL supports a variety of operations for working with data in tables. Some common operations include:
SELECT
: used to retrieve data from a table based on specified criteriaWHERE
: used to filter data based on specific conditionsGROUP BY
: used to group rows together based on a specific columnORDER BY
: used to sort rows based on a specific columnJOIN
: used to combine data from two or more tables based on a related column
Here's an example of a SELECT
statement that retrieves data from the employees
table:
SELECT name, age, salary
FROM employees
WHERE salary > 50000
ORDER BY age DESC;
This statement will retrieve the name
, age
, and salary
columns from the employees
table for all employees with a salary greater than 50000, sorted by age in descending order.
Querying databases
In SQL, querying a database means retrieving data from one or more tables based on specified criteria. The most commonly used statement for querying a database is the SELECT
statement.
The SELECT statement
The SELECT
statement is used to retrieve data from one or more tables in a database. Here's the basic syntax of a SELECT
statement:
SELECT column1, column2, ...
FROM table1
WHERE condition;
In this syntax, column1
, column2
, etc. are the names of the columns that you want to retrieve data from. table1
is the name of the table that you want to retrieve data from, and condition
is an optional expression that specifies the criteria for selecting rows.
Here's an example of a SELECT
statement:
SELECT *
FROM customers
WHERE age > 30;
This statement will retrieve all columns (*
) from the customers
table for all customers with an age greater than 30.
The WHERE clause
The WHERE
clause is used to filter the results of a SELECT
statement based on specific criteria. Here's the basic syntax of a WHERE
clause:
SELECT column1, column2, ...
FROM table1
WHERE condition;
In this syntax, condition
is an expression that specifies the criteria for selecting rows.
Here's an example of a SELECT
statement with a WHERE
clause:
SELECT *
FROM customers
WHERE age > 30 AND email LIKE '%@example.com';
This statement will retrieve all columns (*
) from the customers
table for all customers with an age greater than 30 and an email that contains @example.com
.
The ORDER BY clause
The ORDER BY
clause is used to sort the results of a SELECT
statement based on a specific column. Here's the basic syntax of an ORDER BY
clause:
SELECT column1, column2, ...
FROM table1
WHERE condition
ORDER BY column_name ASC|DESC;
In this syntax, column_name
is the name of the column that you want to sort by, and ASC
or DESC
specifies the order of the sort (ascending or descending).
Here's an example of a SELECT
statement with an ORDER BY
clause:
SELECT *
FROM customers
WHERE age > 30
ORDER BY name ASC;
This statement will retrieve all columns (*
) from the customers
table for all customers with an age greater than 30, sorted by name in ascending order.
The GROUP BY clause
The GROUP BY
clause is used to group the results of a SELECT
statement based on the values in one or more columns. Here's the basic syntax of a GROUP BY
clause:
SELECT column1, column2, ...
FROM table1
WHERE condition
GROUP BY column_name;
In this syntax, column_name
is the name of the column that you want to group by.
Here's an example of a SELECT
statement with a GROUP BY
clause:
SELECT age, COUNT(*)
FROM customers
GROUP BY age;
This statement will retrieve the age
column and the count of rows for each age group from the customers
table, grouped by age.
Updating and deleting data
In SQL, we can update data in a table using the UPDATE
statement. Here's the basic syntax of an UPDATE
statement:
UPDATE table_name
SET column1 = new_value1, column2 = new_value2, ...
WHERE condition;
In this syntax, table_name
is the name of the table that you want to update, and column1
, column2
, etc. are the names of the columns that you want to update. new_value1
, new_value2
, etc. are the new values that you want to set for the columns. condition
is an expression that specifies the criteria for selecting the rows to update.
Here's an example of an UPDATE
statement:
UPDATE customers
SET age = 35
WHERE name = 'John Smith';
This statement will update the age
column for the row where the name
is 'John Smith'
.
We can also delete data from a table using the DELETE
statement. Here's the basic syntax of a DELETE
statement:
DELETE FROM table_name
WHERE condition;
In this syntax, table_name
is the name of the table that you want to delete from, and condition
is an expression that specifies the criteria for selecting the rows to delete.
Here's an example of a DELETE
statement:
DELETE FROM customers
WHERE age > 40;
This statement will delete all rows from the customers
table where the age
is greater than 40.
Best practices for working with SQL in Python
When working with SQL in Python, there are several best practices that can help you write better code and avoid common pitfalls.
Use parameterized queries
One of the most important best practices when working with SQL in Python is to use parameterized queries. Parameterized queries are a way to write SQL queries that include placeholders for data that will be provided later. Here's an example:
import sqlite3
conn = sqlite3.connect('mydatabase.db')
c = conn.cursor()
name = 'John'
age = 30
c.execute('SELECT * FROM customers WHERE name=? AND age=?', (name, age))
In this example, we're using parameterized queries to retrieve data from a table called customers
. The ?
placeholders in the query will be replaced with the values of the name
and age
variables when the query is executed. Using parameterized queries is important because it helps protect against SQL injection attacks, where an attacker could use malicious data to modify or delete data in your database.
Avoid using string concatenation
Another best practice when working with SQL in Python is to avoid using string concatenation to build SQL queries. String concatenation can be error-prone and can make your code vulnerable to SQL injection attacks. Instead, use parameterized queries or a library that provides a safe way to build SQL queries, such as SQLAlchemy or Django's ORM.
Use transactions
When making changes to a database, it's a good idea to use transactions. Transactions are a way to group multiple SQL statements into a single unit of work that can be either completed or rolled back if an error occurs. Here's an example:
import sqlite3
conn = sqlite3.connect('mydatabase.db')
c = conn.cursor()
try:
conn.begin()
c.execute('INSERT INTO customers (name, age) VALUES (?, ?)', ('John Smith', 30))
c.execute('UPDATE customers SET age=? WHERE name=?', (35, 'John Smith'))
conn.commit()
except:
conn.rollback()
In this example, we're using transactions to insert a new row into a table called customers
, update the age of that row, and commit the changes. If an error occurs, the rollback()
method is called to undo the changes.
Close connections when done
When you're finished working with a database in Python, it's important to close the connection to the database. Failing to do so can lead to resource leaks and other issues. Here's an example:
import sqlite3
conn = sqlite3.connect('mydatabase.db')
c = conn.cursor()
# do some work...
conn.close()
In this example, we're using the close()
method to close the connection to the database. It's important to always close the connection when you're done working with the database.