The PHP Project Guide (2014)

9. Databases

Your database will almost always be the fundamental part of your application. Careful planning of the structure, data types, relation between tables and table engine used is extremely important. It’s often a great way to also start thinking about the overall problem of planning your application. Visualise your database tables and data, and you should be able to gauge how your code should start to work, and how you may interact with your data.

9.1 phpMyAdmin

This is an extremely popular tool for easily creating and managing databases, tables and data. phpMyAdmin allows you to interact with your database with a GUI (Graphical User Interface) and do almost everything you’d be able to do within a console. If you haven’t already guessed from the name, it’s built on PHP, so go and have a look at the source code. The great thing about phpMyAdmin is it’s free, open source and very powerful. phpMyAdmin is also easy to use in terms of setting it up on your server. As long as you know your MySQL server’s default username and password you can download, extract and point your browser to its location. From here you can log in and start managing your database, permissions and take a look at other information regarding your MySQL server.

Please be aware that by doing this on a publically available environment you’re giving another point of entry to your database. This means you should be extremely careful about where you place it. It might make sense to only use it during development and remove it temporarily when your website is live, and only use it when you need it. Of course, this isn’t always possible so you may need to keep it permanently. That’s fine, but:

1. Choose an extremely strong password. The longer, more complex and varied it is, the better.

2. Don’t keep it in an obvious location like /phpmyadmin.

3. Keep phpMyAdmin up to date whenever a new release is available. Unless you’ve changed your configuration file, you can just copy over the new downloaded version and get going again. This will help patch up any security vulnerabilities.

phpMyAdmin will be available if you’re using a server with cPanel or Parallels installed, which will be protected behind this environment, so may offer a better level of security and ease of access.

9.2 Planning your database

Regardless of whether you prefer to start working on paper or start creating your database tables directly, you should start planning out the tables and fields you need straight away. This will start to get you thinking about what you can eliminate and what you may need to add along the way. For example, do you need a field for a user’s first name and last name, or could you have one field for their full name and use PHP to split the name on the server side? Of course, you could also split this within your query, but in terms of performance this wouldn’t be ideal. Some decisions can be rectified easily later on by running queries to chop and change data, however other decisions may not be able to be rectified so easily. On the other hand, don’t get too tied down with perfecting your database tables, you’re more than likely going to be able to change them later. However, do give future thought to your structure when planning.

Data types are equally important, and there are so many to choose from, which can sometimes be overwhelming. For the basics, something like a unique user ID could be stored as an int. Likewise, if you were only storing a number 0 through to 9, you would choose a tinyint (a one bit integer). These small decisions will ultimately lead to a faster, better designed database that has been appropriately designed around the data it holds. Remember, data types can easily be changed. If you’re unsure, it’s best to choose what you think would be suitable and then research what you’ve chosen later if you’re worried about performance or your overall database design.

Including an index within your table will ultimately lead to faster performance, as long as other considerations have been taken into account to address speed. Keeping a field as an index, for example, an auto incrementing user ID will mean that querying this table will be faster, as a unique pointer will have been set up. This is common practice when creating a table, as you’ll need a unique identifier for each user anyway. However, don’t think you’re just tied down to integers for unique, indexed columns, you can also choose other fields to index. We’ll discuss this in a bit more detail when we look at speed.

On the more advanced side of database table setup, you should choose an appropriate storage engine for the tables you create. These have a massive impact later on down the line depending on how you’re interacting with your database. For example, if you ended up with hundreds of thousands of records within a table and needed to search the table, you may have chosen an inappropriate storage engine for searching the table efficiently, e.g. you may have chosen InnoDB instead of MyISAM which allows for full-text searches.

Lastly, it’s a good idea to populate your database with some dummy data. On the other hand, you may find it easier to build the functionality to insert data first, and this also largely depends on the kind of website you’re building, and whether this data will ever need to be output to a user. Personally, I’ve always found it easier to insert the dummy data, output this within the application and then build the functionality to insert. This gives the joy and satisfaction of almost seeing your application work. Again, if your application isn’t input/output driven, you may not find the same satisfaction. It’s up to you which way suits you personally and what makes sense.

9.3 PDO

PDO (PHP Data Objects) is a flexible way to communicate with your database. It provides a way to connect using database drivers (one being for MySQL) and is essentially a unified way of database connection and interaction. This means that you can switch between databases and your website will still be able to communicate with your database providing the table names remain the same. PDO also offers powerful functionality in OOP, which makes it great to build into your code. It also offers transactional support which means that operations can be rolled back providing your database supports this, making operations within your database more efficient.

9.4 Choosing how to communicate with your database

I’ve assumed that you’re using a MySQL database, as it’s an extremely popular database solution. Best of all, it’s free and very powerful. PHP has much built in support for MySQL and makes it extremely easy to connect, query and retrieve results in a variety of ways. You may have seen or used the MySQL extension which includes functions like mysql_connect to connect to your MySQL server, mysql_query to send a query to the server and mysql_fetch_assoc to retrieve a result set in an associative array. These groups of functions are called extensions simply because they’re not part of the PHP core set of functions and are instead part of the PHP extension framework. The first thing to mention is that PHP advise against the use of this extension and instead recommend the newer MySQLi extension. This provides exactly the same functionality as you’ll have previously seen albeit improved a great deal, hence the ‘i’, which stands for ‘improved’. The MySQLi extension set is available both procedurally and object oriented, so they can be used in either scenario. Let’s take a look at the difference between these two when querying:

1 // procedural

2 $db = mysqli_connect(server, username, password, database);

3 $result = mysqli_query(query);

4 $assoc = mysqli_fetch_assoc($result);

6 // object oriented

7 $db = new mysqli(server, username, password, database);

8 $result = $db->query(query);

9 $object = $result->fetch_object();

The procedural example shows connecting, querying and returning an associative array of our results (which would, in practice, need to be done in a loop if there were more than one row returned) exactly the same way we would with the mysql extension. This is fine, because you’re doing exactly the same thing, but with the improved extension.

In the second example, we instantiate a new MySQLi object which gives us access to some methods and properties defined within this class. We use the query method to query the database with an SQL string and then we return the result set as an object. This is interesting, because with the previous example we return the results as an associative array, which you may be used to. So, why do we need to store the return value of the query method within the $result variable? The answer is that it returns the mysqli_result object, which then contains the fetch_object method for us to obtain a result set. To examine the returned object, you can make use of the get_class_methods function or get_class_vars similarly for properties. For example, we know that the $result variable contains the returned mysqli_result object, so we print_r on the get_class_methods function, which returns an array of methods from whatever class you pass to it.

1 print_r(get_class_methods($result));

These are extremely simply, meaningless and useless examples, however they demonstrate the effective nature of using MySQLi with OOP. It allows you to be a lot more expressive with the way you work with your database and it’s returned data after queries. If you’re working procedurally, it makes sense to use MySQLi procedurally, and of course the same for OOP.

9.5 Interacting with your database

Once you’ve set up your database, tables and fields and have populated your tables with some dummy data, the next step is to start thinking about how you’ll interact with this data. This vastly changes depending on what type of website you’re building, but it’s a good idea to start writing code to do what you need. Let’s say you’re building a forum - it’s obvious that you need to read and write posts, so start with building up the functionality that will do this. You’ll need to think about how to output certain areas of the site, like the forum categories, topics within categories and posts within each topic. This may seems daunting at first, but actually it’s very easy. Let’s look at this as a more step by step process:

1. To list your categories, you’ll need to output them within a list. You’d query the database for all categories, output them wrapped in anchor (a) tags and have the link something like category.php?id=x, where x is the category ID from the database (this would more than likely be your primary key, set up as an auto incrementing integer). Bear in mind that SEO isn’t used here, so we’re using the raw format of the file and GET variable we’re passing.

2. When the user clicks this link, they’ll be directed to category.php, where the ID will need to be read. There are important considerations here, like security, as this data can easily be changed by the user. You may also consider categories that don’t exist, but this could be incorporated into the main query that pulls topics from this category. Now, to pull in topics, you’d have a separate topics table. Each topic would have the category ID field set to the category it belongs to, and therefore you could query this table based on the category ID that’s been defined within the URL. Now it’s just a case of outputting the topics as you have the categories, and wrap them in anchor tags again, linking to a page - perhaps topic.php?id=x. When you think of linking everything up like this it becomes much easier to think about how to transfer your user through to different pages, using GET data.

3. From here, it would just be a case of listing posts from the posts table that relate to the particular topic (again, we’ve specified this in the URL). This is the basis of most applications. It means you can build one template and use it for everything.

We’ll be touching on security later, but as a quick note, and as we’ve already briefly discussed, it’s extremely important that you take security into consideration. Absolutely anywhere that a user can modify data being sent to your server needs to be considered as it can be manipulated in a way that allows an attacker to take advantage of this. It also helps to include commonplace checks, like whether a particular topic exists, in case a normal user either purposely or accidentally changes the URL, and specifies something incorrectly, or a topic is deleted and a message needs to be presented to the user.

Of course, there is a lot more to creating a fully functional forum, but we’ll be discussing various other example applications along the way so we can start thinking like a programmer and address various other problems.

The main thing to think about is that the fundamental part of many dynamic websites is just grabbing data, showing it to the user and allowing an administrator or user to post data for later retrieval. It’s the cycle that once mastered properly, will make building dynamic websites a lot easier.

9.6 A Guestbook example

Although guestbooks aren’t really used anymore, they represent the fundamental functionality for the majority of dynamic websites.

So, let’s take a look at the thought process that would go into building a simple guestbook, where users could visit a page on our website to leave a comment. This functionality could be easily extended to blog posts, forum topics and much more. The key here is learning how to think about setting up tables so they can handle what you need to do. Once you’ve grasped this, you’re free to explore greater things.

Let’s start with thinking about what fields we need for storing information. A user will enter their name, email address (optional) and a message. But, that’s not all we need to store. We want to know what time they made the post and we also need to keep a unique field for indexing purposes to help speed up retrieval of data, and identify each post uniquely. So, the fields we need are:

· post_id

· name

· email

· message

· timestamp

But what about data types, lengths and default values and required values? Our post_id will obviously be an integer, our name, email and message a string and we’ll be storing the timestamp as an integer. So:

· post_id (int)

· name (varchar)

· email (varchar)

· message (text)

· timestamp (int)

Notice we have varchar and text. Varchar will require a fixed length whereas text doesn’t, although it’s advisable to give one. The int fields also don’t require a fixed length, although for the timestamp we could give one and assuming you’re not going to receive millions of posts, we can also limit post_id. Below is the amended fields with a brief explanation as to why we’ve given these the lengths we have.

· post_id (int 5) - This limit would still allow for 9999 posts. More than enough for now, but can be changed in future if this limit is approached.

· name (varchar 25) - This can vary dramatically, but 25 is more than enough characters to store the average first and last name. We’ve included a little more in case a large name or middle names(s) are included.

· email (varchar 85) - An email address is unlikely to be more than this, but emails can be up to 1024 characters.

· message (text 1000) - On the front end, we’ll be limiting characters count to 1000 characters, we’ll also be checking this within PHP. However, it’s good practice to also define this within your table.

· timestamp (int 10) - Currently, and for years and years to come, a timestamp will be 10 digits.

So, we’ve got the basis of our table and have everything fairly precise. But what about our field contents? We want them all to be mandatory except email, so we should define this. You’d need to include the NOT NULL property when defining these fields, and leave the email field as normal, so this isn’t defined as required. This isn’t required, but is advised.

All that’s left now is to write backend code to start interaction. Remember that it’s important to include validation and security when querying a database where user data is submitted or properties are defined when retrieving data (such as how many posts to display per page, etc.).

Why have we bothered to define all this when we can just lump it all into our table and not declare specific types, sizes and whether a field will be null? Well, of course you could store a username in a field with a text data type, but you’ll be ruining the design of your tables. You could also choose 30 characters for a varchar field for a username, but why would you when you might be validating within PHP to only accept 20 character maximum? Your database ties into your website and most likely cannot function without it. If you’re storing information and querying for data, design your database carefully and the transaction will be more harmonious and save you time in the long run.