Database Normalization

Database normalisation.

Yes. I know. You use a normal database already. However, you may be wasting time & space by storing information multiple times. To store data more efficiently in your (mysql) database you should weed out duplicate entries of the same information, and place it in a separate table, for only that kind of information.

I know; this is all very abstract, and I will get to an example soon. Useful to know is that there are 6 levels of normalisation. Number one where only some tables are linked, through to 6 where even same type of information (e.g., phone numbers) are in separate tables if they types have a different master type (e.g., personal phone numbers vs. company phone numbers). I will explain this by means of an example, and work this example up to a properly normalized database. I am assuming you have basic knowledge of PhP and have a mysql database running.

Example: A market place website

Imagine you are building a website where users can sell items. As you want to have a different layout for each new type of item, you decide to create a database with a table for each type of item that people can sell. For instance for selling plants you may have a table like this:

<PLANTS_FOR_SALE>
- Pl_id
- Pl_Name
- Pl_Price
- Pl_Location
- Pl_Image
- Pl_Owner
- Pl_Owner_name
- Pl_Owner_email
- Pl_Owner_pass

This might work well for a hand full of item types (e.g., plants, pots, soil and tools). However, when the website takes off, and you handle hundreds of requests per hour, you decide to expand into a full market place. Then the trouble starts to show: Do you create a new table for each item for sale? And what if a user decides to change their email or password? Do you want to go to each table and update the information? Of course not! So you need to normalize the database. What you do is: You decide on what information is specific for the item you are selling. In this case that would be:

<Plants>
- Pl_id
- Pl_Name
- Pl_Price
- Pl_Location
- Pl_Image

The other information is information which would be valid for more than one item. There is information on the user. So we create a separate table ‘Users’:

<USERS>
- U_id (user identifier)
- U_name (users’ name)
- U_username (username)
- U-email (users’ email)
- U_pass (users’ password)
- U_status (e.g., banned, waiting for activation, moderator etc)

Hang on! I told you I would make it easier. But now we still have a table for each item type we are selling AND we have a table for the users. And how do I now know which item belongs to which user? So.. We are not finished yet. What we need is create a separate table for item types, and add fields for the user & item type to the items table. So you would end up with something like this:

Table USERS
- U_id (user identifier)
- U_name (users’ name)
- U_username (username)
- U-email (users’ email)
- U_pass (users’ password)
- U_status (e.g., banned, waiting for activation, moderator etc)

Table SALES
- S_id (Items for sale identifier)
- S_I_nr (ITEM_TYPES.I_id)
- S_U_nr (USERS.U_id)
- S_P_nr (PICTURES.P_id)
- S_title (Title of product)
- S_price (Price of product)
- S_location (Location of product)

Table ITEM_TYPES
- I_id
- I_Label

Table PICTURES
- P_id
- P_url

Create the tables with this MYSQL code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
CREATE TABLE `users` (
`U_id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`U_name` VARCHAR( 50 ) NOT NULL ,
`U_username` VARCHAR( 50 ) NOT NULL ,
`U_email` VARCHAR( 50 ) NOT NULL ,
`U_pass` VARCHAR( 12 ) NOT NULL ,
`U_status` TINYINT NOT NULL
) ENGINE = MYISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci;
 
CREATE TABLE `item_types` (
`I_id` SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`I_label` VARCHAR( 50 ) NOT NULL
) ENGINE = MYISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci;
 
CREATE TABLE `sales` (
`S_id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`S_I_nr` SMALLINT NOT NULL ,
`S_U_nr` INT UNSIGNED NOT NULL ,
`S_p_nr` INT UNSIGNED NOT NULL ,
`S_title` VARCHAR( 100 ) NOT NULL ,
`S_price` DECIMAL NOT NULL ,
`S_location` VARCHAR( 50 ) NOT NULL
) ENGINE = MYISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci;
 
CREATE TABLE `pictures` (
`P_id` SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`P_url` VARCHAR( 50 ) NOT NULL
) ENGINE = MYISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci;
CREATE TABLE `users` (
`U_id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`U_name` VARCHAR( 50 ) NOT NULL ,
`U_username` VARCHAR( 50 ) NOT NULL ,
`U_email` VARCHAR( 50 ) NOT NULL ,
`U_pass` VARCHAR( 12 ) NOT NULL ,
`U_status` TINYINT NOT NULL
) ENGINE = MYISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci;

CREATE TABLE `item_types` (
`I_id` SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`I_label` VARCHAR( 50 ) NOT NULL
) ENGINE = MYISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci;

CREATE TABLE `sales` (
`S_id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`S_I_nr` SMALLINT NOT NULL ,
`S_U_nr` INT UNSIGNED NOT NULL ,
`S_p_nr` INT UNSIGNED NOT NULL ,
`S_title` VARCHAR( 100 ) NOT NULL ,
`S_price` DECIMAL NOT NULL ,
`S_location` VARCHAR( 50 ) NOT NULL
) ENGINE = MYISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci;

CREATE TABLE `pictures` (
`P_id` SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`P_url` VARCHAR( 50 ) NOT NULL
) ENGINE = MYISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci;

So.. now a user can register with your website. You store the information in table users. When the data is inserted into the table, you send the user an email, asking him to confirm the account. After the user has confirmed you switch status from 1 (not confirmed) to 2 (confirmed) and the user is allowed to sell items. Which he does. So the user loads the form for new items for sale. You create a dropdown element from the table item_types:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
 
<?php
 // initiate the dropdown
 $drop = "<select name=\"itemtype\">";<br /><br />// define the query
 
$q = "select * from item_types";
 
// Run the query
$res = mysql_query($q) or die('An error occurred. Please try back later');
 
// Create the dropdown
 
if($res) {
  while($row = mysql_fetch_array($res))
    {
    $drop .= "<option value=\"".$row['I_id']."\">".$row['I_label']."</option>";
    }
  }
 
 
// Place the drop somewhere
echo $drop;
?>
 
<?php
 // initiate the dropdown
 $drop = "<select name=\"itemtype\">";<br /><br />// define the query

$q = "select * from item_types";

// Run the query
$res = mysql_query($q) or die('An error occurred. Please try back later');

// Create the dropdown

if($res) {
  while($row = mysql_fetch_array($res))
    {
    $drop .= "<option value=\"".$row['I_id']."\">".$row['I_label']."</option>";
    }
  }


// Place the drop somewhere
echo $drop;
?>

After selecting the item type and filling out the form, the users posts’ the information, which you can now store in the items table. If the user has decided to also upload a picture, you will need to handle the upload. That, however, is outside the scope of the current tutorial.

Retrieving the information using joins.

OK, so the user has uploaded all the information. You have stored it. Now a new user would like to see the item. You need to retrieve the username selling the item, and the full items details. You could of course perform a series of queries. However that is not very efficient. More efficient would be to run one query, and join the tables for this query. This is how to do this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<?php
 // The user has clicked a link with an item id (showitem.php?iid=1). 
 
 // Get the id, and encode the variable to avoid sql injections
 $request_item = mysql_real_escape_string($_GET['iid']);
 
 // Formulate the query
 
 $q = "
   select * from USERS 
     left join SALES      
       on U_id = S_U_nr 
     left join ITEM_TYPES 
       on S_I_nr = I_id 
     left join PICTURES
       on S_p_nr = P_id
   where I_id = ".$request_item."";
 ?>
<?php
 // The user has clicked a link with an item id (showitem.php?iid=1). 
 
 // Get the id, and encode the variable to avoid sql injections
 $request_item = mysql_real_escape_string($_GET['iid']);
 
 // Formulate the query
 
 $q = "
   select * from USERS 
     left join SALES      
       on U_id = S_U_nr 
     left join ITEM_TYPES 
       on S_I_nr = I_id 
     left join PICTURES
       on S_p_nr = P_id
   where I_id = ".$request_item."";
 ?>

This last query will retrieve all information from all tables for the selected item. Naturally, when you do not need ll information, it is more efficient to name only those columns that you want to show. Note furthermore that when you do not give all fields their own unique name across all tables the database engine might get confuse. It will tell you that the query cannot be performed. In those cases you should refer to the individual fields with their table prefix. So for picture identifier that would be: PICTURES.P_id.

Leave a Reply