MySQL Category Tree
tutorialdatabasemysqlphparchivedDatabase and Table structure
Looking at different tutorials and open source projects on the net there's a number of different ways in which category trees are managed. Some I like, some... not so much. In this article I'll cover the method I'm currently using. Firstly, the Pros and Cons:
Pros
- Fast, simple queries to build the path of a category
- A few simple statements to add categories or products
- Fast and easy to build complete category trees
Cons
- Care needed when moving or removing categories
- Possibly more, which I'll add when I think of them...
The Tables
We'll be using InnoDB for the tables (this should work with other engines, but check the INDEXs). Firstly, we'll start with the categories table, which stores the information for each category.
id | id_parent | node_depth | node_path | title |
7 | 6 | 3 | 0,1,6,7 | Snakes |
CREATE TABLE `categories` (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
id_parent INT UNSIGNED DEFAULT 0 NOT NULL,
node_depth TINYINT UNSIGNED DEFAULT 1 NOT NULL,
node_path VARCHAR(255) DEFAULT '0' NOT NULL,
title VARCHAR(32) DEFAULT '' NOT NULL,
INDEX `idx_parent`(id_parent)
) ENGINE='InnoDB';
This table describes the categories; if you want to add icon image URLs, descriptions, etc, you'd do so here. Lets look at the table columns:
- id - An automatically generated unique ID for each entry
- id_parent - The ID of the parent category (or 0 if this is a top-level category)
- node_depth - The distance from the root node (0) of this category
- node_path - The IDs (including 0, root) of each ancestor category in ascending order. Order by this column to sort by ancestor path
- title - The title text for this category
- INDEX idx_parent - An index to speed up lookups for categories that belong to a specific parent
id_category | id_parent |
7 | 6 |
7 | 1 |
7 | 0 |
CREATE TABLE `category_parent` (
id_category INT UNSIGNED DEFAULT 0 NOT NULL,
id_parent INT UNSIGNED DEFAULT 0 NOT NULL,
INDEX `idx_category`(id_category)
) ENGINE='InnoDB';
This table connects each category by its id, id_category, to the IDs of its ancestors (id_parent). There is one entry for each ancestor category, including root (0).
id | title | id_parent |
5 | Duke | 5 |
CREATE TABLE `products` (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(32) DEFAULT 'Example' NOT NULL,
id_parent INT UNSIGNED DEFAULT 0 NOT NULL,
INDEX `idx_parent`(id_parent)
) ENGINE='InnoDB';
We're using this table as an example of how we can attach items to our categories. The products table has two columns we're particularly interested in and which are required; id, the unique ID for each row, and id_parent, the ID of the parent category.
id_product | id_parent |
5 | 5 |
5 | 2 |
5 | 1 |
5 | 0 |
CREATE TABLE `product_parent` (
id_product INT UNSIGNED DEFAULT 0 NOT NULL,
id_parent INT UNSIGNED DEFAULT 0 NOT NULL,
INDEX `idx_product`(id_product)
) ENGINE='InnoDB';
The product_parent table connects a product by its ID, id_product, to its parent category and each ancestor of the parent by their ID, id_parent.
We've also added some indexes to each table to speed up queries. Next, we'll look at some common queries for modifying the tables.
Inserting and Selecting data
Inserting new categories is pretty simple, and is achieved in 4 queries. Note the use of {{title}} and {{id_parent}} as placeholders for the new category title and the ID of the parent category (which can be 0, root).
INSERT INTO `categories` (title, id_parent, node_path, node_depth)
(SELECT '{{title}}', {{id_parent}}, p.node_path, p.node_depth+1
FROM `categories` AS p WHERE p.id={{id_parent}}) UNION
(SELECT '{{title}}', {{id_parent}}, '0', 1) LIMIT 1;
UPDATE `categories` SET node_path=CONCAT(node_path, ',', id) WHERE id=LAST_INSERT_ID();
INSERT INTO `category_parent` (id_category, id_parent) VALUES
(LAST_INSERT_ID(), {{id_parent}});
INSERT INTO `category_parent` (id_category, id_parent)
SELECT LAST_INSERT_ID(), cp.id_parent FROM `category_parent` AS cp
WHERE cp.id_category={{id_parent}};
These statements automatically create the node_path, calculate the node_depth, and create all of the category_parent attachments with very little required information!
Adding products
Adding products is even simpler, and requires only 3 queries. Obviously, in this example our products only have a title, but in your code would need any additional columns populated in the first query:
INSERT INTO `products` (id_parent, title) VALUES ({{id_parent}}, '{{title}}');
INSERT INTO `product_parent` (id_product, id_parent) VALUES (LAST_INSERT_ID(), {{id_parent}});
INSERT INTO `product_parent` (id_product, id_parent)
SELECT LAST_INSERT_ID(), cp.id_parent FROM `category_parent` AS cp
WHERE cp.id_category={{id_parent}};
Really, that's all that's needed for adding data.
Selecting information
Fetching a category from the categories table with its path (ancestor category titles) can now be done very simply. The code in bold shows how the path of parent categories is fetched (except {{id}}, which is the unique ID of the category we're looking up):
SELECT c.*, (SELECT
GROUP_CONCAT(c2.title ORDER BY c2.node_depth ASC
SEPARATOR '/') FROM `categories` AS c2 JOIN `category_parent` AS c2p
ON c2.id=c2p.id_parent WHERE c2p.id_category=c.id) AS path FROM
`categories` AS c WHERE c.id={{id}};
Furthermore, if we want to fetch the parent path as hyperlinks ready to be put straight into the page, we can do so. In this example, we'll wrap each ancestor category title in an anchor element that points to our imaginary page "category.php" which takes a GET name/value ?id={{category_id}}:
SELECT c.*, (SELECT
GROUP_CONCAT(CONCAT('<a href="category.php?id=', c2.id, '">', c2.title, '</a>') ORDER BY c2.node_depth ASC
SEPARATOR '/')
FROM `categories` AS c2 JOIN `category_parent` AS c2p
ON c2.id=c2p.id_parent WHERE c2p.id_category=c.id) AS path FROM
`categories` AS c WHERE c.id={{id}};
If we want to select the ancestor categories of a specific category in more detail, we can also do so:
SELECT c2.id, c2.title FROM `categories` AS c1
LEFT JOIN `category_parent` AS cp JOIN `categories` AS c2
ON cp.id_parent=c2.id AND cp.id_category=c1
WHERE c1.id={{id}}
ORDER BY c2.node_depth ASC;
Fetching the Category Tree
If we want to fetch the whole category tree, we can do so easily:
SELECT * FROM `categories` ORDER BY node_path ASC;
This example, in PHP, fetches and indents depending on node depth all categories in nested unordered lists:
<?php
$sql = new mysqli("localhost", "username", "password", "database");
$res = $sql->query("SELECT * FROM `categories` ORDER BY node_path ASC");
$lastID = 0;
$nodeDepth = 1;
echo "<ul>\n";
while($row = $res->fetch_assoc())
{
if($row['id_parent'] != $lastID)
{
echo "</li>\n";
}
if($row['node_depth'] > $nodeDepth)
{
echo /*"\n".str_repeat("\t", $nodeDepth).*/"<ul>\n";
$nodeDepth++;
}
else if($row['node_depth'] &
lt; $nodeDepth)
{
do {
echo "\n".str_repeat("\t", $nodeDepth-1)."</ul></li>\n";
$nodeDepth--;
} while($row['node_depth'] < $nodeDepth);
}
echo str_repeat("\t", $nodeDepth)."<li>{$row['title']}";
$lastID = $row['id'];
}
if($lastID != 0) { echo "</li>\n"; }
do {
echo str_repeat("\t", $nodeDepth-1)."</ul>";
if($nodeDepth > 1) { echo "</li>"; }
echo "\n";
$nodeDepth--;
} while($nodeDepth >= 1);
$res->free();
Example data fetched like this:
<ul>
<li>Animals<ul>
<li>Mammals<ul>
<li>Dogs</li>
<li>Cats</li>
<li>Horses</li>
</ul></li>
<li>Reptiles<ul>
<li>Snakes</li>
<li>Lizards</li>
</ul></li>
</ul></li>
<li>Canned</li>
<li>Materials<ul>
<li>Feed<ul>
<li>Fruit<ul>
<li>Fresh</li>
<li>Dried</li>
<li>Canned</li>
</ul></li>
</ul></li>
</ul></li>
</ul>
Inserting example data
If you want to try this out with some sample data, here's a PHP script to import the data from some formatted text. I won't explain in great detail, but the code and comments should be reasonably self explanatory.
<?php
// Connect to MySQL. Replace the username, password, and database
// (and possibly localhost) with your information
$sql = new mysqli("localhost", "username", "password", "exampledb");
// Our example categories; formatting should be self explanatory
$categories = <<<ENDLIST
Animals
Animals/Mammals
Animals/Mammals/Dogs
Animals/Mammals/Cats
Animals/Mammals/Horses
Animals/Reptiles
Animals/Reptiles/Snakes
Animals/Reptiles/Lizards
Materials
Materials/Feed
Materials/Feed/Fruit
Materials/Feed/Fruit/Fresh
Materials/Feed/Fruit/Dried
ENDLIST;
// Example products; {{product title}};{{category path}}
$products = <<<ENDLIST
Fluffy;Animals/Mammals/Cats
Mittens;Animals/Mammals/Cats
Spot;Animals/Mammals/Dogs
Rover;Animals/Mammals/Dogs
Duke;Animals/Mammals/Horses
MrSlithers;Animals/Reptiles/Snakes
Snappy;Animals/Reptiles/Lizards
Strange Iguana;Animals/Reptiles/Lizards
Apples;Materials/Feed/Fruit/Fresh
Bananas;Materials/Feed/Fruit/Fresh
Prunes;Materials/Feed/Fruit/Dried
Raisins;Materials/Feed/Fruit/Dried
Mixed feed;Materials/Feed
Flavourings and Ash;Materials/Feed
ENDLIST;
// Clear the tables of any extant data
$sql->query("TRUNCATE TABLE `categories`");
$sql->query("TRUNCATE TABLE `category_parent`");
$sql->query("TRUNCATE TABLE `products`");
$sql->query("TRUNCATE TABLE `product_parent`");
$id = 0;
// For storing our parsed category tree and associating category paths
// to inserted IDs for quick lookup when adding products
$categoryPaths = array();
$categoryData = array(
'id' => 0,
'node_path' => array(0=>0),
'node_depth' => 0,
'id_parent' => 0
);
$data = explode("\n", $categories);
// Transaction to improve speed
$sql->begin_transaction();
foreach($data as $v)
{
// Skip empty lines
$v = trim($v);
if(!strlen($v)) { continue; }
// Reference to current active node
$node = &$categoryData;
$path = explode("/", $v);
$fullPath = array();
// Looping through parts of each line
foreach($path as $p)
{
if(!strlen($p)) { continue; }
array_push($fullPath, $p);
// If this category does not yet exist...
if(!array_key_exists($p, $node))
{
// Build node data that we currently can;
// $p is this part of the path (category title)
$node[$p] = array(
'node_depth' => $node['node_depth']+1,
'node_path' => $node['node_path'],
'id' => 0,
'id_parent' => $node['id']
);
// Insert the category & get the ID
$sql->query("INSERT INTO `categories` ".
"(id_parent, node_path, node_depth, title) VALUES ".
"({$node['id']}, '".implode(",", $node['node_path']).
"', ".($node[$p]['node_depth']).", '{$p}')");
$id = $sql->insert_id;
// Attach the ID to the end of the node_path
$sql->query("UPDATE `categories` SET node_path=CONCAT(node_path, ',', id) WHERE id={$id}");
// Attach this category to all its ancestors
$sql->query("INSERT INTO `category_parent` ".
"(id_category, id_parent) VALUES ".
"({$id}, {$node['id']})");
$sql->query("INSERT INTO `category_parent` ".
"(id_category, id_parent) ".
"SELECT {$id}, id_parent FROM `category_parent` ".
"WHERE id_category={$node['id']}");
// Update the $node data and add the path to ID
// reference
$node[$p]['id'] = $id;
array_push($node[$p]['node_path'], $id);
$categoryPaths[implode("/", $fullPath)] = $id;
}
// Update the $node reference to this node
$node = &$node[$p];
}
}
$sql->commit();
// We just dump here for debugging - you don't have to.
var_dump($categoryData);
// Insert products
$data = explode("\n", $products);
$sql->begin_transaction();
foreach($data as $v)
{
// Skip empty lines
$v = trim($v);
if(!strlen($v)) { continue; }
// Parse product data. If the category path doesn't exist, skip
$prod = explode(";", $v);
if(count($prod)!=2 || !array_key_exists($prod[1], $categoryPaths))
{
continue;
}
// Insert new product and get ID
$sql->query("INSERT INTO `products` (title, id_parent) VALUES ".
"('{$prod[0]}', ".$categoryPaths[$prod[1]].")");
$id = $sql->insert_id;
// Attach product to its parent & all parent ancestor categories
$sql->query("INSERT INTO `product_parent` (id_product, id_parent) ".
"VALUES ({$id}, ".$categoryPaths[$prod[1]].")");
$sql->query("INSERT INTO `product_parent` ".
"(id_product, id_parent) ".
"SELECT {$id}, cp.id_parent FROM `category_parent` AS cp ".
"WHERE cp.id_category=".$categoryPaths[$prod[1]]);
}
$sql->commit();
Adding to sample data
If you installed the sample data from the previous page, let's look at adding some data.
We'll insert a dog named "Bitey" to the "Dogs" category (we're assuming the ID of the Dogs category is 3):
INSERT INTO `products` (id_parent, title) VALUES (3, 'Bitey');
INSERT INTO `product_parent` (id_product, id_parent) VALUES (LAST_INSERT_ID(), 3);
INSERT INTO `product_parent` (id_product, id_parent)
SELECT LAST_INSERT_ID(), cp.id_parent FROM `category_parent` AS cp
WHERE cp.id_category=3;
We'll also add a sub category of the Food, Fruit; "Canned". This assumes the Fruit category has the ID 11:
INSERT INTO `categories` (title, id_parent, node_path, node_depth)
(SELECT 'Canned', 11, p.node_path, p.node_depth+1
FROM `categories` AS p WHERE p.id=11) UNION
(SELECT 'Canned', 11, '0', 1) LIMIT 1;
UPDATE `categories` SET node_path=CONCAT(node_path, ',', id) WHERE id=LAST_INSERT_ID();
INSERT INTO `category_parent` (id_category, id_parent) VALUES
(LAST_INSERT_ID(), 11);
INSERT INTO `category_parent` (id_category, id_parent)
SELECT LAST_INSERT_ID(), cp.id_parent FROM `category_parent` AS cp
WHERE cp.id_category=11;
Finally, we'll add a new top-level (ID 0) category, "Dressage":
INSERT INTO `categories` (title, id_parent, node_path, node_depth)
(SELECT 'Dressage', 0, p.node_path, p.node_depth+1
FROM `categories` AS p WHERE p.id=0) UNION
(SELECT 'Dressage', 0, '0', 1) LIMIT 1;
UPDATE `categories` SET node_path=CONCAT(node_path, ',', id) WHERE id=LAST_INSERT_ID();
INSERT INTO `category_parent` (id_category, id_parent) VALUES
(LAST_INSERT_ID(), 0);
INSERT INTO `category_parent` (id_category, id_parent)
SELECT LAST_INSERT_ID(), cp.id_parent FROM `category_parent` AS cp
WHERE cp.id_category=0;