What is Transitive Dependency in a Database

Avoid Transitive Dependencies to Help Ensure Normalization

A transitive dependency in a database is an indirect relationship between values in the same table that causes a functional dependency. To achieve the normalization standard of Third Normal Form (3NF), you must eliminate any transitive dependency.

By its nature, a transitive dependency requires three or more attributes (or database columns) that have a functional dependency between them, meaning that Column A in a table relies on Column B through an intermediate Column C.

Let's see how this might work.

Transitive Dependency Example

Authors
Author_IDAuthorBookAuthor_Nationality
Auth_001Orson Scott CardEnder's Game

United States

Auth_001Orson Scott CardChildren of the MindUnited States
Auth_002Margaret Atwood

The Handmaid's Tale

Canada

 

 

 

 

 

Book -> Author: Here, the Book attribute determines the Author attribute. If you know the book name, you can learn the author's name. However, Author does not determine Book, because an author can write multiple books. For instance, just because we know the author's name Orson Scott Card, we still don't know the book name.

Author -> Author_Nationality: Likewise, the Author attribute determines the Author_Nationality, but not the other way around: just because we know the nationality does not mean we can determine the author.

But this table introduces a transitive dependency:

Book -> Author_Nationality: If we know the book name, we can determine the nationality via the Author column.

Avoiding Transitive Dependencies

To ensure Third Normal Form, let's remove the transitive dependency.

We can start by removing the Book column from the Authors table and creating a separate Books table:

Books
Book_IDBookAuthor_ID
Book_001Ender's Game

Auth_001

Book_001Children of the MindAuth_001
Book_002

The Handmaid's Tale

Auth_002

 

 

 

 

Authors

Author_IDAuthorAuthor_NationalityAuth_001Orson Scott Card

United States

Auth_002Margaret AtwoodCanada

 

 

 

 


 

Did this fix it? Let's examine our dependencies now:

Books table:

  • Book_ID -> Book, i.e. the Book depends on the Book_ID.
  • No other dependencies in this table exist, so we are okay. Note that the foreign key Author_ID links this table to the Authors table through its primary key Author_ID — we have created a relationship to avoid a transitive dependency, a key design of relational databases.

Authors table:

  • Author_ID -> Author, i.e. the Author depends on the Author_ID.
  • Author -> Author_Nationality, or the nationality can be determined by the author.
  • Author_ID -> Author_Nationality, or the nationality can be determined from the Author_ID through the Author attribute.  Whoops: we still have a transitive dependency. 

We need to add a third table to normalize this data:

Countries

Country_IDCountryCoun_001United StatesCoun_002Canada

 

 

 

 

And then let's add a foreign key to the Authors table and remove the Author_Nationality column:

Authors

Author_IDAuthorCountry_IDAuth_001Orson Scott Card

Coun_001

Auth_002Margaret AtwoodCoun_002

 

 

 

 

Now we have three tables, making use of foreign keys to link between the tables:

  • The Books table's foreign key Author_ID links a book to an author in the Authors table.
  • The Authors table's foreign key Country_ID links an author to a country in the Countries table.
  • The Countries table has no foreign key because it has no need to link to another table in this design.

Why Transitive Dependencies are Bad Database Design

What is the value of avoiding transitive dependencies to help ensure 3NF?  Let's consider our first table again and see the issues it creates:

Authors
Author_IDAuthorBookAuthor_Nationality
Auth_001Orson Scott CardEnder's Game

United States

Auth_001Orson Scott CardChildren of the MindUnited States
Auth_002Margaret Atwood

The Handmaid's Tale

Canada

 

 

 

 

 

This kind of design can contribute to data anomalies and inconsistencies, for example:

  • If you deleted the two books Children of the Mind and Ender's Game, you would delete the author Orson Scott Card and his nationality completely from the database.
  • You cannot add a new author to the database unless you also add a book; what if the author is yet unpublished or you don't know the name of a book she has authored?
  • If Scott Card changed his citizenship, you would have to change it in all records in which he appears. Having multiple records with the same author can result in inaccurate data: what if the data entry person doesn't realize there are multiple records for him and changes the data in only one record?
  • You cannot delete a book — for instance, the Handmaid's Tale — without also deleting the author completely.

These are just some reasons why normalization, and avoiding transitive dependencies, protect data and ensure consistency.