Choosing a Primary Key

A business man working at office
Kohei Hara/Digital Vision/Getty Images

Databases depend upon keys to store, sort and compare or create relationships between records. If you’ve been around databases for a while, you’ve probably heard about many different types of keys — primary keys, candidate keys, and foreign keys. When you create a new database table, you’re asked to select one primary key that will uniquely identify each record stored in that table.

Why is a Primary Key Important?

The selection of a primary key is one of the most critical decisions you’ll make in the design of a new database.

The most important constraint is that you must ensure that the selected key is unique. If it’s possible that two records (past, present, or future) may share the same value for an attribute, it’s a poor choice for a primary key.

Another important aspect of a primary key is its use by other tables that link to it in a relational database. In this aspect, a primary key acts like the target of a pointer. Because of these inter-dependencies, a primary key must exist when a record is created, and can never change. 

Poor Choices for Primary Keys

Let’s consider what not to use as a primary key:

ZIP Codes do not make good primary keys for a table of towns. If you’re making a a simple lookup table of cities, ZIP code seems to be a logical primary key. However, upon further investigation, you may realize that more than one town may share a ZIP code. For example, four cities in New Jersey (Neptune, Neptune City, Tinton Falls and Wall Township) all share the ZIP code 07753.

Social Security Numbers do not make good primary keys, for many reasons. First, most people consider their SSN private and would not want it clearly visible to database users.Second, some people don’t have SSNs — in particular, foreigners or immigrants who have never gotten a social security card. Third, SSNs can be reused after an individual’s death.

Finally, an individual may have more than one SSN over a lifetime — the Social Security Administration will issue a new number in cases of fraud or identity theft.

Email addresses are also a poor choice for a primary key. Although they are unique, they can change over time. Further, not everyone has an email address.

Choosing an Effective Primary Key

So, what makes a good primary key?

In most cases, turn to your database system for support. A best practice in database design is to use an internally generated primary key. Your database management system can normally generate a unique identifier that has no meaning outside of the database system. For example, you might use the Microsoft Access AutoNumber data type to create a field called RecordID. The AutoNumber data type automatically increments the field each time you create a new record. While the number itself is meaningless, it provides a great way to reference an individual record in queries.