SQL Fundamentals

Introduction

The Structured Query Language (SQL) comprises one of the fundamental building blocks of modern database architecture. SQL defines the methods used to create and manipulate relational databases on all major platforms. At first glance, the language may seem intimidating and complex but it's really not all that bad. In a series of articles over the next few weeks we'll explore the inner workings of SQL together.
By the time we're through, you'll have the fundamental knowledge you need to go out there and start working with databases!

This week, our first article in the SQL series provides an introduction to the basic concepts behind SQL and we'll take a brief look at some of the main commands used to create and modify databases. Throughout this article, please keep our goal in mind: we're trying to get the "big picture" of SQL -- not a mastery of the individual commands. We'll provide a few examples for illustrative purposes and explain the theory behind them, but don't feel frustrated if you can't write your own SQL commands after reading this article. We'll cover each of the major commands in detail in future weekly installments. If you'd like a reminder in your e-mail inbox each week when the next article is posted, please take a moment and subscribe to our newsletter.

By the way, the correct pronunciation of SQL is a contentious issue within the database community.
In their SQL standard, the American National Standards Institute declared that the official pronunciation is "es queue el." However, many database professionals have taken to the slang pronunciation "sequel." The choice is yours.

SQL comes in many flavors. Oracle databases utilize their proprietary PL/SQL.
Microsoft SQL Server makes use of Transact-SQL. However, all of these variations are based upon the industry standard ANSI SQL. In our tutorial series, we'll stick to ANSI-compliant SQL commands that will work on any modern relational database system.

SQL commands can be divided into two main sublanguages. The Data Definition Language (DDL) contains the commands used to create and destroy databases and database objects. After the database structure is defined with DDL, database administrators and users can utilize the Data Manipulation Language to insert, retrieve and modify the data contained within it. In the next two sections of this article, we'll explore DDL and DML in further detail. In future articles we'll take an in-depth look at specific SQL commands.

Now, let's take a look at the Data Definition Language. Read on!

The Data Definition Language (DDL) is used to create and destroy databases and database objects. These commands will primarily be used by database administrators during the setup and removal phases of a database project. Let's take a look at the structure and usage of four basic DDL commands:

The Data Definition Language (DDL) is used to create and destroy databases and database objects. These commands will primarily be used by database administrators during the setup and removal phases of a database project.

Let's take a look at the structure and usage of four basic DDL commands:

CREATE

Installing a database management system (DBMS) on a computer allows you to create and manage many independent databases. For example, you may want to maintain a database of customer contacts for your sales department and a personnel database for your HR department. The CREATE command can be used to establish each of these databases on your platform. For example, the command:

CREATE DATABASE employees

creates an empty database named "employees" on your DBMS. After creating the database, your next step is to create tables that will contain data. (If this doesn't make sense, you might want to read the article Microsoft Access Fundamentals for an overview of tables and databases.) Another variant of the CREATE command can be used for this purpose. The command:

CREATE TABLE personal_info (first_name char(20) not null, last_name char(20) not null, employee_id int not null)

establishes a table titled "personal_info" in the current database.

In our example, the table contains three attributes: first_name, last_name and employee_id. Don't worry about the other information included in the command -- we'll cover that in a future article.

USE

The USE command allows you to specify the database you wish to work with within your DBMS. For example, if we're currently working in the sales database and want to issue some commands that will affect the employees database, we would preface them with the following SQL command:

USE employees

It's important to always be conscious of the database you are working in before issuing SQL commands that manipulate data.



ALTER

Once you've created a table within a database, you may wish to modify the definition of it. The ALTER command allows you to make changes to the structure of a table without deleting and recreating it. Take a look at the following command:

ALTER TABLE personal_info
ADD salary money null

This example adds a new attribute to the personal_info table -- an employee's salary. The "money" argument specifies that an employee's salary will be stored using a dollars and cents format. Finally, the "null" keyword tells the database that it's OK for this field to contain no value for any given employee.

DROP

The final command of the Data Definition Language, DROP, allows us to remove entire database objects from our DBMS. For example, if we want to permanently remove the personal_info table that we created, we'd use the following command:

DROP TABLE personal_info

Similarly, the command below would be used to remove the entire employees database:

DROP DATABASE employees

Use this command with care! Remember that the DROP command removes entire data structures from your database. If you want to remove individual records, use the DELETE command of the Data Manipulation Language.

That's the Data Definition Language in a nutshell.

In the next section of this article, we'll take a look at how the Data Manipulation Language is used to manipulate the information contained within a database. Read on!

The Data Manipulation Language (DML) is used to retrieve, insert and modify database information. These commands will be used by all database users during the routine operation of the database. Let's take a brief look at the basic DML commands:

The Data Manipulation Language (DML) is used to retrieve, insert and modify database information. These commands will be used by all database users during the routine operation of the database. Let's take a brief look at the basic DML commands:

INSERT

The INSERT command in SQL is used to add records to an existing table. Returning to the personal_info example from the previous section, let's imagine that our HR department needs to add a new employee to their database. They could use a command similar to the one shown below:

INSERT INTO personal_info
values('bart','simpson',12345,$45000)

Note that there are four values specified for the record. These correspond to the table attributes in the order they were defined: first_name, last_name, employee_id, and salary.

SELECT

The SELECT command is the most commonly used command in SQL. It allows database users to retrieve the specific information they desire from an operational database. Let's take a look at a few examples, again using the personal_info table from our employees database.

The command shown below retrieves all of the information contained within the personal_info table. Note that the asterisk is used as a wildcard in SQL. This literally means "Select everything from the personal_info table."

SELECT *
FROM personal_info

Alternatively, users may want to limit the attributes that are retrieved from the database. For example, the Human Resources department may require a list of the last names of all employees in the company. The following SQL command would retrieve only that information:

SELECT last_name
FROM personal_info

Finally, the WHERE clause can be used to limit the records that are retrieved to those that meet specified criteria. The CEO might be interested in reviewing the personnel records of all highly paid employees. The following command retrieves all of the data contained within personal_info for records that have a salary value greater than $50,000:

SELECT *
FROM personal_info
WHERE salary > $50000

UPDATE

The UPDATE command can be used to modify information contained within a table, either in bulk or individually. Each year, our company gives all employees a 3% cost-of-living increase in their salary. The following SQL command could be used to quickly apply this to all of the employees stored in the database:

UPDATE personal_info
SET salary = salary * 1.03

On the other hand, our new employee Bart Simpson has demonstrated performance above and beyond the call of duty. Management wishes to recognize his stellar accomplishments with a $5,000 raise. The WHERE clause could be used to single out Bart for this raise:

UPDATE personal_info
SET salary = salary + $5000
WHERE employee_id = 12345

DELETE

Finally, let's take a look at the DELETE command. You'll find that the syntax of this command is similar to that of the other DML commands. Unfortunately, our latest corporate earnings report didn't quite meet expectations and poor Bart has been laid off. The DELETE command with a WHERE clause can be used to remove his record from the personal_info table:

DELETE FROM personal_info
WHERE employee_id = 12345

On the next page of this article, we'll begin exploring the power of JOINs. Read on! Now that you’ve learned the basics of SQL, it’s time to move on to one of the most powerful concepts the language has to offer – the JOIN statement. Quite simply, these statements allow you to combine data in multiple tables to quickly and efficiently process large quantities of data. These statements are where the true power of a database resides.

We’ll first explore the use of a basic JOIN operation to combine data from two tables. In future installments, we’ll explore the use of outer and inner joins to achieve added power.

We’ll continue with our example using the PERSONAL_INFO table, but first we’ll need to add an additional table to the mix. Let’s assume we have a table called DISCIPLINARY_ACTION that was created with the following statement:

CREATE TABLE disciplinary_action (action_id int not null, employee_id int not null, comments char(500))

This table contains the results of disciplinary actions on company employees. You’ll notice that it doesn’t contain any information about the employee other than the employee number. It’s then easy to imagine many scenarios where we might want to combine information from the DISCIPLINARY_ACTION and PERSONAL_INFO tables.

Assume we’ve been tasked with creating a report that lists the disciplinary actions taken against all employees with a salary greater than $40,000. The use of a JOIN operation in this case is quite straightforward. We can retrieve this information using the following command:

SELECT personal_info.first_name, personal_info.last_name, disciplinary_action.comments
FROM personal_info, disciplinary_action
WHERE personal_info.employee_id = disciplinary_action.employee_id
AND personal_info.salary > 40000

As you can see, we simply specified the two tables that we wished to join in the FROM clause and then included a statement in the WHERE clause to limit the results to records that had matching employee IDs and met our criteria of a salary greater than $40,000.

Congratulations! You've learned the basic concepts behind the Structured Query Language. Stay tuned for future articles that explore these commands in-depth.