Excel Clean Function

01
of 01

Remove Non-Printable ASCII and Unicode Characters with the CLEAN Function

Remove Non-Printable Characters in Excel with the CLEAN Function
Remove Non-Printable Characters in Excel with the CLEAN Function. © Ted French

Use the CLEAN function to remove a number of non-printable computer characters that have been copied or imported into a worksheet along with good data.

This low-level code is frequently found at the beginning and/or end of data files.

Some common examples of these non-printable characters are the characters mixed in with the text in the examples in cells A2 and A6 in the image above.

These characters can interfere with using the data in worksheet operations such as printing, sorting, and filtering data.

Unicode and ASCII Character Codes

Each character on a computer—printable and non-printable—has a number known as its Unicode character code or value.

Another, older, and better known character set is ASCII, which stands for the American Standard Code for Information Interchange, has been incorporated into the Unicode set.

As a result, the first 32 characters (0 to 31) of the Unicode and ASCII sets are identical and they are referred to as control characters used by programs to control peripheral devices such as printers.

As such, they are not intended for use in a worksheet and can cause the sorts of errors mentioned above when present.

The CLEAN function, which predates the Unicode character set, was designed to remove the first 32 non-printing ASCII characters and removes the same characters from the Unicode set.

The CLEAN Function's Syntax and Arguments

A function's syntax refers to the layout of the function and includes the function's name, brackets, and arguments.

The syntax for the CLEAN function is:

= CLEAN ( Text )

Text - (required) the data to be cleaned of non-printable characters. A cell reference to the location of this data in the worksheet.

For example, to clean the data in cell A2 in the image above, enter the formula:

=CLEAN(A2)

into another worksheet cell.

Cleaning Numbers

If used to clean number data, the CLEAN function, in addition to removing any non-printing characters, will convert all numbers to text - which may result in errors if that data is then used in calculations.

Examples: Removing Non-Printable Characters

In column A in the image, the CHAR function has been used to add non-printing characters to the word text as shown in the formula bar above the worksheet for cell A3 that are then removed with the CLEAN function.

In columns B and C of the image above, the LEN function, which counts the number of characters in a cell, is used to show the effect of using the CLEAN function on the data in column A.

The character count for cell B2 is 7—four characters for the word text and three for the non-printing characters surrounding it.

The character count in cell C2 is 4 because the CLEAN function has be added to the formula and strips away the three non-printing characters before the LEN function counts the characters.

Removing Characters #129, #141, #143, #144, and #157

The Unicode character set contains additional non-printing characters not found in the ASCII character set—numbers 129, 141, 143, 144, and 157.

Even though Excel’s support website says it cannot, the CLEAN function can remove these Unicode characters from data as shown in row three above.

In this example, the CLEAN function in column C is used to strip away these five non-visible control characters leaving again a character count of just four for the word text in C3.

Removing Character #127

There is one non-printing character in the Unicode set that the CLEAN function can't remove—the box-shaped character #127 shown in cell A4, where four of these characters surround the word text.

The character count of eight in cell C4  is the same as that in cell B4 and because the CLEAN function in C4 is trying unsuccessfully to remove #127 on its own.

However, as shown in rows five and six above, there are alternative formulas using the CHAR and SUBSTITUTE functions that can be used to remove this character:

  1. The formula in row five uses the SUBSTITUTE and CHAR to replace character #127 with a character that the CLEAN function can remove—in this case, character # 7 (the black dot seen in cell A2);
  2. The formula in row six uses the SUBSTITUTE and CHAR functions to replace character #127 with nothing as shown by the empty quotation marks ( "" ) at the end of the formula in cell D6. As a result, the CLEAN function is not needed in the formula, since there is no character to remove.

Removing Non-Breaking Spaces from a Worksheet

Similar to non-printable characters is the non-breaking space which can also cause problems with calculations and formatting in a worksheet. The Unicode value for the non-breaking space is #160.

Non-breaking spaces are used extensively in web pages—the html code for it is  —so if data is copied into Excel from a web page, non-breaking spaces may be included.

One way to remove non-breaking spaces from a worksheet is with this formula that combines the SUBSTITUTE, CHAR, and TRIM functions.