How to Extract Text With Excel's MID and MIDB Functions

01
of 01

Excel MID and MIDB Functions

Extract Good Text From Bad with the MID function
Extract Good Text From Bad with the MID function. © Ted French

When text is copied or imported into Excel, unwanted garbage characters are sometimes included with the good data.

Or, there are times when only part of the text string in the cell is needed - such as a person's first name but not the last name.

For instances like these, Excel has a number of functions that can be used to remove the unwanted data from the rest.

Which function you use depends upon where the good data is located relative to the unwanted characters in the cell.

  • If the good data, or substring to be kept, is on the right side of the data, use the RIGHT function to extract it.
  • If the substring is on the left side of the data, use the LEFT function to extract it.
  • If the substring has unwanted characters on both sides of it, use the MID or MIDB functions to extract it.

MID vs. MIDB

The MID and MIDB functions differ only in the languages they support.

MID is for languages that use the single-byte character set - this group includes most languages such as English and all European languages.

MIDB is for languages that use the double-byte character set  -  includes Japanese, Chinese (Simplified), Chinese (Traditional), and Korean. 

The MID and MIDB Function Syntax and Arguments

In Excel, a function's syntax refers to the layout of the function and includes the function's name, brackets, and arguments.

The syntax for the MID function is:

= MID ( Text , Start_num , Num_chars )

The syntax for the MIDB function is:

= MIDB ( Text , Start_num , Num_bytes )

These arguments tell Excel

  • what data it is to be used in the function;
  • the starting position of the good data or substring that is to be extracted;
  • the length of the substring.

Text - (required for MID and MIDB function) the text string containing the desired data
- this argument can be the actual string or a cell reference to the location of the data in the worksheet - rows 2 and 3 in the image above.

Start_num - (required for MID and MIDB function) specifies the starting character from the left of the substring to be kept.

Num_chars - (required for MID function) specifies the number of characters to the right of the Start_num to be retained.

Num_bytes   (required for MIDB function) specifies the number of characters - in bytes - to the right of the Start_num to be retained.

Notes:

  • If Start_num is greater than the length of the text string, MID/MIDB  returns a blank cell - row 4 of the image, where Start_num is equal to 14, and the text string is only 13 characters long.

  • If Start_num is less than 1 or Num_chars/Num_bytes is negative the MID/MIDB function returns the #VALUE! error value - row 6 of the image, where Start_num is equal to -1.

  • If Num_chars/Num_bytes references an empty cell or is set to zero, MID/MIDB returns a blank cell - row 7 of the image, where Num_chars references the empty cell B13.

MID Function Example - Extract Good Data from Bad

The example in the image above shows a number of ways to use the MID function to extract a specific number of characters from a text string, including entering the data directly as arguments for the function - row 2 - and entering cell references for all three arguments - row 5.

Since it is usually best to enter cell references for arguments rather than the actual data, the information below list the steps used to enter the MID function and its arguments into cell C5.

The MID Function Dialog Box

Options for entering the function and its arguments into cell C5 include:

  1. Typing the complete function: = MID ( A3, B11, B12 ) into cell C5.
  2. Selecting the function and arguments using the function's dialog box

Using the dialog box to enter the function often simplifies the task as the dialog box takes care of the function's syntax - entering the function's name, the commas separators, and brackets in the correct locations and quantity.

Pointing at Cell References

No matter which option you choose for entering the function into a worksheet cell, it is probably best to use point and click to enter any and all cell references used as arguments to minimize the chance of errors caused by typing in the wrong cell reference.

Using the MID Function Dialog Box

  1. Click on cell C1 to make it the active cell - this is where the results of the function will be displayed;
  2. Click on the Formulas tab of the ribbon menu;
  3. Choose Text from the ribbon to open the function drop down list;
  4. Click on MID in the list to bring up the function's dialog box;
  5. In the dialog box, click on the Text line in the dialog box;
  6. Click on cell A5 in the worksheet to enter this cell reference as the Text  argument;
  7. Click on the Start_num line
  8. Click on cell B11 in the worksheet to enter this cell reference;
  9. Click on the Num_chars line;
  10. Click on cell B12 in the worksheet to enter this cell reference; 
  11. Click OK to complete the function and close the dialog box;
  12. The extracted substring file #6 should appear in cell C5;
  13. When you click on cell C5 the complete function =MID (A3,B11,B12) appears in the formula bar above the worksheet.

Extracting Numbers with the MID Function

As shown in the row eight of the example above, the MID function can be used to extract a subset of numeric data from a longer number using the steps listed above.

The only problem is that the extracted data is converted to text and cannot be used in calculations involving certain functions - such as the SUM and AVERAGE functions.

One way around this problem is to use the VALUE function to convert the text into a number as shown in row 9 above:

= VALUE (MID(A8,5,3))

A second option is to use paste special to convert the text to numbers.