Splitting Strings in Ruby Using the String#split Method

female using a laptop and a mouse

John Lamb//Getty Images

Unless user input is a single word or number, that input will need to be split or turned into a list of strings or numbers.

For instance, if a program asks for your full name, including middle initial, it will first need to split that input into three separate strings before it can work with your individual first, middle and last name. This is achieved using the String#split method.

How String#split Works

In its most basic form, String#split takes a single argument: the field delimiter as a string. This delimiter will be removed from the output and an array of strings split on the delimiter will be returned.

So, in the following example, assuming the user input their name correctly, you should receive a three-element Array from the split.

#!/usr/bin/env ruby
print "What is your full name? "
full_name = gets.chomp
name = full_name.split(' ')
puts "Your first name is #{name.first}"
puts "Your last name is #{name.last}"

If we run this program and enter a name, we'll get some expected results. Also, note that name.first and name.last are coincidences. The name variable will be an Array, and those two method calls will be equivalent to name[0] and name[-1] respectively.

$ ruby split.rb
What is your full name? Michael C. Morin
Your first name is Michael
Your last name is Morin

However, String#split is a bit smarter than you'd think. If the argument to String#split is a string, it does indeed use that as the delimiter, but if the argument is a string with a single space (as we used), then it infers that you want to split on any amount of whitespace and that you also want to remove any leading whitespace.

So, if we were to give it some slightly malformed input such as

Michael C. Morin

(with extra spaces), then String#split would still do what is expected. However, that's the only special case when you pass a String as the first argument. Regular Expression Delimiters

You can also pass a regular expression as the first argument. Here, String#split becomes a bit more flexible. We can also make our little name splitting code a bit smarter.

We don't want the period at the end of the middle initial. We know it's a middle initial, and the database won't want a period there, so we can remove it while we split. When String#split matches a regular expression, it does the same exact thing as if it had just matched a string delimiter: it takes it out of the output and splits it at that point.

So, we can evolve our example a little bit:

$ cat split.rb
#!/usr/bin/env ruby
print "What is your full name? "
full_name = gets.chomp
name = full_name.split(/\.?\s+/)
puts "Your first name is #{name.first}"
puts "Your middle initial is #{name[1]}"
puts "Your last name is #{name.last}"

Default Record Separator

Ruby is not really big on "special variables" that you might find in languages like Perl, but String#split does use one you need to be aware of. This is the default record separator variable, also known as $;.

It's a global, something you don't often see in Ruby, so if you change it, it might affect other parts of the code—just be sure to change it back when finished.

However, all this variable does is act as the default value for the first argument to String#split. By default, this variable seems to be set to nil. However, if String#split's first argument is nil, it will replace it with a single space string.

Zero-Length Delimiters

If the delimiter passed to String#split is a zero-length string or regular expression, then String#split will act a bit differently. It will remove nothing at all from the original string and split on every character. This essentially turns the string into an array of equal length containing only one-character strings, one for each character in the string.

This can be useful for iterating over the string and was used in pre-1.9.x and pre-1.8.7 (which backported a number of features from 1.9.x) to iterate over characters in a string without worrying about breaking up multi-byte Unicode characters. However, if what you really want to do is iterate over a string, and you're using 1.8.7 or 1.9.x, you should probably use String#each_char instead.

#!/usr/bin/env ruby
str = "She turned me into a newt!"
str.split('').each do|c|
puts c

Limiting The Length of the Returned Array

So back to our name parsing example, what if someone has a space in their last name? For instance, Dutch surnames can often begin with "van" (meaning "of" or "from").

We only really want a 3-element array, so we can use the second argument to String#split that we have so far ignored. The second argument is expected to be a Fixnum. If this argument is positive, at most, that many elements will be filled in the array. So in our case, we would want to pass 3 for this argument.

#!/usr/bin/env ruby
print "What is your full name? "
full_name = gets.chomp
name = full_name.split(/\.?\s+/, 3)
puts "Your first name is #{name.first}"
puts "Your middle initial is #{name[1]}"
puts "Your last name is #{name.last}"

If we run this again and give it a Dutch name, it will act as expected.

$ ruby split.rb
What is your full name? Vincent Willem van Gogh
Your first name is Vincent
Your middle initial is Willem
Your last name is van Gogh

However, if this argument is negative (any negative number), then there will be no limit on the number of elements in the output array and any trailing delimiters will appear as zero-length strings at the end of the array.

This is demonstrated in this IRB snippet:

:001 > "this,is,a,test,,,,".split(',', -1)
=> ["this", "is", "a", "test", "", "", "", ""]
mla apa chicago
Your Citation
Morin, Michael. "Splitting Strings in Ruby Using the String#split Method." ThoughtCo, Apr. 5, 2023, thoughtco.com/splitting-strings-2908301. Morin, Michael. (2023, April 5). Splitting Strings in Ruby Using the String#split Method. Retrieved from https://www.thoughtco.com/splitting-strings-2908301 Morin, Michael. "Splitting Strings in Ruby Using the String#split Method." ThoughtCo. https://www.thoughtco.com/splitting-strings-2908301 (accessed June 10, 2023).