Strings - Java 8 Recipes, 2th Edition (2014)

Java 8 Recipes, 2th Edition (2014)

CHAPTER 3. Strings

Strings are one of the most commonly used data types in any programming language. They can be used for obtaining text from a keyboard, printing messages to a command-line, and much more. Given the fact that strings are used so often, there have been many features added to the Stringobject over time in order to make them easier to work with. After all, a string is an object in Java, so it contains methods that can be used to manipulate the contents of the string. Strings are also immutable in Java, which means that their state cannot be changed or altered. This makes them a bit different to work with than some of the mutable, or changeable, data types. It is important to understand how to properly make use of immutable objects, especially when attempting to change or assign different values to them.

This chapter focuses on some of the most commonly used String methods and techniques for working with String objects. We also cover some useful techniques that are not inherent of String objects.

3-1. Obtaining a Subsection of a String

Problem

You would like to retrieve a portion of a string.

Solution

Use the substring() method to obtain a portion of the string between two different positions. In the solution that follows, a string is created and then various portions of the string are printed out using the substring() method.

public static void substringExample(){
String originalString = "This is the original String";
System.out.println(originalString.substring(0, originalString.length()));
System.out.println(originalString.substring(5, 20));
System.out.println(originalString.substring(12));
}

Running this method would yield the following results:

This is the original String
is the original
original String

How It Works

The String object contains many helper methods. One such method is substring(), which can be used to obtain portions of the string. There are two variations of the substring() method. One of them accepts a single argument, that being the starting index; and the other accepts two arguments: startingindex and endingindex. Having two variations of the substring() method makes it seem as though the second argument is optional; if it is not specified, the length of the calling string is used in its place. It should be noted that indices begin with zero, so the first position in a string has the index of 0, and so on.

As you can see from the solution to this recipe, the first use of substring() prints out the entire contents of the string. This is because the first argument passed to the substring() method is 0, and the second argument passed is the length of the original string. In the second example of substring(), an index of 5 is used as the first argument, and an index of 20 is used as the second argument. This effectively causes only a portion of the string to be printed, beginning with the character in the string that is located in the sixth position, or index 5 because the first position has an index of 0; and ending with the character in the string that is located in the 20th position, the index of 19. The third example specifies only one argument; therefore, the result will be the original string beginning with the position specified by that argument.

Image Note The substring() method only accepts positive integer values. If you attempt to pass a negative value, an exception will be thrown.

3-2. Comparing Strings

Problem

An application that you are writing needs to have the ability to compare two or more string values.

Solution

Use the built-in equals(), equalsIgnoreCase(), compareTo(), and compareToIgnoreCase() methods to compare the values contained within the strings. The following is a series of tests using different string-comparison operations.

As you can see, various if statements are used to print out messages if the comparisons are equal:

String one = "one";
String two = "two";

String var1 = "one";
String var2 = "Two";

String pieceone = "o";
String piecetwo = "ne";

// Comparison is equal
if (one.equals(var1)){
System.out.println ("String one equals var1 using equals");
}

// Comparison is NOT equal
if (one.equals(two)){
System.out.println ("String one equals two using equals");
}

// Comparison is NOT equal
if (two.equals(var2)){
System.out.println ("String two equals var2 using equals");
}

// Comparison is equal, but is not directly comparing string values using ==
if (one == var1){
System.out.println ("String one equals var1 using ==");
}

// Comparison is equal
if (two.equalsIgnoreCase(var2)){
System.out.println ("String two equals var2 using equalsIgnoreCase");
}

System.out.println("Trying to use == on Strings that are pieced together");

String piecedTogether = pieceone + piecetwo;

// Comparison is equal
if (one.equals(piecedTogether)){
System.out.println("The strings contain the same value using equals");
}

// Comparison is NOT equal using ==
if (one == piecedTogether) {
System.out.println("The string contain the same value using == ");
}

// Comparison is equal
if (one.compareTo(var1) == 0){
System.out.println("One is equal to var1 using compareTo()");
}

Results in the following output:

String one equals var1 using equals
String one equals var1 using ==
String two equals var2 using equalsIgnoreCase
Trying to use == on Strings that are pieced together
The strings contain the same value using equals
One is equal to var1 using compareTo()

How It Works

One of the trickier parts of using a programming language can come when attempting to compare two or more values. In the Java language, comparing strings can be fairly straightforward, keeping in mind that you should not use the == for string comparison. This is because the comparison operator (==) is used to compare references, not values of strings. One of the most tempting things to do when programming with strings in Java is to use the comparison operator, but you must not because the results can vary.

Image Note Java uses interning of strings to speed up performance. This means that the JVM contains a table of interned strings, and each time the intern() method is called on a string, a lookup is performed on that table to find a match. If no matching string resides within the table, the string is added to the table and a reference is returned. If the string already resides within the table, the reference is returned. Java will automatically intern string literals, and this can cause variation when using the == comparison operator.

In the solution to this recipe, you can see various different techniques for comparing string values. The equals() method is a part of every Java object. The Java string equals() method has been overridden so that it will compare the values contained within the string rather than the object itself. As you can see from the following examples that have been extracted from the solution to this recipe, the equals() method is a safe way to compare strings.

// Comparison is equal
if (one.equals(var1)){
System.out.println ("String one equals var1 using equals");
}
// Comparison is NOT equal
if (one.equals(two)){
System.out.println ("String one equals two using equals");
}

The equals() method will first check to see whether the strings reference the same object using the == operator; it will return true if they do. If they do not reference the same object, equals() will compare each string character-by-character to determine whether the strings being compared to each other contain exactly the same values. What if one of the strings has a different case setting than another? Do they still compare equal to each other using equals()? The answer is no, and that is why the equalsIgnoreCase() method was created. Comparing two values using equalsIgnoreCase() will cause each of the characters to be compared without paying attention to the case. The following examples have been extracted from the solution to this recipe:

// Comparison is NOT equal
if (two.equals(var2)){
System.out.println ("String two equals var2 using equals");
}
// Comparison is equal
if (two.equalsIgnoreCase(var2)){
System.out.println ("String two equals var2 using equalsIgnoreCase");
}

The compareTo()and compareToIgnoreCase() methods perform a lexicographical comparison of the strings. This comparison is based upon the Unicode value of each character contained within the strings. The result will be a negative integer if the string lexicographically precedes the argument string. The result will be a positive integer if the string lexicographically follows the argument string. The result will be zero if both strings are lexicographically equal to each other. The following excerpt from the solution to this recipe demonstrates the compareTo()method:

// Comparison is equal
if (one.compareTo(var1) == 0){
System.out.println("One is equal to var1 using compareTo()");
}

Inevitably, many applications contain code that must compare strings at some level. The next time you have an application that requires string comparison, consider the information discussed in this recipe before you write the code.

3-3. Trimming Whitespace

Problem

One of the strings you are working with contains some whitespace on either end. You would like to get rid of that whitespace.

Solution

Use the string trim() method to eliminate the whitespace. In the following example, a sentence is printed including whitespace on either side. The same sentence is then printed again using the trim() method to remove the whitespace so that the changes can be seen.

String myString = " This is a String that contains whitespace. ";
System.out.println(myString);
System.out.println(myString.trim());

The output will print as follows:

This is a String that contains whitespace.
This is a String that contains whitespace.

How It Works

Regardless of how careful we are, whitespace is always an issue when working with strings of text. This is especially the case when comparing strings against matching values. If a string contains an unexpected whitespace character then that could be disastrous for a pattern-searching program. Luckily, the Java String object contains the trim() method that can be used to automatically remove whitespace from each end of any given string.

The trim() method is very easy to use. In fact, as you can see from the solution to this recipe, all that is required to use the trim() method is a call against any given string. Because strings are objects, they contain many helper methods, which can make them very easy to work with. After all, strings are one of the most commonly used data types in any programming language . . . so they’d better be easy to use! The trim() method returns a copy of the original string with all leading and trailing whitespace removed. If, however, there is no whitespace to be removed, thetrim() method returns the original string instance. It does not get much easier than that!

3-4. Changing the Case of a String

Problem

A portion of your application contains case-sensitive string values. You want to change all the strings to uppercase before they are processed in order to avoid any case-sensitivity issues down the road.

Solution

Make use of the toUpperCase() and toLowerCase() methods. The String object provides these two helper methods to assist in performing a case change for all of the characters in a given string.

For example, given the string in the following code, each of the two methods will be called:

String str = "This String will change case.";
System.out.println(str.toUpperCase());
System.out.println(str.toLowerCase());

The following output will be produced:

THIS STRING WILL CHANGE CASE.
this string will change case.

How It Works

To ensure that the case of every character within a given string is either upper- or lowercase, use the toUpperCase() and toLowerCase() methods, respectively. There are a couple of items to note when using these methods. First, if a given string contains an uppercase letter, and thetoUpperCase() method is called against it, the uppercase letter is ignored. The same concept holds true for calling the toLowerCase() method. Any punctuation or numbers contained within the given string are also ignored.

There are two variations for each of these methods. One of the variations does not accept any arguments, while the other accepts an argument pertaining to the locale you want to use. Calling these methods without any arguments will result in a case conversion using the default locale. If you want to use a different locale, you can pass the desired locale as an argument, using the variation of the method that accepts an argument. For instance, if you want to use an Italian or French locale, you would use the following code:

System.out.println(str.toUpperCase(Locale.ITALIAN));
System.out.println(str.toUpperCase(new Locale("it","US")));
System.out.println(str.toLowerCase(new Locale("fr", "CA")));

Converting strings to upper- or lowercase using these methods can make life easy. They are also very useful for comparing strings that are taken as input from an application. Consider the case in which a user is prompted to enter a username, and the result is saved into a string. Now consider that later in the program that string is compared against all the usernames stored within a database to ensure that the username is valid. What happens if the person who entered the username types it with an uppercase first character? What happens if the username is stored within the database in all uppercase? The comparison will never be equal. In such a case, a developer can use the toUpperCase() method to alleviate the problem. Calling this method against the strings that are being compared will result in a comparison in which the case is the same in both strings.

3-5. Concatenating Strings

Problem

There are various strings that you want to combine into one.

Solution #1

If you want to concatenate strings onto the end of each other, use the concat() method. The following example demonstrates the use of the concat() method:

String one = "Hello";
String two = "Java8";
String result = one.concat(" ".concat(two));

The result is this:

Hello Java8

Solution #2

Use the concatenation operator to combine the strings in a shorthand manner. In the following example, a space character has been placed in between the two strings:

String one = "Hello";
String two = "Java8";
String result = one + " " + two;

The result is this:

Hello Java8

Solution #3

Use StringBuilder or StringBuffer to combine the strings. The following example demonstrates the use of StringBuffer to concatenate two strings:

String one = "Hello";
String two = "Java8";
StringBuffer buffer = new StringBuffer();
buffer.append(one).append(" ").append(two);
String result = buffer.toString();
System.out.println(result);

The result is this:

Hello Java8

How It Works

The Java language provides a couple of different options for concatenating strings of text. Although none is better than the others, you may find one or the other to work better in different situations. The concat() method is a built-in string helper method. It has the ability to append one string onto the end of another, as demonstrated by solution #1 to this recipe. The concat() method will accept any string value; therefore, you can explicitly type a string value to pass as an argument if you want. As demonstrated in solution #1, simply passing one string as an argument to this method will append it to the end of the string, which the method is called upon. However, if you wanted to add a space character in between the two strings, you could do so by passing a space character as well as the string you want to append as follows:

String result = one.concat(" ".concat(two));

As you can see, having the ability to pass any string or combination of strings to the concat() method makes it very useful. Because all of the string helper methods actually return copies of the original string with the helper method functionality applied, you can pass strings calling other helper methods to concat() (or any other string helper method) as well. Consider that you want to display the text "Hello Java" rather than "Hello Java8". The following combination of string helper methods would allow you to do just that:

String one = "Hello";
String two = "Java8";
String result = one.concat(" ".concat(two.substring(0, two.length()-1)));

The concatenation operator (+) can be used to combine any two strings. It is almost thought of as a shorthand form of the concat() method. The last technique that is demonstrated in solution #3 to this example is the use of StringBuffer, which is a mutable sequence of characters, much like a string, except that it can be modified through method calls. The StringBuffer class contains a number of helper methods for building and manipulating character sequences. In the solution, the append() method is used to append two string values. The append() method places the string that is passed as an argument at the end of the StringBuffer. For more information regarding the use of StringBuffer, refer to the online documentation at http://docs.oracle.com/javase/8/docs/api/java/lang/StringBuffer.html.

3-6. Converting Strings to Numeric Values

Problem

You want to have the ability to convert any numeric values that are stored as strings into integers.

Solution #1

Use the Integer.valueOf() helper method to convert strings to int data types. For example:

String one = "1";
String two = "2";
int result = Integer.valueOf(one) + Integer.valueOf(two);

As you can see, both of the string variables are converted into integer values. After that, they are used to perform an addition calculation and then stored into an int.

Image Note A technique known as autoboxing is used in this example. Autoboxing is a feature of the Java language that automates the process of converting primitive values to their appropriate wrapper classes. For instance, this occurs when you assign an int value to an Integer. Similarly,unboxing automatically occurs when you try to convert in the opposite direction, from a wrapper class to a primitive. For more information on autoboxing, refer to the online documentation at http://docs.oracle.com/javase/tutorial/java/data/autoboxing.html.

Solution #2

Use the Integer.parseInt() helper method to convert strings to int data types. For example:

String one = "1";
String two = "2";
int result = Integer.parseInt(one) + Integer.parseInt(two);
System.out.println(result);

How It Works

The Integer class contains the valueOf() and parseInt() methods, which are used to convert strings or int types into integers. There are two different forms of the Integer class’s valueOf() type that can be used to convert strings into integer values. Each of them differs by the number of arguments that they accept. The first valueOf() method accepts only a string argument. This string is then parsed as an integer value if possible, and then an integer holding the value of that string is returned. If the string does not convert into an integer correctly, then the method will throw a NumberFormatException.

The second version of Integer’s valueOf() method accepts two arguments: a string argument that will be parsed as an integer and an int that represents the radix that is to be used for the conversion.

Image Note Many of the Java type classes contain valueOf() methods that can be used for converting different types into that class’s type. Such is the case with the String class because it contains many different valueOf() methods that can be used for conversion. For more information on the different valueOf() methods that the String class or any other type class contains, see the online Java documentation at http://docs.oracle.com/javase/8/docs.

There are also two different forms of the Integer class’s parseInt() method. One of them accepts one argument: the string you want to convert into an integer. The other form accepts two arguments: the string that you want to convert to an integer and the radix. The first format is the most widely used, and it parses the string argument as a signed decimal integer. A NumberFormatException will be thrown if a parsable unsigned integer is not contained within the string. The second format, which is less widely used, returns an Integer object holding the value that is represented by the string argument in the given radix, given a parsable unsigned integer is contained within that string.

3-7. Iterating Over the Characters of a String

Problem

You want to iterate over the characters in a string of text so that you can manipulate them at the character level.

Solution

Use a combination of string helper methods to gain access to the string at a character level. If you use a String helper method within the context of a loop, you can easily traverse a string by character. In the following example, the string named str is broken down using thetoCharArray() method.

String str = "Break down into chars";
System.out.println(str);
for (char chr:str.toCharArray()){
System.out.println(chr);
}

The same strategy could be used with the traditional version of the for loop. An index could be created that would allow access to each character of the string using the charAt() method.

for (int x = 0; x <= str.length()-1; x++){
System.out.println(str.charAt(x));
}

Both of these solutions will yield the following result:

B
r
e
a
k

d
o
w
n

i
n
t
o

c
h
a
r
s

Image Note The first example using toCharArray() generates a new character array. Therefore, the second example, using the traditional for loop, might perform faster.

How It Works

String objects contain methods that can be used for performing various tasks. The solution to this recipe demonstrates a number of different String methods. The toCharArray() method can be called against a string in order to break the string into characters and then store those characters in an array. This method is very powerful and it can save a bit of time when performing this task is required. The result of calling the toCharArray() method is a char[], which can then be traversed using an index. Such is the case in the solution to this recipe. An enhancedfor loop is used to iterate through the contents of the char[] and print out each of its elements.

The string length() method is used to find the number of characters contained within a string. The result is an int value that can be very useful in the context of a for loop, as demonstrated in the solution to this recipe. In the second example, the length() method is used to find the number of characters in the string so that they can be iterated over using the charAt() method. The charAt() method accepts an int index value as an argument and returns the character that resides at the given index in the string.

Often the combination of two or more string methods can be used to obtain various results. In this case, using the length() and charAt() methods within the same code block provided the ability to break down a string into characters.

3-8. Finding Text Matches

Problem

You want to search a body of text for a particular sequence of characters.

Solution #1

Make use of regular expressions and the string matches() helper method to determine how many matches exist. To do this, simply pass a string representing a regular expression to the matches() method against any string you are trying to match. In doing so, the string will be compared with the string that matches() is being called upon. Once evaluated, matches() will yield a boolean result, indicating whether it is a match. The following code excerpt contains a series of examples using this technique. The comments contained within the code explain each of the matching tests.

String str = "Here is a long String...let's find a match!";
// This will result in a "true" since it is an exact match
boolean result = str.matches("Here is a long String...let's find a match!");
System.out.println(result);
// This will result iin "false" since the entire String does not match
result = str.matches("Here is a long String...");

System.out.println(result);

str = "true";

// This will test against both upper & lower case "T"...this will be TRUE
result = str.matches("[Tt]rue");
System.out.println(result);

// This will test for one or the other
result = str.matches("[Tt]rue|[Ff]alse]");
System.out.println(result);

// This will test to see if any numbers are present, in this case the
// person writing this String would be able to like any Java release!
str = "I love Java 8!";
result = str.matches("I love Java [0-9]!");
System.out.println(result);

// This will test TRUE as well...
str = "I love Java 7!";
result = str.matches("I love Java [0-9]!");
System.out.println(result);

// The following will test TRUE for any language that contains
// only one word for a name. This is because it tests for
// any alphanumeric combination. Notice the space character
// between the numeric sequence...
result = str.matches("I love .*[ 0-9]!");
System.out.println(result);

// The following String also matches.
str = "I love Jython 2.5.4!";
result = str.matches("I love .*[ 0-9]!");

System.out.println(result);

Each of the results printed out in the example will be true, with the exception of the second example because it does not match.

Solution #2

Use the regular expression Pattern and Matcher classes for a better-performing and more versatile matching solution than the string matches() method. Although the matches() method will get the job done most of the time, there are some occasions in which you will require a more flexible way of matching. Using this solution is a three-step process:

1. Compile a pattern into a Pattern object.

2. Construct a Matcher object using the matcher() method on the Pattern.

3. Call the matches() method on the Matcher.

In the following example code, the Pattern and Matcher technique is demonstrated:

String str = "I love Java 8!";
boolean result = false;

Pattern pattern = Pattern.compile("I love .*[ 0-9]!");
Matcher matcher = pattern.matcher(str);
result = matcher.matches();

System.out.println(result);

The previous example will yield a TRUE value just like its variant that was demonstrated in solution #1.

How It Works

Regular expressions are a great way to find matches because they allow patterns to be defined so that an application does not have to explicitly find an exact string match. They can be very useful when you want to find matches against some text that a user may be typing into your program. However, they could be overkill if you are trying to match strings against a string constant you have defined in your program because the String class provides many methods that could be used for such tasks. Nevertheless, there will certainly come a time in almost every developer’s life when regular expressions can come in handy. They can be found in just about every programming language used today. Java makes them easy to use and understand.

Image Note Although regular expressions are used in many different languages today, the expression syntax for each language varies. For complete information regarding regular expression syntax, see the documentation online athttp://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html.

The easiest way to make use of regular expressions is to call the matches() method on the String object. Passing a regular expression to the matches() method will yield a boolean result that indicates whether the String matches the given regular expression pattern or not. At this point, it is useful to know what a regular expression is and how it works.

A regular expression is a string pattern that is used to match against other strings in order to determine its contents. Regular expressions can contain a number of different patterns that enable them to be dynamic in that they can have the ability to match many different strings that contain the same format. For instance, in the solution to this recipe, the following code can match several different strings:

result = str.matches("I love Java [0-9]!");

The regular expression string in this example is "I love Java [0-9]!", and it contains the pattern [0-9], which represents any number between 0 and 9. Therefore, any string that reads "I love Java" followed by the numbers 0 through 9 and then an exclamation point will match the regular expression string. To see a listing of all the different patterns that can be used in a regular expression, see the online documentation available at the URL in the previous note.

A combination of Pattern and Matcher objects can also be used to achieve similar results as the string matcher() method. The Pattern object can be used to compile a string into a regular expression pattern. A compiled pattern can provide performance gains to an application if the pattern is used multiple times. You can pass the same string–based regular expressions to the Pattern.compile() method as you would pass to the string matches() method. The result is a compiled Pattern object that can be matched against a string for comparison. AMatcher object can be obtained by calling the Pattern object’s matcher() method against a given string. Once a Matcher object is obtained, it can be used to match a given string against a Pattern using any of the following three methods, which each return a boolean value indicating a match. The following three lines of solution #2 could be used as an alternate solution to using the Pattern.matches() method, minus the reusability of the compile pattern:

Pattern pattern = Pattern.compile("I love .*[ 0-9]!");
Matcher matcher = pattern.matcher(str);
result = matcher.matches();

· The Matcher matches() method attempts to match the entire input string with the pattern.

· The Matcher lookingAt() method attempts to match the input string to the pattern starting at the beginning.

· The Matcher find() method scans the input sequence looking for the next matching sequence in the string.

In the solution to this recipe, the matches() method is called against the Matcher object in order to attempt to match the entire string. In any event, regular expressions can be very useful for matching strings against patterns. The technique used for working with the regular expressions can vary in different situations, using whichever method works best for the situation.

3-9. Replacing All Text Matches

Problem

You have searched a body of text for a particular sequence of characters, and you are interested in replacing all matches with another string value.

Solution

Use a regular expression pattern to obtain a Matcher object; then use the Matcher object’s replaceAll() method to replace all matches with another string value. The example that follows demonstrates this technique:

String str = "I love Java 8! It is my favorite language. Java 8 is the "
+ "8th version of this great programming language.";
Pattern pattern = Pattern.compile("[0-9]");
Matcher matcher = pattern.matcher(str);
System.out.println("Original: " + str);
System.out.println(matcher.matches());
System.out.println("Replacement: " + matcher.replaceAll("7"));

This example will yield the following results:

Original: I love Java 8! It is my favorite language. Java 8 is the 8th version of this great programming language.
Replacement: I love version of this great programming language.

How It Works

The replaceAll() method of the Matcher object makes it easy to find and replace a string or a portion of string that is contained within a body of text. In order to use the replaceAll() method of the Matcher object, you must first compile a Pattern object by passing a regular expression string pattern to the Pattern.compile() method. Use the resulting Pattern object to obtain a Matcher object by calling its matcher() method. The following lines of code show how this is done:

Pattern pattern = Pattern.compile("[0-7]");
Matcher matcher = pattern.matcher(str);

Once you have obtained a Matcher object, call its replaceAll() method by passing a string that you want to use to replace all the text that is matched by the compiled pattern. In the solution to this recipe, the string "7" is passed to the replaceAll() method, so it will replace all the areas in the string that match the "[0-7]" pattern.

3-10. Determining Whether a File Name Ends with a Given String

Problem

You are reading a file from the server and you need to determine what type of file it is in order to read it properly.

Solution

Determine the suffix of the file by using the endsWith() method on a given file name. In the following example, assume that the variable filename contains the name of a given file, and the code is using the endsWith() method to determine whether filename ends with a particular string.:

if(filename.endsWith(".txt")){
System.out.println("Text file");
} else if (filename.endsWith(".doc")){
System.out.println("Document file");
} else if (filename.endsWith(".xls")){
System.out.println("Excel file");
} else if (filename.endsWith(".java")){
System.out.println("Java source file");
} else {
System.out.println("Other type of file");
}

Given that a file name and its suffix are included in the filename variable, this block of code will read its suffix and determine what type of file the given variable represents.

How It Works

As mentioned previously, the String object contains many helper methods that can be used to perform tasks. The String object’s endsWith() method accepts a character sequence and then returns a boolean value representing whether the original string ends with the given sequence. In the case of the solution to this recipe, the endsWith() method is used in an if block. A series of file suffixes is passed to the endsWith() method to determine what type of file is represented by the filename variable. If any of the file name suffixes matches, a line is printed, stating what type of file it is.

Summary

This chapter covered the basics of working with strings. Although a string may look like a simple string of characters, it is an object that contains many methods that can be useful for obtaining the required results. Although strings are immutable objects, many methods within the Stringclass contain a copy of the string, modified to suit the request. This chapter covered a handful of these methods, demonstrating features such as concatenation, how to obtain portions of strings, trimming whitespace, and replacing portions of a string.