The String Data Type - Expert C# 5.0: with .NET 4.5 Framework (2013)

Expert C# 5.0: with .NET 4.5 Framework (2013)

CHAPTER 10

images

The String Data Type

This chapter will discuss the string data type in Microsoft .NET Framework using C# language. First I will show how the CLR manages to instantiate a string in .NET. I will then discuss string immutability through which CLR ensures that when a string is created, it can’t be changed, and examine its contents, chaining operations in string, and various concatenation techniques used in .NET Framework for the string.

Throughout the chapter, I reference StringBuilder, which is a class that can be used to generate string efficiently. It can also be used to manipulate string, such as append, insert, or remove string. You will see this class used in several examples, but it’s not until later in the chapter that I detail the internal workings of the StringBuilder class. There we will examine the constructor of the StringBuilder and the addition, insertion, and remove operations to see how CLR deals with string when using StringBuilder to generate it.

String in .NET

In C#, you can represent numbers such as 1, 2, 3, and so forth using Int32 data type as characters, such as 'A','B', or 'C' using char data type. If you want to represent a word, a sentence, and so on, you can use String data type. In .NET, C# string is a sealed class defined in the System namespace of the mscorlib.dll assembly (located in C:\Windows\Microsoft.NET\Framework\v4.0.30319\mscorlib.dll), as shown in Figure 10-1.

images

Figure 10-1. String class in System.String namespace

The class definition of the String is extracted using the ildasm.exe, as shown in Listing 10-1.

Listing 10-1. Definition of the String Class in .NET

.class public auto ansi serializable sealed beforefieldinit String
extends
System.Object
implements
System.IComparable, System.ICloneable, System.IConvertible,
System.IComparable'1<string>, System.Collections.Generic.IEnumerable'1<char>,
System.Collections.IEnumerable, System.IEquatable'1<string>

So based on the class definition, you can see that string class is derived from the System.Object. It is not possible to inherit a type from the String class as it is sealed. As the String class implements the IEnumerable'1<char> interface, you will be able to use the Linq (discussed in the Chapter 12) functionality over the String. Listing 10-2 gives an example of the String in .NET using C#.

Listing 10-2. An Example of String

using System;
using System.Text;
namespace Ch10
{
class Program
{
static void Main(string[] args)
{
string bookName = "Expert C# 5.0: with the .NET 4.5 Framework";
/* CLR will create a String with - by repeating the number
* of the Length of the bookName string .*/
string dashedLine = new string('-', bookName.Length);
StringBuilder sb = new StringBuilder("by Mohammad Rahman");

Console.WriteLine("{0}\n{1}\n{2}",
bookName, /* C# Compiler include the String Literal
* used in bookName in metadata */
dashedLine, /* C# Compiler does not include the
* String Literal used in dashedLine
* in metadata */
sb.ToString()); /* C# Compiler include the String Literal
* used in the constructor in metadata and
* will construct the String at runtime
* using StringBuilder */
}
}
}

In Listing 10-2, the bookName is declared as a String type and assigned the string literal Expert C# 5.0: with the .NET 4.5 Framework to it as a value, dashedLine has been constructed using a char, and the StringBuilder is used to construct the string. When this program executes, it will produce the following output:

Expert C# 5.0: with the .NET 4.5 Framework
--------------------------------------------------------
by Mohammad Rahman

Let’s open the executable of the program in Listing 10-2 using ildasm.exe to see the metadata information. When the C# compiler compiles the code in Listing 12-2, it embed the string literal used in the Program class into the User Strings section of the executable file, as you can see in Figure 10-2.

images

Figure 10-2. String literals in the User Strings section of the MetaInfo

The C# compiler embed Expert C# 5.0: with the .NET 4.5 Framework, by Mohammad Rahman, and {0}\n{1}\n{2} String literally into the metadata of the executable, and this will be used by the CLR when it is required.

Instantiation of a String Object

As you saw earlier, String class is derived from the System.Object, so the String is a reference type and it will live in the Heap while executing a program that uses String. However, in comparison to other reference types, CLR will handle this a bit differently to instantiate an instance of the String. Let’s explore this further using the example in Listing 10-3.

Listing 10-3. Demonstration of the C# String Creation

using System;

namespace Ch10
{
class Program
{
static void Main(string[] args)
{
string book = LoadStringLiteral();
}
static string LoadStringLiteral()
{ return "Expert C# 5.0: with the .NET 4.5 Framework"; }
}
}

The decompiled IL code of the Listing 10-3 program using the ildasm.exe to .NETReflector is given in Listing 10-4.

Listing 10-4. IL Code of the Program in Listing 10-3

.class private auto ansi beforefieldinit Program
extends [mscorlib]System.Object
{
/* Code removed */
.method private hidebysig static string LoadStringLiteral() cil managed
{
.maxstack 1
.locals init (
[0] string CS$1$0000)
L_0000: nop
/* String literal embedded by the C# compiler */
L_0001: ldstr "Expert C# 5.0: with the .NET 4.5 Framework"
L_0006: stloc.0
L_0007: br.s L_0009
L_0009: ldloc.0
L_000a: ret
}

.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
.maxstack 1
.locals init (
[0] string book)
L_0000: nop
L_0001: call string Ch10.Program::LoadStringLiteral()
L_0006: stloc.0
L_0007: ret
}
}

How the CLR Handles String Instantiation

Let’s analyze the code in Listing 10-4 to better understand the concept of the String creation in the C#.

First, in runtime, while the CLR executes the Main method, it will Jit the IL code for the Main method, but the LoadStringLiteral method will not be Jitted at that point, as demonstrated below.

Let’s see the MethodDesc Table for the Program class in Listing 10-3 while debugging using the windbg.exe.

MethodDesc Table
Entry MethodDesc JIT Name
55c8a7e0 55a64934 PreJIT System.Object.ToString()
55c8e2e0 55a6493c PreJIT System.Object.Equals(System.Object)
55c8e1f0 55a6495c PreJIT System.Object.GetHashCode()
55d11600 55a64970 PreJIT System.Object.Finalize()
001dc019 001d3808 NONE Ch10.Program..ctor()
003b0070 001d37f0 JIT Ch10.Program.Main(System.String[])
003b00c0 001d37fc NONE Ch10.Program.LoadStringLiteral()

As you can see, for the MethodDesc Table of the Program class, the LoadStringLiteral method has not yet been Jitted, so there will not be any address in the Heap in regard to the String literal Expert C# 5.0: with the .NET 4.5 Framework.

Second, as soon as the CLR starts executing the Jitted LoadStringLiteral method, it will instantiate an instance of the String using String literal Expert C# 5.0: with the .NET 4.5 Framework and store it in the Heap and then pass the address of that String back to the Stack of the LoadLiteralString method as a reference, as shown in Figure 10-3.

images

Figure 10-3. Instantiation of the String

From Figure 10-3, you can see that in the Heap there isn’t any address for the string literal Expert C# 5.0: with the .NET 4.5 Framework in the pre-Jit state of the method LoadStringLiteral. While the CLR Jits the LoadStringLiteral and starts executing, it instantiates an instance of the string using the literal into the Heap and passes the reference (address 0x01eeb91c) back to the LoadStringLiteral method where it will be stored in the local variable CS$1$0000, as shown in Listing 10-4.

Images Note ldstr: The ldstr instruction pushes a new string object representing the literal stored in the metadata as string (which is a string literal).

Analyzing the Stack information while executing Listing 10-3 will provide further understanding of the string creation in the .NET. Figure 10-4 shows the Stack information while executing the Main and LoadStringLiteral methods of the Program class in Listing 10-3.

Examining the Memory While the CLR Loads String into the Heap

The locals section of the Main method contains the variable for the book string (variable location 0x001ef09c) that is used to store the data it gets from the LoadStringLiteral method. When the CLR starts executing the LoadStringLiteral method, it will store the address (0x01eeb91c) of the book string from the Heap to the local variable (0x001ef08c refers to the CS$1$0000 of Listing 10-4), as demonstrated in Figure 10-4.

images

Figure 10-4. Stack information while executing Listing 10-3

From Figure 10-4, we can see that the address 0x01eeb91c from the Heap refers to the string literal Expert C# 5.0: with the .NET 4.5 Framework while executing the dumpobj command in the windbg.exe.

0:000> !dumpobj 0x01eeb91c
Name: System.String
MethodTable: 565af9ac
EEClass: 562e8bb0
Size: 84(0x54) bytes
File: C:\Windows\Microsoft.NET\assembly\GAC_32\mscorlib\
v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
String: Expert C# 5.0: with the .NET 4.5 Framework

In .NET, you can instantiate a string object using a char array, as demonstrated in Listing 10-5.

Listing 10-5. Construct String Using Char Array

using System;
namespace Ch10
{
class Program
{
static void Main(string[] args)
{
string book = new String(new char[]
{
'E', 'x', 'p', 'e', 'r', 't', ' ', 'C', '#',
' ', '5', '.', '0', ':', ' ', 'w', 'i', 't',
'h', ' ', 't', 'h', 'e', ' ', '.', 'N', 'E',
'T', ' ', '4', '.', '5', ' ', 'F', 'r', 'a',
'm', 'e', 'w', 'o', 'r', 'k'
});
Console.WriteLine(book);
}
}
}

This program will produce the following output:

Expert C# 5.0: with the .NET 4.5 Framework

To understand the string construction using a char array, you need to look into the IL code the C# compiler produced for Listing 10-5.

String Instantiation Using Char Array

The decompiled IL code in Listing 10-6 shows how the CLR instantiates the string object in runtime using a char array.

Listing 10-6. IL Code for the Source Code in Listing 10-5

.class private auto ansi beforefieldinit Program
extends [mscorlib]System.Object
{
.method public hidebysig specialname rtspecialname instance void
.ctor() cil managed
{
/*Code removed*/
}

.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
.maxstack 3
.locals init (
[0] string book)
L_0000: nop
L_0001: ldc.i4.s 0x2a

L_0003: newarr char
/* Code removed */

/* The CLR creates a new instance of the string object
* and pass the address to the Evaluation stack.*/
L_0013: newobj instance void [mscorlib]System.String::.ctor(char[])

/* Store the address from the top of the evaluation stack and store
* into the book variable at position 0 of the Locals section
* of the Main method Stack.*/
L_0018: stloc.0

/* Load the address of the local variable book from the method stack
* to the evaluation stack */
L_0019: ldloc.0

L_001a: call void [mscorlib]System.Console::WriteLine(string)
L_001f: nop
L_0020: ret
}
}

Let’s analyze the code in Listing 10-6 to understand the String instantiation using char array.

How the CLR Handles String Instantiation from Char Array

The L_0003 label in the Main method will create an array of char with the size of 0x2a (42) and store all the related characters, for instance 'E', 'x', and so on in it. The CLR will use this array in the L_0013 to instantiate an instance of the string object using newobj instruction.

Images Note newobj: The newobj instruction allocates a new instance of the class into the Heap and pushed the initialized object reference onto the stack.

Figure 10-5 shows the string instantiation using char array as input to the string class.

images

Figure 10-5. String instantiation using char array

HOW MANY CHARACTERS CAN STRING HOLD?

In the string type, you can store almost 2 billion characters if you look into the constructor of the string class:

/* count - refers to the number of times to repeat char c to
* construct new string object*/
public extern String (char c, int count);

where the count is a type of int. The maximum value of the int is 0x7fffffff (2147483647), which is approximately 2 billion, and this amount of characters can be stored into a string object.

String and Chaining

The method of chaining uses mechanisms through which you can call a series of method (each of the methods returns the same type where it is being defined) in one line. For example, if a class C defines methods Ma to Mz and each of the methods returns type C, then according to the method chaining mechanism, you call each of the methods as:

Ma().Mb().Mc()......Mz()

Listing 10-7 shows how method chaining has been implemented on the Book class.

Listing 10-7. Example of the Method Chaining for the String Type

using System;

namespace Ch10
{
class Program
{
static void Main(string[] args)
{
Book book = new Book();
Console.WriteLine(
book.
SetBookName("Expert C# 5.0: with the .NET 4.5 Framework").
SetPublishedYear(2012).ToString());
}
}

public class Book
{
private string bookName = default(string);
private Int32 publishedYear = default(int);

public Book SetBookName(string nameOfTheBook)
{
bookName = nameOfTheBook;
return this;
}

public Book SetPublishedYear(int yearOfThePublication)
{
publishedYear = yearOfThePublication;
return this;
}

public override string ToString()
{
return string.Format("{0}:{1}", bookName, publishedYear);
}
}
}

The Book class has SetBookName and SetPublishedYear methods, and these methods return Book as the return type, as demonstrated in Figure 10-6.

images

Figure 10-6. Method chaining

As a result, you can call SetBookName and SetPublishedYear methods as a series of the method call, for example:

book.SetBookName("Jupiter").SetPublishedYear(9999)

This program will produce the following output:

Expert C# 5.0: with the .NET 4.5 Framework : 2012

Strings Are Immutable

An immutable object refers to an object whose state cannot be altered after it is instantiated. Similarly, the string is by default immutable (i.e., whenever you create a string object, it is not possible to modify the contents of that string unless you create a new instance of it). This section will explore more detail about the immutable behavior of the string in .NET by analyzing the runtime behavior of the string object.

Listing 10-8 shows that when you try to modify the contents of the string object, the C# compiler raises a compilation error.

Listing 10-8. Modify the String Content

using System;

namespace Ch10
{
class Program
{
static void Main(string[] args)
{
string bookName = "A book name.";

/* replace the whole string */
bookName = "Expert C# 5.0: with the .NET 4.5 Framework";

/* Compiler will generate error in here. */
bookName[2] = 'A';
}
}
}

In Listing 10-8, a bookName variable has been declared as string with the literal value of A book name. It is possible to replace the whole contents of the bookName variable, but it is not possible to modify the first, second, or any individual character or part of the bookName string. The C# compiler will produce the following error when you try to compile the code in the Listing 10-8:

Error 107 Property or indexer 'string.this[int]' cannot be assigned to -- it is read only
J:\Book\ExpertC#2012\SourceCode\BookExamples\Ch10\Program.cs 11 13 Ch10

Based on the compiler-generated error message, you can see that the string class has an index property, which will take an int type input and return a character from that specified position. Let’s explore the index of the string class using the ildasm.exe to NETReflector, and you will see the code for the index property of the string class. Listing 10-9 shows the index property of the string class and the converted version of the C# code from the IL code of the string class.

Listing 10-9. Index of the String Class

public char this[int index]
{
get; /* There is no set as a result it becomes readonly property */
}

The index property of the string class shows that it is readonly, and only the get method is defined. So in .NET, whenever it requires modification to a string to do an operation, it will copy the string-required modification into a new string, apply the change on the new string, and return it as the result of the operation. Listing 10-10 presents an example so we can see the string immutable behavior in .NET for the C#.

Listing 10-10. String Immutable Example

using System;

namespace Ch10
{
class Program
{
static void Main(string[] args)
{
string myString =
" Expert C# 5.0: with the .NET 4.5 Framework by Mohammad A Rahman ";
myString = myString.ToUpper().ToLower().Trim();
Console.WriteLine(myString);
}
}
}

Listing 10-10 will produce the following output:

expert c# 5.0: with the .net 4.5 framework by mohammad a rahman

Let’s explore the internal workings of how CLR handles the string modification task. Listing 10-10 shows that ToUpper and ToLower functionality apply over the myString. Internally, CLR will instantiate a new instance of the string using the literal of the previous operation (such as ToUpper orToLower), continue the operation, and finally produce the result as demonstrated in Figure 10-7.

images

Figure 10-7. String immutable

The CLR will pass the address of the myString (0x01f6b91c) to the ToUpper method, which instantiates a new string (0x1f6bd08) object with the value of the myString. It will change the case of the new string (0x1f6bd08), pass to the ToLower method, which will do the same as ToUpper except make it lower case, and pass this new string (0x01f6bd60) to the Trim method and so on. Let’s see the object information of those newly created strings, as shown in Listing 10-10, using windbg.exe.

!dumpobj 0x01f6b91c - Expert C# 5.0: with the .NET 4.5 Framework by Mohammad A Rahman
!dumpobj 0x01f6bd08 - EXPERT C# 5.0: WITH THE .NET 4.5 FRAMEWORK BY MOHAMMAD A RAHMAN
!dumpobj 0x01f6bd60 - expert c# 5.0: with the .NET 4.5 framework by mohammad a rahman

String Concatenation

String concatenation refers to joining two strings; for example, the concatenation between two string objects Sa and Sb will produce SaSb. In everyday programming life, you might need to deal with the string concatenation operation somehow. In .NET, there are many ways to perform this concatenation operation, such as using string concatenation operator +, the concatenation method provided by .NET Framework, or the StringBuilder class. Usage of these different techniques of string concatenation depends on the situation and number of items to concatenate. Table 10-1 shows the different concatenation techniques used in the C#.

Images

Let’s see how each of these operations is performed in the .NET.

+ Operator

The concatenation operator + can be used to concatenate multiple strings into one. Listing 10-11 provides an example that will concatenate a few strings into one using the concatenation operator +. The ConcatUsingOperator method will add three string literals into one and return the results as shown in Listing 10-11.

Listing 10-11. String Concatenation Using Concatenation Operator+

using System;

namespace Ch10
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine(ConcatUsingOperator());
Console.WriteLine(ConcatUsingOperator("One,", "Two,", "Three"));
}

static string ConcatUsingOperator()
{
return "One," + "Two," + "Three.";
}

static string ConcatUsingOperator(string one, string two, string three)
{
return one + two + three;
}
}
}

This program will produce the following output:

One,Two,Three.
One,Two,Three

When the C# compiler finds any string literals in the source code while compiling, it will embed those string literals into the User Strings section of the meta info of that executable. If you build the program in Listing 10-11 and open the produced executable using the ildasm.exe to see the metadata info, you will find the string literals shown in Listing 10-12 embedded in the metadata.

Listing 10-12. User String Section of the Metadata

User Strings
-------------------------------------------------------
70000001 : ( 4) L"One,"
7000000b : ( 4) L"Two,"
70000015 : ( 5) L"Three"
70000021 : (14) L"One,Two,Three."

The CLR will use these string literals to execute the ConcatUsingOperator() and ConcatUsingOperator(string one, string two, string three) methods. The decompiled IL code for Listing 10-11, as shown in Listing 10-13, will demonstrate how the internally concatenation operator is used by the C# compiler.

Listing 10-13. IL code for the ConcatUsingOperator Method Shown in Listing 10-11

.class private auto ansi beforefieldinit Program
extends [mscorlib]System.Object
{
/* Code removed */
.method private hidebysig static string
ConcatUsingOperator() cil managed
{
.maxstack 1
.locals init (
[0] string CS$1$0000)
L_0000: nop

/* The C# compiler Concat all the string at the compile time. */
L_0001: ldstr "One,Two,Three."
L_0006: stloc.0
L_0007: br.s L_0009
L_0009: ldloc.0
L_000a: ret
}

.method private hidebysig static string
ConcatUsingOperator(
string one, string two, string three) cil managed
{
.maxstack 3
.locals init (
[0] string CS$1$0000)
L_0000: nop
L_0001: ldarg.0
L_0002: ldarg.1
L_0003: ldarg.2

/* The Concatenation operator will be replaced by the Concat method */
L_0004: call string [mscorlib]System.String::Concat(string, string, string)
L_0009: stloc.0
L_000a: br.s L_000c
L_000c: ldloc.0
L_000d: ret
}

.method private hidebysig static void Main(string[] args) cil managed
{
/* Code removed */
L_0001: call string Ch10.Program::ConcatUsingOperator()
L_0006: call void [mscorlib]System.Console::WriteLine(string)
L_000b: nop

L_000c: ldstr "One,"
L_0011: ldstr "Two,"
L_0016: ldstr "Three"
L_001b: call string Ch10.Program::ConcatUsingOperator(string, string, string)
L_0020: call void [mscorlib]System.Console::WriteLine(string)
L_0025: nop
L_0026: ret
}
}

The Main method will call the ConcatUsingOperator method while executing the instruction in the L_0001. Inside the ConcatUsingOperator method, the C# compiler declared a local variable CS$1$0000 in the stack, which will be used to store string One, Two, Three, as shown in the L_0006 instruction.

The CLR will extract string literal from the metadata info and include it into the labels L_000c, L_0011, L_0016 of the Main method, which will use the ldstr instruction to load the string and pass it as the parameter to the ConcatUsingOperator method in the L_001b. The CLR will execute the L_001binstruction from the Main method, which will call ConcatUsingOperator(string, string, string) with the argument value One, Two, and Three. This ConcatUsingOperator method calls the Concat method internally to concatenate the strings (you will find the details of the Concat method later in this chapter). From the overloaded ConcatUsingOperator method, CLR will call the Concat method and store the results in the local variable CS$1$0000 at position 0 of the method stack, as shown in L_0009 and it will be returned as output at L_000c.

Concat IEnumerable<T>

To concatenate the items from the IEnumerable<T>, you can use this method, the signature of which is shown below:

public static string Concat<T>(IEnumerable<T> values)
public static string Concat(IEnumerable<string> values)

The program in Listing 10-14 shows a list of string object listOfNumbers used to concatenate the items from that list.

Listing 10-14. Concat List of String Object

using System;
using System.Collections;
using System.Collections.Generic;

namespace Ch10
{
class Program
{
static void Main(string[] args)
{
IList<string> listOfNumbers = new List<string>()
{
"One,", "Two,", "Three."
};
Console.WriteLine("{0}", ConcateUsingConcate(listOfNumbers));
}

static string ConcateUsingConcate(IEnumerable<string> enumerable)
{
return string.Concat<string>(enumerable);
}
}
}

Listing 10-14 will produce the following output:

One,Two,Three.

The code in Listing 10-14 uses the Concat<T>( IEnumerable<T> values) method to concatenate a list of strings. While the CLR executes this method, it will check whether the given list is null or not. If null, then it throws an ArgumentNullException, otherwise it will process the list to concatenate into onestring object. The CLR will take the following steps to process the concatenation operation:

· Create an instance of StringBuilder type where it will store the entire string object to concatenate.

· Retrieve the Enumerator object from the IEnumberable<T>.

· Loop through the Enumerator object to get each of the items from the list and append them to the StringBuilder object it instantiated earlier.

· Call the ToString method of the StringBuilder object to get the final concatenated string and return it as output.

The implementation of the Concat<string>(IEnumberable<string> values) method is shown in Listing 10-15.

Listing 10-15. The Implementation of Concat<string>(IEnumberable<string> values)

public static string Concat(IEnumerable<string> values)
{
StringBuilder builder = new StringBuilder();
using (IEnumerator<string> enumerator = values.GetEnumerator())
{
while (enumerator.MoveNext())
{
if (enumerator.Current != null)
{
builder.Append(enumerator.Current);
}
}
}
return builder.ToString();
}

Concat Array of Objects

To concatenate the string representations of the elements in a given Object array, you can use one of the following methods:

public static string Concat(params object[] args)
public static string Concat(params string[] values)

Listing 10-16 shows the usage of the first method, and it will be used to explain the Concat method.

Listing 10-16. Usage of the Concat(params object[] args)

using System;

namespace Ch10
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("{0}", ConcatUsingConcat(new[]
{
"One,", "Two,", "Three."
}));
}

static string ConcatUsingConcat(params object[] args)
{
return string.Concat(args);
}
}
}

This program will produce the following output:

One,Two,Three.

To execute the Concat operation of the object array, the CLR will perform the following operations:

1. Step 1: Check whether the args(object array) is null or not. If null, it will throw an ArgumentNullException, otherwise it continues to process the operation.

2. Step 2: Call the ConcatArray method to process the further concatenation operation. The code for the Concat method shows how the CLR internally calls the ConcatArray method, as shown in Listing 10-17.

Listing 10-17. The Implementation of the Concat Method

public static string Concat(params object[] args)
{
/* Code removed- This section does the Initial check as demonstrated in the Step 1.*/
if (args == null)
{
throw new ArgumentNullException("args");
}
/* Code removed - This section described in the Step 2. */
return ConcatArray(values, totalLength);
}

3. Step 3: The ConcatArray takes an array of string objects and the total length of the array. It will then allocate a string of size total length using the FastAllocateString method. This new string will be filled with the value from the input array using the FillStringChecked method, as shown inListing 10-18. The implementation of the ConcatArray method of the System.String class from the mscorlib.dll assembly is shown in Listing 10-18.

Listing 10-18. The Implementation of the ConcatArray

private static string ConcatArray(string[] values, int totalLength)
{
string dest = FastAllocateString(totalLength);
int destPos = 0;
for (int i = 0; i < values.Length; i++)
{
FillStringChecked(dest, destPos, values[i]);
destPos += values[i].Length;
}
return dest;
}

4. Step 4: Finally, the concatenated string will return an output.

Concat Objects

To concatenate one to three strings represented in the three objects, you can use one of the following Concat methods in C#:

public static string Concat(object arg0)
public static string Concat(object arg0, object arg1)
public static string Concat(object arg0, object arg1, object arg2)

Listing 10-19 shows the usage of the first method, and this will be used to explain the Concat method.

Listing 10-19. Using Concat(object arg0, object arg1)

using System;

namespace Ch10
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("{0}",
ConcatUsingConcat( "Expert C# 5.0: with the .NET 4.5 Framework ",
" by Mohammad Rahman"));
}

static string ConcatUsingConcat(object args0, object args1)
{
return string.Concat(args0, args1);
}
}
}

This program will produce the following output:

Expert C# 5.0: with the .NET 4.5 Framework by Mohammad Rahman

The Concat operation works as follows:

1. Step 1: The CLR will load both of the object arguments and call their ToString method.

2. Step 2: It will then call the Concat( string, string ) method to do the Concat operation.

The implementation of the Concat method is shown in Listing 10-20.

Listing 10-20. The Implementation of the Concat Method

.method public hidebysig static string Concat(
object arg0, object arg1) cil managed
{
// Code size 38 (0x26)
.maxstack 8
IL_0000: ldarg.0
IL_0001: brtrue.s IL_000a

/* Following IL_0003 to L_0008 instruction will execute while there is no
* argument value or null value for the argument at position 0 */
IL_0003: ldsfld string System.String::Empty
IL_0008: starg.s arg0 /* Store value at argument position 0 */

IL_000a: ldarg.1
IL_000b: brtrue.s IL_0014

/* Following IL_000d to L_0012 instruction will execute while there is no
* argument value or null value for the argument at position 0 */
IL_000d: ldsfld string System.String::Empty
IL_0012: starg.s arg1 /* Store value at argument position 1 */


IL_0014: ldarg.0
IL_0015: callvirt instance string System.Object::ToString()
IL_001a: ldarg.1
IL_001b: callvirt instance string System.Object::ToString()

/* Concat method will be called to do the concat operation */
IL_0020: call string System.String::Concat(string,string)
IL_0025: ret
}

3. Step 3: This method will return the concatenated string object as a result.

Concat Strings

To concatenate a specified number of string instances, you can use one of the following methods:

public static string Concat(string str0, string str1)
public static string Concat(string str0, string str1, string str2)
public static string Concat(string str0, string str1, string str2, string str3)

Listing 10-21 shows the usage of the first method, and this will be used to explain the Concat method.

Listing 10-21. An Example of Concat(string str0, string str1)

using System;

namespace Ch10
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("{0}",
ConcatUsingConcat("Expert C# 5.0: with the .NET 4.5 Framework ",
" by Mohammad Rahman"));
}

static string ConcatUsingConcat(string str0, string str1)
{
return string.Concat(str0, str1);
}
}
}

This will produce the following output:

Expert C# 5.0: with the .NET 4.5 Framework by Mohammad Rahman

The Concat operation works as follows:

1. Step 1: Internally the CLR will check whether the arguments are empty or not. The logic is if the str0 is null or empty, then the compiler will check the str1. If str1 is null or empty, it will return the string.Empty, otherwise str1, as a result. On the other hand, if the str0 is not null or empty, or if str1 is null or empty, then the compiler will return the str0 as a result of the Concat method.

if (IsNullOrEmpty(str0))
{
if (IsNullOrEmpty(str1))
{
return Empty;
}
return str1;
}
if (IsNullOrEmpty(str1))
{
return str0;
}

2. Step 2: The CLR will determine the length of the str0 and str1. It will then call the FastAllocateString method with the sum of the str0 and str1’s length as the total length of the new string instance. It will subsequently call the FillStringChecked method to do the concatenation operation. The implementation of these operations might be as demonstrated below:

int length = str0.Length;
string dest = FastAllocateString(length + str1.Length);
FillStringChecked(dest, 0, str0);
FillStringChecked(dest, length, str1);
return dest;

3. Step 3: Finally, the concatenation result will return as output.

StringBuilder

The StringBuilder class can be used to represent editable or mutable string in the .NET. You have already seen how you can use the StringBuilder class to construct the string in .NET. In this section, you will learn more details about the StringBuilder class and also explore how the CLR handles theStringBuilder class to do the Append, Insert operation.

The StringBuilder class is defined in the String.Text namespace of the mscorlib.dll assembly, as shown in Figure 10-8.

images

Figure 10-8. StringBuilder class in System.Text namespace

The StringBuilder is a sealed class defined with the following definition:

public sealed class StringBuilder : ISerializable

The StringBuilder class can do append, insert, remove, replace, and clear operations over the string literals. In .NET, the StringBuilder class has the methods as demonstrated in Listing 10-22.

Listing 10-22. The StringBuilder Class Definition

public sealed class StringBuilder : ISerializable
{
/* 6 overloaded constructors */
public StringBuilder() {}

/* 19 overloaded Append method*/
public StringBuilder Append(bool value) {}

/* 5 overloaded AppendFormat method*/
public StringBuilder AppendFormat(string format, object arg0) {}

/* 2 overloaded AppendFormat method*/
public StringBuilder AppendLine() {}

public StringBuilder Clear() {}
public void CopyTo(int sourceIndex, char[] destination, int destinationIndex, int count)
{}
public int EnsureCapacity(int capacity) {}
public bool Equals(StringBuilder sb) {}

/* 18 overloaded AppendFormat method*/
public StringBuilder Insert(int index, char[] value) {}

public StringBuilder Remove(int startIndex, int length) {}

/* 4 overloaded AppendFormat method*/
public StringBuilder Replace(char oldChar, char newChar) {}

/* 2 overloaded AppendFormat method*/
public override unsafe string ToString() {}

/*Properties*/
public int Capacity {}
public char this[int index] {}
public int Length {}
public int MaxCapacity {}
}

Listing 10-22 shows the definition of the StringBuilder class in .NET. In the following section, you will explore the internal workings of the StringBuilder class to learn how the CLR instantiates an instance of the StringBuilder class and how it does the Append, Insert operation in the StringBuilder class.

Internal of StringBuilder

The StringBuilder class internally maintains a few private fields to do its job. One of the important fields is the m_ChunkChars, which is a char array. Unless otherwise defined, the CLR will set the initial size of this array as 0x10 (16), which is defined internally as const of int called DefaultCapacity.Listing 10-23 provides an example of the use of the Append and Insert methods of the StringBuilder class.

Listing 10-23. An Example of StringBuilder

using System;
using System.Text;

namespace Ch10
{
class Program
{
static void Main(string[] args)
{
StringBuilder sb = new StringBuilder();
sb.Append("Expert C# 5.0: with the .NET 4.5 Framework ");
sb.Insert(sb.Length, "by Mohammad A Rahman");
Console.WriteLine(sb.ToString());
}
}
}

Listing 10-23 will produce the following output:

Expert C# 5.0: with the .NET 4.5 Framework by Mohammad A Rahman

Listing 10-23 shows how to instantiate an instance of the StringBuilder class and how to use the Append and Insert methods of the StringBuilder class to append and insert string. Figure 10-9 demonstrates the instantiation of the StringBuilder class and the Append, Insert, and ToString operations of theStringBuilder class.

images

Figure 10-9. StringBuilder overall working details

In the following sections, you will explore in detail the StringBuilder instantiation and the Append and Insert operations in the StringBuilder class that is shown in Figure 10-9.

Instantiation of the StringBuilder

While CLR executes the constructor of the StringBuilder, it will initialize the m_ChunkChars array and call the ThreadSafeCopy method, which will copy the input string literal (if any provided as input) to the m_ChunkChars array, as shown in Listing 10-24.

Listing 10-24. The Implementation of the StringBuilder Constructor

public unsafe StringBuilder(string value, int startIndex, int length, int capacity)
{
this.m_ChunkChars = new char[capacity];
this.m_ChunkLength = length;
fixed (char* str = ((char*) value))
{
char* chPtr = str;
ThreadSafeCopy(
chPtr + startIndex, /* Source pointer */
this.m_ChunkChars, /* Destination char array */
0, /* Destination index */
length); /* Total length to copy. */
}
}

Listing 10-24 demonstrates the internal workings of the StringBuilder instantiation that uses the ThreadSafeCopy method. The implementation of the ThreadSafeCopy method will be discussed in the next section. You can append data to the instance of the StringBuilder class using the Append method. In the following section, you will also explore the internal workings of the Append method of StringBuilder.

Append Operation in the StringBuilder

While executing the Append method, the CLR will use the ThreadSafeCopy method to copy the contents of the input to the m_ChunkChars array. The implementation of the ThreadSafeCopy method is shown in Listing 10-25. Interestingly, the ThreadSafeCopy method uses the wstrcpy method from the string class to handle copy operations of the input string to the m_ChunkChars array.

Listing 10-25. The Implementation of the ThreadSafeCopy

private static unsafe void ThreadSafeCopy(
char* sourcePtr,
char[] destination,
int destinationIndex,
int count)
{
fixed (char* chRef = &(destination[destinationIndex]))
{
string.wstrcpy(chRef, sourcePtr, count);
}
}

Listing 10-25 demonstrates the internal workings of the Append method of the StringBuilder class. You can insert data into the instance of the StringBuilder class using the Insert method of the StringBuilder class, which is examined in the next section.

Insert Operation in the StringBuilder

In the insertion time, the CLR checks the availability of the empty cells in the m_ChunkChars array and depending on the need the CLR will resize the m_ChunkChars array. It will then place the new value into the specified index using the ReplaceInPlaceAtChunk method. The implementation of the Insertmethod is shown in Listing 10-26.

Listing 10-26. The Code for the Insert Method

private unsafe void Insert(int index, char* value, int valueCount)
{
this.MakeRoom(index, valueCount, out builder, out num, false);
this.ReplaceInPlaceAtChunk(ref builder, ref num, value, valueCount);
}

Internally, ReplaceInPlaceAtChunk method will use the ThreadSafeCopy method to insert value into the m_ChunkChars array. The implementation of the ReplaceInPlaceAtChunk method is shown in Listing 10-27.

Listing 10-27. Implementation of the ReplaceInPlaceAtChunk Method

private unsafe void ReplaceInPlaceAtChunk(
ref StringBuilder chunk,
ref int indexInChunk,
char* value, int count)
{
/* Code removed */
ThreadSafeCopy(value, chunk.m_ChunkChars, indexInChunk, num2);
/* Code removed */
}

You’ve seen how to instantiate an instance of the StringBuilder class and how to append and insert data into the StringBuilder. Next you’ll see how to get the string out of StringBuilder class using the ToString method.

Getting String from the StringBuilder

While the CLR executes the ToString method of the StringBuilder class, it will do the following:

1. Allocate a new string object with the current length of the m_ChunkChars array of the StringBuilder class.

2. Pass this newly allocated string and the current m_ChunkChars array of the StringBuilder class as pointers to the wstrcpy method of the string class. The wstrcpy method copies the characters from the m_ChunkChars array into the string object. This string will be returned as a result of the ToStringmethod of the StringBuilder class.

Listing 10-28 presents an example that will show the usage of Append and ToString methods of the StringBuilder class.

Listing 10-28. Concat Strings Using StringBuilder

using System;
using System.Text;

namespace Ch10
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("{0}", ConcatUsingStringBuilder(
"Expert C# 5.0: with the .NET 4.5 Framework ",
"by Mohammad Rahman"));
Console.WriteLine("{0}", ConcatUsingStringBuilder());
}

static string ConcatUsingStringBuilder(string str0, string str1)
{
StringBuilder builder = new StringBuilder();
builder.Append(str0).Append("\t");
builder.Append(str1).Append("\t");
return builder.ToString();
}

static string ConcatUsingStringBuilder()
{
StringBuilder builder = new StringBuilder();

bool boolValue = true;
byte byteValue = 1;
char charValue = 'A';
decimal decimalValue = 10;
double doubleValue = 100;
short shortValue = 1000;
char[] charArrayValue = new char[] { 'A', 'B', 'C' };
int intValue = 10000;
long longValue = 100000;
object objectValue = new object();
sbyte sByteValue = 2;
float floatValue = 200;
string stringValue = "Expert C# 5.0: with the .NET 4.5 Framework";
ushort ushortValue = 10;
uint uintValue = 4;
ulong ulongValue = 400;

builder
.Append(boolValue).Append("\t")
.Append(byteValue).Append("\t")
.Append(charValue).Append("\t")
.Append(decimalValue).Append("\t")
.Append(doubleValue).Append("\t")
.Append(shortValue).Append("\t")
.Append(charArrayValue).Append("\t")
.Append(intValue).Append("\t")
.Append(longValue).Append("\t")
.Append(objectValue).Append("\t")
.Append(sByteValue).Append("\t")
.Append(floatValue).Append("\t")
.Append(stringValue).Append("\t")
.Append(ushortValue).Append("\t")
.Append(uintValue).Append("\t")
.Append(ulongValue).Append("\t")
.Append(charValue, 10).Append("\t")
.Append(stringValue, 1, 2).Append("\t")
.Append(charArrayValue, 1, 2);

return builder.ToString();
}
}
}

This program will produce the following output:

Expert C# 5.0: with the .NET 4.5 Framework by Mohammad Rahman
True 1 A 10 100 1000 ABC 10000 100000 System.O
bject 2 200 Expert C# 5.0: with the .NET 4.5 Framework 10
4 400 AAAAAAAAAA xp BC

Listing 10-28 shows the Append operation using the StringBuilder class, which appends different kinds of value types into an instance of the StringBuilder class.

Summary

In this chapter we have examined the internal workings of string in .NET, such as how the CLR instantiates an instance of the string object and how the string object relates to the Heap storage. You have also learned about the different string concatenation techniques that can be used. You learned the internal implementation of the different concatenation methods of the string class, which will help you understand the string concatenation. This chapter also explored the StringBuilder class and examined the internal behavior of the StringBuilder. Finally, we explored details about theAppend, Insert, and ToString methods in the StringBuilder class. The next chapter will examine about the Collections in .NET. .