All about String and its immutability in .NET C#
Some developers get confused about String in .NET. The confusion comes from the fact that String is an Object -which is a Reference type- but still acts as a Value type.
Another thing, all developers know that String is an immutable type but some developers don’t understand how and why.
Therefore, in this article we are going to discuss how String works in .NET C# trying to cover all the basic knowledge we need to have about this topic.
Memory Allocation
To understand how String works in .NET C#, we need to understand how memory is allocated while creating and manipulating a String.
Therefore, let’s work through some simple examples step by step to see what actually happens in the background in terms of memory allocation.
Changing Value After Initialization
At line 1, we defined a variable called s0 and set its value to “Ahmed”.
At line 2, we changed the value of s0 to “Tarek”.
Since String is immutable, what actually happens is that a new memory location in the Heap memory is created and filled with “Tarek”. Then, the address of s0 is updated to the new memory location.
Therefore, at line 3, the result would be as follows:
Tarek
Initializing Multiple Variables With Same Value
At line 1, we defined a variable called s1 and set its value to “Ahmed”.
At line 2, we defined a variable called s2 and set its value to “Ahmed”.
Since String is immutable, what actually happens here is that the Heap acts as if it is a dictionary where the value of the String is the key.
So, whenever a new String is to be created and allocated into the Heap, if the same String value already exists, no new allocation happens and the same existing memory allocation is used.
Therefore, in our case, the address of s2 is updated to the same address of s1.
Therefore, at line 3 and 4, the result would be as follows:
True
True
Initializing Variable Using “new String()”
At line 1, we defined a variable called s3 and set its value to “Ahmed”.
From line 3 to 12, we defined a variable called s4, initialized it with new String() passing in an array of characters corresponding to “Ahmed”.
Since we used the new String() to create a new instance of String, a unique action happens now. Let me explain.
When we use a string value directly to initialize a variable (like var s3 = “Ahmed”), the allocation happens on a special pool in the Heap memory. Any allocation happens on this special pool is shared between all String variables which are initialized directly as well.
However, whenever new String() is used, a memory location is allocated but outside this special pool.
Therefore, from line 14 to 23, we defined a variable called s5, initialized it with new String() passing in an array of characters corresponding to “Ahmed”.
See, another new memory allocation is set for the variable s5.
Therefore, at line 25, 26, and 27, the result would be as follows:
TrueFalseFalse
For line 25, the result is True because they all have the same value.
For line 26, the result is False because they are not sharing the same memory allocation and they are actually two different references.
The same for line 27, the result is False because they are not sharing the same memory allocation and they are actually two different references.
Implicit Setting Variable By Reference
At line 1, we defined a variable called s6 and set its value to “Ahmed”.
At line 2, we defined a variable called s7 and set its reference implicitly to s6.
What happened here is that the address of s6 is copied to s7. Therefore, they are both referring to the same Heap memory allocation.
At line 3, we set the value of the variable s6 to “Tarek”.
As we explained before, what actually happens here is that a new memory location in the Heap memory is created and filled with “Tarek”. Then, the address of s6 is updated to the new memory location.
However, s7 would not be updated. This means that it would still refer to the memory location where “Ahmed” is saved.
Therefore, at lines 4 and 5, the result would be as follows:
TarekAhmed
For line 4, the result is Tarek because now s6 is referring to the memory location where “Tarek” is saved.
For line 5, the result is Ahmed because now s7 is still referring to the memory location where “Ahmed” is saved.
Explicit Setting Variable By Reference
What would happen here is similar to what happened on the Implicit Setting Variable By Reference case.
And at lines 6 and 7, the result would be as follows:
Ahmed_
Ahmed
Advantages of String Being Immutable
Now, you might ask:
What is the advantage of the String being immutable?
The answer is somehow logical. Let me explain.
Thread Safety
Since the String is immutable, on the memory allocation level, we are sure that its value would not be changed at all.
Therefore, we don’t expect any racing problems to happen as even if more than one thread is accessing a String variable, its value would always be the same.
Consistency
If we have the following:
var s1 = "Ahmed";
var s2 = "Ahmed";
var s3 = "Ahmed";
Then we decide to change the value of s1 as follows:
s1 = "Tarek";
What would actually happen is that only the value of s1 would be changed to “Tarek”. However, the values of s2 and s3 would still be “Ahmed”.
Memory Optimization
Again, if we have the following:
var s1 = "Ahmed";
var s2 = "Ahmed";
var s3 = "Ahmed";
In this case, only one memory location would be allocated in the Heap memory and its value would be set to “Ahmed”.
So, following the same pattern, imagine that we have like hundreds or even thousands of instances of a certain string, the performance would not be affected as String is memory optimized by default.
Easy Copying
Since the String is immutable, we can simply return this when we need a copy. Why? simply because it acts as a value type as its value would never change.