by Zoran Horvat
System.ICloneable interface has been introduced with a relatively vague accompanying documentation and ever since it has remained the inexhaustible source of disputes. Should a given class implement ICloneable interface or not? And, once it does, should the clone be deep or shallow? And if deep, how deep? And if moderately deep, how much of the implementation details shoul be documented and communicated to the consumer?
As it often happens among programmers, two camps have emerged: one urging not to implement ICloneable, another providing implementation guidelines. Microsoft itself has stirred the waters back in 2003 and hadn't had put an effort to clear the situation a bit. A decade later, the two camps are still entrenched and new developers are still asking who is right and who is wrong.
In this article we are going to investigate the System.ICloneable interface as it is documented. Then, we will list pros and cons to the idea of implementing this interface.
Original documentation for the ICloneable interface can be found at address ICloneable Interface . Developers should stick to definitions and explanations from the documentation whenever in doubt. Internet resources, including this article, are often biased and should be taken with a grain of salt.
So, what does the documentation say about the ICloneable interface? Here is the first hint about what this interface is meant to do: "Supports cloning, which creates a new instance of a class with the same value as an existing instance."
The following sentences give a little bit more insight. "The ICloneable interface enables you to provide a customized implementation that creates a copy of an existing object. The ICloneable interface contains one member, the Clone method, which is intended to provide cloning support beyond that supplied by Object.MemberwiseClone."
And that is, more or less, it.
So where does all the confusion surrounding this interface come from? In many cases it comes from an unpleasant feeling that ICloneable is somewhat incomplete and that it should be more specific in terms on what part of object's content the implementation should really copy. Here is the statement which seems to provide most of the fuel for disputes. "The ICloneable interface simply requires that the method return a copy of the current object instance. It does not specify whether the cloning operation performs a deep copy, a shallow copy, or something in between. Nor does it require all property values of the original instance to be copied to the new instance."
Many developers find this definition too allowing. If cloning of a class can be implemented in one way or in the opposite way, then what should the consumer expect from that class and how to prevent mistakes that arise from wrong expectations? The answers will follow.
This question bothers many developers when they come to implementing the ICloneable interface. Documentation is explicit on the matter: It is not specified. Many programmers have difficult times with this sentence, probably because they tend to read it as the documentation does not specify whether Clone method produces deep or shallow copy. Well, that's not what it says. It rather says that it is not specified. This formulation removes the pressure from the implementation. If a consumer urges to know the answer, then there is the problem in the consumer itself - ICloneable is not the right interface for its needs; or, which is quite often the case, consumer thinks too much (in formal terms, it is overspecified).
Let's take a look at one refreshing example. One of the core classes in .NET Framework, System.String, implements ICloneable. Strings in .NET are immutable, which means that there are no operations on the class that can change content of an object once it is constructed. Should one try to modify a string, e.g. to trim it, substitute some characters, change case, or anything else, the corresponding method will actually construct a new instance of System.String with all modifications applied and return that instance as the result. Now that we know this, can we tell what the string's Clone methid should do? Many developers will be surprised to find that implementation of String.Clone method looks something like this:
public class String : ICloneable, ...
{
...
public System.Object Clone()
{
return this;
}
...
}
Does this implementation satisfy the definition ("creates a new instance of a class with the same value as an existing instance")? Let's check:
string s1 = "Something";
string s2 = (string)s1.Clone();
if (s2 != "Something")
throw new Exception("Not the same!")
There will be no exception thrown, as you already know. Did the Clone method create new instance? Yes - that is s2. Is the new value same as the first one? Yes - that is what the if syatement verifies. Is the new instance deep or shallow copy of the first instance? Well, the answer that follows might offend some readers: it's not your business to know. Let's see,why is this so:
string s1 = "Something";
string s2 = (string)s1.Clone();
if ((object).ReferenceEquals(s1, s2))
throw new Exception("But I need to know!");
When this code is run on .NET Framework version 4, for example, it really throws the exception. But that is only because the consumer has asked the wrong question. ICloneable interface does not guarantee that clone will actually point to a different object. Expecting such a thing might have a good foundation when talking in general, but when trying the idea on a particular class, such as System.String, might fail. Lesson learned - do not assume anything about interior of the Clone implementation. Stick with the contract that says you are going to obtain "a copy".
In terms that might sound more assuring, cloned object should safely be used without affecting the original object. Implementation should make sure that object is copied as deep as it takes to ensure this property.
Another question that is frequently raised on the Internet is why there is no strongly typed ICloneable interface. To see why there is no such an interface, let's try to produce one. Here is a very simple implementation:
public interface ICloneable<T>
{
T Clone();
}
Now let's try to use this interface:
public class A : ICloneable<A>
{
public A() { }
public A(A a) { }
public virtual A Clone() { return new A(this); }
}
This method will certainly work fine. But now, let's try to introduce another class, derived from A:
public class B : A
{
public B() { }
public B(B b) : base(b) { }
public override A Clone()
{
B b = new B(this);
return b;
}
}
First thing to notice that there is no Clone method that returns instance of B - the only method available returns reference to A. Now look at this situation from the viewpoint of a consumer. What is the difference between having a reference to A and having a reference to System.Object when reference to B is required? Basically - none. In either case, consumer must cast the result to desired type B at run time, risking the runtime exception if object at hand is not an instance of B. Hence the first conclusion regarding strongly typed ICloneable interface: It is strongly typed as long as type is not derived, in which case it instantly seizes to be strongly typed!
Some authors have tried to cope with the problem by stating the type in the declaration of the Clone method:
public interface ICloneable<T>
{
T1 Clone<T1>() where T1 : T;
}
This solution attempts to evade the problem of derived classes by carving "this and any derived type" into Clone method's signature. There are two problems with this approach. First, Clone method introduces another generic type parameter not mentioned in the interface declaration, which adds the whole lot of confusing thoughts. Second, the way in which this interface is used is, to say the least, awful:
public class A : ICloneable<A>
{
public A() { }
public A(A a) { }
public virtual T Clone<T>() where T: A
{
return (T)(new A(this));
}
}
public class B : A
{
public B() { }
public B(B b) : base(b) { }
public override T Clone<T>()
{
return (T)(A)(new B(this));
}
}
And still it is just semi-strongly typed as soon as we step into the derived class B, which still provides the Clone method which believes that it works for A, none the less.
So if we dismiss the solution with generic Clone method in generic ICloneable interface, we end up with a simple solution which proposes the Clone method with basically unknown return type:
public interface ICloneable<T>
{
T Clone();
}
Just to get an idea how bad this is, try to compile something like this:
public class A : ICloneable<A>
{
public A() { }
public A(A a) { }
public virtual A Clone()
{
return new A(this);
}
}
public class B : A, ICloneable<B>
{
public B() { }
public B(B b) : base(b) { }
public override A Clone()
{
return new B(this);
}
public virtual B Clone()
{
return new B(this);
}
}
In this example, we have tried to work around the fact that B's cloning facility returns A and to provide a side-by-side cloning facility which returns strongly typed B result. But this attempt fails with compile error:
Type 'CloningDemo.B' already defines a member
called 'Clone' with the same parameter types.
What lays behind this error is the fact that two Clone methods in implementation of B have the same name and arguments, which causes name clash when it comes to compiling the class. We can emphasize the problem even further by dropping the override of A's Clone method in declaration of B:
public class B : A, ICloneable<B>
{
public B() { }
public B(B b) : base(b) { }
public virtual B Clone()
{
return new B(this);
}
}
This time we receive compile-time warning:
'CloningDemo.B.Clone()' hides inherited member 'CloningDemo.A.Clone()'.
To make the current member override that implementation,
add the override keyword. Otherwise add the new keyword.
The compiler seems to have no clue what this code should do. One possible way out of this situation is explicit interface implementation:
public class A : ICloneable<A>
{
public A() { }
public A(A a) { }
A ICloneable<A>.Clone()
{
return new A(this);
}
}
public class B : A, ICloneable<B>
{
public B() { }
public B(B b) : base(b) { }
B ICloneable<B>.Clone()
{
return new B(this);
}
}
In this way both base and derived class are cloneable, each in its own strong type. But this solution is not polymorphic. Each class is implementing a separate Clone method of its own, which is often not desired behavior.
The bottom line is that strongly typed ICloneable<T> interface has not been implemented because derived class implementations are still weakly typed and there is no proper way to distinguish base type from derived type implementations.
There are many references on the Internet pointing to a 2003 blog post by Brad Abrams - at the time employed at Microsoft - in which some thoughts about ICloneable are discussed. The blog entry can be found at this address: Implementing ICloneable . Despite the misleading title, this blog entry calls not to implement ICloneable, mainly because of shallow/deep confusion. Article ends in a straight suggestion: If you need a cloning mechanism, define your own Clone, or Copy methodology, and ensure that you document clearly whether it is a deep or shallow copy. An appropriate pattern is:
public <type> Copy();
If we take a look at .NET Framework 2, which was released in 2005, we will see that there is a handful of new types that still implement ICloneable. This small number of misbehaving classes may be the indicator that Microsoft had actually put an effort into avoiding the Clone method. So we might conclude that Microsoft is still holding the same suggestion, that is to avoid using the ICloneable in public interfaces.
This suggestion raises several interesting questions. For example, Copy method is not part of any interface - otherwise, that interface would suffer the same problems ICloneable did. Consequently, Copy method can be implemented in any class, even overridden in derived classes, but it would be quite different than Copy method in any other class. It looks like Microsoft has taken the course of splitting the cloning facilities on a per-class basis, avoiding completely to generalize cloning into an interface. In most cases this approach provides quite useful results. If that is the course your design has taken, rest assured that there are no pitfalls, or at least no obvious pitfalls. But watch out for name clashes should your custom Copy method end up in any interface!
On the other hand, remember that there are still hundreds of classes in .NET Framework that implement ICloneable, including notable examples such as System.Drawing.Image, System.Xml.XmlNode, and many others. If your code expects from ICloneable implementation just what the contract specifies and nothing beyond that, you are largely saved from troubles.
On a zero-to-infinity numeric scale it is quite simple to say which is the smallest number: it is number zero. But what is the largest number? That question cannot be answered because for any given number there is a number by one larger than that. There is no largest number on an open scale.
The same is true in cloning business. It's easy to say what is the shallow copy - it is already provided for us in Object.MemberwiseClone method. There is nothing special to do to have shallow copy of an object. But deciding what is a true deep copy may be very tricky. Let's take a look at several examples.
What is the deep copy of a collection? Does it require to recursively copy elements in the collection? And what if some of the contained objects do not implement deep cloneable interface? Should the operation fail, or silently step over those elements?
What is the deep copy of a circular list? How can we discover that some contained object has already been cloned, just to pick the same reference on all places where original object was used? Without this precaution, we run into risk of making an infinite recursion, which typically turns into a stack overflow exception with immediate notice.
What is the deep copy of an object with external dependencies? Should HttpWebRequest clone the accompanying Web server, just to make sure that it is another Web server this time? Should Stream class copy all file contents into a spare location just to make sure that it's not the same data as in original stream?
Questions about really-deep-cloning are endless. There seems to be no proper way to define deep cloning without getting into trouble of defining a cutting edge between cloning the object at hand and cloning other objects which would have to remain invariant with respect to the cloned object. Consequently, deep cloneable interface is not part of Base Class Library.
Previous sections have shown several confronted ideas surrounding the ICloneable interface. But ICloneable interface is not alone when such questions are raised. Take a look at the Equals method, provided by System.Object. All custom types are free to provide custom implementation of the Equals method in order to test value equality between objects. However, the notion of "value of the object" is not quite clear. These are the guidelines for inheritors: "You can also override the default implementation of Equals to test for value equality instead of reference equality and to define the precise meaning of value equality. Such implementations of Equals return true if the two objects have the same value, even if they are not the same instance. The type's implementer decides what constitutes an object's value, but it is typically some or all the data stored in the instance variables of the object."
Does this passage look familiar? If it was so bad that ICloneable has left to inheritors to decide whether to make deep or shallow copy, how doesn't it cause discomfort to know that Equals method does not specify whether to compare all or just some members? And for each member, whether to test it by reference or by value, recursively calling its Equals method?
The .NET Framework is full of cases in which confusion can quickly erupt if caller makes decisions based on unreasonable assumptions. Custom classes should not put constraints on types that they call. Instead, called types should define their interfaces and callers should operate within defined bounds. In terms of ICloneable - caller should not assume deep copy, or shallow copy, or any copy in between. If it is not possible for a caller to operate without knowing the type of copy, then it cannot call the Clone method. If there is no alternative method, like strongly defined Copy method, then there is a gap in cloneable class implementation that should be filled before proceeding. Cutting the problem short by presuming implementation of the Clone method is bad design on many accounts. The only real problem regarding the ICloneable interface are masses of code dwelling around the globe that rely on impression that they know what Clone method implementation really is. Make sure that your designs do not fall into the same trap.
For more information about how to implement ICloneable interface in custom classes, please refer to article How to Implement ICloneable Interface in Derivable Classes in .NET .
For more information about how to override Equals and GetHashCode methods in custom classes, please refer to article How to Override Equals and GetHashCode Methods in Base and Derived Classes .
If you wish to learn more, please watch my latest video courses
In this course, you will learn the basic principles of object-oriented programming, and then learn how to apply those principles to construct an operational and correct code using the C# programming language and .NET.
As the course progresses, you will learn such programming concepts as objects, method resolution, polymorphism, object composition, class inheritance, object substitution, etc., but also the basic principles of object-oriented design and even project management, such as abstraction, dependency injection, open-closed principle, tell don't ask principle, the principles of agile software development and many more.
More...
In this course, you will learn how design patterns can be applied to make code better: flexible, short, readable.
You will learn how to decide when and which pattern to apply by formally analyzing the need to flex around specific axis.
More...
This course begins with examination of a realistic application, which is poorly factored and doesn't incorporate design patterns. It is nearly impossible to maintain and develop this application further, due to its poor structure and design.
As demonstration after demonstration will unfold, we will refactor this entire application, fitting many design patterns into place almost without effort. By the end of the course, you will know how code refactoring and design patterns can operate together, and help each other create great design.
More...
In four and a half hours of this course, you will learn how to control design of classes, design of complex algorithms, and how to recognize and implement data structures.
After completing this course, you will know how to develop a large and complex domain model, which you will be able to maintain and extend further. And, not to forget, the model you develop in this way will be correct and free of bugs.
More...
Zoran Horvat is the Principal Consultant at Coding Helmet, speaker and author of 100+ articles, and independent trainer on .NET technology stack. He can often be found speaking at conferences and user groups, promoting object-oriented and functional development style and clean coding practices and techniques that improve longevity of complex business applications.