Implementing NSCopying (or NSCopyObject() considered harmful)
NSCopying is not always as simple to implement as you would think. Apple has a good write-up discussing the complexities, but let me elaborate. Forgive some ranting digressions. It’s important to know how to work around the problems I’m going to discuss, but it’s also important to understand how insane it is to have to work around this issue.
First, there’s the fairly obvious problem of deep versus shallow copies. If object foo has an instance variable *bar, should a copy of foo have a second pointer to bar, or should it have its own copy of whatever bar points to? There is no way to answer this question generally; it depends on the nature of the objects.
Most of the time, this can be dealt with fairly easily by implementing the accessors with the correct behavior (retain versus copy), and you wind up with a copyWithZone: like this:
- (id)copyWithZone:(NSZone *)zone
{
Product *copy = [[[self class] allocWithZone: zone] init];
[copy setProductName:[self productName]];
return copy;
}
That works really well as long as your superclass doesn’t implement NSCopying, but if it does, you may not have enough information or access to cleanly copy it. Now you would think this would be easy:
- (id)copyWithZone:(NSZone *)zone
{
Product *copy = [super copy];
[copy setProductName:[self productName]];
return copy;
}
But that may or may not work. If super implements -copyWithZone: as described above, then all is fine. But what if your superclass uses NSCopyObject()? Things go badly, and in ways very difficult to understand and debug. NSCopyObject(), in my opinion, is evil. Yes it’s quick and easy to use, but it’s incredibly dangerous and there are better ways (I’ll discuss one approach later). Never use this function in your own code. But Apple used it in NSCell, and so we have to deal with it. NSCopyObject() breaks object orientation (it forces subclasses to know implementation details of their superclass) and memory management (it breaks retain counts).
NSCopyObject() makes a perfect copy of an object’s ivars, optionally expanding the size of the resulting copy. By “perfect copy” I mean “copies the pointers in the ivars to the new object.” So this is what happens in our example above:
- (id)copyWithZone:(NSZone *)zone
{
// Assume -productName = someProduct
// We'll call this point [1].
Product *copy = [super copy]; // [2]
[copy setProductName:[self productName]]; // [3]
return copy;
}
At point [1], productName is pointing to someProduct, which has some positive retain count. At point [2], copy‘s productName also points to the same object, but the retain count has not changed. It does not matter how -setProductName: is implemented because this isn’t called. NSCopyObject() has just copied the raw pointer.
At [3], we call [copy setProductName:...], which will likely [productName release]. Remember that copy->productName points to someProduct at this point, so we just reduced this retain count by 1. That might deallocate someProduct immediately, or it might not. Then we’ll [newValue retain] (newValue also points to someProduct), either crashing immediately, or setting the retain count back up by one, equal to its original value. But we now have an extra object pointing to it!
So what do we do? Well, the problem is that NSCopyObject() copied pointers without changing the retain counts. So before we mess with the retain counts, we need to clear out those pointers. Here is the canonical way to solve this problem. It’s terrible ObjC, but it is the “correct” way to solve Apple’s bug:
- (id)copyWithZone:(NSZone *)zone
{
Product *copy = [super copy];
copy->productName = nil;
[copy setProductName:[self productName]];
return copy;
}
That -> is dereferencing the struct* that underlies id. This bypasses all accessors and modifies the underlying ivar. You should never do this, but here it’s the best way. The other solution would be to randomly call [[copy productName] retain], but that’s even more confusing and error-prone.
If you don’t do this, when your two objects deallocate they’ll over-release someProduct, and you’ll crash. And you will be dumbfounded by the crash because all of your memory management will be correct. You’ll spend a few hours before hopefully Google brings you to a page like this one.
Since it is a private implementation detail whether a class uses NSCopyObject(), you must assume that any class you don’t control might, and so you should implement -copyWithZone: as above in all cases that your superclass implements NSCopying. Even if all the relevant superclasses are yours, you want to implement as above in case you ever change the superclass of your top-level class (which is itself an implementation detail and your subclasses shouldn’t rely on). This problem exists for all decedents of the class that uses NSCopyObject(), not just the first subclass. Your parent class can’t fix it for you.
I am horrified that NSCopyObject() has been carried over to iPhone. They’ve been so good about cleaning up these kinds of things. I understand the desire for a fast and easy copy, but there are better ways. objc_object is a struct of ivars. We could have a fast memory copy of all the ivars for the current class. The rest of the struct could have been initialized to NULL as usual. If this sounds complex it should be as simple as calling class_getInstanceSize() for your parent, adding that to self to get your first ivar’s offset, and memcpy() the number of bytes for your class_getInstanceSize() minus your superclass’s. Then you could clean up your own retain counts without screwing up your subclasses. Even better, multiple sub-classes could use this fast-copy without impacting each other, versus NSCopyObject() which can only be use by the top-level class.
Hi! Great info.
Could you post detailed example about the alternative way, this would help iPhone newbies.
You missed initializing the allocated memory:
should be Product *copy = [[[self class] allocWithZone: zone]init];
Thanks for the enlightening article, I was having hard time understanding the drawback of NSCopyObject on the reference manual.
@ozgur Wow… I even have the extra bracket in there. I should always cut-and-paste code into the blog. Thanks. Fixed.
Nice work!
You wrote “[S]hould a copy of foo have a second pointer to bar, or should it have its own copy of whatever bar points to? There is no way to answer this question generally; it depends on the nature of the objects.”
Indeed, I suspect I’m wrestling with this issue right now. (I could also be suffering from a bit of developer myopia and need to step back a bit.)
As I’ve been taught, for attributes whose type is an immutable value class that conforms to the NSCopying protocol, it’s wise to specify copy vs retain in the @property declaration.
In particular, the recommendation is to always use copy on NSString setters, which I continue to do to this day:
However, wouldn’t that go against the advisement – when NSCopying is honored – to “[implement] the accessors with the correct behavior (retain versus copy)”?
In the case of NSString, is the use of copy in setters a bad idea? Is it a toss-up (“it doesn’t matter”)?
The urging to use copy on NSString setters is very strong out there in the wild, so I want to make sure I “measure twice, cut once” here!
The answer you link to has is mostly correct but has made the point too general in my opinion. It is true that public NSString properties should be copied rather than retained, but this is because NSString has a common mutable subclass (NSMutableString). For immutable classes with no mutable subclass, it is generally best to retain rather than copy for memory and performance reasons.
Yes, it is conceivable that a custom immutable class will later have an immutable class added, but you have to balance this hypothetical against the cost. If you’re very concerned about it, you could add an
isMemberOfClass:assertion, but in my opinion this goes beyond defensive programming to unhelpful paranoia.When choosing whether to retain or copy, you must also consider the size of the object. If the object is quite small, then perhaps the cost of copy is worth it. If it is large or set very often, then copying may simply be too inefficient. As an example from the other direction, when returning an NSArray that you store internally as an NSMutableArray, you should make a copy so that the array doesn’t change behind your caller’s back. Unfortunately this can sometimes be a massive performance bottleneck, and I’ve been forced to just return the NSMutableArray cast as an NSArray. Read the docs for NSView -subviews for an example where Apple has had to do this, too.
Always remember to keep the actual goal in mind, and then apply the rules of thumb appropriately.
An immutable value class with no mutable subclass can implement NSCopying by returning [self retain]; client code should not be able to tell the difference. That’s what it means to be a value class: equivalent instances are interchangeable, whether or not they’re pointer-equal (so you may as well make them pointer-equal).
When choosing whether to specify retain or copy for a property, you should not be considering size, time, etc. – you should be considering semantics first and foremost.