How to: Copy data in Python
Understanding Shallow vs Deep Copy in Python
When working with data structures in Python, particularly lists and dictionaries, you may find yourself needing to copy an object. However, it's crucial to understand the difference between shallow and deep copying first to avoid unintended side effects in your code.
What is a Shallow Copy
A shallow copy creates a new object, but it inserts references into it to the objects found in the original. This means that while the new object is a separate instance, the elements within it are still references to the same objects as those in the original. In Python, you can create a shallow copy using the copy
module:
import copy
original_list = [1, 2, [3, 4]]
shallow_copied_list = copy.copy(original_list)
In this example, shallow_copied_list
is a new list, but it contains references to the same nested list [3, 4]
found in original_list
. Modifying the nested list in either original_list
or shallow_copied_list
will affect both lists.
What is a Deep Copy
In contrast, a deep copy creates a new object and recursively copies all objects found in the original, creating fully independent copies. This ensures that modifications to the new object do not affect the original object. You can create a deep copy using the copy
module as well:
import copy
original_list = [1, 2, [3, 4]]
deep_copied_list = copy.deepcopy(original_list)
Here, deep_copied_list
is a completely independent copy of original_list
, including the nested list. Changes to the nested list in deep_copied_list
will not affect original_list
, and vice versa.
How to Make Shallow Copies in Python
In addition to using the copy
module from the example above, there are several other ways to create shallow copies of objects in Python. Here are a few common methods:
Using the copy()
Method
Python lists and dictionaries have a built-in copy()
method that returns a shallow copy of the original object:
original_dict = {'a': 1, 'b': {'c': 2}}
shallow_copied_dict = original_dict.copy()
print(shallow_copied_dict) # Output: {'a': 1, 'b': {'c': 2}}
Using List Slicing
You can create a shallow copy of a list using slicing. This method is concise and commonly used for copying lists:
original_list = [1, 2, [3, 4]]
shallow_copied_list = original_list[:]
print(shallow_copied_list) # Output: [1, 2, [3, 4]]
Using the list()
Function
Another way to create a shallow copy of a list is by using the list()
function:
original_list = [1, 2, [3, 4]]
shallow_copied_list = list(original_list)
print(shallow_copied_list) # Output: [1, 2, [3, 4]]
Using the dict()
Function
To create a shallow copy of a dictionary, you can use the dict()
function:
original_dict = {'a': 1, 'b': {'c': 2}}
shallow_copied_dict = dict(original_dict)
print(shallow_copied_dict) # Output: {'a': 1, 'b': {'c': 2}}
How to Make Deep Copies in Python
Other than using copy
module, like in the example above, here are some other options:
Using Custom Deep Copy Functions
For specific use cases, you might want to create a custom deep copy function. This can be useful if you need to handle certain types of objects in a particular way or if you want to avoid importing the copy
module.
Here's an example of a simple custom deep copy function for lists:
def custom_deepcopy(obj):
if isinstance(obj, list):
return [custom_deepcopy(item) for item in obj]
else:
return obj
original_list = [1, 2, [3, 4]]
deep_copied_list = custom_deepcopy(original_list)
print(deep_copied_list) # Output: [1, 2, [3, 4]]
This function checks if the object is a list and recursively copies each element. For non-list objects, it returns the object itself.
Using JSON Serialization
Another method to create deep copies is by using JSON serialization and deserialization. This method involves converting the object to a JSON string and then back to a Python object. It's particularly useful for data structures that are easily representable in JSON format.
Here's how you can use JSON serialization for deep copying:
import json
original_list = [1, 2, [3, 4]]
deep_copied_list = json.loads(json.dumps(original_list))
print(deep_copied_list) # Output: [1, 2, [3, 4]]
This approach works well for lists and dictionaries but may only be suitable for some complex objects or objects that can be serialized to JSON.
I am not suggesting it is a good Pythonic way of doing things, but it is still an option!
Copying Custom Objects
When working with custom objects, you can define your own deep copy method within the class. This method allows you to control exactly how the object and its attributes are copied.
Here's an example:
class CustomObject:
def __init__(self, value, nested):
self.value = value
self.nested = nested
def __deepcopy__(self, memo):
# Create a new instance of CustomObject
copied_object = CustomObject(self.value, copy.deepcopy(self.nested, memo))
return copied_object
original_object = CustomObject(1, [2, 3])
deep_copied_object = copy.deepcopy(original_object)
print(deep_copied_object.nested) # Output: [2, 3]
In this example, the __deepcopy__
method is defined within the CustomObject
class to handle deep copying of its attributes.
Conclusion
Understanding the difference between shallow and deep copying in Python is essential for managing data structures and avoiding unintended side effects in your code. Shallow copies create new objects with references to the original elements, while deep copies create fully independent copies of the original objects. By using the appropriate copying method, such as list slicing, the copy()
method, or custom deep copy functions, you can ensure your data is managed correctly according to your specific use case.