Last time, we had an epic journey through the world of Python Tuples, and I hope you’re all tupled out! Today, let’s switch gears and dive into the marvelous world of Python Sets. Imagine a world where duplicates don’t exist, and order is a myth. Welcome to Sets!
What Are Python Sets?
Think of a set like a magical bag where you can toss in a bunch of items, and it will automatically toss out any duplicates and jumble the order. Sets are collections that are unordered, and unindexed, and do not allow duplicate elements. A set itself may be modified, but the elements contained in the set must be of an immutable type.
Python’s built-in set
type has the following characteristics:
- Sets are unordered.
- Set elements are unique. Duplicate elements are not allowed.
- A set itself may be modified, but the elements contained in the set must be of an immutable type.
Here’s a quick comparison with lists and tuples:
- Lists: Ordered and indexed. Allows duplicates. Mutable.
- Tuples: Ordered and indexed. Allows duplicates. Immutable.
- Sets: Unordered and unindexed. Does not allow duplicates. Mutable.
Feature | List | Tuple | Set | Dictionary |
---|---|---|---|---|
Definition | Ordered collection of elements | Ordered, immutable collection | Unordered, unique elements | Unordered collection of key-value pairs |
Syntax | [1, 2, 3] | (1, 2, 3) | {1, 2, 3} | {'key1': 'value1', 'key2': 'value2'} |
Mutability | Mutable | Immutable | Mutable | Mutable |
Order | Ordered | Ordered | Unordered | Unordered (Python 3.7+ maintains insertion order) |
Duplicates | Allows duplicates | Allows duplicates | No duplicates | Keys must be unique; values can be duplicated |
Indexing | Supports indexing and slicing | Supports indexing and slicing | Does not support indexing | Keys are used for indexing |
Methods | Many methods (e.g., append, remove) | Limited methods (e.g., count, index) | Basic methods (e.g., add, remove) | Rich methods (e.g., get, keys, values, items) |
Performance | Slower for lookups | Faster than lists for fixed data | Fast membership tests | Fast lookups by key |
Use Cases | Dynamic arrays, frequent updates | Fixed collections, read-only data | Membership testing, unique elements | Mapping, fast access to values by keys |
Memory Consumption | More memory for dynamic data | Less memory, more efficient | More memory for large sets | More memory due to keys and values |
Creating Sets
You can create a set using curly braces {}
or the set()
function. Let’s see both in action
# Using curly braces
fruits = {"apple", "banana", "cherry"}
print(fruits) #Output: {'apple', 'banana', 'cherry'}
# Using set() function
vegetables = set(["carrot", "lettuce", "spinach"])
print(vegetables) #OutPut: {'spinach', 'carrot', 'lettuce'}
Empty Set
Creating an empty set is a bit tricky because {}
creates an empty dictionary. Instead, use set()
.
# Correct way to create an empty set
empty_set = set()
print(empty_set)
#output: set()
Basic Set Operations
Method | Description | Behavior if Element Not Present | Return Value | Usage |
---|---|---|---|---|
remove(x) | Removes the specified element x from the set. | Raises a KeyError . | None. | my_set.remove(x) |
discard(x) | Removes the specified element x from the set. | Does nothing (no error). | None. | my_set.discard(x) |
pop() | Removes and returns an arbitrary element from the set. | Raises a KeyError if the set is empty. | The removed element. | my_set.pop() |
clear() | Removes all elements from the set, making it empty. | No effect (no error). | None. | my_set.clear() |
Explanation:
remove(x)
: Use this when you want to remove a specific element from the set and you expect it to be present. If the element isn’t there, it will raise an error.discard(x)
: Similar toremove(x)
, but it’s safer if you’re unsure whether the element is in the set, as it won’t raise an error if the element is missing.pop()
: Handy when you want to remove and retrieve an element from the set, though you can’t control which element will be removed.clear()
: The simplest way to empty a set entirely, remove all elements in one go.
Adding Elements
You can add elements to a set using the add()
method for single items or update()
for multiple items.
fruits = {"apple", "banana", "cherry"}
# Adding a single element
fruits.add("orange")
print(fruits)
#output:{'cherry', 'banana', 'apple', 'orange'}
# Adding multiple elements
fruits.update(["mango", "grape"])
print(fruits)
#output: {'grape', 'apple', 'banana', 'cherry', 'mango', 'orange'}
Removing Elements
You have several options to remove elements from a set: remove()
, discard()
, pop()
, and clear()
.
fruits = {"apple", "banana", "cherry"}
# Using remove() - raises KeyError if the element is not found
fruits.remove("orange")
print(fruits)
"""
output: ERROR!
Traceback (most recent call last):
File "<main.py>", line 4, in <module>
KeyError: 'orange'
"""
# Using discard() - does not raise an error if the element is not found
fruits.discard("mango")
print(fruits)
#output: {'apple', 'cherry'}
# Using pop() - removes and returns an arbitrary element
removed_item = fruits.pop()
print("Removed:", removed_item) #output: Removed: apple
print(fruits) #output: {'cherry'}
# Using clear() - empties the set
fruits.clear()
print(fruits) #output : set()
Checking Membership
To check if an element is in a set, use the in
keyword.
# Check membership
fruits = {"apple", "banana", "cherry"}
print("apple" in fruits) # Output: True
print("grape" in fruits) # Output: False
Set Operations
Union
Union combines elements from two sets, excluding duplicates.
# Using union() method
set1 = {"apple", "banana", "cherry"}
set2 = {"banana", "kiwi", "orange"}
union_set = set1.union(set2)
print(union_set)
#output: {'kiwi', 'orange', 'apple', 'cherry', 'banana'}
# Using | operator
union_set = set1 | set2
print(union_set) #output: {'kiwi', 'orange', 'apple', 'cherry', 'banana'}
More than two sets may be specified with either the operator or the method:
a = {1, 2, 3, 4}
b = {2, 3, 4, 5}
c = {3, 4, 5, 6}
d = {4, 5, 6, 7}
print(a.union(b, c, d)) #output: {1, 2, 3, 4, 5, 6, 7}
print(a | b | c | d) #output: {1, 2, 3, 4, 5, 6, 7}
Intersection
Intersection returns elements that are common to both sets.
# Using intersection() method
set1 = {"apple", "banana", "cherry"}
set2 = {"banana", "kiwi", "orange"}
intersection_set = set1.intersection(set2)
print(intersection_set) #output: {'banana'}
# Using & operator
intersection_set = set1 & set2
print(intersection_set) #output: {'banana'}
Difference
Difference returns elements present in the first set but not in the second.
# Using difference() method
set1 = {"apple", "banana", "cherry"}
set2 = {"banana", "kiwi", "orange"}
difference_set = set1.difference(set2)
print(difference_set) #output: {'cherry', 'apple'}
# Using - operator
difference_set = set1 - set2
print(difference_set) #output: {'cherry', 'apple'}
Symmetric Difference
Symmetric Difference returns elements that are in either of the sets, but not in both.
# Using symmetric_difference() method
set1 = {"apple", "banana", "cherry"}
set2 = {"banana", "kiwi", "orange"}
sym_diff_set = set1.symmetric_difference(set2)
print(sym_diff_set) #output: {'cherry', 'kiwi', 'orange', 'apple'}
# Using ^ operator
sym_diff_set = set1 ^ set2
print(sym_diff_set) #output: {'cherry', 'kiwi', 'orange', 'apple'}
Advanced Set Operations
Subset and Superset
A subset is a set whose elements are all contained in another set, while a superset contains all elements of another set.
# Using issubset() method
small_set = {"apple", "banana"}
print(small_set.issubset(set1)) # Output: True
# Using issuperset() method
print(set1.issuperset(small_set)) # Output: True
Disjoint Sets
Disjoint sets have no elements in common.
# Using isdisjoint() method
set3 = {"kiwi", "orange"}
print(set1.isdisjoint(set3)) # Output: True
Frozen Sets
Immutable sets that cannot be modified after creation, but support all set operations.
frozen_set1 = frozenset([1, 2, 3])
frozen_set2 = frozenset([3, 4, 5])
union_frozen = frozen_set1 | frozen_set2 # frozenset({1, 2, 3, 4, 5})
Set Comprehensions
Set comprehensions are a concise way to create sets. They follow a similar syntax to list comprehensions.
# Set comprehension
squared_set = {x ** 2 for x in range(10)}
print(squared_set) #output: {0, 1, 64, 4, 36, 9, 16, 49, 81, 25}
Performance Considerations
Sets are optimized for membership tests, making them faster than lists for this purpose.
# Membership test comparison
import time
list_test = list(range(1000000))
set_test = set(range(1000000))
start = time.time()
print(999999 in list_test) # Output: True
print("List time:", time.time() - start) #output: List time: 0.008748292922973633
start = time.time()
print(999999 in set_test) # Output: True
print("Set time:", time.time() - start) #output: Set time: 5.7220458984375e-06
Common Use Cases
Removing Duplicates from a List
Convert a list to a set and back to a list to remove duplicates.
# Removing duplicates
numbers = [1, 2, 2, 3, 4, 4, 5]
unique_numbers = list(set(numbers))
print(unique_numbers) #output: [1, 2, 3, 4, 5]
Set Operations in Data Analysis
Sets can be handy for comparing datasets.
# Example: Finding common elements between two datasets
data1 = {"apple", "banana", "cherry"}
data2 = {"banana", "kiwi", "cherry"}
common_data = data1.intersection(data2)
print(common_data) #output: {'cherry', 'banana'}
Text Processing
Find unique words in a text using sets.
# Finding unique words
text = "Python sets are great for finding unique elements in a collection"
words = text.split()
unique_words = set(words)
print(unique_words)
#output: {'collection', 'are', 'Python', 'sets', 'for', 'unique', 'elements', 'great', 'in', 'finding', 'a'}
Real-World Examples
Example 1: Unique Visitors on a Website
Track unique visitors using sets.
# Tracking unique visitors
visitors = {"user1", "user2", "user3"}
new_visitors = {"user2", "user4"}
unique_visitors = visitors.union(new_visitors)
print(unique_visitors) #output: {'user1', 'user3', 'user4', 'user2'}
Example 2: Detecting Common Elements Between Lists
Use sets for finding common elements.
# Finding common elements
list1 = ["apple", "banana", "cherry"]
list2 = ["banana", "kiwi", "cherry"]
common_elements = set(list1).intersection(set(list2))
print(common_elements) #output: {'banana', 'cherry'}
Pitfalls and Best Practices
Common Mistakes
Avoid using {}
to create an empty set as it creates an empty dictionary.
# Correct way to create an empty set
empty_set = set()
Best Practices
Use sets when you need to ensure uniqueness and perform membership tests efficiently.
# Efficient membership test
emails = {"test@example.com", "hello@example.com", "user@example.com"}
print("test@example.com" in emails) # Output: True
Conclusion
We’ve covered everything you need to know about Python Sets, from creation and basic operations to advanced usage and real-world examples. Sets are powerful tools that can help you write more efficient and readable code, especially when dealing with collections of unique items.
For further reading, check out the official Python documentation on sets. Happy coding!
Let’s Get in Touch! Follow me on :