Day 1: Big-O Analysis and Essential Python Collections

Developer based in India. Passionate learner and blogger. All blogs are basically Notes of Tech Learning Journey.
Welcome to this blog collection where we cover the DSA with Python. This is a scheduled 45 day plan, keeping in mind that we might lose the hands-on skill of programming as we may tend to over-rely on AI for our code related tasks.
As an AI Engineer myself, I decided to restart my daily learnings around the topic of DSA keeping python in the center stage. So lets begin.
Before solving any DSA problem, it is essential to have a basic understanding of two things:
How efficient your code is (Big-O Analysis)
Which Python data structure to use (Collections)
Most interview failures occur not because the candidate cannot solve the problem, but because they choose the wrong data structure.
For example
# Slow
if x in my_list:
# Fast
if x in my_set:
Both produce the same result, but one may take thousands of times longer on large datasets.
Why should we even care?
In AI systems, you constantly work with:
Vocabulary lookups
Feature mappings
Cache systems
Retrieval pipelines
Token IDs
Graph relationships
These all rely heavily on:
Hash tables (
dict)Sets (
set)Queues (
deque)Priority queues (
heapq)
Understanding them will improve both interview performance and real-world engineering.
What is Big O?
Big-O measures how an algorithm grows as input size increases.
Suppose if you declare a variable as n = 10, and some time later, you declare as n = 100000. In this case, will your program become:
10x slower?
100x slower?
1,000,000x slower?
Big-O helps answer that.
| Complexity | Name | Good? |
|---|---|---|
| O(1) | Constant | Excellent |
| O(log n) | Logarithmic | Excellent |
| O(n) | Linear | Good |
| O(n log n) | Linearithmic | Good |
| O(n²) | Quadratic | Usually Bad |
| O(2ⁿ) | Exponential | Very Bad |
Lets see a few examples of various complexities.
Example 1: O(1)
arr = [10, 20, 30]
print(arr[1])
Python immediately knows where index 1 is. No searching needed. So as a result. the time complexity stands O(1).
Example 2: O(n)
arr = [3,5,2,9,1]
for x in arr:
print(x)
In this case, it is required to visit each and every element in the array for printing purpose. So it depends on number of elements in the array. So here, time complexity is O(n).
Example 3: O(n²)
for i in range(n):
for j in range(n):
pass
For every element, we visit every other element. It is a very time consuming process though.
Essential Python Collections
Python interviews heavily rely on four structures:
dict
set
deque
heapq
We will see all these Python collections one by one.
Dictionary : dict
It usually stores key value pairs.
student = {
"name": "Alice",
"age": 22
}
# To access
student["name"]
# Output
# Alice
# Time Complexity : O(1)
But what if dictionaries aren't even there?
students = [
("Alice", 22),
("Bob", 25)
]
# You will need to search the student manually
# Time complexity : O(n)
# Using Dictionary
students["Alice"]
# Then time complexity becomes : O(1)
Example:
student_scores = {
"Alice": 95,
"Bob": 88,
"Charlie": 91
}
print(student_scores["Bob"])
# Output: 88
Set
Set stores unique values.
numbers = {1,2,3}
Suppose, 1,000,000 numbers exist and we need to know about 999999.
We will consider 2 options here.
# Using list
999999 in numbers_list # O(n)
# Using set
999999 in numbers_set # O(1)
It drastically reduces the time complexity.
visited = {1, 2, 3}
print(2 in visited) # True
print(10 in visited) # False
Deque
It basically means, double ended queue, in which we can insert / remove from both ends efficiently.
Before using deque, you will need to import it.
from collections import deque
How is it different then?
Lets say we have a list and we have to pop an element, lets say arr.pop(0), this actually shifts every thing after popping. This is not the case with deque.
from collections import deque
queue = deque()
queue.append(10) # this pops out
queue.append(20)
queue.append(30)
print(queue.popleft()) # 10
Heap / Priority Queue
Heap automatically keeps smallest element on top. These are typically useful for:
Top K problems
Scheduling
AI search algorithms
Beam Search
Before running, you might need to run this:
import heapq
An example:
import heapq
nums = [7, 1, 5, 3]
heapq.heapify(nums)
print(heapq.heappop(nums))
# Output : 1
This is just a refresher, now lets jump to the real problems. For the first day, let us start with a simple problem.
Question : Given a list of integers, determine whether any number appears more than once.
Example : [1 , 2 , 3 , 4 , 5 , 6 , 7 ] => False
This means we need to deal with unique numbers here, so set seems to be the good option.
If number already seen → duplicate found.
Else add it to set.
# Function checks whether duplicates exist
def contains_duplicate(nums):
# Create an empty set
# This will store numbers we have already seen
seen = set()
# Visit every number in the list one by one
for num in nums:
# If num already exists in the set,
# we have found a duplicate
if num in seen:
return True
# Otherwise remember this number
# for future comparisons
seen.add(num)
# If loop finishes,
# no duplicate was found
return False
# Test Case 1
numbers1 = [1, 2, 3, 4]
# Test Case 2
numbers2 = [1, 2, 3, 1]
# Print results
print("Test Case 1:", contains_duplicate(numbers1))
print("Test Case 2:", contains_duplicate(numbers2))
# Output
# Test Case 1: False
# Test Case 2: True
A simple function which utilizes sets to determine if the number is repeated or not. Loop runs "n" times, while for each set lookup, it takes O(1). So the total time complexity is O(n).
Space Complexity is also O(n); as set may hold all items.
Few questions to be ready for:
Q1) Why sets in place of lists?
Set lookup : O(1), list lookup : O(n)
Q2) Difference between dict and set?
dict contains key value pairs, set contains only value.
Q3) Why is dictionary lookup O(1)?
Because Python uses a hash table internally.
Q4) When would you use deque?
When frequent insertions/removals occur at the beginning of a sequence.
Q5) When would you use heap?
When needed these kinds of elements : maximum, minimum, top-k, priority
Cheatsheet of the day
Big-O
O(1) Best
O(log n) Great
O(n) Good
O(n log n) Good
O(n²) Expensive
Dictionary
d = {}
d["a"] = 1
d["a"]
"a" in d
Average:
O(1)
Set
s = set()
s.add(10)
10 in s
Average:
O(1)
Deque
from collections import deque
q = deque()
q.append(x)
q.popleft()
Both:
O(1)
Heap
import heapq
heapq.heappush(heap, x)
heapq.heappop(heap)
Complexity:
O(log n)
Conclusion:
Today you learned:
What Big-O means
Why efficiency matters
Dictionaries (
dict)Sets (
set)Deques (
deque)Heaps (
heapq)How to solve a duplicate-detection problem in O(n) using a set
These four Python structures will appear throughout the next 44 days. Mastering them now will make every later topic easier.
Homework
Solve:
Contains Duplicate
Valid Anagram
Two Sum (attempt, don't worry if you can't finish)
Memorize:
dictsetdequeheapq
Be able to answer:
Why is
setlookup O(1)?When should I use a heap?
Why is
dequebetter thanlist.pop(0)?
Tomorrow (Day 2) we'll begin Arrays & Hash Maps, including the famous Two Sum problem and the hash-map pattern that powers a large fraction of coding interviews.

