LeetCode #295: Find Median from Data Stream
https://leetcode.com/problems/find-median-from-data-stream/description/
The problem asks us to find the median of a stream of integers as they are being added. We can solve this problem using heaps. Specifically, we can maintain two heaps, a max heap for the smaller half of the data and a min heap for the larger half of the data. The median will either be the largest element in the small heap or the smallest element in the large heap (if the heaps are of equal size, the median will be the average of the two). We will add elements to the heaps in a balanced way, such that the difference in size between the two heaps is at most one.
Here are the steps to solve the problem:
- Initialize two heaps, one max heap for the smaller half and one min heap for the larger half.
- When a new integer is added to the stream, add it to the max heap first. Then, if the max heap is not empty and the new integer is greater than the largest integer in the max heap, remove the largest integer from the max heap and add it to the min heap.
- If the size of the max heap is more than one greater than the size of the min heap, remove the largest integer from the max heap and add it to the min heap.
- If the size of the min heap is more than one greater than the size of the max heap, remove the smallest integer from the min heap and add it to the max heap.
- To find the median, if the size of the max heap is greater than the size of the min heap, the median is the largest integer in the max heap. If the size of the min heap is greater than the size of the max heap, the median is the smallest integer in the min heap. If the heaps are of equal size, the median is the average of the largest integer in the max heap and the smallest integer in the min heap.
Here is the implementation of the solution in Python:
import heapq
class MedianFinder:
def __init__(self):
self.small = []
self.large = []
def addNum(self, num: int) -> None:
heapq.heappush(self.small, -1*num)
if self.small and self.large and (-1 * self.small[0]) > self.large[0]:
val = -1 * heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.small) > len(self.large) + 1:
val = -1 * heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.large) > len(self.small) + 1:
val = heapq.heappop(self.large)
heapq.heappush(self.small, -1 * val)
def findMedian(self) -> float:
if len(self.small) > len(self.large):
return -1 * self.small[0]
elif len(self.large) > len(self.small):
return self.large[0]
return (-1 * self.small[0] + self.large[0]) / 2
Time and Space Complexity
The time complexity of adding a new integer to the heap is O(log n), since we are using the heapq module. The space complexity is O(n), where n is the number of integers added to the stream.