Skip to content

Commit bc35ef7

Browse files
authored
Update sorting.md
1 parent c662be4 commit bc35ef7

File tree

1 file changed

+202
-37
lines changed

1 file changed

+202
-37
lines changed

notes/sorting.md

Lines changed: 202 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -16,61 +16,59 @@ Stability, in the context of sorting, refers to preserving the relative order of
1616
- In an **unstable** sorting algorithm, their order might be reversed in the sorted output.
1717
- In a **stable** sorting algorithm, their relative order remains unchanged.
1818

19-
Sure, let's use an example with characters and their indices to illustrate both stable and unstable sorting.
20-
2119
#### Example
2220

23-
Consider the following pairs of characters and their indices:
21+
Picture each element as a **programming-language name** followed by the 0-based position it held in the original (unsorted) list:
2422

2523
```
26-
[A0] [C1] [B2] [A3] [B4]
24+
[C#0] [Python1] [C#2] [JavaScript3] [Python4]
2725
```
2826

2927
#### Stable Sort
3028

31-
When we sort these pairs alphabetically by character using a stable sorting method, pairs with the same character retain their original order in terms of indices.
29+
A **stable** sort keeps items that compare as “equal” in the same left-to-right order they started with.
3230

33-
I. Sort the character 'A':
31+
I. Bring every `"C#"` to the front **without** changing their internal order (`0` still precedes `2`):
3432

3533
```
36-
[A0] [A3] [C1] [B2] [B4]
34+
[C#0] [C#2] [Python1] [JavaScript3] [Python4]
3735
```
3836

39-
II. Next, sort the character 'B', retaining the original order of indices:
37+
II. Next, move the two `"Python"` entries ahead of `"JavaScript"`, again preserving `1` before `4`:
4038

4139
```
42-
[A0] [A3] [B2] [B4] [C1]
40+
[C#0] [C#2] [JavaScript3] [Python1] [Python4]
4341
```
4442

45-
So, the stable sorted sequence is:
43+
So the stable-sorted sequence is:
4644

4745
```
48-
[A0] [A3] [B2] [B4] [C1]
46+
[C#0] [C#2] [JavaScript3] [Python1] [Python4]
4947
```
5048

5149
#### Unstable Sort
5250

53-
When we sort these pairs alphabetically by character using an unstable sorting method, pairs with the same character may not retain their original order in terms of indices.
51+
An **unstable** sort does *not* guarantee that equal items keep their original relative order.
5452

55-
I. Sort the character 'A':
53+
I. While collecting `"C#"` items, the algorithm might emit index `2` *before* index `0`:
5654

5755
```
58-
[A3] [A0] [C1] [B2] [B4]
56+
[C#2] [C#0] [Python1] [JavaScript3] [Python4]
5957
```
6058

61-
II. Next, sort the character 'B', without retaining the original order of indices:
59+
II. Later, the two `"Python"` entries can also swap positions (`4` before `1`):
6260

6361
```
64-
[A3] [A0] [B4] [B2] [C1]
62+
[C#2] [C#0] [JavaScript3] [Python4] [Python1]
6563
```
6664

67-
So, the unstable sorted sequence is:
65+
So one possible unstable-sorted sequence is:
6866

6967
```
70-
[A3] [A0] [B4] [B2] [C1]
68+
[C#2] [C#0] [JavaScript3] [Python4] [Python1]
7169
```
7270

73-
This characteristic is particularly valuable in scenarios where we might perform multiple rounds of sorting based on different criteria.
71+
This stability property matters when you chain sorts on multiple keys—for instance, first sorting bug reports by **severity**, then by **timestamp**—because each later pass can rely on ties already being in the correct internal order.
7472

7573
### Bubble Sort
7674

@@ -88,6 +86,26 @@ Imagine a sequence of numbers. Starting from the beginning of the sequence, we c
8886
4. After the first pass, the largest item will be at the last position. On the next pass, you can ignore the last item and consider the rest of the array.
8987
5. Continue this process for `n-1` passes to ensure the array is completely sorted.
9088

89+
```
90+
Start: [ 5 ][ 1 ][ 4 ][ 2 ][ 8 ]
91+
92+
Pass 1:
93+
[ 5 ][ 1 ][ 4 ][ 2 ][ 8 ] → swap(5,1) → [ 1 ][ 5 ][ 4 ][ 2 ][ 8 ]
94+
[ 1 ][ 5 ][ 4 ][ 2 ][ 8 ] → swap(5,4) → [ 1 ][ 4 ][ 5 ][ 2 ][ 8 ]
95+
[ 1 ][ 4 ][ 5 ][ 2 ][ 8 ] → swap(5,2) → [ 1 ][ 4 ][ 2 ][ 5 ][ 8 ]
96+
[ 1 ][ 4 ][ 2 ][ 5 ][ 8 ] → no swap → [ 1 ][ 4 ][ 2 ][ 5 ][ 8 ]
97+
98+
Pass 2:
99+
[ 1 ][ 4 ][ 2 ][ 5 ] [8] → no swap → [ 1 ][ 4 ][ 2 ][ 5 ] [8]
100+
[ 1 ][ 4 ][ 2 ][ 5 ] [8] → swap(4,2) → [ 1 ][ 2 ][ 4 ][ 5 ] [8]
101+
[ 1 ][ 2 ][ 4 ][ 5 ] [8] → no swap → [ 1 ][ 2 ][ 4 ][ 5 ] [8]
102+
103+
Pass 3:
104+
[ 1 ][ 2 ][ 4 ] [5,8] → all comparisons OK
105+
106+
Result: [ 1 ][ 2 ][ 4 ][ 5 ][ 8 ]
107+
```
108+
91109
#### Optimizations
92110

93111
An important optimization for bubble sort is to keep track of whether any swaps were made during a pass. If a pass completes without any swaps, it means the array is already sorted, and there's no need to continue further iterations.
@@ -98,9 +116,9 @@ Bubble sort is stable. This means that two objects with equal keys will retain t
98116

99117
#### Time Complexity
100118

101-
- In the **worst-case** scenario, the time complexity of bubble sort is **$O(n^2)$**, which occurs when the array is in reverse order.
102-
- The **average-case** time complexity is also **$O(n^2)$**, as bubble sort generally requires quadratic time for typical unsorted arrays.
103-
- In the **best-case** scenario, the time complexity is **$O(n)$**, which happens when the array is already sorted, especially if an optimization like early exit is implemented.
119+
- In the **worst-case** scenario, the time complexity of bubble sort is $O(n^2)$, which occurs when the array is in reverse order.
120+
- The **average-case** time complexity is also $O(n^2)$, as bubble sort generally requires quadratic time for typical unsorted arrays.
121+
- In the **best-case** scenario, the time complexity is $O(n)$, which happens when the array is already sorted, especially if an optimization like early exit is implemented.
104122

105123
#### Space Complexity
106124

@@ -122,15 +140,38 @@ Consider an array of numbers. The algorithm divides the array into two parts: a
122140
4. Move the boundary of the sorted and unsorted subarrays one element to the right.
123141
5. Repeat steps 1-4 until the entire array is sorted.
124142

143+
```
144+
Start:
145+
[ 64 ][ 25 ][ 12 ][ 22 ][ 11 ]
146+
147+
Pass 1: find min(64,25,12,22,11)=11, swap with first element
148+
[ 11 ][ 25 ][ 12 ][ 22 ][ 64 ]
149+
150+
Pass 2: find min(25,12,22,64)=12, swap with second element
151+
[ 11 ][ 12 ][ 25 ][ 22 ][ 64 ]
152+
153+
Pass 3: find min(25,22,64)=22, swap with third element
154+
[ 11 ][ 12 ][ 22 ][ 25 ][ 64 ]
155+
156+
Pass 4: find min(25,64)=25, swap with fourth element (self-swap)
157+
[ 11 ][ 12 ][ 22 ][ 25 ][ 64 ]
158+
159+
Pass 5: only one element remains, already in place
160+
[ 11 ][ 12 ][ 22 ][ 25 ][ 64 ]
161+
162+
Result:
163+
[ 11 ][ 12 ][ 22 ][ 25 ][ 64 ]
164+
```
165+
125166
#### Stability
126167

127168
Selection sort is inherently unstable. When two elements have equal keys, their relative order might change post-sorting. This can be problematic in scenarios where stability is crucial.
128169

129170
#### Time Complexity
130171

131-
- In the **worst-case**, the time complexity is **$O(n^2)$**, as even if the array is already sorted, the algorithm still iterates through every element to find the smallest.
132-
- The **average-case** time complexity is also **$O(n^2)$**, since the algorithm's performance generally remains quadratic regardless of input arrangement.
133-
- In the **best-case**, the time complexity is still **$O(n^2)$**, unlike other algorithms, because selection sort always performs the same number of comparisons, regardless of the input's initial order.
172+
- In the **worst-case**, the time complexity is $O(n^2)$, as even if the array is already sorted, the algorithm still iterates through every element to find the smallest.
173+
- The **average-case** time complexity is also $O(n^2)$, since the algorithm's performance generally remains quadratic regardless of input arrangement.
174+
- In the **best-case**, the time complexity is still $O(n^2)$, unlike other algorithms, because selection sort always performs the same number of comparisons, regardless of the input's initial order.
134175

135176
#### Space Complexity
136177

@@ -157,15 +198,35 @@ Imagine you have a series of numbers. The algorithm begins with the second eleme
157198
4. Insert the current element into the correct position so that the elements before are all smaller.
158199
5. Repeat steps 2-4 for each element in the array.
159200

201+
```
202+
Start:
203+
[ 12 ][ 11 ][ 13 ][ 5 ][ 6 ]
204+
205+
Pass 1: key = 11, insert into [12]
206+
[ 11 ][ 12 ][ 13 ][ 5 ][ 6 ]
207+
208+
Pass 2: key = 13, stays in place
209+
[ 11 ][ 12 ][ 13 ][ 5 ][ 6 ]
210+
211+
Pass 3: key = 5, insert into [11,12,13]
212+
[ 5 ][ 11 ][ 12 ][ 13 ][ 6 ]
213+
214+
Pass 4: key = 6, insert into [5,11,12,13]
215+
[ 5 ][ 6 ][ 11 ][ 12 ][ 13 ]
216+
217+
Result:
218+
[ 5 ][ 6 ][ 11 ][ 12 ][ 13 ]
219+
```
220+
160221
#### Stability
161222

162223
Insertion sort is stable. When two elements have equal keys, their relative order remains unchanged post-sorting. This stability is preserved since the algorithm only swaps elements if they are out of order, ensuring that equal elements never overtake each other.
163224

164225
#### Time Complexity
165226

166-
- In the **worst-case**, the time complexity is **$O(n^2)$**, which happens when the array is in reverse order, requiring every element to be compared with every other element.
167-
- The **average-case** time complexity is **$O(n^2)$**, as elements generally need to be compared with others, leading to quadratic performance.
168-
- In the **best-case**, the time complexity is **$O(n)$**, occurring when the array is already sorted, allowing the algorithm to simply pass through the array once without making any swaps.
227+
- In the **worst-case**, the time complexity is $O(n^2)$, which happens when the array is in reverse order, requiring every element to be compared with every other element.
228+
- The **average-case** time complexity is $O(n^2)$, as elements generally need to be compared with others, leading to quadratic performance.
229+
- In the **best-case**, the time complexity is $O(n)$, occurring when the array is already sorted, allowing the algorithm to simply pass through the array once without making any swaps.
169230

170231
#### Space Complexity
171232

@@ -193,15 +254,52 @@ Quick Sort, often simply referred to as "quicksort", is a divide-and-conquer alg
193254
3. Recursively apply steps 1 and 2 to the left and right partitions.
194255
4. Repeat until base case: the partition has only one or zero elements.
195256

257+
258+
```
259+
Start:
260+
[ 10 ][ 7 ][ 8 ][ 9 ][ 1 ][ 5 ]
261+
262+
Partition around pivot = 5:
263+
• Compare and swap ↓
264+
[ 1 ][ 7 ][ 8 ][ 9 ][ 10 ][ 5 ]
265+
• Place pivot in correct spot ↓
266+
[ 1 ][ 5 ][ 8 ][ 9 ][ 10 ][ 7 ]
267+
268+
Recurse on left [1] → already sorted
269+
Recurse on right [8, 9, 10, 7]:
270+
271+
Partition around pivot = 7:
272+
[ 7 ][ 9 ][ 10 ][ 8 ]
273+
Recurse left [] → []
274+
Recurse right [9, 10, 8]:
275+
276+
Partition around pivot = 8:
277+
[ 8 ][ 10 ][ 9 ]
278+
Recurse left [] → []
279+
Recurse right [10, 9]:
280+
Partition pivot = 9:
281+
[ 9 ][ 10 ]
282+
→ both sides sorted
283+
284+
→ merge [8] + [9, 10] → [ 8 ][ 9 ][ 10 ]
285+
286+
→ merge [7] + [8, 9, 10] → [ 7 ][ 8 ][ 9 ][ 10 ]
287+
288+
→ merge [1, 5] + [7, 8, 9, 10] → [ 1 ][ 5 ][ 7 ][ 8 ][ 9 ][ 10 ]
289+
290+
Result:
291+
[ 1 ][ 5 ][ 7 ][ 8 ][ 9 ][ 10 ]
292+
```
293+
196294
#### Stability
197295

198296
Quick sort is inherently unstable due to the long-distance exchanges of values. However, with specific modifications, it can be made stable, although this is not commonly done.
199297

200298
#### Time Complexity
201299

202-
- In the **worst-case**, the time complexity is **$O(n^2)$**, which can occur when the pivot is the smallest or largest element, resulting in highly unbalanced partitions. However, with effective pivot selection strategies, this scenario is rare in practice.
203-
- The **average-case** time complexity is **$O(n \log n)$**, which is expected when using a good pivot selection method that balances the partitions reasonably well.
204-
- In the **best-case**, the time complexity is also **$O(n \log n)$**, occurring when each pivot divides the array into two roughly equal-sized parts, leading to optimal partitioning.
300+
- In the **worst-case**, the time complexity is $O(n^2)$, which can occur when the pivot is the smallest or largest element, resulting in highly unbalanced partitions. However, with effective pivot selection strategies, this scenario is rare in practice.
301+
- The **average-case** time complexity is $O(n \log n)$, which is expected when using a good pivot selection method that balances the partitions reasonably well.
302+
- In the **best-case**, the time complexity is also $O(n \log n)$, occurring when each pivot divides the array into two roughly equal-sized parts, leading to optimal partitioning.
205303

206304
#### Space Complexity
207305

@@ -218,8 +316,9 @@ Heap Sort is a comparison-based sorting technique performed on a binary heap dat
218316

219317
#### Conceptual Overview
220318

221-
1. The first step is to **build a max heap**, which involves transforming the list into a max heap (a complete binary tree where each node is greater than or equal to its children). This is typically achieved using a bottom-up approach to ensure the heap property is satisfied.
222-
2. During **sorting**, the maximum element (the root of the heap) is swapped with the last element of the unsorted portion of the array, placing the largest element in its final position. The heap size is then reduced by one, and the unsorted portion is restructured into a max heap. This process continues until the heap size is reduced to one, completing the sort.
319+
1. The first step is to **build a max heap**, which involves transforming the list into a max heap (a complete binary tree where each node is greater than or equal to its children). This is typically achieved using a bottom-up approach to ensure the heap property is satisfied. *(Building the heap with Floyd’s bottom-up procedure costs Θ(*n*) time—lower than Θ(*n log n*)—so it never dominates the overall running time.)*
320+
321+
2. During **sorting**, the maximum element (the root of the heap) is swapped with the last element of the unsorted portion of the array, placing the largest element in its final position. **After each swap, the newly “fixed” maximum stays at the end of the *same* array; the active heap is simply the prefix that remains unsorted.** The heap size is then reduced by one, and the unsorted portion is restructured into a max heap. This process continues until the heap size is reduced to one, completing the sort.
223322

224323
#### Steps
225324

@@ -229,19 +328,85 @@ Heap Sort is a comparison-based sorting technique performed on a binary heap dat
229328
4. "Heapify" the root of the tree, i.e., ensure the heap property is maintained.
230329
5. Repeat steps 2-4 until the size of the heap is one.
231330

331+
```
332+
Initial array (size n = 5) index: 0 1 2 3 4
333+
4 [4,10,3,5,1]
334+
/ \
335+
10 3
336+
/ \
337+
5 1
338+
339+
↓ BUILD MAX-HEAP (Θ(n)) —> heapSize = 5
340+
10 [10,5,3,4,1]
341+
/ \
342+
5 3
343+
/ \
344+
4 1
345+
```
346+
347+
**Pass 1 extract-max**
348+
349+
```
350+
swap 10 ↔ 1 [1,5,3,4 | 10] heapSize = 4
351+
↑ live heap ↑ ↑fixed↑
352+
heapify (1↔5, 1↔4) → [5,4,3,1 | 10]
353+
354+
5
355+
/ \
356+
4 3
357+
/
358+
1
359+
```
360+
361+
**Pass 2 extract-max**
362+
363+
```
364+
swap 5 ↔ 1 [1,4,3 | 5,10] heapSize = 3
365+
heapify (1↔4) → [4,1,3 | 5,10]
366+
367+
4
368+
/ \
369+
1 3
370+
```
371+
372+
**Pass 3 extract-max**
373+
374+
```
375+
swap 4 ↔ 3 [3,1 | 4,5,10] heapSize = 2
376+
(no heapify needed – root already ≥ child)
377+
378+
3
379+
/
380+
1
381+
```
382+
383+
**Pass 4 extract-max**
384+
385+
```
386+
swap 3 ↔ 1 [1 | 3,4,5,10] heapSize = 1
387+
(heap of size 1 is trivially a heap)
388+
```
389+
390+
**Pass 5 extract-max**
391+
392+
```
393+
Done – heapSize = 0
394+
Sorted array: [1,3,4,5,10]
395+
```
396+
232397
#### Stability
233398

234399
Heap sort is inherently unstable. Similar to quicksort, the relative order of equal items is not preserved because of the long-distance exchanges.
235400

236401
#### Time Complexity
237402

238-
- In the **worst-case**, the time complexity is **$O(n \log n)$**, regardless of the arrangement of the input data.
239-
- The **average-case** time complexity is also **$O(n \log n)$**, as the algorithm's structure ensures consistent performance.
240-
- In the **best-case**, the time complexity remains **$O(n \log n)$**, since building and deconstructing the heap is still necessary, even if the input is already partially sorted.
403+
- In the **worst-case**, the time complexity is $O(n \log n)$, regardless of the arrangement of the input data.
404+
- The **average-case** time complexity is also $O(n \log n)$, as the algorithm's structure ensures consistent performance.
405+
- In the **best-case**, the time complexity remains $O(n \log n)$, since building and deconstructing the heap is still necessary, even if the input is already partially sorted.
241406

242407
#### Space Complexity
243408

244-
$(O(1))$ - The sorting is done in-place, requiring only a constant amount of space for variables, regardless of the input size.
409+
$O(1)$ – The sorting is done in-place, requiring only a constant amount of auxiliary space. **This assumes an *iterative* `siftDown/heapify`; a recursive version would add an \$O(\log n)\$ call stack.**
245410

246411
#### Implementation
247412

0 commit comments

Comments
 (0)