Open
Description
Bug Report
Description
The Python implementation in the stable marriage chapter is worst case ϴ(n³), but should be O(n²).
I have attached an alternative implementation that runs in worst-case quadratic time. I can make a PR if you want to replace the current algorithm.
Steps to Reproduce
Run the code with N = 3000.
Expected behavior
Terminate within seconds.
Screenshots
N | current | new |
---|---|---|
0 | 0:00.11 | 0:00.08 |
100 | 0:00.10 | 0:00.07 |
200 | 0:00.14 | 0:00.09 |
300 | 0:00.18 | 0:00.13 |
400 | 0:00.28 | 0:00.16 |
500 | 0:00.24 | 0:00.20 |
600 | 0:00.90 | 0:00.27 |
700 | 0:00.71 | 0:00.35 |
800 | 0:01.38 | 0:00.43 |
900 | 0:01.95 | 0:00.55 |
1000 | 0:01.37 | 0:00.65 |
1100 | 0:04.69 | 0:00.80 |
1200 | 0:02.42 | 0:00.95 |
1300 | 0:04.20 | 0:01.15 |
1400 | 0:05.49 | 0:01.29 |
1500 | 0:09.23 | 0:01.55 |
1600 | 0:15.27 | 0:01.76 |
1700 | 0:09.13 | 0:01.93 |
1800 | 0:10.11 | 0:02.19 |
1900 | 0:11.64 | 0:02.27 |
2000 | 0:18.57 | 0:02.49 |
2100 | 0:22.63 | 0:02.89 |
2200 | 0:19.52 | 0:03.13 |
2300 | 0:26.70 | 0:03.43 |
2400 | 0:54.32 | 0:03.72 |
2500 | 1:07.84 | 0:04.16 |
Additional context
from random import sample as shuffle
def stable_matching(hospital_preferences, student_preferences):
students = [s for s in student_preferences]
hospitals = [h for h in hospital_preferences]
proposals = {h: 0 for h in hospitals}
unmatched_hospitals = [h for h in hospitals]
student = {h: None for h in hospitals}
hospital = {s: None for s in students}
inrank = {
s: {} for s in students
} # maps s to each hospital's s-ranking
for s in students:
for idx, h in enumerate(student_preferences[s]):
inrank[s][h] = idx
while unmatched_hospitals:
h = unmatched_hospitals.pop()
nxt = proposals[h]
s = hospital_preferences[h][nxt]
proposals[h] += 1
# h proposes to its best student not yet proposed to
if not hospital[s]:
# s is available
hospital[s] = h
student[h] = s
else:
sh = hospital[s]
rank_sh = inrank[s][sh]
rank_h = inrank[s][h]
if rank_h < rank_sh:
# s dumps sh for h
hospital[s] = h
student[sh] = None
student[h] = s
unmatched_hospitals.append(sh)
else:
# s is taken
unmatched_hospitals.append(h)
return student
def _generate_instance(N):
HOSPITALS = [f"h{i}" for i in range(N)]
STUDENTS = [f"s{i}" for i in range(N)]
hospital_preferences = {h: STUDENTS[:N] for h in HOSPITALS[:N]}
student_preferences = {s: HOSPITALS[:N] for s in STUDENTS[:N]}
for h in HOSPITALS[:N]:
hospital_preferences[h] = shuffle(hospital_preferences[h], N)
for s in STUDENTS[:N]:
student_preferences[s] = shuffle(student_preferences[s], N)
return hospital_preferences, student_preferences
if __name__ == "__main__":
import sys
hospital_preferences, student_preferences = _generate_instance(int(sys.argv[1]))
#print(hospital_preferences)
#print(student_preferences)
M = stable_matching(hospital_preferences, student_preferences)
for h in M:
print(f"Hospital {h} + Student {M[h]}")
For Algorithm Archive Developers
- The bug can be reproducedThe bug can be fixed (if not, please explain why not in a comment below)There is a timeline to fix the bugThe bug has been fixed (Please link the PR)
Activity
Amaras commentedon Sep 25, 2022
Hi!
Thank you for your interest in the Algorithm Archive, as I see this is your first contribution to our repo.
While I recognize our stable problem code is not the most efficient, I don't think it is a big problem just yet.
Could you please provide a fair comparison between the two algorithms?
This would mean:
hyperfine
to compare the two version easily.If you have done all that, I think you can open a PR, in which we'll do a code review. As the code you provide in the issue is confusing (in my opinion), we'll definitely need to improve it before we can merge it in.
pgdr commentedon Sep 25, 2022
The results are identical. They both implement the Gale–Shapley Algorithm, and interestingly, any G–S algorithm implementation has a unique output (regardless of execution order). I have confirmed that the produced matching is the same in both algorithms.
sm.py
is the quadratic time algorithm attached above, and 1000 means thatN = 1000
in every run.They both used the same seed in all ten runs. If necessary, I can try to get
hyperfine
to set different seed in each run.Amaras commentedon Sep 25, 2022
OK, good. We won't have to check too much, then.
The thing I don't really understand is where you pull
stable_marriage.py
accepting an argument, which we don't have in the AAA. If that's our code that you modified, you need to provide the modifications you made, because, right now, the code just can't accept more than 26 people due to the data representation we chose.While in general I'd prefer fast code, in this case I'm not sure if that's necessary, as the AAA has a mainly pedagogical value, although it should definitely talk about performance in a dedication section.
EDIT: What I failed to mention is that we have to assume people don't have a background in programming or algorithms, so clarity is key, sometimes even at the cost of performance.
leios commentedon Sep 26, 2022
This looks good to me. To be honest, the stable marriage chapter is one of the few that need a major overhaul. There does not seem to be any standard implementation that spans across different languages, so the python version is radically different than the Julia one (for example).
Right now, I find the current python implementation a bit harder to read than the one you provided, but we might need to work on it to standardize it across the other languages as well.
pgdr commentedon Sep 26, 2022
Here is the modification that allowed me to run it for an arbitrary number of pairs
I realize that, and I think that's a good thing. I just don't know if the current SM implementation is optimal in that regard either. I think the OOP based approach to a relatively simple algorithm makes it more difficult to see what's going on.
The stable marriage algorithm works as follows:
I think that an algorithm in the AAA should be as close in spirit as possible, to that one.
A different issue is that I think that more modern expositions of the stable matching problem uses "hospitals" and "students", rather than "men" and "women". But that's a different discussion.
I could also take a shot at that while I'm at it (if I'm creating a PR anyway).
Pps: I don't think it belongs under decision problems (that are typically yes/no). The Stable Marriage problem always has a solution. Perhaps it would be better suited under a flows & matchings category?
leios commentedon Sep 26, 2022
Yeah, I agree with the points here. Also happy to change terminology to be stable matching instead of stable marriage (and also take down my youtube video on the topic). At the time, I had not seen the stable matching terminology. I wish I had because it kinda caused a bit of a rift in the community and I didn't know how to fix it.
I think I would ideally like to revamp the entire chapter to be more inline with current standards.
Amaras commentedon Sep 26, 2022
@pgdr Yep, I'm okay with those modifications for benchmarking, although we don't typically read from stdin (because of an ongoing-ish effort for standardization of inputs and outputs).
I also agree that OOP might not be the best approach, although it was the approach I went for when I wanted to improve it (might have been misguided).
It is also possible that output to stdout takes a long time, since you're writing 3000 lines (for N=1000).
A quick test without output gave me around 30 seconds with the current version (N=1000), so about 3 times less than what you got, but still significant.
If both of you want a revamp of the chapter, we'll need to rewrite all the implementations anyway, so might as well do a compromise solution (your code, but a bit less confusing).
Amaras commentedon Sep 26, 2022
After a "quick" profile, these two are the longest running functions:
So the method call overhead seems to be part of the problem, since the time per call of those two functions is really low.
pgdr commentedon Sep 26, 2022
My point is that the worst-case complexity of the current implementation is ϴ(n³), so it's not really the method call overhead that's doing it, it's just the sheer number of operations. When N = 3000, we have something like 1010 many "operations", and if each "operation" takes at least a couple of CPU cycles, then time starts to fly rather quickly. So I don't think there's any point in profiling the implementation.
pgdr commentedon Sep 26, 2022
Indeed, just looking at the number of function calls (for N = 3000):
Current implementation:
Alternative implementation:
That's 2833M vs 62M function calls.
Amaras commentedon Sep 26, 2022
Yeah, that's what I meant but didn't articulate properly. The time cost is dominated by method call (and profiling) overhead due to the number of operations.
Seems like the mean time complexity is close to ϴ(n³) as well.
It's been a while since I computed complexity, so I'll have to take some time to do that on my ownOkay, I see where that comes from: a linear search (
pick_prefered
) in a double loop... yep, obviously O(n³). With the current architecture, it will be basically impossible to go below O(n² log n), so we should not use it unless it's very superior in terms of clarity compared to any improvement.We'll see when we get around to rewriting the chapter. If @leios is comfortable enough letting you do this chapter rewrite, then please do take a few days to do it, and we'll review both the new code and the new chapter fairly. Although that has historically been a very long process.
Otherwise, @leios should (probably) get some AAA work done before Oct 15, so feel free to be part of the chapter review process, and contribute Python code when the chapter is merged.