Linux flock does not provide fair locking
The problem I’m trying to solve: 1 machine with X amount of ram. I have to run N tasks on it, each of which take more than X/N memory. So, they can’t run at the same time. Luckily, the time they run is not particularly important, as long as they run regularly.
Via a suggestion from the whenever gem for ruby, I discovered linux lock. At first it seems like a perfect simple solution. But then I realized, I bet flock isn’t able to do anything to ensure fairness; the processes aren‘t guaranteed to run in the order in which they requested the lock.
So I threw together a little test to validate this. I'm sleeping 0.2 seconds between process invocations just in case there is some factor I'm not aware of, like maybe the OS is able to return a pid to spawn before the process "really" starts running, which means the processes might not acquire the locks in the order I anticipate. I'm queuing them all up before the first one runs (0.1*9 < 2).
9.times do |i|
Process.spawn <<~SHELL
echo #{i}-queued >> flock-test.log;
flock /var/lock/john.lock -c 'sleep 2 && echo #{i}-executed >> flock-test.log'
SHELL
sleep 0.2
print '.'
end
0-queued
1-queued
2-queued
3-queued
4-queued
5-queued
6-queued
7-queued
8-queued
0-executed
2-executed
4-executed
5-executed
6-executed
7-executed
1-executed
8-executed
3-executed
::sad trombone::
For many use cases, this doesn’t matter. Assuming that each waiting process has the same likelihood of winning, it will be very unlikely that my processes go hours without being able to run. But they might! In fact, I bet if I did the math I’d find that we could expect an unacceptable delay at a regular rate, say 4 times a year.
I looked for other solutions and explored at/batch, but that system doesn’t have the ability to run jobs serially as far as I can tell.
This discussion on stack overflow referencing this source code seems to suggest that flock is fair, but then recommends another approach, so maybe I'm misunderstanding. I think the comments in the code are relevant to if a single process is using the locks.c library internally, not for independent processes using the same lock file.
If you think my test isn't correct or have other ideas for how to achieve fair serial execution in a simple way, let me know!