2025-09-18 20:29:37 -04:00
|
|
|
I just realized that my spinqueueing mechanism is highly power inefficient if a
|
|
|
|
|
lock needs to be held across a "true async wait"—where the async sequence
|
|
|
|
|
actually waits on a hardware bottleneck. In this case, the thread acquires the
|
|
|
|
|
spinlock, then goes to sleep in the kernel schedq until some hardware event
|
|
|
|
|
occurs, and is then awakened—all while still holding the spinlock.
|
|
|
|
|
|
|
|
|
|
Meanwhile, other sequences running on other threads and contending for that lock
|
|
|
|
|
will be Qspinning. This is acceptable if all I care about is maximum throughput:
|
|
|
|
|
the Qspinning just re-posts the sequences back into the Q, and eventually
|
|
|
|
|
they'll acquire the LockSet and proceed.
|
|
|
|
|
|
|
|
|
|
Importantly, since the thread itself isn't slept in the kschedQ, it will be
|
|
|
|
|
deQing and processing other sequences that aren't bottlenecked on the lock held
|
|
|
|
|
by the sequence waiting for the hardware response. Throughput is indeed
|
|
|
|
|
maximized.
|
|
|
|
|
|
|
|
|
|
However, I just realized that if kernel mutexes expose FD events, I can apply
|
|
|
|
|
this same logic to sleeplocks: I can wait on the sleeplocks asynchronously
|
|
|
|
|
instead of synchronously. If I can make my asio::io_service wait on all the
|
|
|
|
|
mutex FDs requested by sequences on the current thread, then in theory I can put
|
|
|
|
|
the thread to sleep and know that when the mutex becomes available, I'll be
|
|
|
|
|
awakened again.
|
|
|
|
|
|
|
|
|
|
Hence, I can get the best of both worlds: maximum throughput and power saving.
|
|
|
|
|
Instead of spinqueueing, we just add the lock FDs to an FD set to be waited on
|
|
|
|
|
by asio. If any of those locks become available, the kernel scheduler will
|
|
|
|
|
awaken our asioQ thread, and we can then awaken and retry the lock.
|
|
|
|
|
|
|
|
|
|
## Boost asio queue-based sleep locking:
|
|
|
|
|
|
|
|
|
|
Instead of using FDs, we can also try to use a fifo Q based mechanism: each lock
|
|
|
|
|
is a spinlock and a fifo queue.
|
|
|
|
|
|
|
|
|
|
Acquire:
|
|
|
|
|
```
|
|
|
|
|
lock(spinlock);
|
|
|
|
|
q.push_back(self);
|
|
|
|
|
head = q.peek_front();
|
|
|
|
|
if (head == self) {
|
|
|
|
|
// We acquired the lock.
|
|
|
|
|
unlock(spinlock);
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
unlock(spinlock);
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
release:
|
|
|
|
|
```
|
|
|
|
|
lock(spinlock);
|
|
|
|
|
// Should get back ourself.
|
|
|
|
|
q.pop_front();
|
|
|
|
|
// Wake up the next request in the q.
|
|
|
|
|
head = q.peek_front();
|
|
|
|
|
if (head == NULL) {
|
|
|
|
|
// Nobody was waiting.
|
|
|
|
|
unlock(spinlock);
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
head.thread_to_wake.getIoService().post([]{
|
|
|
|
|
// This lambda causes thread_to_wake to check this lock's
|
|
|
|
|
// Q and then proceed to execute since it now owns the lock.
|
|
|
|
|
});
|
|
|
|
|
unlock(spinlock);
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Something like this: it causes the entire thing to be, at least ostensibly,
|
|
|
|
|
in userspace -- though idk how Boost handles its queues internally.
|
|
|
|
|
|
|
|
|
|
## Priortizing LockSets:
|
|
|
|
|
|
|
|
|
|
One problem we have with a FIFO-based sleeping system is that it makes it very
|
|
|
|
|
unlikely that LockSets will ever acquire all of their locks, if there are
|
|
|
|
|
contenders for those same locks who only need to acquire one of the locks in
|
|
|
|
|
that LockSet.
|
|
|
|
|
|
|
|
|
|
We could theoretically give locksets an advantage by not making them backout
|
|
|
|
|
if they fail to acquire all locks in their set. I.e: if they get 2/3, then they
|
|
|
|
|
hold those 2 and then wait for the 3rd. This is problematic because it leaves
|
|
|
|
|
room open for deadlocks in the form of T1 and T2 needing both LockA and LockB,
|
|
|
|
|
but they acquire them in reverse order. I.e: T1 takes LockA and now waits for
|
|
|
|
|
LockB; and T2 takes LockB and now waits for LockA. This will now happen among
|
|
|
|
|
the LockSets if we don't impose backing out. It may be possible to avoid this
|
|
|
|
|
using very careful lock ordering and dependency analysis but this project is
|
|
|
|
|
asynchronous the locking is done in the async sequences and not in the sync
|
|
|
|
|
accessor functions. So this kind of analysis is almost impossible to do.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
We need to think of a way to make the FIFOs biased toward LockSets so that they
|
|
|
|
|
have an advantage over single-lock acquirers. Or else LockSet sequences will be
|
|
|
|
|
starved.
|
|
|
|
|
|
|
|
|
|
### Timed backoff:
|
|
|
|
|
|
|
|
|
|
We could have Locksets be greedy and try to hold on to the locks they've
|
|
|
|
|
acquired (say, 2/3 and then wait for the 3rd) but then be forced to backoff
|
|
|
|
|
after a timeout.
|
|
|
|
|
|
|
|
|
|
This introduces async event complexity and also the timeout we choose is almost
|
|
|
|
|
guaranteed to be arbitrary.
|
|
|
|
|
|
|
|
|
|
### Fractionally inserted FIFOs:
|
|
|
|
|
|
|
|
|
|
We insert sequences with a LockSet.size() of 1, at the back.
|
|
|
|
|
We insert all other sequences (>1) into first 1/LockSet.size()th position in the
|
|
|
|
|
Queue.
|
|
|
|
|
So a Lockset of size 2 will be inserted at the end of the first half of the
|
|
|
|
|
items in the queue.
|
|
|
|
|
A Lockset of size 3 will be inserted at the end of the first 33% of items.
|
|
|
|
|
A lockset of size 4 will be inserted at the end of the first 25% of items.
|
|
|
|
|
And so on.
|
|
|
|
|
|
|
|
|
|
This ensures that higher LockSet.size()s will be prioritized ever higher, and
|
|
|
|
|
at the same time they don't completely hog everything. Those single-lock
|
|
|
|
|
sequences that have already naturally progressed past the fraction-mark of a
|
|
|
|
|
given LockSet size will continue making progress toward the front.
|
|
|
|
|
|
|
|
|
|
For queueing sequences with Locksets>1, we can enQ them on the FIFO of the first
|
|
|
|
|
lock in their set. They'll back off each time anyway, so they'll always be
|
|
|
|
|
re-trying from the first lock in their set each time.
|
|
|
|
|
|
|
|
|
|
#### Impl details:
|
|
|
|
|
|
|
|
|
|
We'd like to use std::unordered_set because insertion will require lots of
|
|
|
|
|
moving items around, but we'll have to use std::vector because we need direct
|
|
|
|
|
access to insert at arbitrary fractional indexes. It's unlikely the number of
|
|
|
|
|
items in any lock's Q will ever be large enough to require lots of displacement,
|
|
|
|
|
but welp there's no reason not to plan for scaling. Although if we end up
|
|
|
|
|
needing scaling that's a symptom of a bigger problem...with scaling itself lol.
|
|
|
|
|
There shouldn't be enough items blocked on a lock that we have to design the
|
|
|
|
|
lock's queue to be scalable.
|
|
|
|
|
|
|
|
|
|
### Inverted Fractionally acquired locksets:
|
|
|
|
|
|
|
|
|
|
The previous ideas of fractionally inserted lockQs was okay, but the acquisition
|
|
|
|
|
algo required that the async seq be at the front of a locks queue to
|
|
|
|
|
successfully acquire that lock. That makes it almost impossible for Locksets>1
|
|
|
|
|
to ever acquire all of their locks. If we add backoff to that, it basically
|
|
|
|
|
means no lockset will ever acquire all of its locks.
|
|
|
|
|
|
|
|
|
|
Instead what we now do is always insert at the rear (push_back()) and then when
|
|
|
|
|
acquiring, we check to see if the sequence is in the first
|
|
|
|
|
1/(1/(LockSet.size())), and if so, it successfully acquires the lock. I.e: if
|
|
|
|
|
the sequence item isn't in the LAST 1/(LockSet.size()) items, then it succeeds.
|
|
|
|
|
* For a lockset of size=1: It must be at the front of the queue.
|
|
|
|
|
* Lockset.size=2: it must be in the first 50% of items.
|
|
|
|
|
* Lockset.size=3: it must be in the first 66% of items.
|
|
|
|
|
* Lockset.size=4: It must be in the first 75% of items.
|
|
|
|
|
|
|
|
|
|
So this way larger LockSets are favoured, but 1-size locksets make progress.
|
|
|
|
|
|
|
|
|
|
For performance:
|
|
|
|
|
* We obv can just scan the smaller tail percentage for the item instead of
|
|
|
|
|
scanning the larger front percentage.
|
|
|
|
|
* If we use a doubly-linked list, we can prolly keep the insertion iterator
|
|
|
|
|
and this way we won't have to actually find the item in the lockQ when we wish
|
|
|
|
|
to eventually remove it from the lockQ when releasing the lock.
|
|
|
|
|
|
|
|
|
|
## Total overall design:
|
|
|
|
|
|
|
|
|
|
### Asio queues and Lockvokers:
|
|
|
|
|
|
|
|
|
|
Lockvokers are initially enqueued on a CompThread's queue. When the lockvoker
|
|
|
|
|
first runs, it checks a flag to see if it has been "registered" into the queues
|
|
|
|
|
for all locks in its set. If not, then it "registers" itself in each lock's
|
|
|
|
|
ticketQ and then attempts to acquire each lock. Registration and acquisition
|
|
|
|
|
are logically separate operations; and locks will often attempt acquisition
|
|
|
|
|
many times after first registering, without needing to register again. Ideally
|
|
|
|
|
we can implement a LockSet::registerAndTryAcquireAll() method, but that's for
|
|
|
|
|
us to think about later.
|
|
|
|
|
|
|
|
|
|
```
|
2025-09-19 01:18:19 -04:00
|
|
|
/* We'll need to rename current class LockSpec to LockSet. */
|
2025-09-18 20:29:37 -04:00
|
|
|
class LockSet
|
|
|
|
|
{
|
|
|
|
|
/* Add this either inside of LockSet or outside of it -- depends on whether
|
|
|
|
|
* it's we can get it to compile because I'm seeing some potential circular
|
|
|
|
|
* definition dependencies.
|
|
|
|
|
*/
|
2025-09-19 01:18:19 -04:00
|
|
|
typedef std::pair<Qutex, LockerAndInvokerList::iterator>
|
|
|
|
|
LockUsageDesc;
|
2025-09-18 20:29:37 -04:00
|
|
|
|
2025-09-19 01:18:19 -04:00
|
|
|
/* Find a LockUsageDesc -- useful below */
|
|
|
|
|
LockUsageDesc &getLockUsageDesc(Qutex &criterionLock)
|
2025-09-18 20:29:37 -04:00
|
|
|
{
|
|
|
|
|
for (auto &reqLock: requiredLocks) {
|
|
|
|
|
if (reqLock.first == &criterionLock) { return reqLock; }
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Should never happen.
|
|
|
|
|
throw;
|
|
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
LockSet::register(LockerAndInvoker &lockvoker)
|
|
|
|
|
{
|
|
|
|
|
for (auto &lock: lockset.locks) {
|
|
|
|
|
// Register the Lockvoker object in each lock's ticketQ.
|
|
|
|
|
lock.second = lock.first.register(lockvoker);
|
|
|
|
|
}
|
|
|
|
|
registered = true;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
bool LockSet::tryAcquire(LockerAndInvoker &lockvoker)
|
|
|
|
|
{
|
|
|
|
|
if (!registered) {
|
|
|
|
|
// Should never happen.
|
|
|
|
|
throw ...;
|
|
|
|
|
}
|
|
|
|
|
int nLocksAcquired=0,
|
|
|
|
|
nLocksInSet = lockset.size();
|
|
|
|
|
for (auto &lock: lockset.locks) {
|
|
|
|
|
if (!lock.first.tryAcquire(nLocksInSet)) {
|
|
|
|
|
break;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
nLocksAcquired++;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (nLocksAcquired == nLocksInSet) {
|
|
|
|
|
// Success
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
for (int i=0; i<nLocksAcquired; i++) {
|
|
|
|
|
// Backoff does different stuff from release();
|
|
|
|
|
locks[i].first.backoff(lockvoker);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
LockSet::release()
|
|
|
|
|
{
|
|
|
|
|
for (auto &lock: requiredLocks) {
|
|
|
|
|
lock.release();
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2025-09-19 01:18:19 -04:00
|
|
|
Now, the Qutex class is what we'll use for synchronization. It's just a
|
2025-09-18 20:29:37 -04:00
|
|
|
combination of 2 atomic<bool> and a std::list.
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
class SpinLock
|
|
|
|
|
{
|
|
|
|
|
/* Modify to add methods acquire() and release() which busy-wait.
|
|
|
|
|
*/
|
|
|
|
|
void acquire();
|
|
|
|
|
void release();
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
class LockSet
|
|
|
|
|
{
|
|
|
|
|
/* Modify the std::vector of SpinLock to instead be:
|
2025-09-19 01:18:19 -04:00
|
|
|
* std::vector<LockUsageDesc> locks;
|
2025-09-18 20:29:37 -04:00
|
|
|
*/
|
2025-09-19 01:18:19 -04:00
|
|
|
std::vector<LockUsageDesc> locks;
|
2025-09-18 20:29:37 -04:00
|
|
|
}
|
|
|
|
|
|
2025-09-19 18:52:27 -04:00
|
|
|
bool LockerAndInvoker::operator==(const LockerAndInvoker &other)
|
|
|
|
|
{
|
|
|
|
|
/* Compare by the address of the continuation objects. Why?
|
|
|
|
|
* Because there's no guarantee that the lockvoker object that was
|
|
|
|
|
* passed in by the io_service invocation is the same object as that
|
|
|
|
|
* which is in the qutexQs. Especially because we make_shared() a
|
|
|
|
|
* copy when registerInQutexQueues()ing.
|
|
|
|
|
*
|
|
|
|
|
* Generally when we "wake" a lockvoker by enqueuing it, boost's
|
|
|
|
|
* io_service::post will copy the lockvoker object.
|
|
|
|
|
*/
|
|
|
|
|
return &this->serializedContinuation == &other.serializedContinuation;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
bool LockerAndInvoker::operator !=(const LockerAndInvoker &other)
|
|
|
|
|
{
|
|
|
|
|
return &this->serializedContinuation != &other.serializedContinuation;
|
|
|
|
|
}
|
|
|
|
|
|
2025-09-19 01:18:19 -04:00
|
|
|
class Qutex
|
2025-09-18 20:29:37 -04:00
|
|
|
{
|
|
|
|
|
public:
|
2025-09-19 01:18:19 -04:00
|
|
|
typedef std::list<LockerAndInvoker> LockerAndInvokerList;
|
|
|
|
|
|
|
|
|
|
LockerAndInvokerList::iterator register(const LockerAndInvoker &lockvoker)
|
2025-09-18 20:29:37 -04:00
|
|
|
{
|
|
|
|
|
/** EXPLANATION:
|
|
|
|
|
* Just insert the lockvoker into the rear of the list.
|
|
|
|
|
*
|
|
|
|
|
* Then, since we want to store the
|
|
|
|
|
*/
|
2025-09-19 01:18:19 -04:00
|
|
|
LockerAndInvokerList::iterator it;
|
2025-09-18 20:29:37 -04:00
|
|
|
|
|
|
|
|
lock.acquire();
|
|
|
|
|
queue.push_back(lockvoker);
|
|
|
|
|
it = queue.end();
|
|
|
|
|
--it;
|
|
|
|
|
lock.release();
|
|
|
|
|
|
|
|
|
|
return it;
|
|
|
|
|
}
|
|
|
|
|
|
2025-09-19 18:52:27 -04:00
|
|
|
void unregister(LockerAndInvokerList::iterator it, bool shouldLock=1)
|
2025-09-18 20:29:37 -04:00
|
|
|
{
|
2025-09-19 18:52:27 -04:00
|
|
|
if (shouldLock)
|
|
|
|
|
{
|
|
|
|
|
lock.acquire();
|
|
|
|
|
queue.erase(it);
|
|
|
|
|
lock.release();
|
|
|
|
|
}
|
|
|
|
|
else{
|
|
|
|
|
queue.erase(it);
|
|
|
|
|
}
|
2025-09-18 20:29:37 -04:00
|
|
|
}
|
|
|
|
|
|
2025-09-19 18:52:27 -04:00
|
|
|
bool tryAcquire(LockerAndInvoker &tryingLockvoker)
|
2025-09-18 20:29:37 -04:00
|
|
|
{
|
2025-09-19 18:52:27 -04:00
|
|
|
const nRequiredLocks = tryingLockvoker.serializedContinuation
|
|
|
|
|
.requiredLocks.size();
|
|
|
|
|
|
2025-09-18 20:29:37 -04:00
|
|
|
lock.acquire();
|
2025-09-19 18:52:27 -04:00
|
|
|
|
|
|
|
|
const qNItems = queue.size();
|
2025-09-18 20:29:37 -04:00
|
|
|
|
|
|
|
|
if (qNItems < 1) {
|
|
|
|
|
lock.release();
|
|
|
|
|
|
|
|
|
|
/** EXPLANATION:
|
|
|
|
|
* requiredLocks before ever trying to tryAcquire() them, so if
|
|
|
|
|
* tryAcquire is being called, that must mean that queue.size() > 0.
|
|
|
|
|
*
|
|
|
|
|
* Ergo this should never happen.
|
|
|
|
|
*/
|
|
|
|
|
throw;
|
|
|
|
|
}
|
|
|
|
|
|
2025-09-19 18:52:27 -04:00
|
|
|
if (!!currentOwner) {
|
2025-09-18 20:29:37 -04:00
|
|
|
lock.release();
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/** EXPLANATION:
|
|
|
|
|
* From here:
|
|
|
|
|
* if qNItems == 1 the we are the only one in the ticketQ and we have
|
|
|
|
|
* successfully acquired the lock.
|
|
|
|
|
* If qNitems / nRequiredLocks == 0, then we acquire by default since
|
|
|
|
|
* the number of items in the ticketQ guarantees that we are in the top
|
|
|
|
|
* X% for that nRequiredLocks.
|
|
|
|
|
* If qNItems / nRequiredLocks >= 1, then we must do the normal algo:
|
|
|
|
|
* Check the last (qNItems/nRequiredLocks) items, and if the item isn't
|
|
|
|
|
* in those items, then it must be in the earlier ones (obviously).
|
|
|
|
|
* Hence this Lockvoker acquisition should be considered successful.
|
|
|
|
|
*
|
|
|
|
|
* EXPLANATION 2:
|
|
|
|
|
* You'll notice that we don't do actual percentages but rather we just
|
|
|
|
|
* do discrete fractions -- this makes the algo more deterministic
|
|
|
|
|
* and much easier to reason about. I.e:
|
|
|
|
|
* If nRequiredLocks is 6 and qNItems==3:
|
|
|
|
|
* we don't actually calculate that the Lockvoker item must be in
|
|
|
|
|
* the top (100-17%), and then try to calculate whether we ought to
|
|
|
|
|
* consider the 3rd item to be in the last 17-percentile. We just
|
|
|
|
|
* do a fractional count and assume complete discreteness.
|
|
|
|
|
*/
|
|
|
|
|
const int nRearItemsToScan = qNItems / nRequiredLocks;
|
|
|
|
|
|
|
|
|
|
if (qNItems == 1 || nRearItemsToScan < 1) {
|
2025-09-19 18:52:27 -04:00
|
|
|
currOwner = tryingLockvoker;
|
2025-09-18 20:29:37 -04:00
|
|
|
lock.release();
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
|
2025-09-19 18:52:27 -04:00
|
|
|
/** EXPLANATION:
|
|
|
|
|
* For lockvokers that only have 1 requiredLock, they must be at the
|
|
|
|
|
* front of the queue to successfully acquire.
|
|
|
|
|
*/
|
|
|
|
|
if (nRequiredLocks == 1)
|
|
|
|
|
{
|
|
|
|
|
if (tryingLockvoker == &queue.front())
|
|
|
|
|
{
|
|
|
|
|
currOwner = tryingLockvoker;
|
|
|
|
|
lock.release();
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2025-09-18 20:29:37 -04:00
|
|
|
auto rIt = queue.rbegin();
|
|
|
|
|
auto rEndIt = queue.rend();
|
|
|
|
|
bool foundInRear = false;
|
|
|
|
|
for (int i=0; i<nRearItemsToScan && rIt != rEndIt; rIt++, i++)
|
|
|
|
|
{
|
2025-09-19 18:52:27 -04:00
|
|
|
if (*rIt != tryingLockvoker) { continue; }
|
2025-09-18 20:29:37 -04:00
|
|
|
|
|
|
|
|
foundInRear == true;
|
|
|
|
|
break;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (foundInRear) {
|
|
|
|
|
lock.release();
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* Not found in rear: this means the item is in the top X%. That means
|
|
|
|
|
* it should be allowed to claim the lock.
|
|
|
|
|
*/
|
2025-09-19 18:52:27 -04:00
|
|
|
currOwner = tryingLockvoker;
|
2025-09-18 20:29:37 -04:00
|
|
|
lock.release();
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
backoff(LockerAndInvoker &failedAcquirer)
|
|
|
|
|
{
|
|
|
|
|
lock.acquire();
|
|
|
|
|
|
|
|
|
|
// Rotate queue members if failedAcquirer is at front of queue.
|
|
|
|
|
LockerAndInvoker &currFront = queue.front();
|
|
|
|
|
if (currFront == failedAcquirer)
|
|
|
|
|
{
|
|
|
|
|
/** EXPLANATION:
|
|
|
|
|
* Rotate the top LockSet.size() items in the queue by moving
|
|
|
|
|
* the failedAcquirer to the last position in the top
|
|
|
|
|
* LockSet.size() items within the queue.
|
|
|
|
|
*
|
|
|
|
|
* I.e: if queue.size()==20, and lockSet.size()==5, then move
|
|
|
|
|
* failedAcquirer from the front the 5th position in the queue,
|
|
|
|
|
* which should push the other 4 items forward.
|
|
|
|
|
* If queue.size==3 and LockSet.size()==5, then just
|
|
|
|
|
* push_back(failedAcquirer).
|
|
|
|
|
*
|
2025-09-19 01:18:19 -04:00
|
|
|
* It is impossible for a Qutex queue to have only one
|
2025-09-18 20:29:37 -04:00
|
|
|
* item in it, yet for that Lockvoker item to have failed to
|
2025-09-19 01:18:19 -04:00
|
|
|
* acquire the Qutex. Being the only item in the ticketQ
|
|
|
|
|
* means that you must succeed at acquiring the Qutex.
|
2025-09-18 20:29:37 -04:00
|
|
|
*/
|
|
|
|
|
int swapPosition = min(
|
|
|
|
|
queue.size(),
|
|
|
|
|
failedAcquirer.serializedContinuation.requiredLocks.size());
|
|
|
|
|
|
|
|
|
|
/* EXPLANATION:
|
|
|
|
|
* Swap them here.
|
|
|
|
|
*
|
|
|
|
|
* The reason why we do this swap is to avoid a particular kind of
|
|
|
|
|
* deadlock wherein a grid of async requests is perfectly configured
|
|
|
|
|
* so as to guarantee that none of them can make any forward
|
|
|
|
|
* progress unless they get reordered.
|
|
|
|
|
*
|
|
|
|
|
* Consider 2 different locks with 2 different items in them
|
|
|
|
|
* each, both of which come from 2 particular requests:
|
2025-09-19 01:18:19 -04:00
|
|
|
* Qutex1: Lockvoker1, Lv2
|
|
|
|
|
* Qutex2: Lv2, Lv1
|
2025-09-18 20:29:37 -04:00
|
|
|
*
|
|
|
|
|
* Moreover, both of these lockvokers have requiredLocks.size()==2,
|
|
|
|
|
* and the particular 2 locks that each one requires are indeed
|
2025-09-19 01:18:19 -04:00
|
|
|
* Qutex1 and Qutex2.
|
2025-09-18 20:29:37 -04:00
|
|
|
*
|
|
|
|
|
* This particular setup basically means that in TL1's queue, Lv1
|
|
|
|
|
* will wakeup since it's at the front of TL1. It'll successfully
|
|
|
|
|
* acquire TL1 (since it's at the front), and then it'll try to
|
|
|
|
|
* acquire TL2. But since Lv1 isn't in the top 50% of items in TL2's
|
|
|
|
|
* queue, Lv1 will fail to acquire TL2.
|
|
|
|
|
*
|
|
|
|
|
* Then similarly, in TL2's queue, Lv2 will wakeup since it's at
|
|
|
|
|
* the front. Again, it'll successfully acquire TL2 since it's at
|
|
|
|
|
* the front of TL2's queue. But then it'll try to acquire TL1.
|
|
|
|
|
* Since it's not in the top 50% of TL1's enqueued items, it'll fail
|
|
|
|
|
* to acquire TL1.
|
|
|
|
|
*
|
|
|
|
|
* N.B: This type of perfectly ordered deadlock can occur in any
|
|
|
|
|
* kind of NxN situation where ticketQ.size()==requiredLocks.size().
|
|
|
|
|
* That could be 4x4, 5x5, 6x6, etc. It doesn't happen in 1x1
|
|
|
|
|
* because a Lockvoker that only requires one lock will always just
|
|
|
|
|
* succeed if it's at the front of its queue.
|
|
|
|
|
*
|
|
|
|
|
* This state of affairs is stable and will persist unless these
|
|
|
|
|
* queues are reordered in some way. Hence: that's why we rotate the
|
2025-09-19 01:18:19 -04:00
|
|
|
* items in a QutexQ after backing off of it. Backing off means
|
2025-09-18 20:29:37 -04:00
|
|
|
* Not necessarily that the calling LockVoker failed to acquire
|
2025-09-19 01:18:19 -04:00
|
|
|
* THIS PARTICULAR Qutex, but rather than it failed to acquire
|
2025-09-18 20:29:37 -04:00
|
|
|
* ALL of its required locks.
|
|
|
|
|
*
|
|
|
|
|
* Hence, if we are backing out, we should also rotate the items
|
|
|
|
|
* in the queue if the current front item is the failed acquirer.
|
|
|
|
|
* So that's why we do this rotation here.
|
|
|
|
|
*/
|
|
|
|
|
}
|
|
|
|
|
|
2025-09-19 18:52:27 -04:00
|
|
|
currOwner.release();
|
|
|
|
|
|
2025-09-18 20:29:37 -04:00
|
|
|
LockerAndInvoker &newFront = queue.front();
|
|
|
|
|
|
|
|
|
|
lock.release();
|
|
|
|
|
|
|
|
|
|
wakeUp(newFront);
|
|
|
|
|
}
|
|
|
|
|
|
2025-09-19 18:52:27 -04:00
|
|
|
void release()
|
2025-09-18 20:29:37 -04:00
|
|
|
{
|
|
|
|
|
lock.acquire();
|
|
|
|
|
|
|
|
|
|
#ifndef CONFIG_LOCKVOKER_AGGRESSIVE_WAKEUPS
|
|
|
|
|
LockerAndInvoker &oldFront = queue.front();
|
|
|
|
|
#endif
|
|
|
|
|
|
2025-09-19 18:52:27 -04:00
|
|
|
/* Get the saved iterator and use it to unregister.
|
|
|
|
|
* Don't acquire lock because we already acquired it in this function.
|
|
|
|
|
*/
|
|
|
|
|
unregister(currOwner->serializedContinuation.requiredLocks
|
|
|
|
|
.getLockUsageDesc(*this).second, false);
|
|
|
|
|
|
|
|
|
|
currOwner.release();
|
2025-09-18 20:29:37 -04:00
|
|
|
|
|
|
|
|
/** NOTE:
|
|
|
|
|
* I am not sure whether we should only wake up the front item if
|
|
|
|
|
* the prev owner was the previous front item; or whether we should
|
|
|
|
|
* always wake up the new front item.
|
|
|
|
|
*
|
2025-09-19 01:18:19 -04:00
|
|
|
* Recall that because a sequence can acquire a Qutex without being
|
2025-09-18 20:29:37 -04:00
|
|
|
* at the front of the queue (because it could merely be in the top X%
|
|
|
|
|
* of items instead), this means that during a call to release(), the
|
|
|
|
|
* owning async sequence may not be at the front -- it needs only be in
|
|
|
|
|
* the top X% of items.
|
|
|
|
|
*
|
|
|
|
|
* When the owning sequence is the front, then we should definitely wake
|
|
|
|
|
* the new front item after removing the previous owner.
|
|
|
|
|
*
|
|
|
|
|
* But when the front item is not the prev owner, then should we wake it
|
|
|
|
|
* up? It should have been awakened previously, so if it's still at the
|
|
|
|
|
* front, that implies that it failed to make forward progress last time
|
|
|
|
|
* it awoke. You could argue that during the interim while this owner
|
|
|
|
|
* did its thing, it's possible for the current front item to have
|
|
|
|
|
* become capable of acquiring all of its requiredLocks, due to changes
|
|
|
|
|
* in the other locks in its requiredLocks set. This is a fair argument.
|
|
|
|
|
*
|
|
|
|
|
* You could also argue that we can just wait until its registered
|
|
|
|
|
* items in its other locks' ticketQs reach the front of those other
|
|
|
|
|
* ticketQs, and then when that happens, those locks will wake it up
|
|
|
|
|
* when release() is called.
|
|
|
|
|
*
|
|
|
|
|
* The latter eases contention since one could argue that we have a
|
|
|
|
|
* surer chance of it successfully acquiring all of its requiredLocks if
|
|
|
|
|
* we wait until it bubbles to the front of another ticketQ. Whereas,
|
|
|
|
|
* if we aggressively/greedily awaken it to try just because an item in
|
|
|
|
|
* another ticketQ has been removed, we're just introducing contention.
|
|
|
|
|
* Both cases have good arguments. The aggressive approach enables us to
|
|
|
|
|
* potentially retire more requests, and thus increase throughput.
|
|
|
|
|
*
|
|
|
|
|
* This current pseudocode assumes the latter and waits for the other
|
|
|
|
|
* locks' ticketQs to wake it up when it reaches the top in their
|
|
|
|
|
* Qs.
|
|
|
|
|
*/
|
|
|
|
|
LockerAndInvoker &newFront = queue.front();
|
|
|
|
|
|
|
|
|
|
lock.release();
|
|
|
|
|
|
|
|
|
|
#ifndef CONFIG_LOCKVOKER_AGGRESSIVE_WAKEUPS
|
|
|
|
|
if (&newFront != &oldFront)
|
|
|
|
|
#endif
|
|
|
|
|
{ wakeUp(newFront); }
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
public:
|
|
|
|
|
SpinLock lock;
|
2025-09-19 18:52:27 -04:00
|
|
|
std::shared_ptr<LockerAndInvoker> currOwner;
|
2025-09-19 01:18:19 -04:00
|
|
|
LockerAndInvokerList queue;
|
2025-09-18 20:29:37 -04:00
|
|
|
};
|
|
|
|
|
```
|