Switch lock design
coconutnut

Lock structure

CAS, qspinlock and kmob can share the same lock structure:

1
2
3
4
5
6
7
8
9
10
11
12
13
struct qspinlock {
union {
atomic_t val;
struct {
u8 locked;
u8 pending;
};
struct {
u16 locked_pending;
u16 tail;
};
};
}

Since pending & locked requires only 1 bit, there’s space for more information (if needed).

The threads could be spinning/waiting at:

  1. CAS spins on the lock (32bits)
  2. qspinlock:
    1. uncontended: lock (32bits)
    2. pending: locked (8bits)
    3. uncontended queue: locked_pending (16bits)
    4. contended queue: node.locked (and later locked_pending)
  3. komb:
    1. fastpath: lock (32bits)
    2. midpath: locked_pending (16bits)
    3. node.wait (and later lock)

For the nodes spinning on the lock itself, no further notification needed.

For those waiting on their own node, will need to notify according to queue type.

Node structure

TODO: How to store node?

  1. Copy all fields inside a single sqnode structure. Drawback: Will need to modify all underlying lock & unlock functions
1
2
3
4
5
6
7
struct sqnode{
enum LockType type;
struct sqnode *next;
u8 locked; # Originally, struct mcs_spinlock has uintptr_t locked;
# other fields for QSPINLOCK
# other fields for KOMB
}
  1. Put MCS node and KOMB node inside sqnode. Drawback: Additional handling of getting next node
1
2
3
4
5
6
struct sqnode{
enum LockType type;
struct sqnode *next;
struct mcs_spinlock mcs_node;
struct komb_node komb_node;
}

Passing sqnode instead of mcs/komb nodes

Modify select_next_waiter(sqnode)

Plan 2 seems to be better?

Lock & Unlock

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Indicate current lock type
# Currently a helper thread is changing lock type every 5s
enum LockType cur_lock_type;

def slock_lock(lock):
# Fastpath: try to acquire the lock
val = CAS(&lock.val, 0, _Q_LOCKED_VAL);
if val == 0:
return

# Midpath: same for qspinlock and komb, 0,1,0 -> 0,0,1
# Don't need node
...

# Slowpath: choose according to lock type
if cur_lock_type==CAS:
# Spin
...
else:
# Get per-CPU sqnode
sqnode = cst.node
init_sqnode(sqnode, cur_lock_type)
# Jump to slowpath
if cur_lock_type==QSPINLOCK:
# Copy original one and modify lines involving nodes
queued_spin_lock_slowpath(sqnode);
else:
komb_spin_lock_slowpath(sqnode);

For KOMB phase 4:

Check if next node is KOMB and next.next is not none

1
2
3
4
5
6
7
8
9
# line 58 in Listing 1
if qnext.next==None or qnext.type!=KOMB:
notify_next_queue_head(qnext)
return

# line 73 in Listing 1
if sqnext is None or sqnext.type!=KOMB or sqnext.next is None or
counter >= WAITERS_TO_COMBINE:
break

For qspinlock, waiting on QLOCKED_PENDING_MASK also implies waiting on QLOCKED_COMBINER_VAL

Analysis

Switching from QSPINLOCK to KOMB

node A[SPIN] -> node B[SPIN] -> node C[KOMB] -> node D[KOMB] -> node E[KOMB]

  1. A holds the lock

    • B spins locked_pending
    • C spins on node.locked C spins on lock.glock since qprev is None
    • D spins on node.locked
  2. A unlock, set locked to 0

  3. B gets the lock, (0,1,1)->(0,1,0)->(0,0,1)

  4. B unlock, notify C by setting C.node.locked to 0 B unlock, sets val to (0,0,0)

  5. C gets the lock, switched

Switching from KOMB to QSPINLOCK

A[KOMB] -> B[KOMB] -> C[KOMB] -> D[KOMB] -> E[SPIN]

  1. A holds the lock, and doing combining

    1. execute B’s CS
    2. execute C’s CS
    3. notify D
    4. execute A’s CS
    5. unlock
  2. D spins on D.qsnode.locked

    1. D gets lock
    2. notify E
    3. unlock
  3. E spins on E.qsnode.locked

    1. E gets lock, switched