Linux內(nèi)核同步機(jī)制mutex
mutex鎖概述
在linux內(nèi)核中,互斥量mutex是一種保證CPU串行運(yùn)行的睡眠鎖機(jī)制。和spinlock類似,都是同一個(gè)時(shí)刻只有一個(gè)線程進(jìn)入臨界資源,不同的是,當(dāng)無(wú)法獲取鎖的時(shí)候,spinlock原地自旋,而mutex則是選擇掛起當(dāng)前線程,進(jìn)入阻塞狀態(tài)。所以,mutex無(wú)法在中斷上下文中使用。
mutex鎖使用注意事項(xiàng)
- mutex一次只能有一個(gè)進(jìn)程或線程持有該鎖
- mutex只有它的擁有者可以釋放該鎖
- 不能多次釋放同一把鎖
- 不可以重復(fù)獲取同一把鎖,否則會(huì)造成死鎖
- 必須使用mutex提供的專用初始化函數(shù)初始化該鎖
- 不能重復(fù)初始化同一把鎖
- 不能使用
memset
或memcpy
等內(nèi)存處理函數(shù)初始化mutex鎖 - 線程退出時(shí)要釋放自己持有的所有mutex鎖
- 不能用于設(shè)備中斷或軟中斷上下文中
mutex鎖結(jié)構(gòu)體定義
- owner:記錄mutex的持有者
- wait_lock:spinlock自旋鎖
- soq:MCS鎖隊(duì)列,用于支持mutex樂(lè)觀自旋機(jī)制
- wait_list:當(dāng)無(wú)法獲取鎖的時(shí)候掛起在此
- magic:用于debug調(diào)試
- dep_map:用于debug調(diào)試
struct mutex {
atomic_long_t owner;
spinlock_t wait_lock;
#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
struct optimistic_spin_queue osq; /* Spinner MCS lock */
#endif
struct list_head wait_list;
#ifdef CONFIG_DEBUG_MUTEXES
void *magic;
#endif
#ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
#endif
};
mutex鎖主要接口函數(shù)
mutex_init | 初始化mutex對(duì)象 |
---|---|
__mutex_init | mutex_init會(huì)調(diào)用此函數(shù) |
DEFINE_MUTEX | 靜態(tài)定義并初始化一個(gè)mutex對(duì)象 |
__MUTEX_INITIALIZER | DEFINE_MUTEX會(huì)調(diào)用此函數(shù) |
mutex_lock | 獲取mutex鎖,失敗進(jìn)程進(jìn)入D狀態(tài) |
mutex_lock_interruptible | 獲取mutex鎖,失敗進(jìn)入S狀態(tài) |
mutex_trylock | 嘗試獲取mutex鎖,失敗直接返回 |
mutex_unlock | 釋放mutex鎖 |
mutex_is_locked | 判斷當(dāng)前mutex鎖的狀態(tài) |
獲取鎖流程分析
mutex_lock()
函數(shù)調(diào)用might_sleep()
函數(shù)判斷鎖的狀態(tài),調(diào)用__mutex_trylock_fast()
函數(shù)嘗試快速獲取mutex
鎖,如果失敗,則調(diào)用__mutex_lock_slowpath()
函數(shù)獲取mutex
鎖
void __sched mutex_lock(struct mutex *lock)
{
might_sleep();
if (!__mutex_trylock_fast(lock))
__mutex_lock_slowpath(lock);
}
如果沒(méi)有定義CONFIG_DEBUG_ATOMIC_SLEEP
宏,might_sleep
函數(shù)退化為 might_resched()
函數(shù)。
# define might_sleep() \\
do { __might_sleep(__FILE__, __LINE__, 0); might_resched(); } while (0)
# define sched_annotate_sleep() (current- >task_state_change = 0)
#else
static inline void ___might_sleep(const char *file, int line,
int preempt_offset) { }
static inline void __might_sleep(const char *file, int line,
int preempt_offset) { }
# define might_sleep() do { might_resched(); } while (0)
# define sched_annotate_sleep() do { } while (0)
在配置了搶占式內(nèi)核或者非搶占式內(nèi)核的情況下,might_resched()
函數(shù)最終都是空函數(shù)。如果配置了主動(dòng)搶占式內(nèi)核CONFIG_PREEMPT_VOLUNTARY
,則might_resched()
函數(shù)會(huì)調(diào)用 _cond_resched()
函數(shù)來(lái)主動(dòng)觸發(fā)一次搶占。
#ifdef CONFIG_PREEMPT_VOLUNTARY
extern int _cond_resched(void);
# define might_resched() _cond_resched()
#else
# define might_resched() do { } while (0)
#endif
#ifndef CONFIG_PREEMPT
extern int _cond_resched(void);
#else
static inline int _cond_resched(void) { return 0; }
#endif
——cond_resched()
函數(shù)調(diào)用should_resched()
函數(shù)判斷搶占計(jì)數(shù)器是否為0,如果搶占計(jì)數(shù)器為0并且設(shè)置了重新調(diào)度標(biāo)記則調(diào)用preempt_schedule_common()
函數(shù)進(jìn)行搶占式調(diào)度
#ifndef CONFIG_PREEMPT
int __sched _cond_resched(void)
{
if (should_resched(0)) {
preempt_schedule_common();
return 1;
}
return 0;
}
EXPORT_SYMBOL(_cond_resched);
#endif
__mutex_trylock_fast()
函數(shù)調(diào)用atomic_long_cmpxchg_acquire()
函數(shù)判斷lock->owner
的值是否等于0,如果等于0,則直接將當(dāng)前線程的task struct
的指針賦值給lock->owner
,表示該mutex
鎖已經(jīng)被當(dāng)前線程持有。如果lock->owner
的值不等于0,則表示該mutex
鎖已經(jīng)被其他線程持有或者鎖正在傳遞給top waiter
線程,當(dāng)前線程需要阻塞等待。上面描述的操作(比較和賦值)都是原子操作,不會(huì)有任何指令插入。
static __always_inline bool __mutex_trylock_fast(struct mutex *lock)
{
unsigned long curr = (unsigned long)current;
if (!atomic_long_cmpxchg_acquire(&lock- >owner, 0UL, curr))
return true;
return false;
}
慢速獲取mutex
鎖的路徑就是__mutex_lock_common()
函數(shù),所謂慢速其實(shí)就是阻塞當(dāng)前線程,將current task
掛入mutex
的等待隊(duì)列的尾部。讓所有等待mutex
的任務(wù)按照時(shí)間的先后順序排列起來(lái),當(dāng)mutex
被釋放的時(shí)候,會(huì)首先喚醒隊(duì)首的任務(wù),即最先等待的任務(wù)最先被喚醒。此外,在向空隊(duì)列插入第一個(gè)任務(wù)的時(shí)候,會(huì)給mutex flag
設(shè)置上MUTEX_FLAG_WAITERS
標(biāo)記,表示已經(jīng)有任務(wù)在等待這個(gè)mutex
鎖了。
static noinline void __sched
__mutex_lock_slowpath(struct mutex *lock)
{
__mutex_lock(lock, TASK_UNINTERRUPTIBLE, 0, NULL, _RET_IP_);
}
static int __sched
__mutex_lock(struct mutex *lock, long state, unsigned int subclass,
struct lockdep_map *nest_lock, unsigned long ip)
{
return __mutex_lock_common(lock, state, subclass, nest_lock, ip, NULL, false);
}
static __always_inline int __sched
__mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
struct lockdep_map *nest_lock, unsigned long ip,
struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx)
{
struct mutex_waiter waiter;
bool first = false;
struct ww_mutex *ww;
int ret;
might_sleep();
ww = container_of(lock, struct ww_mutex, base);
if (use_ww_ctx && ww_ctx) {
if (unlikely(ww_ctx == READ_ONCE(ww- >ctx)))
return -EALREADY;
}
preempt_disable();
mutex_acquire_nest(&lock- >dep_map, subclass, 0, nest_lock, ip);
if (__mutex_trylock(lock) ||
mutex_optimistic_spin(lock, ww_ctx, use_ww_ctx, NULL)) {
/* got the lock, yay! */
lock_acquired(&lock- >dep_map, ip);
if (use_ww_ctx && ww_ctx)
ww_mutex_set_context_fastpath(ww, ww_ctx);
preempt_enable();
return 0;
}
spin_lock(&lock- >wait_lock);
/*
* After waiting to acquire the wait_lock, try again.
*/
if (__mutex_trylock(lock)) {
if (use_ww_ctx && ww_ctx)
__ww_mutex_wakeup_for_backoff(lock, ww_ctx);
goto skip_wait;
}
debug_mutex_lock_common(lock, &waiter);
debug_mutex_add_waiter(lock, &waiter, current);
lock_contended(&lock- >dep_map, ip);
if (!use_ww_ctx) {
/* add waiting tasks to the end of the waitqueue (FIFO): */
list_add_tail(&waiter.list, &lock- >wait_list);
#ifdef CONFIG_DEBUG_MUTEXES
waiter.ww_ctx = MUTEX_POISON_WW_CTX;
#endif
} else {
/* Add in stamp order, waking up waiters that must back off. */
ret = __ww_mutex_add_waiter(&waiter, lock, ww_ctx);
if (ret)
goto err_early_backoff;
waiter.ww_ctx = ww_ctx;
}
waiter.task = current;
if (__mutex_waiter_is_first(lock, &waiter))
__mutex_set_flag(lock, MUTEX_FLAG_WAITERS);
set_current_state(state);
for (;;) {
/*
* Once we hold wait_lock, we're serialized against
* mutex_unlock() handing the lock off to us, do a trylock
* before testing the error conditions to make sure we pick up
* the handoff.
*/
if (__mutex_trylock(lock))
goto acquired;
/*
* Check for signals and wound conditions while holding
* wait_lock. This ensures the lock cancellation is ordered
* against mutex_unlock() and wake-ups do not go missing.
*/
if (unlikely(signal_pending_state(state, current))) {
ret = -EINTR;
goto err;
}
if (use_ww_ctx && ww_ctx && ww_ctx- >acquired > 0) {
ret = __ww_mutex_lock_check_stamp(lock, &waiter, ww_ctx);
if (ret)
goto err;
}
spin_unlock(&lock- >wait_lock);
schedule_preempt_disabled();
/*
* ww_mutex needs to always recheck its position since its waiter
* list is not FIFO ordered.
*/
if ((use_ww_ctx && ww_ctx) || !first) {
first = __mutex_waiter_is_first(lock, &waiter);
if (first)
__mutex_set_flag(lock, MUTEX_FLAG_HANDOFF);
}
set_current_state(state);
/*
* Here we order against unlock; we must either see it change
* state back to RUNNING and fall through the next schedule(),
* or we must see its unlock and acquire.
*/
if (__mutex_trylock(lock) ||
(first && mutex_optimistic_spin(lock, ww_ctx, use_ww_ctx, &waiter)))
break;
spin_lock(&lock- >wait_lock);
}
spin_lock(&lock- >wait_lock);
acquired:
__set_current_state(TASK_RUNNING);
mutex_remove_waiter(lock, &waiter, current);
if (likely(list_empty(&lock- >wait_list)))
__mutex_clear_flag(lock, MUTEX_FLAGS);
debug_mutex_free_waiter(&waiter);
skip_wait:
/* got the lock - cleanup and rejoice! */
lock_acquired(&lock- >dep_map, ip);
if (use_ww_ctx && ww_ctx)
ww_mutex_set_context_slowpath(ww, ww_ctx);
spin_unlock(&lock- >wait_lock);
preempt_enable();
return 0;
err:
__set_current_state(TASK_RUNNING);
mutex_remove_waiter(lock, &waiter, current);
err_early_backoff:
spin_unlock(&lock- >wait_lock);
debug_mutex_free_waiter(&waiter);
mutex_release(&lock- >dep_map, 1, ip);
preempt_enable();
return ret;
}