Description of problem: 海光4号机器,2socket、128core 256smt的机器上,运行mysql,测试项为oltp_read_write,在并发测试80~160线程时会出现性能下降的问题,怀疑是在80~160线程执行ttwu进行唤醒的时候,对于短唤醒任务有选核竞争的问题,导致80~160线程迁移频繁。 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
devel-5.10的数据如下: 80threads并发: 性能:transactions: 39146.56 per sec 路径: set_task_cpu → try_to_wake_up → pollwake → __wake_up_common → __wake_up_common_lock 次数: 696 路径: set_task_cpu → try_to_wake_up → wake_up_q → futex_wake → do_futex 次数: 49 路径: set_task_cpu → try_to_wake_up → hrtimer_wakeup → __hrtimer_run_queues → hrtimer_interrupt 次数: 13 路径: set_task_cpu → try_to_wake_up → __queue_work → queue_work_on → xlog_cil_force_seq 次数: 8 路径: set_task_cpu → load_balance → newidle_balance → pick_next_task_fair → pick_next_task 次数: 4 路径: set_task_cpu → try_to_wake_up → call_timer_fn → run_timer_softirq → __do_softirq 次数: 3 路径: set_task_cpu → try_to_wake_up → wake_up_q → rwsem_wake.isra.11 → xfs_iunlock 次数: 1 100threads并发: 性能:transactions: 27547.49 per sec 路径: set_task_cpu → try_to_wake_up → pollwake → __wake_up_common → __wake_up_common_lock 次数: 4090 路径: set_task_cpu → try_to_wake_up → wake_up_q → futex_wake → do_futex 次数: 286 路径: set_task_cpu → try_to_wake_up → hrtimer_wakeup → __hrtimer_run_queues → hrtimer_interrupt 次数: 56 路径: set_task_cpu → load_balance → newidle_balance → pick_next_task_fair → pick_next_task 次数: 48 路径: set_task_cpu → try_to_wake_up → __queue_work → queue_work_on → xlog_cil_force_seq 次数: 31 路径: set_task_cpu → try_to_wake_up → call_timer_fn → run_timer_softirq → __do_softirq 次数: 8 路径: set_task_cpu → try_to_wake_up → wake_up_q → rwsem_wake.isra.11 → xfs_iunlock 次数: 5 路径: set_task_cpu → try_to_wake_up → wake_page_function → __wake_up_common → wake_up_page_bit 次数: 3 路径: set_task_cpu → try_to_wake_up → __flush_work → xlog_cil_force_seq → xfs_log_force_seq 次数: 1 160threads并发: 性能:transactions: 47733.65 per sec 路径: set_task_cpu → try_to_wake_up → pollwake → __wake_up_common → __wake_up_common_lock 次数: 421666 路径: set_task_cpu → try_to_wake_up → wake_up_q → futex_wake → do_futex 次数: 4533 路径: set_task_cpu → load_balance → newidle_balance → pick_next_task_fair → pick_next_task 次数: 1280 路径: set_task_cpu → try_to_wake_up → hrtimer_wakeup → __hrtimer_run_queues → hrtimer_interrupt 次数: 830 路径: set_task_cpu → try_to_wake_up → __queue_work → queue_work_on → xlog_cil_force_seq 次数: 56 路径: set_task_cpu → try_to_wake_up → call_timer_fn → run_timer_softirq → __do_softirq 次数: 27 路径: set_task_cpu → try_to_wake_up → wake_up_q → rwsem_wake.isra.11 → xfs_iunlock 次数: 13 路径: set_task_cpu → try_to_wake_up → wake_page_function → __wake_up_common → wake_up_page_bit 次数: 11 路径: set_task_cpu → try_to_wake_up → __flush_work → xlog_cil_force_seq → xfs_log_force_seq 次数: 2 路径: set_task_cpu → try_to_wake_up → autoremove_wake_function → __wake_up_common → __wake_up_common_lock 次数: 1 180threads并发: 性能:transactions: 51087.11 per sec 路径: set_task_cpu → try_to_wake_up → pollwake → __wake_up_common → __wake_up_common_lock 次数: 503757 路径: set_task_cpu → try_to_wake_up → wake_up_q → futex_wake → do_futex 次数: 4712 路径: set_task_cpu → load_balance → newidle_balance → pick_next_task_fair → pick_next_task 次数: 1022 路径: set_task_cpu → try_to_wake_up → hrtimer_wakeup → __hrtimer_run_queues → hrtimer_interrupt 次数: 917 路径: set_task_cpu → try_to_wake_up → __queue_work → queue_work_on → xlog_cil_force_seq 次数: 40 路径: set_task_cpu → try_to_wake_up → call_timer_fn → run_timer_softirq → __do_softirq 次数: 18 路径: set_task_cpu → try_to_wake_up → wake_page_function → __wake_up_common → wake_up_page_bit 次数: 11 路径: set_task_cpu → try_to_wake_up → wake_up_q → rwsem_wake.isra.11 → xfs_iunlock 次数: 7 路径: set_task_cpu → try_to_wake_up → iomap_dio_bio_end_io → dec_pending → clone_endio 次数: 1
devel-6.6分支数据如下: 80threads并发: 性能:transactions: 37931.06 per sec 路径: set_task_cpu → try_to_wake_up → pollwake → __wake_up_common → __wake_up_common_lock 次数: 400 路径: set_task_cpu → try_to_wake_up → wake_up_q → futex_wake → do_futex 次数: 38 路径: set_task_cpu → load_balance → newidle_balance.isra.47 → pick_next_task_fair → __schedule 次数: 24 路径: set_task_cpu → try_to_wake_up → hrtimer_wakeup → __hrtimer_run_queues → hrtimer_interrupt 次数: 8 路径: set_task_cpu → try_to_wake_up → kick_pool → __queue_work → queue_work_on 次数: 6 路径: set_task_cpu → try_to_wake_up → call_timer_fn → expire_timers → run_timer_softirq 次数: 2 路径: set_task_cpu → try_to_wake_up → kick_pool → __queue_work → call_timer_fn 次数: 1 100threads并发: 性能:transactions: 43531.23 per sec 路径: set_task_cpu → try_to_wake_up → pollwake → __wake_up_common → __wake_up_common_lock 次数: 1085 路径: set_task_cpu → load_balance → newidle_balance.isra.47 → pick_next_task_fair → __schedule 次数: 88 路径: set_task_cpu → try_to_wake_up → wake_up_q → futex_wake → do_futex 次数: 27 路径: set_task_cpu → try_to_wake_up → hrtimer_wakeup → __hrtimer_run_queues → hrtimer_interrupt 次数: 26 路径: set_task_cpu → try_to_wake_up → kick_pool → __queue_work → queue_work_on 次数: 8 路径: set_task_cpu → try_to_wake_up → call_timer_fn → expire_timers → run_timer_softirq 次数: 7 160threads并发: 性能:transactions: 51947.05 per sec 路径: set_task_cpu → try_to_wake_up → pollwake → __wake_up_common → __wake_up_common_lock 次数: 447852 路径: set_task_cpu → load_balance → newidle_balance.isra.47 → pick_next_task_fair → __schedule 次数: 57422 路径: set_task_cpu → try_to_wake_up → wake_up_q → futex_wake → do_futex 次数: 2758 路径: set_task_cpu → try_to_wake_up → hrtimer_wakeup → __hrtimer_run_queues → hrtimer_interrupt 次数: 617 路径: set_task_cpu → try_to_wake_up → kick_pool → __queue_work → queue_work_on 次数: 45 路径: set_task_cpu → try_to_wake_up → call_timer_fn → expire_timers → run_timer_softirq 次数: 19 路径: set_task_cpu → try_to_wake_up → wake_up_q → rwsem_wake → up_read 次数: 16 路径: set_task_cpu → try_to_wake_up → wake_page_function → __wake_up_common → folio_wake_bit 次数: 9 路径: set_task_cpu → try_to_wake_up → autoremove_wake_function → __wake_up_common → __wake_up_common_lock 次数: 1 路径: set_task_cpu → try_to_wake_up → wake_up_q → rwsem_wake → up_write 次数: 1 路径: set_task_cpu → try_to_wake_up → kick_pool → __queue_work → call_timer_fn 次数: 1 路径: set_task_cpu → load_balance → rebalance_domains → _nohz_idle_balance.isra.49 → handle_softirqs 次数: 1 180threads并发: 性能:transactions: 56031.12 per sec 路径: set_task_cpu → try_to_wake_up → pollwake → __wake_up_common → __wake_up_common_lock 次数: 408323 路径: set_task_cpu → load_balance → newidle_balance.isra.47 → pick_next_task_fair → __schedule 次数: 56015 路径: set_task_cpu → try_to_wake_up → wake_up_q → futex_wake → do_futex 次数: 3764 路径: set_task_cpu → try_to_wake_up → hrtimer_wakeup → __hrtimer_run_queues → hrtimer_interrupt 次数: 405 路径: set_task_cpu → try_to_wake_up → kick_pool → __queue_work → queue_work_on 次数: 118 路径: set_task_cpu → try_to_wake_up → wake_up_q → rwsem_wake → up_read 次数: 38 路径: set_task_cpu → try_to_wake_up → wake_page_function → __wake_up_common → folio_wake_bit 次数: 16 路径: set_task_cpu → try_to_wake_up → call_timer_fn → expire_timers → run_timer_softirq 次数: 16 路径: set_task_cpu → load_balance → rebalance_domains → _nohz_idle_balance.isra.49 → handle_softirqs 次数: 2 路径: set_task_cpu → try_to_wake_up → swake_up_locked.part.24 → complete_with_flags → flush_workqueue_prep_pwqs 次数: 1
80 threads 100 threads 160 threads 180 threads 5.10 切换次数 696 4090 421666 503757 6.6 切换次数 400 1085 447852 408323 从切换次数来看,5.10 比 6.6 多很多,怀疑是频繁的切换导致的性能差。