NUMA

CPU affinity:CPU绑定 (不会交错访问内存,性能提高)

 

numa:

numactl numad numdemo numastat

numastat:numa_miss 出现大量值出现,使用绑定

[root@monitor ~]# numastat
                           node0
numa_hit                10937748
numa_miss                      0
numa_foreign                   0
interleave_hit             14133
local_node              10937748
other_node        

[root@monitor ~]# numastat -s

Per-node numastat info (in MBs):
                          Node 0           Total
                 --------------- ---------------
Numa_Hit                42739.40        42739.40
Local_Node              42739.40        42739.40
Interleave_Hit             55.21           55.21
Numa_Foreign                0.00            0.00
Numa_Miss                   0.00            0.00
Other_Node                  0.00            0.00


[root@monitor ~]# numastat -s node0
Found no processes containing pattern: "node0"

Per-node numastat info (in MBs):
                          Node 0           Total
                 --------------- ---------------
Numa_Hit                42741.29        42741.29
Local_Node              42741.29        42741.29
Interleave_Hit             55.21           55.21
Numa_Foreign                0.00            0.00
Numa_Miss                   0.00            0.00
Other_Node                  0.00            0.00

[root@monitor ~]# numastat -p 935

Per-node process memory usage (in MBs) for PID 935 (runuser)
                           Node 0           Total
                  --------------- ---------------
Huge                         0.00            0.00
Heap                         0.04            0.04
Stack                        0.05            0.05
Private                      1.28            1.28
----------------  --------------- ---------------
Total                        1.37            1.37
       man numastat


numa_hit is memory successfully allocated on this node as intended. numa_miss is memory allocated on this node despite the process preferring some differ- ent node. Each numa_miss has a numa_foreign on another node. numa_foreign is memory intended for this node, but actually allocated on some different
本地内存分配了,但被非本地CPU使用
numad 守护进程 必要  提高50%性能   

numastat:

numctl:
[root@monitor proc]# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 62
model name      : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
stepping        : 4
cpu MHz         : 2600.214
cache size      : 20480 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 1
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush
mmx fxsr sse sse2 ht syscall nx lm up rep_good unfair_spinlock pni ssse3
cx16 sse4_1 sse4_2
popcnt aes hypervisor lahf_lm bogomips : 5200.42 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management:

 

在非 NUMA 上也可以使用:

绑定进程到CPU

[root@monitor proc]# taskset taskset (util-linux-ng 2.17.2) usage: taskset [options] [mask | cpu-list] [pid | cmd [args...]] set or get the affinity of a process -p, --pid operate on existing given pid -c, --cpu-list display and specify cpus in list format -h, --help display this help -V, --version output version information The default behavior is to run a new command: taskset 03 sshd -b 1024 You can retrieve the mask of an existing task: taskset -p 700 Or set it: taskset -p 03 700 List format uses a comma-separated list instead of a mask: taskset -pc 0,3,7-11 700 Ranges in list format can take a stride argument: e.g. 0-31:2 is equivalent to mask 0x55555555




 

taskset
  mask:
  0x00000001 is processor #0

    0x00000003 is processors #0 and #1

    0xFFFFFFFF is all processors (#0 through #31)


使用方法:
taskset -p mastk pid
taskset -p 0x0000003 999
taskset -p -c 3 999
taskset -p -c 0-2,7 999


nginx:配制文件中时行nginx 与cpu绑定

 

只让某CPU处理某些进程:
1.Isolate a CPU from automatic scheduling in /etc/grub.conf isolcpus=cpu number,…,cpu number :开机不让基他进程运行CPU,进行隔离
2.Pin tasks to that CPU with taskset
taskset -p
3.Consider moving IRQs off the CPU //IRQ 绑定到非隔离的CPU上,从而将那些隔离的CPU不处理中断
echo CPU_MASK > /proc/irq/<irq_number>/smp_affinity


在NUMA 中的miss过高时绑定
在非NUMA 中 进程使用非常繁忙,而进程在CPU间来回切换,时绑定

 

[root@monitor 0]# rpm -ql sysstat-9.0.4-22.el6.x86_64
/etc/cron.d/sysstat
/etc/rc.d/init.d/sysstat
/etc/sysconfig/sysstat
/etc/sysconfig/sysstat.ioconf
/usr/bin/cifsiostat
/usr/bin/iostat
/usr/bin/mpstat
/usr/bin/pidstat
/usr/bin/sadf
/usr/bin/sar


ab -n 100000 -c 300 http://127.0.0.1/index.html 压测
Viewing CPU performance data

Load average: average length of run queues
Considers only tasks in TASK_RUNNABLE and TASK_UNINTERRUPTABLE 
sar  -q
top
w
uptime
vmstat  1  5
CPU utilization
mpstat  1  2
sar  -P  ALL   1  2
iostat  -c  1  2
/proc/stat
dstat -c
[root@monitor 0]# w -l root
 21:14:58 up 4 days,  6:33,  2 users,  load average: 0.01, 0.05, 0.02
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
root     pts/0    118.239.203.123  20:03    0.00s  0.07s  0.00s w -l root
root     pts/1    118.239.203.123  21:07    3:40   0.01s  0.01s -bash

 

cpu负载:

[root@monitor 0]# sar -q 1 10 Linux 2.6.32-431.23.3.el6.x86_64 (monitor) 06/11/2016 _x86_64_ (1 CPU) 09:01:10 PM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 09:01:11 PM 0 113 0.00 0.00 0.00 09:01:12 PM 0 113 0.00 0.00 0.00 09:01:13 PM 0 113 0.00 0.00 0.00 09:01:14 PM 0 113 0.00 0.00 0.00 09:01:15 PM 0 113 0.00 0.00 0.00 09:01:16 PM 0 113 0.00 0.00 0.00 09:01:17 PM 0 113 0.00 0.00 0.00 09:01:18 PM 0 113 0.00 0.00 0.00 09:01:19 PM 0 113 0.00 0.00 0.00 09:01:20 PM 0 113 0.00 0.00 0.00

runq-sz:
一个CPU ,不能超过3
如果多核心,多CPU要平均计算

cpu 使用状况:

[root@monitor ~]# mpstat -P 0 1 (0号CPU) Linux 2.6.32-431.23.3.el6.x86_64 (monitor) 06/11/2016 _x86_64_ (1 CPU) 09:11:18 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 09:11:19 PM 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 09:11:20 PM 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 09:11:21 PM 0 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 99.00 09:11:22 PM 0 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 99.00 09:11:23 PM 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 09:11:24 PM 0 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 99.00 09:11:25 PM 0 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 99.00 09:11:26 PM 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00



[root@monitor 0]# uptime
 21:17:46 up 4 days,  6:36,  2 users,  load average: 0.00, 0.02, 0.00


mpstat -I CPU 1

 

[root@monitor 0]# iostat -c 1 6
Linux 2.6.32-431.23.3.el6.x86_64 (monitor)      06/11/2016      _x86_64_        (1 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.29    0.00    0.49    0.21    0.00   98.01

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.00    0.00    0.00  100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.00    0.00    0.00  100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    1.01    0.00    0.00   98.99

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.00    0.00    0.00  100.00


cat /proc/stat

 

[root@monitor 0]# dstat --top-io
----most-expensive----
     i/o process      
init        116k   91k
sshd: root@ 146B  196B
AliYunDun    18k    0 
sshd: root@  71B  116B
sshd: root@  78B  116B
AliYunDunUp 256B    0 
AliYunDunUp 256B    0 
sshd: root@  71B  116B^C

 

[root@monitor 0]# dstat --top-cpu
-most-expensive-
  cpu process   
AliYunDun    0.2
AliYunDun    1.0
                
AliHids      1.0
                
                
AliHids      1.0
                
AliYunDun    1.0
[root@monitor 0]# dstat --top-mem --top-cpu --top-io
--most-expensive- -most-expensive- ----most-expensive----
  memory process |  cpu process   |     i/o process      
AliHids     10.0M|AliYunDun    0.2|init        116k   91k
AliHids     10.0M|AliYunDun    1.0|sshd: root@ 388B  436B
AliHids     10.0M|                |sshd: root@ 162B  212B
AliHids     10.0M|                |sshd: root@ 155B  196B
AliHids     10.0M|                |sshd: root@ 155B  196B
AliHids     10.0M|AliYunDun    1.0|sshd: root@ 155B  196B
AliHids     10.0M|                |sshd: root@ 162B  212B
AliHids     10.0M|                |sshd: root@ 155B  196B
AliHids     10.0M|                |sshd: root@ 155B  196B
AliHids     10.0M|AliYunDunUpda1.0|sshd: root@ 155B  196B^C
[root@monitor 0]# dstat -c
----total-cpu-usage----
usr sys idl wai hiq siq
  1   0  98   0   0   0
  0   0 100   0   0   0
  0   0 100   0   0   0
  0   0 100   0   0   0
  1   0  99   0   0   0^C

 

系统上下文切换:

[root@monitor 0]# sar -w 1 Linux 2.6.32-431.23.3.el6.x86_64 (monitor) 06/11/2016 _x86_64_ (1 CPU) 09:30:53 PM proc/s cswch/s   进程创建平均次数 上下文切换 09:30:54 PM 0.00 341.58 09:30:55 PM 0.00 343.00 09:30:56 PM 0.00 334.34 09:30:57 PM 0.00 333.00 09:30:58 PM 0.00 337.00 09:30:59 PM 0.00 341.41 09:31:00 PM 0.00 326.00 09:31:01 PM 0.00 334.00 09:31:02 PM 0.00 333.00


[root@monitor 0]# vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 253808  82668 523696    0    0     4    86    3   48  1  0 98  0  0
 0  0      0 253800  82668 523696    0    0     0     0   95  270  1  0 99  0  0
 0  0      0 253800  82668 523696    0    0     0     0   89  269  0  0 100  0  0

 

watch -n 0.5 'ps -e -o psr,pid,cmd'

  CPU调度域概念:可以把进程绑定到CPU上 
  Group processors into cpusets 
 
Group processors into cpusets
    Each cpuset represents a scheduler domain
    Supports both multi-core and NUMA architectures
    Simple management interface through the cpuset virtual file system
 
The root cpuset contains all system resources
 
Child cpusets
    Each cpuset must contain at least one CPU and one memory zone
    Child cpusets can be nested
    Dynamically attach tasks to a cpuset
Consequences
    Control latency due to queue length, cache, and NUMA zones
    Assign processes with different CPU characteristics to different cpusets
    Scalable for complex performance scenarios

创建方法:

1.Create a mount point at /cpusets
2.Add an entry to /etc/fstab cpuset /cpusets cpuset defaults 0 0
3.
mount -a
mount

Mount the filesystem to automatically create the cpuset
/cpusets/cpus /cpusets/mems /cpusets/tasks
All CPUs and memory zones belong to the root cpuset All existing PIDs are assigned to the root cpuset
[root@monitor cpusets]# pwd
/cpusets
[root@monitor cpusets]# ll
total 0
--w--w--w- 1 root root 0 Jun 11 21:56 cgroup.event_control
-rw-r--r-- 1 root root 0 Jun 11 21:56 cgroup.procs
-rw-r--r-- 1 root root 0 Jun 11 21:56 cpu_exclusive
-rw-r--r-- 1 root root 0 Jun 11 21:56 cpus
-rw-r--r-- 1 root root 0 Jun 11 21:56 mem_exclusive
-rw-r--r-- 1 root root 0 Jun 11 21:56 mem_hardwall
-rw-r--r-- 1 root root 0 Jun 11 21:56 memory_migrate
-r--r--r-- 1 root root 0 Jun 11 21:56 memory_pressure
-rw-r--r-- 1 root root 0 Jun 11 21:56 memory_pressure_enabled
-rw-r--r-- 1 root root 0 Jun 11 21:56 memory_spread_page
-rw-r--r-- 1 root root 0 Jun 11 21:56 memory_spread_slab
-rw-r--r-- 1 root root 0 Jun 11 21:56 mems
-rw-r--r-- 1 root root 0 Jun 11 21:56 notify_on_release
-rw-r--r-- 1 root root 0 Jun 11 21:56 release_agent
-rw-r--r-- 1 root root 0 Jun 11 21:56 sched_load_balance
-rw-r--r-- 1 root root 0 Jun 11 21:56 sched_relax_domain_level
-rw-r--r-- 1 root root 0 Jun 11 21:56 tasks
[root@monitor cpusets]# cat cpus   //CPU 0
0
[root@monitor cpusets]# cat mems   //节点0的内存
0
[root@monitor cpusets]# cat tasks   //进程号
1
2
3
4
5
...
创建子域

[root@monitor cpusets]# pwd
/cpusets
[root@monitor cpusets]# mkdir xx

设置:
echo [pid] > tasks
echo memnum >mems
echo cpus >cpus

查看进程运行在那个CPU上
watch -n 0.5 'ps -e -o psr,pid,cmd'

taskset -p -c 0 26774 也可以把进程绑定到CPU上 P 与 C位置不能互换