使用qemu结合eclipse或者DDD等gdb的图形前端,跟踪协议栈或者文件系统内存管理等都会很方便。就是与硬件驱动相关的跟踪可能差点。



编译内核

下载Linux Kernel源代码,并编译生成压缩的kernel镜像(/bak/linux/linux-2.6/arch/x86_64/boot/bzImage)与用于gdb的非压缩的kernel ELF文件(/bak/linux/linux-2.6/vmlinux, ELF object file, symbols included, including debug info)。  

 cd /bak/linux/ && git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git   


怎样编译内核參考:编译linux kernel及制作initrd ( by quqi99 )   


sudo apt-get install libncurses5-dev  


make menuconfig   make -j 8 bzImage



制作initrd

制作initrd, 使用initrd时的kernel要使用CONFIG_BLK_DEV_INITRD=y编译。


sudo apt-get install build-essential initramfs-tools   


sudo make modules_install            #将生成/lib/modules/4.5.0-rc2+   


mkinitramfs -o initrd.img -v 4.5.0-rc2+



使用busybox制作initrd

mkdir -p /bak/linux/initramfs/{bin,sbin,etc,proc,sys,newroot}cd /bak/linuxtouch initramfs/etc/mdev.confwget http://jootamam.net/initramfs-files/busybox-1.10.1-static.bz2 -O - | bunzip2 > initramfs/bin/busyboxchmod +x initramfs/bin/busyboxtouch initramfs/init

chmod +x initramfs/init

initramfs/init文件例如以下:



#!/bin/sh
#Mount things needed by this script
mount -t proc proc /proc
mount -t sysfs sysfs /sys
#Disable kernel messages from popping onto the screen
echo 0 > /proc/sys/kernel/printk
#Clear the screen
clear
#Create all the symlinks to /bin/busybox
busybox --install -s
#Create device nodes
mknod /dev/null c 1 3
mknod /dev/tty c 5 0
mdev -s
#Function for parsing command line options with "=" in them
# get_opt("init=/sbin/init") will return "/sbin/init"
get_opt() {
echo "$@" | cut -d "=" -f 2
}
#Defaults
init="/sbin/init"
root="/dev/hda1"
#Process command line options
for i in $(cat /proc/cmdline); do
case $i in
root\=*)
root=$(get_opt $i)
;;
init\=*)
init=$(get_opt $i)
;;
esac
done
#Mount the root device
mount "${root}" /newroot
#Check if $init exists and is executable
if [[ -x "/newroot/${init}" ]] ; then
#Unmount all other mounts so that the ram used by
#the initramfs can be cleared after switch_root
umount /sys /proc

#Switch to the new root and execute init
exec switch_root /newroot "${init}"
fi
#This will only be run if the exec above failed
echo "Failed to switch_root, dropping to a shell"
exec sh


cd initramfs

find . | cpio -H newc -o > ../initramfs.cpio

cd ..

cat initramfs.cpio | gzip > initramfs.igz


但上述busybox-1.10.1-static.bz2似乎没有ext2模块不能识别qemu的-hda參数传进去ext2格式的硬盘,所以最后改成从busybox-1.24.0的源代码编译。

wget https://busybox.net/downloads/busybox-1.24.0.tar.bz2


make menuconfig 

   CONFIG_MKFS_EXT2=y

          Busybox Settings  --->  

           Build Options  ---> 

               [*] Build BusyBox as a static binary (no shared libs) //静态方式编译

   make & make install

   cp -avR /bak/linux/busybox-1.24.0/_install/* /bak/linux/initramfs/


qemu载入内核

wget http://www.nongnu.org/qemu/linux-0.2.img.bz2

sudo qemu-system-x86_64 -hda /bak/images/linux-0.2.img -hdb /bak/linux/disk.img -kernel /bak/linux/linux-2.6/arch/x86_64/boot/bzImage -initrd /bak/linux/initramfs.igz -append "root=/dev/sda init=sbin/init console=ttyS0" -nographic -smp 1,cores=1 -S -s 

參数解释例如以下:

  1.    当中-s为开启GDB的调试端口1234,而-S则表示运行QEMU时冻结待GDB运行(c)ontinue操作。
  2.    console=ttyS0" -nographic表示不开新的图形化窗体。直接使用敲命令的bash窗体
  3.    -append "root=/dev/sda init=sbin/init应该与initrd文件中的init脚本一致。
  4.   加--enable-debug參数编译的QEMU会自己主动加入符号表




使用gdb调试内核


   qemu的-s參数会默认在1234端口开启gdbserver。

   hua@node1:~$ sudo netstat -anp |grep 1234

   tcp    0     0 0.0.0.0:1234        0.0.0.0:*          LISTEN      24309/qemu-system-x

   hua@node1:~$ /bak/java/gdb/bin/gdb /bak/linux/linux-2.6/vmlinux 

    ...

   (gdb) target remote localhost:1234

   Remote debugging using localhost:1234

   0x0000000000000000 in irq_stack_union ()

   (gdb) b start_kernel

   Breakpoint 1 at 0xffffffff81d66b09: file init/main.c, line 498.

   (gdb) info registers

   (gdb) bt

   (gdb) c

   (gdb) list

   (gdb) set architecture

   Requires an argument. Valid arguments are i386, i386:x86-64, i386:x64-32, i8086, i386:intel,i386:x86-64:intel, i386:x64-32:intel, auto.


   # Inside VM, echo 'c' | sudo tee /proc/sysrq-trigger


   (gdb) add-symbol-file vmlinux 0xffffffff81000000 #echo 0x$(sudo cat /proc/kallsyms | egrep -e "T _text$" | awk '{print $1}')

   (gdb) b sysrq_handle_crash


   


使用eclipse调试内核


1, Linux源代码size太大,设置workspace全局禁止使用eclipse去给代码做自己主动build。索引能够仍然交由eclipse来做,这样方便在eclipse中进行搜索及代码导航。

   - Preferences -> Generl -> Workspace -> Build automatically (Disable)


2, 将Kernel源代码导入为eclipseproject, toolChain选为Linux GCC.

      Import -> C/C++ -> Existing Code as Makefile Project

3, 创建一个debug启动器(Debug configurations -> C/C++ Remote Application)

   选择GDB(DSF) Manual Remote Debugging Launcher

   Main TAB -> -C/C++ Application指向实际uncompress kernel: /bak/linux/linux-2.6/vmlinux

   Main TAB -> -Disable auto build

   Debugger TAB -> Stop on startup at 'start_kernel'

   Debugger TAB -> connection -> Host Name or IP Address -> = localhost

   Debugger TAB -> connection -> Port number = 1234




编译gdb解决错误“Remote 'g' packet reply is too long”

   cd /bak/java && wget http://ftp.gnu.org/gnu/gdb/gdb-7.7.tar.gz

   改动gdb/remote.c文件,在process_g_packet函数里,将例如以下代码:

   if (buf_len > 2 * rsa->sizeof_g_packet)

     error (_("Remote 'g' packet reply is too long: %s"), rs->buf);

   改动上两行代码为以下的代码,或者直接凝视上两行什么也不加:

if (buf_len > 2 * rsa->sizeof_g_packet) {
rsa->sizeof_g_packet = buf_len ;


for (i = 0; i < gdbarch_num_regs (gdbarch); i++) {
if (rsa->regs[i].pnum == -1)
continue;


if (rsa->regs[i].offset >= rsa->sizeof_g_packet)
rsa->regs[i].in_g_packet = 0;
else
rsa->regs[i].in_g_packet = 1;
}
}


  ./configure --prefix=/bak/java/gdb && make && make install

   接下来又一次配置下Eclipse,点击菜单“Run”->“Debug Configurations…”,在弹出的对话框中,切换到“Debugger”下的“Main”页,改动“GDB debugger:”为刚编译出来的GDB(/bak/java/gdb/bin/gdb),而不是默认的gdb

 

參考

[1] http://blog.chinaunix.net/uid-26009923-id-3825761.html

[2] http://mgalgs.github.io/2012/03/23/how-to-build-a-custom-linux-kernel-for-qemu.html

[3] http://www.kgdb.info/kgdb/use_kgdb/using_kgdb_base_qemu/



附录1, 使用cscope创建索引

1, 创建cscope.files

LNX=/bak/linux/linux-2.6

cd /

find  $LNX                                                                \

-path "$LNX/arch/*" ! -path "$LNX/arch/i386*" -prune -o               \

-path "$LNX/include/asm-*" ! -path "$LNX/include/asm-i386*" -prune -o \

-path "$LNX/tmp*" -prune -o                                           \

-path "$LNX/Documentation*" -prune -o                                 \

-path "$LNX/scripts*" -prune -o                                       \

-path "$LNX/drivers*" -prune -o                                       \

-name "*.[chxsS]" -print >/bak/linux/linux-2.6/cscope/cscope.files

2, 创建索引数据库

cd /bak/linux/linux-2.6/cscope

3, 使用索引数据库

cscope -d


附录2,ELF格式

ELF(Executable and Linking Format),它是一种容器格式。用于存放可运行文件及相关数据。逻辑上分为许多section(可使用objdump -h 或readelf -S命令查看),包含:

  •     executable code & data (.text, .data, .bss, etc)  #.data包含初始化的全局数据  .bss未初始化的数据。 .text为可运行代码
  •     symbol tables (.symtab)
  •     ELF string tables (.strtab, .shstrtab)
  •     debug information (.debug_info, .debug_line, .eh_frame, etc)
  •     metadata (.notes, .comment)
  •     dynamic linking information (.plt, .got, etc)

hua@node1:/bak/linux/linux-2.6$ readelf -n vmlinux


Displaying notes found at file offset 0x0094e5f8 with length 0x00000024:

  Owner                 Data size    Description

  GNU                  0x00000014    NT_GNU_BUILD_ID (unique build ID bitstring)

    Build ID: 8930dc42387f290d882a43eafffb3e6105dd4df0


hua@node1:/bak/linux/linux-2.6$ readelf -p .comment vmlinux


String dump of section '.comment':

  [     0]  GCC: (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4


#readelf -S 用于读ELS镜像中的全部符号表

hua@node1:/bak/linux/linux-2.6$ readelf -S vmlinux

There are 44 section headers, starting at offset 0xa081ae0:


Section Headers:

  [Nr] Name              Type             Address           Offset

       Size              EntSize          Flags  Link  Info  Align

  [ 0]                   NULL             0000000000000000  00000000

       0000000000000000  0000000000000000           0     0     0

  [ 1] .text             PROGBITS         ffffffff81000000  00200000

       000000000074e5f8  0000000000000000  AX       0     0     4096

  [ 2] .notes            NOTE             ffffffff8174e5f8  0094e5f8

       0000000000000024  0000000000000000  AX       0     0     4

  [ 3] __ex_table        PROGBITS         ffffffff8174e620  0094e620

       0000000000002158  0000000000000000   A       0     0     8

  [ 4] .rodata           PROGBITS         ffffffff81800000  00a00000

       000000000033f9fe  0000000000000000   A       0     0     64

  [ 5] __bug_table       PROGBITS         ffffffff81b3fa00  00d3fa00

       00000000000072fc  0000000000000000   A       0     0     1

  [ 6] .pci_fixup        PROGBITS         ffffffff81b46d00  00d46d00

       0000000000003270  0000000000000000   A       0     0     8

  [ 7] .builtin_fw       PROGBITS         ffffffff81b49f70  00d49f70

       0000000000000120  0000000000000000   A       0     0     8

  [ 8] .tracedata        PROGBITS         ffffffff81b4a090  00d4a090

       0000000000000078  0000000000000000   A       0     0     1

  [ 9] __ksymtab         PROGBITS         ffffffff81b4a110  00d4a110

       00000000000118a0  0000000000000000   A       0     0     16

  [10] __ksymtab_gpl     PROGBITS         ffffffff81b5b9b0  00d5b9b0

       000000000000ecc0  0000000000000000   A       0     0     16

  [11] __kcrctab         PROGBITS         ffffffff81b6a670  00d6a670

       0000000000008c50  0000000000000000   A       0     0     8

  [12] __kcrctab_gpl     PROGBITS         ffffffff81b732c0  00d732c0

       0000000000007660  0000000000000000   A       0     0     8

  [13] __ksymtab_strings PROGBITS         ffffffff81b7a920  00d7a920

       00000000000268c3  0000000000000000   A       0     0     1

  [14] __init_rodata     PROGBITS         ffffffff81ba1200  00da1200

       0000000000000240  0000000000000000   A       0     0     32

  [15] __param           PROGBITS         ffffffff81ba1440  00da1440

       00000000000025d0  0000000000000000   A       0     0     8

  [16] __modver          PROGBITS         ffffffff81ba3a10  00da3a10

       00000000000005f0  0000000000000000   A       0     0     8

  [17] .data             PROGBITS         ffffffff81c00000  00e00000

       0000000000144140  0000000000000000  WA       0     0     4096

  [18] .vvar             PROGBITS         ffffffff81d45000  00f45000

       0000000000001000  0000000000000000  WA       0     0     16

  [19] .data..percpu     PROGBITS         0000000000000000  01000000

       000000000001f918  0000000000000000  WA       0     0     4096

  [20] .init.text        PROGBITS         ffffffff81d66000  01166000

       0000000000060879  0000000000000000  AX       0     0     16

  [21] .init.data        PROGBITS         ffffffff81dc7000  011c7000

       00000000000c2e90  0000000000000000  WA       0     0     4096

  [22] .x86_cpu_dev.init PROGBITS         ffffffff81e89e90  01289e90

       0000000000000018  0000000000000000   A       0     0     8

  [23] .altinstructions  PROGBITS         ffffffff81e89ea8  01289ea8

       0000000000005f44  0000000000000000   A       0     0     1

  [24] .altinstr_replace PROGBITS         ffffffff81e8fdec  0128fdec

       00000000000017db  0000000000000000  AX       0     0     1

  [25] .iommu_table      PROGBITS         ffffffff81e915c8  012915c8

       00000000000000f0  0000000000000000   A       0     0     8

  [26] .apicdrivers      PROGBITS         ffffffff81e916b8  012916b8

       0000000000000030  0000000000000000  WA       0     0     8

  [27] .exit.text        PROGBITS         ffffffff81e916e8  012916e8

       0000000000001e26  0000000000000000  AX       0     0     1

  [28] .smp_locks        PROGBITS         ffffffff81e94000  01294000

       0000000000007000  0000000000000000   A       0     0     4

  [29] .data_nosave      PROGBITS         ffffffff81e9b000  0129b000

       0000000000001000  0000000000000000  WA       0     0     4

  [30] .bss              NOBITS           ffffffff81e9c000  0129c000

       0000000000142000  0000000000000000  WA       0     0     4096

  [31] .brk              NOBITS           ffffffff81fde000  0129c000

       0000000000026000  0000000000000000  WA       0     0     1

  [32] .comment          PROGBITS         0000000000000000  0129c000

       0000000000000029  0000000000000001  MS       0     0     1

  [33] .debug_aranges    PROGBITS         0000000000000000  0129c030

       0000000000023880  0000000000000000           0     0     16

  [34] .debug_info       PROGBITS         0000000000000000  012bf8b0

       000000000724bc4f  0000000000000000           0     0     1

  [35] .debug_abbrev     PROGBITS         0000000000000000  0850b4ff

       00000000002d7be9  0000000000000000           0     0     1

  [36] .debug_line       PROGBITS         0000000000000000  087e30e8

       000000000072232c  0000000000000000           0     0     1

  [37] .debug_frame      PROGBITS         0000000000000000  08f05418

       00000000001f5cd0  0000000000000000           0     0     8

  [38] .debug_str        PROGBITS         0000000000000000  090fb0e8

       00000000002b5264  0000000000000001  MS       0     0     1

  [39] .debug_loc        PROGBITS         0000000000000000  093b034c

       0000000000925080  0000000000000000           0     0     1

  [40] .debug_ranges     PROGBITS         0000000000000000  09cd53d0

       00000000003ac530  0000000000000000           0     0     16

  [41] .shstrtab         STRTAB           0000000000000000  0a081900

       00000000000001dd  0000000000000000           0     0     1

  [42] .symtab           SYMTAB           0000000000000000  0a0825e0

       00000000002490c0  0000000000000018          43   63525     8

  [43] .strtab           STRTAB           0000000000000000  0a2cb6a0

       0000000000219339  0000000000000000           0     0     1

Key to Flags:

  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)

  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)

  O (extra OS processing required) o (OS specific), p (processor specific)



#查看调试信息

hua@node1:/bak/linux/linux-2.6$ readelf -S vmlinux |grep debug

  [33] .debug_aranges    PROGBITS         0000000000000000  0129c030

  [34] .debug_info       PROGBITS         0000000000000000  012bf8b0

  [35] .debug_abbrev     PROGBITS         0000000000000000  0850b4ff

  [36] .debug_line       PROGBITS         0000000000000000  087e30e8

  [37] .debug_frame      PROGBITS         0000000000000000  08f05418

  [38] .debug_str        PROGBITS         0000000000000000  090fb0e8

  [39] .debug_loc        PROGBITS         0000000000000000  093b034c

  [40] .debug_ranges     PROGBITS         0000000000000000  09cd53d0



#readelf -e 用于读ELS镜像中的全部段

hua@node1:/bak/linux/linux-2.6$ readelf -e vmlinux

ELF Header:

  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00

  Class:                             ELF64

  Data:                              2's complement, little endian

  Version:                           1 (current)

  OS/ABI:                            UNIX - System V

  ABI Version:                       0

  Type:                              EXEC (Executable file)

  Machine:                           Advanced Micro Devices X86-64

  Version:                           0x1

  Entry point address:               0x1000000

  Start of program headers:          64 (bytes into file)

  Start of section headers:          168303328 (bytes into file)

  Flags:                             0x0

  Size of this header:               64 (bytes)

  Size of program headers:           56 (bytes)

  Number of program headers:         5

  Size of section headers:           64 (bytes)

  Number of section headers:         44

  Section header string table index: 41


Section Headers:

  [Nr] Name              Type             Address           Offset

       Size              EntSize          Flags  Link  Info  Align

  [ 0]                   NULL             0000000000000000  00000000

       0000000000000000  0000000000000000           0     0     0

  [ 1] .text             PROGBITS         ffffffff81000000  00200000

       000000000074e5f8  0000000000000000  AX       0     0     4096

  [ 2] .notes            NOTE             ffffffff8174e5f8  0094e5f8

       0000000000000024  0000000000000000  AX       0     0     4

  [ 3] __ex_table        PROGBITS         ffffffff8174e620  0094e620

       0000000000002158  0000000000000000   A       0     0     8

  [ 4] .rodata           PROGBITS         ffffffff81800000  00a00000

       000000000033f9fe  0000000000000000   A       0     0     64

  [ 5] __bug_table       PROGBITS         ffffffff81b3fa00  00d3fa00

       00000000000072fc  0000000000000000   A       0     0     1

  [ 6] .pci_fixup        PROGBITS         ffffffff81b46d00  00d46d00

       0000000000003270  0000000000000000   A       0     0     8

  [ 7] .builtin_fw       PROGBITS         ffffffff81b49f70  00d49f70

       0000000000000120  0000000000000000   A       0     0     8

  [ 8] .tracedata        PROGBITS         ffffffff81b4a090  00d4a090

       0000000000000078  0000000000000000   A       0     0     1

  [ 9] __ksymtab         PROGBITS         ffffffff81b4a110  00d4a110

       00000000000118a0  0000000000000000   A       0     0     16

  [10] __ksymtab_gpl     PROGBITS         ffffffff81b5b9b0  00d5b9b0

       000000000000ecc0  0000000000000000   A       0     0     16

  [11] __kcrctab         PROGBITS         ffffffff81b6a670  00d6a670

       0000000000008c50  0000000000000000   A       0     0     8

  [12] __kcrctab_gpl     PROGBITS         ffffffff81b732c0  00d732c0

       0000000000007660  0000000000000000   A       0     0     8

  [13] __ksymtab_strings PROGBITS         ffffffff81b7a920  00d7a920

       00000000000268c3  0000000000000000   A       0     0     1

  [14] __init_rodata     PROGBITS         ffffffff81ba1200  00da1200

       0000000000000240  0000000000000000   A       0     0     32

  [15] __param           PROGBITS         ffffffff81ba1440  00da1440

       00000000000025d0  0000000000000000   A       0     0     8

  [16] __modver          PROGBITS         ffffffff81ba3a10  00da3a10

       00000000000005f0  0000000000000000   A       0     0     8

  [17] .data             PROGBITS         ffffffff81c00000  00e00000

       0000000000144140  0000000000000000  WA       0     0     4096

  [18] .vvar             PROGBITS         ffffffff81d45000  00f45000

       0000000000001000  0000000000000000  WA       0     0     16

  [19] .data..percpu     PROGBITS         0000000000000000  01000000

       000000000001f918  0000000000000000  WA       0     0     4096

  [20] .init.text        PROGBITS         ffffffff81d66000  01166000

       0000000000060879  0000000000000000  AX       0     0     16

  [21] .init.data        PROGBITS         ffffffff81dc7000  011c7000

       00000000000c2e90  0000000000000000  WA       0     0     4096

  [22] .x86_cpu_dev.init PROGBITS         ffffffff81e89e90  01289e90

       0000000000000018  0000000000000000   A       0     0     8

  [23] .altinstructions  PROGBITS         ffffffff81e89ea8  01289ea8

       0000000000005f44  0000000000000000   A       0     0     1

  [24] .altinstr_replace PROGBITS         ffffffff81e8fdec  0128fdec

       00000000000017db  0000000000000000  AX       0     0     1

  [25] .iommu_table      PROGBITS         ffffffff81e915c8  012915c8

       00000000000000f0  0000000000000000   A       0     0     8

  [26] .apicdrivers      PROGBITS         ffffffff81e916b8  012916b8

       0000000000000030  0000000000000000  WA       0     0     8

  [27] .exit.text        PROGBITS         ffffffff81e916e8  012916e8

       0000000000001e26  0000000000000000  AX       0     0     1

  [28] .smp_locks        PROGBITS         ffffffff81e94000  01294000

       0000000000007000  0000000000000000   A       0     0     4

  [29] .data_nosave      PROGBITS         ffffffff81e9b000  0129b000

       0000000000001000  0000000000000000  WA       0     0     4

  [30] .bss              NOBITS           ffffffff81e9c000  0129c000

       0000000000142000  0000000000000000  WA       0     0     4096

  [31] .brk              NOBITS           ffffffff81fde000  0129c000

       0000000000026000  0000000000000000  WA       0     0     1

  [32] .comment          PROGBITS         0000000000000000  0129c000

       0000000000000029  0000000000000001  MS       0     0     1

  [33] .debug_aranges    PROGBITS         0000000000000000  0129c030

       0000000000023880  0000000000000000           0     0     16

  [34] .debug_info       PROGBITS         0000000000000000  012bf8b0

       000000000724bc4f  0000000000000000           0     0     1

  [35] .debug_abbrev     PROGBITS         0000000000000000  0850b4ff

       00000000002d7be9  0000000000000000           0     0     1

  [36] .debug_line       PROGBITS         0000000000000000  087e30e8

       000000000072232c  0000000000000000           0     0     1

  [37] .debug_frame      PROGBITS         0000000000000000  08f05418

       00000000001f5cd0  0000000000000000           0     0     8

  [38] .debug_str        PROGBITS         0000000000000000  090fb0e8

       00000000002b5264  0000000000000001  MS       0     0     1

  [39] .debug_loc        PROGBITS         0000000000000000  093b034c

       0000000000925080  0000000000000000           0     0     1

  [40] .debug_ranges     PROGBITS         0000000000000000  09cd53d0

       00000000003ac530  0000000000000000           0     0     16

  [41] .shstrtab         STRTAB           0000000000000000  0a081900

       00000000000001dd  0000000000000000           0     0     1

  [42] .symtab           SYMTAB           0000000000000000  0a0825e0

       00000000002490c0  0000000000000018          43   63525     8

  [43] .strtab           STRTAB           0000000000000000  0a2cb6a0

       0000000000219339  0000000000000000           0     0     1

Key to Flags:

  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)

  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)

  O (extra OS processing required) o (OS specific), p (processor specific)


Program Headers:

  Type           Offset             VirtAddr           PhysAddr

                 FileSiz            MemSiz              Flags  Align

  LOAD           0x0000000000200000 0xffffffff81000000 0x0000000001000000

                 0x0000000000ba4000 0x0000000000ba4000  R E    200000

  LOAD           0x0000000000e00000 0xffffffff81c00000 0x0000000001c00000

                 0x0000000000146000 0x0000000000146000  RW     200000

  LOAD           0x0000000001000000 0x0000000000000000 0x0000000001d46000

                 0x000000000001f918 0x000000000001f918  RW     200000

  LOAD           0x0000000001166000 0xffffffff81d66000 0x0000000001d66000

                 0x0000000000136000 0x000000000029e000  RWE    200000

  NOTE           0x000000000094e5f8 0xffffffff8174e5f8 0x000000000174e5f8

                 0x0000000000000024 0x0000000000000024         4


 Section to Segment mapping:

  Segment Sections...

   00     .text .notes __ex_table .rodata __bug_table .pci_fixup .builtin_fw .tracedata __ksymtab __ksymtab_gpl __kcrctab __kcrctab_gpl __ksymtab_strings __init_rodata __param __modver

   01     .data .vvar

   02     .data..percpu

   03     .init.text .init.data .x86_cpu_dev.init .altinstructions .altinstr_replacement .iommu_table .apicdrivers .exit.text .smp_locks .data_nosave .bss .brk

   04     .notes


附录三,DWARF格式

DWARF(Debugging With Attributed Record Formats)和ELF是同义词,从gcc 4.8開始使用DWARF version 4作为默认格式(Linux Kernel开关是:DEBUG_INFO_DWARF4)。


附录四,内核调试举例一


#内存越界错误。或者相似unable to handle kernel paging request at xxx, unable to handle kernel NULL pointer dereference at

[158108.522856] general protection fault: 0000 [#1] SMP  

#模块信息

[158108.531877] Modules linked in: dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag veth xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_crypt gpio_ich xfs x86_pkg_temp_thermal intel_powerclamp coretemp bridge kvm_intel stp kvm joydev llc mei_me mei shpchp lpc_ich ipmi_si acpi_power_meter acpi_pad mac_hid btrfs xor raid6_pq libcrc32c ses enclosure crct10dif_pclmul crc32_pclmul ixgbe igb aesni_intel aes_x86_64 hid_generic dca lrw gf128mul ptp glue_helper usbhid ablk_helper cryptd hid pps_core i2c_algo_bit megaraid_sas mdio wmi

#CPU是20, PID是0, command是swapper/20, 内核版本号。硬件信息

[158108.654066] CPU: 20 PID: 0 Comm: swapper/20 Not tainted 3.13.0-74-generic #118-Ubuntu

[158108.675000] Hardware name: Cisco Systems Inc UCSC-C240-M4SX/UCSC-C240-M4SX, BIOS C240M4.2.0.8b.0.080620151546 08/06/2015

#task_struct(per-cpu variable current_task的内核地址,ti是current_thread_info的内核地址

[158108.699921] task: ffff883f2653b000 ti: ffff883f26536000 task.ti: ffff883f26536000

#寄存器信息, 对于x86,%cr2中的是近期的page fault address, RAX是非法值

[158108.720992] RIP: 0010:[<ffffffff810756a4>]  [<ffffffff810756a4>] detach_if_pending+0x34/0xb0

[158108.744725] RSP: 0018:ffff887f7f083d10  EFLAGS: 00010002

[158108.757586] RAX: dead000000200200 RBX: ffffffffa012f040 RCX: 0000000000001896

[158108.779778] RDX: ffff887f25d00938 RSI: ffff887f25eb8000 RDI: ffffffffa012f040

[158108.802864] RBP: ffff887f7f083d30 R08: 0000000000000086 R09: ffff887f25d74000

[158108.826882] R10: 0000000000000002 R11: 0000000000000005 R12: ffffffffa012f040

[158108.851851] R13: ffff887f25eb8000 R14: 0000000000000001 R15: 0000000000000001

[158108.877347] FS:  0000000000000000(0000) GS:ffff887f7f080000(0000) knlGS:0000000000000000

[158108.903997] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

[158108.918882] CR2: 00000000006f0e58 CR3: 0000000001c0e000 CR4: 00000000001407e0

#栈的raw十六进制信息

[158108.943906] Stack:

[158108.954323]  ffffffffa012f040 0000000000000000 ffff887f25eb8000 ffff883f22d7ea00

[158108.978987]  ffff887f7f083d60 ffffffff81075766 0000000000000086 ffffffffa012f020

[158109.003697]  ffff887f7f083d98 0000000000000100 ffff887f7f083d88 ffffffff81082369

#符号call stack backtrace,结合%rip这是很实用的信息。它也提供了函数的大小及偏移信息, 函数call前加问号。

[158109.028467] Call Trace:

[158109.039181]  <IRQ> 

[158109.041507]  [<ffffffff81075766>] del_timer+0x46/0x70

[158109.062562]  [<ffffffff81082369>] try_to_grab_pending+0xa9/0x160

[158109.076953]  [<ffffffff81082453>] mod_delayed_work_on+0x33/0x70

[158109.091233]  [<ffffffffa012c3ba>] set_timeout+0x3a/0x40 [ib_addr]

[158109.105194]  [<ffffffffa012c559>] netevent_callback+0x29/0x30 [ib_addr]

[158109.120083]  [<ffffffff8173125c>] notifier_call_chain+0x4c/0x70

[158109.134153]  [<ffffffff81634a60>] ? neigh_table_clear+0x120/0x120

[158109.148004]  [<ffffffff817312ba>] atomic_notifier_call_chain+0x1a/0x20

[158109.162487]  [<ffffffff8163100b>] call_netevent_notifiers+0x1b/0x20

[158109.176677]  [<ffffffff81634b21>] neigh_timer_handler+0xc1/0x2c0

[158109.189976]  [<ffffffff810745d6>] call_timer_fn+0x36/0x100

[158109.202723]  [<ffffffff81634a60>] ? neigh_table_clear+0x120/0x120

[158109.216443]  [<ffffffff8107556f>] run_timer_softirq+0x1ef/0x2f0

[158109.229444]  [<ffffffff8106cd2c>] __do_softirq+0xec/0x2c0

[158109.241890]  [<ffffffff8106d275>] irq_exit+0x105/0x110

[158109.253555]  [<ffffffff81737b15>] smp_apic_timer_interrupt+0x45/0x60

[158109.266647]  [<ffffffff8173649d>] apic_timer_interrupt+0x6d/0x80

[158109.279320]  <EOI> 

[158109.281647]  [<ffffffff815d65b2>] ?

cpuidle_enter_state+0x52/0xc0

[158109.300117]  [<ffffffff815d66d9>] cpuidle_idle_call+0xb9/0x1f0

[158109.312100]  [<ffffffff8101d3ee>] arch_cpu_idle+0xe/0x30

[158109.323777]  [<ffffffff810bf475>] cpu_startup_entry+0xc5/0x290

[158109.335775]  [<ffffffff810415ed>] start_secondary+0x21d/0x2d0

#原生字节(instruction stream),反汇编时才实用

[158109.347654] Code: 89 e5 41 56 41 89 d6 41 55 41 54 49 89 fc 53 48 8b 17 48 85 d2 74 55 49 89 f5 0f 1f 44 00 00 49 8b 44 24 08 45 84 f6 48 89 42 08 <48> 89 10 74 08 49 c7 04 24 00 00 00 00 41 f6 44 24 18 01 48 b8 

#Reprint of instruction pointer, current function, and stack pointer

[158109.386072] RIP  [<ffffffff810756a4>] detach_if_pending+0x34/0xb0

[158109.398404]  RSP <ffff887f7f083d10>


使用上面的内核及RIP寄存器信息找到相关代码:

addr2line -e ddeb/vmlinux-3.13.0-74-generic 0xffffffff810756a4

linux-3.13.0/include/linux/list.h:89

static inline void __list_del(struct list_head * prev, struct list_head * next)

{

        next->prev = prev;

        prev->next = next;    <<=== HERE

}


上面的高层C代码看不出什么东西。我们继续去ELF文件(vmlinux或者System.map)中通过符号找到相应的汇编代码:

% objdump -d -l ddeb/vmlinux-3.13.0-74-generic 0xffffffff810756a4

[...]

ffffffff81075670 <detach_if_pending>:

[...]

detach_timer():

/build/linux-_xRakU/linux-3.13.0/kernel/timer.c:662

ffffffff81075698:       49 8b 44 24 08          mov    0x8(%r12),%rax

/build/linux-_xRakU/linux-3.13.0/kernel/timer.c:663

ffffffff8107569d:       45 84 f6                test   %r14b,%r14b

__list_del():

/build/linux-_xRakU/linux-3.13.0/include/linux/list.h:88

ffffffff810756a0:       48 89 42 08             mov    %rax,0x8(%rdx)

/build/linux-_xRakU/linux-3.13.0/include/linux/list.h:89

ffffffff810756a4:       48 89 10                mov    %rdx,(%rax)


deatch_if_pending -> detach_timer -> __list_del之间发生了嵌套调用,它是造成panic的根原因。

static int detach_if_pending(struct timer_list *timer, struct tvec_base *base,

                             bool clear_pending)

{

        if (!timer_pending(timer))

                return 0;

        detach_timer(timer, clear_pending);        <== HERE

...


static inline int timer_pending(const struct timer_list * timer)

{

        return timer->entry.next != NULL;

}


结合堆栈信息查看代码。然后依据一些得到的大致的字眼搜索git log看bug是否已经被fix, 


附录五,内核调试举例二。内存篇


#GFP_ATOMIC=0x4020。意思是:the caller cannot sleep and wait for memory to be made available

[3387282.901263] ceph-osd: page allocation failure: order:2, mode:0x4020

[3387282.901271] Pid: 10125, comm: ceph-osd Tainted: G         C   3.2.0-51-generic #77-Ubuntu

#堆栈说明错误并非開始想象的是由ceph-osd造成的, 而是一个网络设备在分配接收缓存

#上面的order:2说明在分配2的2次方的pages(共16K bytes),为mtu=9000大帧分配的。可是找不着连续的16K的内存了。

[3387282.901274] Call Trace:

[3387282.901277]  <IRQ>  [<ffffffff8111e9a6>] warn_alloc_failed+0xf6/0x150

[3387282.901294]  [<ffffffff815349ac>] ? sk_reset_timer+0x1c/0x30

[3387282.901301]  [<ffffffff81599773>] ?

tcp_send_delayed_ack+0xe3/0xf0

[3387282.901308]  [<ffffffff8158d3c0>] ? __tcp_ack_snd_check+0x70/0xa0

[3387282.901314]  [<ffffffff81122737>] __alloc_pages_nodemask+0x6d7/0x8f0

[3387282.901320]  [<ffffffff8159d7bf>] ? tcp_v4_do_rcv+0xff/0x1d0

[3387282.901330]  [<ffffffff8164bf15>] kmalloc_large_node+0x57/0x85

[3387282.901338]  [<ffffffff81167bb5>] __kmalloc_node_track_caller+0x195/0x1e0

[3387282.901344]  [<ffffffff81538a4b>] ?

__alloc_skb+0x4b/0x240

[3387282.901349]  [<ffffffff815390c4>] ? __netdev_alloc_skb+0x24/0x50

[3387282.901354]  [<ffffffff81538a78>] __alloc_skb+0x78/0x240

[3387282.901359]  [<ffffffff815390c4>] __netdev_alloc_skb+0x24/0x50

[3387282.901373]  [<ffffffffa00a8909>] ixgbe_alloc_rx_buffers+0x289/0x350 [ixgbe]

[3387282.901380]  [<ffffffff81546fc0>] ?

napi_skb_finish+0x50/0x70

[3387282.901385]  [<ffffffff815475f5>] ? napi_gro_receive+0xf5/0x140

[3387282.901393]  [<ffffffffa00a91bb>] ixgbe_clean_rx_irq+0x7eb/0x8a0 [ixgbe]

[3387282.901401]  [<ffffffffa00a99ee>] ixgbe_poll+0xae/0x1a0 [ixgbe]

[3387282.901406]  [<ffffffff81547844>] net_rx_action+0x134/0x290

[3387282.901412]  [<ffffffff8115d753>] ? isolate_migratepages+0x333/0x660

[3387282.901418]  [<ffffffff8106f9e8>] __do_softirq+0xa8/0x210

[3387282.901425]  [<ffffffff816606be>] ?

_raw_spin_lock+0xe/0x20

[3387282.901432]  [<ffffffff8166af6c>] call_softirq+0x1c/0x30

[3387282.901439]  [<ffffffff810162f5>] do_softirq+0x65/0xa0

[3387282.901444]  [<ffffffff8106fdce>] irq_exit+0x8e/0xb0

[3387282.901450]  [<ffffffff8166b833>] do_IRQ+0x63/0xe0

[3387282.901455]  [<ffffffff81660b6e>] common_interrupt+0x6e/0x6e

[3387282.901458]  <EOI>  [<ffffffff8115d753>] ? isolate_migratepages+0x333/0x660

[3387282.901467]  [<ffffffff8115d74d>] ?

isolate_migratepages+0x32d/0x660

[3387282.901472]  [<ffffffff8115dadf>] compact_zone.part.14+0x5f/0x270

[3387282.901478]  [<ffffffff8115ddd7>] compact_zone+0x37/0x50

[3387282.901482]  [<ffffffff8115df63>] compact_zone_order+0x83/0xb0

[3387282.901488]  [<ffffffff8115e05d>] try_to_compact_pages+0xcd/0x100

[3387282.901494]  [<ffffffff8164b17e>] __alloc_pages_direct_compact+0xb2/0x178

[3387282.901500]  [<ffffffff81122595>] __alloc_pages_nodemask+0x535/0x8f0

[3387282.901508]  [<ffffffff8164bf15>] kmalloc_large_node+0x57/0x85

[3387282.901514]  [<ffffffff81167bb5>] __kmalloc_node_track_caller+0x195/0x1e0

[3387282.901520]  [<ffffffff81538a4b>] ? __alloc_skb+0x4b/0x240

[3387282.901526]  [<ffffffff81589034>] ?

sk_stream_alloc_skb+0x44/0x120

[3387282.901531]  [<ffffffff81538a78>] __alloc_skb+0x78/0x240

[3387282.901536]  [<ffffffff81589034>] sk_stream_alloc_skb+0x44/0x120

[3387282.901541]  [<ffffffff81589518>] tcp_sendmsg+0x408/0xd90

[3387282.901548]  [<ffffffff815af564>] inet_sendmsg+0x64/0xb0

[3387282.901554]  [<ffffffff81057d15>] ? reweight_entity+0x165/0x180

[3387282.901562]  [<ffffffff812d9837>] ? apparmor_socket_sendmsg+0x17/0x20

[3387282.901569]  [<ffffffff8152e49e>] sock_sendmsg+0x10e/0x130

[3387282.901574]  [<ffffffff8105725d>] ?

set_next_entity+0xad/0xd0

[3387282.901580]  [<ffffffff810573fa>] ? finish_task_switch+0x4a/0xf0

[3387282.901586]  [<ffffffff8165e14c>] ? __schedule+0x3cc/0x6f0

[3387282.901591]  [<ffffffff8165e79f>] ? schedule+0x3f/0x60

[3387282.901596]  [<ffffffff8153c766>] ? verify_iovec+0x56/0xd0

[3387282.901602]  [<ffffffff81530076>] ___sys_sendmsg+0x396/0x3b0

[3387282.901609]  [<ffffffff8109fd16>] ? get_futex_key+0x166/0x2d0

[3387282.901614]  [<ffffffff816606be>] ?

_raw_spin_lock+0xe/0x20

[3387282.901619]  [<ffffffff810a02f3>] ?

futex_wake+0x113/0x130

[3387282.901624]  [<ffffffff8109ff81>] ?

futex_wait+0x1/0x210

[3387282.901630]  [<ffffffff81532029>] __sys_sendmsg+0x49/0x90

[3387282.901636]  [<ffffffff81532089>] sys_sendmsg+0x19/0x20

[3387282.901642]  [<ffffffff81668d02>] system_call_fastpath+0x16/0x1b

#NUMA节点的相关信息,用途不大,>=kernel4.1取消了这部分信息。

#DMA, 为ISA设备保留的,低于16MB的物理地址

#DMA32, 为32位的pci设备保留的,低于4GB的物理地址

#Normal, x86_64,所以保留的内存,i686是(16MB -> 896MB)

#HighMem, 对于i686为>896MB以上的内存,须要物理的MMU映射才干訪问

#除上面4个zone外的其它zone如active_anon,略。

[3387282.901645] Mem-Info:

[3387282.901647] Node 0 DMA per-cpu:

[3387282.901651] CPU    0: hi:    0, btch:   1 usd:   0

[3387282.901654] CPU    1: hi:    0, btch:   1 usd:   0

[3387282.901657] CPU    2: hi:    0, btch:   1 usd:   0

[3387282.901660] CPU    3: hi:    0, btch:   1 usd:   0

[3387282.901663] CPU    4: hi:    0, btch:   1 usd:   0

[3387282.901666] CPU    5: hi:    0, btch:   1 usd:   0

[3387282.901669] CPU    6: hi:    0, btch:   1 usd:   0

[3387282.901672] CPU    7: hi:    0, btch:   1 usd:   0

[3387282.901675] CPU    8: hi:    0, btch:   1 usd:   0

[3387282.901677] CPU    9: hi:    0, btch:   1 usd:   0

[3387282.901680] CPU   10: hi:    0, btch:   1 usd:   0

[3387282.901683] CPU   11: hi:    0, btch:   1 usd:   0

[3387282.901686] CPU   12: hi:    0, btch:   1 usd:   0

[3387282.901689] CPU   13: hi:    0, btch:   1 usd:   0

[3387282.901692] CPU   14: hi:    0, btch:   1 usd:   0

[3387282.901695] CPU   15: hi:    0, btch:   1 usd:   0

[3387282.901697] Node 0 DMA32 per-cpu:

[3387282.901701] CPU    0: hi:  186, btch:  31 usd:  86

[3387282.901704] CPU    1: hi:  186, btch:  31 usd:   0

[3387282.901707] CPU    2: hi:  186, btch:  31 usd:   0

[3387282.901710] CPU    3: hi:  186, btch:  31 usd:   0

[3387282.901712] CPU    4: hi:  186, btch:  31 usd:   0

[3387282.901715] CPU    5: hi:  186, btch:  31 usd:  96

[3387282.901718] CPU    6: hi:  186, btch:  31 usd:   0

[3387282.901721] CPU    7: hi:  186, btch:  31 usd:  16

[3387282.901724] CPU    8: hi:  186, btch:  31 usd:   0

[3387282.901727] CPU    9: hi:  186, btch:  31 usd:   0

[3387282.901730] CPU   10: hi:  186, btch:  31 usd:   0

[3387282.901732] CPU   11: hi:  186, btch:  31 usd:   0

[3387282.901735] CPU   12: hi:  186, btch:  31 usd:   0

[3387282.901738] CPU   13: hi:  186, btch:  31 usd:  78

[3387282.901741] CPU   14: hi:  186, btch:  31 usd:   0

[3387282.901744] CPU   15: hi:  186, btch:  31 usd:   0

[3387282.901746] Node 0 Normal per-cpu:

[3387282.901750] CPU    0: hi:  186, btch:  31 usd: 162

[3387282.901753] CPU    1: hi:  186, btch:  31 usd:  29

[3387282.901756] CPU    2: hi:  186, btch:  31 usd:  40

[3387282.901759] CPU    3: hi:  186, btch:  31 usd:  42

[3387282.901762] CPU    4: hi:  186, btch:  31 usd:  42

[3387282.901765] CPU    5: hi:  186, btch:  31 usd: 221

[3387282.901768] CPU    6: hi:  186, btch:  31 usd:  37

[3387282.901771] CPU    7: hi:  186, btch:  31 usd: 182

[3387282.901774] CPU    8: hi:  186, btch:  31 usd:   0

[3387282.901777] CPU    9: hi:  186, btch:  31 usd:   0

[3387282.901780] CPU   10: hi:  186, btch:  31 usd:  29

[3387282.901783] CPU   11: hi:  186, btch:  31 usd:  22

[3387282.901786] CPU   12: hi:  186, btch:  31 usd:   0

[3387282.901789] CPU   13: hi:  186, btch:  31 usd: 156

[3387282.901792] CPU   14: hi:  186, btch:  31 usd:   6

[3387282.901795] CPU   15: hi:  186, btch:  31 usd:   0

[3387282.901802] active_anon:277242 inactive_anon:22700 isolated_anon:0

[3387282.901804]  active_file:5468942 inactive_file:9468439 isolated_file:0

[3387282.901805]  unevictable:0 dirty:95 writeback:0 unstable:0

[3387282.901807]  free:103654 slab_reclaimable:700786 slab_unreclaimable:89064

[3387282.901808]  mapped:3932 shmem:22 pagetables:3338 bounce:0

#系统的静态统计信息(/proc/vmstat, /proc/zoneinfo)。假设free和slab_reclaimable很低。说明物理内存不够了。

[3387282.901811] Node 0 DMA free:15896kB min:16kB low:20kB high:24kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15640kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes

[3387282.901825] lowmem_reserve[]: 0 1936 64432 64432

#free数据是大的,说明问题不是物理内存不够造成的

[3387282.901830] Node 0 DMA32 free:250560kB min:2028kB low:2532kB high:3040kB active_anon:12kB inactive_anon:116kB active_file:31272kB inactive_file:276136kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1982592kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:846404kB slab_unreclaimable:144884kB kernel_stack:3696kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable?

no

[3387282.901845] lowmem_reserve[]: 0 0 62496 62496

[3387282.901850] Node 0 Normal free:148160kB min:65536kB low:81920kB high:98304kB active_anon:1108956kB inactive_anon:90684kB active_file:21844496kB inactive_file:37597620kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:63995904kB mlocked:0kB dirty:380kB writeback:0kB mapped:15724kB shmem:88kB slab_reclaimable:1956740kB slab_unreclaimable:211372kB kernel_stack:18960kB pagetables:13352kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable?

no

[3387282.901864] lowmem_reserve[]: 0 0 0 0

[3387282.901869] Node 0 DMA: 0*4kB 1*8kB 1*16kB 0*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15896kB

[3387282.901883] Node 0 DMA32: 2010*4kB 2207*8kB 4168*16kB 2405*32kB 949*64kB 124*128kB 2*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 250560kB

#里面的0*16KB说明仅仅有0个16KB的内存了,显然问题就发生了。

[3387282.901897] Node 0 Normal: 36611*4kB 16*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 150668kB

[3387282.901917] 14937512 total pagecache pages

[3387282.901920] 8 pages in swap cache

[3387282.901923] Swap cache stats: add 250, delete 242, find 465/466

[3387282.901925] Free swap  = 3902516kB

[3387282.901927] Total swap = 3903484kB