作者:Ajay Vijayvargiya | 2011-01-24

Preamble
前言

All of us have used some kind of debugger while programming in some language. The debugger you used may be in C++, C#, Java or another language. It might be standalone like WinDbg, or inside an IDE like Visual Studio. But have you been inquisitive(好奇) over how debuggers work?

在我们使用某些语言时都用过某些调试器。你使用过的调试器可能用C++、C#、Java或者其它语言编写的。它可能是独立的,像WinDbg,或者内嵌在一个像Visual Studio 的IDE中。然而你是否会对“调试器如何工作”的感到好奇?

Well, this article presents the hidden glory on how debuggers work. This article only covers writing debugger on Windows. Please note that here I am concerned only about the debugger and not about compilers, linkers, or debugging extensions. Thus, we'll only debug executables (like WinDbg). This article assumes a basic understanding of multithreading from the reader (read my article on multithreading).

好,这篇文档展示了调试器如何工作的隐藏亮点。这篇文章仅包含编写Windows上的调试器。请注意,我在这仅关注“调试器”,而不是编译器、链接器或调试器的扩展。因此,我们仅调试可执行的(像WinDbg)。这篇文章假设读者对多线程有基本的理解(阅读我的关于多线程的文章)

1. How to Debug a Program?
1. 怎样调试一个程序?

Two steps:

  1. Starting the process with DEBUG_ONLY_THIS_PROCESS or DEBUG_PROCESS flags.
  2. Setting up the debugger's loop that will handle debugging events.

两步:

1. 使用DEBUG_ONLY_THIS_PROCESSDEBUG_PROCESS标志启动进程。

2. 设置调试器的循环,那将处理调试事件。

Before we move further, please remember:

  1. 1.Debugger is the process/program which debugs the other process (target-process).
  2. 2.Debuggee is the process being debugged, by the debugger.
  3. 3.Only one debugger can be attached to a debuggee. However, a debugger can debug multiple processes (in separate threads).
  4. 4.Only the thread that created/spawned(生育) the debuggee can debug the target-process. Thus, CreateProcess and the debugger-loop must be in the same thread.
  5. 5.When the debugger thread terminates, the debuggee terminates as well. The debugger process may keep running, however.
  6. 6.When the debugger's debugging thread is busy processing a debug event, all threads in the debuggee (target-process) stand suspended. More on this later.

在进一步阅读之前,请记住:

1. 调试器是调试其它进程(目标进程)的进程/程序。

2. 被调试者是被调试器调试的进程。

3. 一个被调试者仅可以与一个调试器关联。然而,一个调试器可以调试多个进程(在不同线程中)。

4. 仅仅创建/产生被调试者的线程可以调试目标进程。因此,CreateProcess和调试循环必须在同一个线程中。

5. 当调试线程终止时,被调试者也终止。然而调试进程可能保持运行。

6. 当调试器的调试线程正在忙于处理一个调试事件时,在被调用者(目标进程)所有的线程保持挂起状态。后面会有更多讨论。

A. Starting the process with the debugging flag
A.使用调试标识启动进程

Use CreateProcess to start the process, specifying DEBUG_ONLY_THIS_PROCESS as the sixth parameter (dwCreationFlags). With this flag, we are asking the Windows OS to communicate this thread for all debugging events, including process creation/termination, thread creation/termination, runtime exceptions, and so on. A detailed explanation is given below. Please note that we'll be using DEBUG_ONLY_THIS_PROCESS in this article. It essentially means we want only to debug the process we are creating, and not any child process(es) that may be created by the process we create.

使用CreateProcess启动进程,指定DEBUG_ONLY_THIS_PROCESS作为第六个参数(dwCreationFlags)。有了这个标识,我们要求将Windows操作系统所有调试事件与这个线程通信,包括进程创建/终止,线程创建/终止,运行时异常,等等。下面有更详细的解释。请注意在这篇文章中我们将会使用DEBUG_ONLY_THIS_PROCESS。这实际上意味着我们仅仅想要调试我们创建的进程,而不是任何可能被我们创建的进程创建的子进程。

STARTUPINFO si;

PROCESS_INFORMATION pi;


ZeroMemory( &si, sizeof(si) );


si.cb = sizeof(si);
ZeroMemory( &pi, sizeof(pi) );


 


CreateProcess ( ProcessNameToDebug, NULL, NULL, NULL, FALSE,


                DEBUG_ONLY_THIS_PROCESS, NULL,NULL, &si, &pi );


After this statement, you would see the process in the Task Manager, but the process hasn't started yet. The newly created process is suspended. No, we don't have to call ResumeThread, but write a debugger-loop.

在这一句之后,你在任务管理器中可以看到那个进程,但是那个进程还没有启动。新创建的进程被挂起了。不,我们不需要调用ResumeThread,而是仅仅写一个调试循环。

B. The debugger loop
B. 调试循环

The debugger-loop is the central area for debuggers! The loop runs around the WaitForDebugEvent API. This API takes two parameters: a pointer to the DEBUG_EVENT structure and the DWORD timeout parameter. For timeout, we would simply specify INFINITE. This API exists in kernel32.dll, thus we need not link to any library.

调试循环式调试器的中心区域!这个循环围绕着WaitForDebugEvent API运行。这个API需要两个参数:一个指向DEBUG_EVENT结构体的指针和一个DWORD类型的超时参数。对于超时,我们简单的指定无限(INFINITE这个APIkernel32.dll中,因此我们不需要连接任何库了。

Collapse


BOOL WaitForDebugEvent(DEBUG_EVENT* lpDebugEvent, DWORD dwMilliseconds);


The DEBUG_EVENT structure contains the debugging event information. It has four members: Debug event code, process-ID, thread-ID, and the event information. As soon as WaitForDebugEvent returns, we process the received debugging event, and then eventually call ContinueDebugEvent. Here is a minimal debugger-loop:

Collapse


DEBUG_EVENT debug_event = {0};
 
for(;;)
 
{

 
    if (!WaitForDebugEvent(&debug_event, INFINITE))
 
        return;
 
    ProcessDebugEvent(&debug_event);  // User-defined function, not API
 
    ContinueDebugEvent(debug_event.dwProcessId,
 
                      debug_event.dwThreadId,
 
                      DBG_CONTINUE);
 
}


DEBUG_EVENT结构体包括调试事件信息。它有4个成员:调试事件代码,进程ID,线程ID事件信息。只要WaitForDebugEvent一返回,我们就处理接收到的调试事件,最后调用ContinueDebugEvent。这有一个最小的调试循环:


DEBUG_EVENT debug_event = {0};
 
for(;;)
 
{

 
    if (!WaitForDebugEvent(&debug_event, INFINITE))
 
        return;
 
    ProcessDebugEvent(&debug_event);  //用户定义的函数,不是API
 
    ContinueDebugEvent(debug_event.dwProcessId,
 
                      debug_event.dwThreadId,
 
                      DBG_CONTINUE);
 
}


Using the ContinueDebugEvent API, we are asking the OS to continue executing the debuggee. The dwProcessId and dwThreadId specify the process and thread. These values are the same that we received form WaitForDebugEvent. The last parameter specifies if the execution should continue or not. This parameter is relevant only if the exception-event is received. We will cover this later. Until then, we'll utilize(利用) only DBG_CONTINUE (another possible value is DBG_EXCEPTION_NOT_HANDLED).

使用ContinueDebugEvent API,我们要求操作系统继续运行被调试者。dwProcessIddwThreadId分别指定进程和线程。这些值和我们从WaitForDebugEvent得到的相同。最后一个参数指定是否需要继续运行。这个参数仅用于判断是否有异常事件。我们将在后面讨论。在那之前,我们仅会利用DBG_CONTINUE(另一个可能值是DBG_EXCEPTION_NOT_HANDLED)。

2. Handling debugging events
2. 处理调试事件

There are nine different major debugging events, and 20 different sub-events under the exception-event category. I will discuss them, starting from the simplest. Here is the DEBUG_EVENT structure:

Collapse

struct DEBUG_EVENT
 
{

 
    DWORD dwDebugEventCode;
 
    DWORD dwProcessId;
 
    DWORD dwThreadId;
 
    union {

 
        EXCEPTION_DEBUG_INFO Exception;
 
        CREATE_THREAD_DEBUG_INFO CreateThread;
 
        CREATE_PROCESS_DEBUG_INFO CreateProcessInfo;
 
        EXIT_THREAD_DEBUG_INFO ExitThread;
 
        EXIT_PROCESS_DEBUG_INFO ExitProcess;
 
        LOAD_DLL_DEBUG_INFO LoadDll;
 
        UNLOAD_DLL_DEBUG_INFO UnloadDll;
 
        OUTPUT_DEBUG_STRING_INFO DebugString;
 
        RIP_INFO RipInfo;
 
    } u;
 
};


这有9种不同的主要调试事件,20种在异常事件分类中不同的子事件。我将会从简单的开始讨论它们。这是DEBUG_EVENT结构体:

struct DEBUG_EVENT


{

 
    DWORD dwDebugEventCode;
 
    DWORD dwProcessId;
 
    DWORD dwThreadId;
 
    union {

 
        EXCEPTION_DEBUG_INFO Exception;
 
        CREATE_THREAD_DEBUG_INFO CreateThread;
 
        CREATE_PROCESS_DEBUG_INFO CreateProcessInfo;
 
        EXIT_THREAD_DEBUG_INFO ExitThread;
 
        EXIT_PROCESS_DEBUG_INFO ExitProcess;
 
        LOAD_DLL_DEBUG_INFO LoadDll;
 
        UNLOAD_DLL_DEBUG_INFO UnloadDll;
 
        OUTPUT_DEBUG_STRING_INFO DebugString;
 
        RIP_INFO RipInfo;
 
    } u;
 
};


WaitForDebugEvent, on successful return, fills-in the values in this structure. dwDebugEventCode specifies which debugging-event has occurred. Depending on the event-code received, one of the members of the union u contains the event information, and we should only use the respective union-member. For example, if the debug event code is OUTPUT_DEBUG_STRING_EVENT, the member OUTPUT_DEBUG_STRING_INFO would be valid.

WaitForDebugEvent在成功返回时会填充这个结构体的值。dwDebugEventCode指定那个调试事件发生了。根据接收到的事件代码,联合体u 的其中一个成员包含了事件信息,我们需要使用各自的联合体成员。例如,如果调试事件代码是OUTPUT_DEBUG_STRING_EVENTOUTPUT_DEBUG_STRING_INFO成员就是正确的。

A. Processing OUTPUT_DEBUG_STRING_EVENT
A. 处理OUTPUT_DEBUG_STRING_EVENT

Programmers generally use OutputDebugString to generate debugging-text that would be displayed on the debugger's 'Output' window. Depending on the language/framework you use, you might be familiar with the TRACE, ATLTRACE macros. A .NET programmers may use the System.Diagnostics.Debug.Print/System.Diagnostics.Trace.WriteLine methods (or other methods). But with all these methods, the OutputDebugString API is called, and the debugger would receive this event (unless it is buried with the DEBUG symbol undefined!).

程序员一般使用OutputDebugString产生输出到调试器’输出’窗口的调试文本。根据你使用的语言/框架,你可能对TRACE,ATLTRACE宏很熟悉。一个.NET程序员可能使用System.Diagnostics.Debug.Print/System. Trace.WriteLine方法(或其它方法)。但是对于所有这些方法,OutputDebugString都会被叫用,调试器也会接收到这个事件(除非它被DEBUG符号取消定义!)。

When this event is received, we work on the DebugString member variable. The structure OUTPUT_DEBUG_STRING_INFO is defined as:

struct OUTPUT_DEBUG_STRING_INFO
 
{

 
   LPSTR lpDebugStringData;  // char*
 
   WORD fUnicode;
 
   WORD nDebugStringLength;
 
};
 
当这个事件到达时,我们工作在(work on)DebugString成员变量。OUTPUT_DEBUG_STRING_INFO结构体是这样定义的:
 
struct OUTPUT_DEBUG_STRING_INFO
 
{


LPSTR lpDebugStringData;  // char*


WORD fUnicode;
 
   WORD nDebugStringLength;
 
};


The member-variable 'nDebugStringLength' specifies the length of the string, including the terminating null, in characters (not bytes). The variable 'fUnicode' specifies if the string is Unicode (non-zero), or ANSI (zero). That means, we read 'nDebugStringLength' bytes from 'lpDebugStringData' if the string is ANSI; otherwise, we read (nDebugStringLength x 2) bytes. But remember, the address pointed by 'lpDebugStringData' is not from the address-space of the debugger's memory. The address is relevant to the debuggee memory. Thus, we need to read the contents from the debuggee's process memory.

成员变量'nDebugStringLength'指定了字符串的长度,包括终止空字符,以字符编码(不是字节)。变量'fUnicode'指定了字符串是否Unicode(非零)或ANSI(零)。这意味着,如果字符串是ANSI编码,我们从'lpDebugStringData'读取'nDebugStringLength'个字节;否则,我们读取(nDebugStringLength x 2)个字节。但是记住,'lpDebugStringData'指向的地址不是调试器内存的地址空间。这个地址是与被调试者内存相关的。因此,我们需要从被调试者进程内存中读取内容

To read data from another process' memory, we use the ReadProcessMemory function. It requires that the calling process should have the appropriate permission. Since the debugger only created the process, we do have the rights. Here is the code to process this debugging event:

Collapse


case OUTPUT_DEBUG_STRING_EVENT:
 
{

 
   CStringW strEventMessage;  // Force Unicode
 
   OUTPUT_DEBUG_STRING_INFO & DebugString = debug_event.u.DebugString;
 
 
 
   WCHAR *msg=new WCHAR[DebugString.nDebugStringLength];
 
   // Don't care if string is ANSI, and we allocate double...
 
 
 
   ReadProcessMemory(pi.hProcess,       // HANDLE to Debuggee
 
         DebugString.lpDebugStringData, // Target process' valid pointer
 
         msg,                           // Copy to this address space
 
         DebugString.nDebugStringLength, NULL);
 
 
 
   if ( DebugString.fUnicode )
 
      strEventMessage = msg;
 
   else
 
      strEventMessage = (char*)msg; // char* to CStringW (Unicode) conversion.
 
 
 
   delete []msg;
 
   // Utilize strEventMessage
 
}


要从另一个进程内存中读取数据,我们使用ReadProcessMemory函数。那要求调用进程需要有适当的权限。因为调试器仅创建了那个进程,我们当然有这个权利。这是处理这个调试事件的代码:

case OUTPUT_DEBUG_STRING_EVENT:


{


CStringW strEventMessage;  // 强制 Unicode


   OUTPUT_DEBUG_STRING_INFO & DebugString = debug_event.u.DebugString;


 


WCHAR *msg=new WCHAR[DebugString.nDebugStringLength];
// 不用关心字符串是不是ANSI编码,我们分配双倍空间...


 


ReadProcessMemory(pi.hProcess,       // 被调试者的HANDLE
DebugString.lpDebugStringData, // 目标进程的正确指针
msg,                           // 复制这个地址空间


         DebugString.nDebugStringLength, NULL);


 


   if ( DebugString.fUnicode )


      strEventMessage = msg;


   else


strEventMessage = (char*)msg; // char*到CStringW(Unicode)的转换.


 


delete []msg;
// 使用 strEventMessage


}


What if the debuggee terminates before the debugger copies the memory contents?

如果被调试者在调试器复制内存内容时终止了?

Well... In that case, I would like to remind you: when the debugger is processing a debugging event, all threads in the debuggee are suspended. The process cannot kill itself in anyway at this moment. Also, no other method can terminate the process (Task Manager, Process Explorer, Kill utility...). Attempting to kill the process from these utilities will, however, schedule the terminating process. Thus, the debugger would receive EXIT_PROCESS_DEBUG_EVENT as the next event!

好……在这种情况下,我想提醒你:当调试器处理一个调试事件时,挂起被调试者的所有线程。这个时候那个进程就无法杀死自己。并且,没有其它的方法可以终止那个进程(任务管理器,进程资源管理器,杀死效用(Kill utility)……)。然而,这些工具的杀死那个进程的企图将会按照调度终止进程。因此,调试器将会接收到的下一个事件是EXIT_PROCESS_DEBUG_EVENT

B. Processing CREATE_PROCESS_DEBUG_EVENT
B. 处理CREATE_PROCESS_DEBUG_EVENT

This event is raised when the process (debuggee) is being spawned. This would be the first event the sebugger receives. For this event, the relevant member of DEBUG_EVENT would be CreateProcessInfo. Here is the structure definition of CREATE_PROCESS_DEBUG_INFO:

Collapse


struct CREATE_PROCESS_DEBUG_INFO


{


    HANDLE hFile;   // The handle to the physical file (.EXE)


    HANDLE hProcess; // Handle to the process


    HANDLE hThread;  // Handle to the main/initial thread of process


    LPVOID lpBaseOfImage; // base address of the executable image


    DWORD dwDebugInfoFileOffset;


    DWORD nDebugInfoSize;


    LPVOID lpThreadLocalBase;


    LPTHREAD_START_ROUTINE lpStartAddress;


    LPVOID lpImageName;  // Pointer to first byte of image name (in Debuggee)


    WORD fUnicode; // If image name is Unicode.


};


当那个进程(被调试者)产生时会引发这个事件。这是调试器(sedebugger)接收到的第一个事件。对于这个事件,CreateProcessInfoDEBUG_EVENT的相关成员。这是CREATE_PROCESS_DEBUG_INFO结构体的定义:

struct CREATE_PROCESS_DEBUG_INFO


{


HANDLE hFile;   // 物理文件的句柄(.EXE)
HANDLE hProcess; //进程的句柄
HANDLE hThread;  // 进程的main/初始化线程的句柄
LPVOID lpBaseOfImage; // 可执行映像的基址


    DWORD dwDebugInfoFileOffset;


    DWORD nDebugInfoSize;


    LPVOID lpThreadLocalBase;


    LPTHREAD_START_ROUTINE lpStartAddress;


LPVOID lpImageName;  // 指向映像名称第一个字节的指针(在被调试者中)
WORD fUnicode; // 映像名称是否Unicode编码.


};


Please note that hProcess and hThread may not have the same handle values we have received in pi (PROCESS_INFORMATION). The process-ID and the thread-ID would, however, be the same. Each handle you get by Windows (for the same resource) is different from other handles, and has a different purpose. So, the debugger may choose to display either the handles or the IDs.

请注意我们接收到的pi (PROCESS_INFORMATION)中的hProcesshThread可能不是一个句柄值。然而进程ID和线程ID可能是相同的。你获取的每个窗口(对于同一个资源)句柄和其它的句柄都是不同的,也有不同的目的。因此,调试器可能选择显示句柄或者ID。

The hFile as well as lpImageName can both be used to get the file-name of the process being debugged. Although, we already know what the name of the process is, since we only created the debuggee. But locating the module name of the EXE or DLL is important, since we would anyway need to find the name of the DLL while processing the LOAD_DLL_DEBUG_EVENT message.

通过hFilelpImageName都能获取正在被调试进程的文件名称。尽管我们已经知道了进程的名称是什么,因为我们仅仅创建了被调试者。但是定位的EXE或DLL模块名称是重要的,因为当处理LOAD_DLL_DEBUG_EVENT消息时我们常常需要找到DLL的名称。

As you can read in MSDN, lpImageName will never return the filename directly, and the name would be in the target-process. Furthermore, it may not have a filename in the target-process too (i.e., via ReadProcessMemory). Also, the filename may not be fully qualified (as I've tested). Thus, we will not use this method. We'll retrieve the filename from the hFile member.

和你在MSDN上读到的一样,lpImageName从不会直接返回文件名称,而且这个名称会在目标进程中。更进步的说,在目标进程中可能没有文件名称(例如,通过ReadProcessMemory)。还有,文件名称可能不是完全符合要求(就像我曾经测试的)。因此,我们不会使用这个方法。我们将会从hFile成员获取文件名称。

How to get the name of the file by HANDLE
怎样通过HANDLE获取文件名称

Unfortunately, we need to use the method described in MSDN that uses around 10 API calls to get the filename from the handle. I have slightly modified the function GetFileNameFromHandle. The code is not shown here for brevity, it is available in the source file attached with this article. Anyway, here is the basic code to process this event:

case


{


   CString strEventMessage =


     GetFileNameFromHandle(debug_event.u.CreateProcessInfo.hFile);


   // Use strEventMessage, and other members


   // of CreateProcessInfo to intimate(亲密,暗示) the user of this event.


}


不幸的是,我们需要使用MSDN描述的方法,使用将近10个API调用来从句柄获取文件名称。我稍微修改了GetFileNameFromHandle函数。为了简洁,这里没有显示代码,可以从和这篇文章关联的源代码文件中获取。还有,这时处理这个事件的基本代码:


{


   CString strEventMessage =


     GetFileNameFromHandle(debug_event.u.CreateProcessInfo.hFile);


// 使用 strEventMessage和CreateProcessInfo的其它成员
// 来暗示这个事件的用户.


}


You may have noticed that I did not cover a few members of this structure. I would probably cover all of them in the next part of this article.

你可能已经注意到了我没有讨论这个结构体的一些成员。我可能在这篇文章的下一个部分讨论它们的全部。

C. Processing LOAD_DLL_DEBUG_EVENT
C.处理LOAD_DLL_DEBUG_EVENT

This event is similar to CREATE_PROCESS_DEBUG_EVENT, and as you might have guessed, it is raised when a DLL is loaded by the OS. This event is raised whenever a DLL is loaded, either implicitly or explicitly (when the debuggee calls LoadLibrary). This debugging event only occurs the first time the system attaches a DLL to the virtual address space of a process. For this event processing, we use the 'LoadDll' member of the union. It is of type LOAD_DLL_DEBUG_INFO:

struct LOAD_DLL_DEBUG_INFO


{


   HANDLE hFile;         // Handle to the DLL physical file.


   LPVOID lpBaseOfDll;   // The DLL Actual load address in process.


   DWORD dwDebugInfoFileOffset;


   DWORD nDebugInfoSize;


   LPVOID lpImageName;   // These two member are same as CREATE_PROCESS_DEBUG_INFO


   WORD fUnicode;


};


这个事件和CREATE_PROCESS_DEBUG_EVENT很相似,就像你能猜到的,这个事件会在OS载入了一个DLL时引发。无论什么时候载入了一个DLL这个都会引发这个事件,不论是隐式的还是明确的(当被调试者调用LoadLibrary时)。这个调试事件仅在系统第一次关联一个DLL到一个进程的虚拟地址空间发生。对于这个事件的处理,我们使用联合体的'LoadDll'成员。它的类型是LOAD_DLL_DEBUG_INFO

struct LOAD_DLL_DEBUG_INFO


{


HANDLE hFile;         //DLL物理文件的句柄.
LPVOID lpBaseOfDll;   // 进程中DLL实际导入地址.


   DWORD dwDebugInfoFileOffset;


   DWORD nDebugInfoSize;


LPVOID lpImageName;   // 这两个成员和CREATE_PROCESS_DEBUG_INFO一样


   WORD fUnicode;


};


For retrieving the filename, we would use the same function, GetFileNameFromHandle, as we have used in CREATE_PROCESS_DEBUG_EVENT. I will list out the code for processing this event when I would describe UNLOAD_DLL_DEBUG_EVENT, since the UNLOAD_DLL_DEBUG_EVENT does not have any direct information available to find out the name of the DLL file.

想要获取这个文件名称,我们使用和CREATE_PROCESS_DEBUG_EVENT事件中使用的相同的函数GetFileNameFromHandle。当我描述UNLOAD_DLL_DEBUG_EVENT时我将会列出处理这个事件的代码,因为UNLOAD_DLL_DEBUG_EVENT没有任何直接的可用信息来查找DLL文件的名称。

D. Processing CREATE_THREAD_DEBUG_EVENT
D.处理CREATE_THREAD_DEBUG_EVENT

This debug event is generated whenever a new thread is created in the debuggee. Like CREATE_PROCESS_DEBUG_EVENT, this event is raised before the thread actually gets to run. To get information about this event, we use the 'CreateThread' union member. This variable is of type CREATE_THREAD_DEBUG_INFO:

Collapse


struct CREATE_THREAD_DEBUG_INFO


{


  // Handle to the newly created thread in debuggee


  HANDLE hThread;


  LPVOID lpThreadLocalBase;


  // pointer to the starting address of the thread


  LPTHREAD_START_ROUTINE lpStartAddress;


};


无论何时被调试者创建了一个新的线程是这个调试事件就会产生。像CREATE_PROCESS_DEBUG_EVENT,这个事件是在一个线程实际开始运行前触发的。我们使用联合体成员来获取关于这个事件的信息。这个变量的类型是CREATE_THREAD_DEBUG_INFO:


struct CREATE_THREAD_DEBUG_INFO


{


//被调试者创建的新线程的句柄


  HANDLE hThread;


  LPVOID lpThreadLocalBase;


// 指向线程的起始地址的指针


  LPTHREAD_START_ROUTINE lpStartAddress;


};


The thread-ID for a newly arrived thread is available in DEBUG_EVENT::dwThreadId. Using this member to intimate the user is straightforward:

case CREATE_THREAD_DEBUG_EVENT:


{


   CString strEventMessage;


   strEventMessage.Format(L"Thread 0x%x (Id: %d) created at: 0x%x",


            debug_event.u.CreateThread.hThread,


            debug_event.dwThreadId,


            debug_event.u.CreateThread.lpStartAddress);


            // Thread 0xc (Id: 7920) created at: 0x77b15e58


}


DEBUG_EVENT::dwThreadId中可以得到新到来的线程的线程ID。使用这个成员暗示用户直截了当:

case CREATE_THREAD_DEBUG_EVENT:


{


   CString strEventMessage;


   strEventMessage.Format(L"Thread 0x%x (Id: %d) created at: 0x%x",


            debug_event.u.CreateThread.hThread,


            debug_event.dwThreadId,


            debug_event.u.CreateThread.lpStartAddress);


// 线程0xc(Id:7920)创建在: 0x77b15e58


}


The 'lpStartAddress' is relevant to the debuggee and not the debugger; we are just displaying it for completeness. Remember this event is not received for the primary/initial thread of the process. It is received only for the subsequent thread creations in the debuggee.

'lpStartAddress'和被调试者相关而不是调试器;我们只是为了完整的显示它。记住这个事件不是在进程的主/初始化线程中接收到的。仅在被调试者创建子线程时会接收到。

E. Processing EXIT_THREAD_DEBUG_EVENT
E. 处理EXIT_THREAD_DEBUG_EVENT

This event is raised as soon as the thread returns, and the return code is available to the system. The 'dwThreadId' member of DEBUG_EVENT specifies which thread exited. To get the thread handle and other information that we received in CREATE_THREAD_DEBUG_EVENT, we need to store the information in some map. This event has a relevant member named 'ExitThread', which is of type EXIT_THREAD_DEBUG_INFO:

struct EXIT_THREAD_DEBUG_INFO


{


   DWORD dwExitCode; // The thread exit code of DEBUG_EVENT::dwThreadId


};


线程一返回就会引发这个事件,系统可以得到返回代码。DEBUG_EVENT的成员'dwThreadId'指定了哪个线程退出了。要获得CREATE_THREAD_DEBUG_EVENT事件中我们接收到的线程句柄和其它信息,我们需要在一些map中存储信息。这个事件和名称为'ExitThread'的成员关联,类型是EXIT_THREAD_DEBUG_INFO:

struct EXIT_THREAD_DEBUG_INFO


{


DWORD dwExitCode; // DEBUG_EVENT::dwThreadId的线程退出代码


};


Here is the event handler code:


case EXIT_THREAD_DEBUG_EVENT:


{


   CString strEventMessage;


   strEventMessage.Format( _T("The thread %d exited with code: %d"),


      debug_event.dwThreadId,


debug_event.u.ExitThread.dwExitCode);    // The thread 2760 exited with code: 0


}


这是事件处理代码:

case EXIT_THREAD_DEBUG_EVENT:


{


   CString strEventMessage;


   strEventMessage.Format( _T("%d 线程退出,代码: %d"),


      debug_event.dwThreadId,


debug_event.u.ExitThread.dwExitCode);    //2760线程退出,代码:0


}


F. Processing UNLOAD_DLL_DEBUG_EVENT
F.处理UNLOAD_DLL_DEBUG_EVENT

Of course, this event occurs when a DLL is unloaded from the debuggee's memory. But wait! It is only generated against FreeLibrary calls, and not when the system unloads DLLs. The debuggee may call LoadLibrary multiple times, and thus only the last call to FreeLibrary would raise this event. It means, the implicitly loaded DLLs will not receive this event when they are unloaded, when the process exits. (You can verify this assertion in your favorite debugger!)

当然,这个事件在一个DLL从被调试者内存中卸载时发生。但是等等!它只FreeLibrary调用时产生,而不是当系统卸载DLL时。被调试者可能调用LoadLibrary多次,因此仅在最后一次调用FreeLibrary时会触发这个事件。这意味着,当进程退出时隐式装载的DLL不会在卸载时接收到这个事件。(你可以在你最喜欢的调试器中验证这个断言!)。

For this event, you use the 'UnloadDll' member of the union, which is of type UNLOAD_DLL_DEBUG_INFO:

struct UNLOAD_DLL_DEBUG_INFO


{


    LPVOID lpBaseOfDll;


};


对于这个事件,你使用联合体的'UnloadDll'成员,类型是UNLOAD_DLL_DEBUG_INFO:

struct UNLOAD_DLL_DEBUG_INFO


{


    LPVOID lpBaseOfDll;


};


As you can see, only the base-address of the DLL (a simple pointer) is available for us to process this event. This is the reason I had delayed giving the code for LOAD_DLL_DEBUG_EVENT. In the DLL loading event, we get the 'lpBaseOfDll' also. We can use the map (or another data structure you like) to store the name of the DLL against the base-address of the DLL. The same base-address would arrive while processing UNLOAD_DLL_DEBUG_EVENT.

正如你可以看到的,仅可以获得DLL的基址(一个简单的指针)来处理这个事件。这就是我推迟给出LOAD_DLL_DEBUG_EVENT代码的原因。在DLL装载的代码中,我们也获得了'lpBaseOfDll'。我们可以使用map(或你喜欢的其它的数据结构)来存储对应DLL基址的DLL名称。在处理UNLOAD_DLL_DEBUG_EVENT时会接收到相同的基址。

It should be noted that not all DLL-load events would get the DLL-unload event; still, we have to store all DLL names into the map, since LOAD_DLL_DEBUG_EVENT doesn't provide us info on how the DLL was loaded.

应该注意到并不是所有的DLL装载事件都会获取DLL卸载事件;还有,我们还得把所有DLL名称保存到map中,因为LOAD_DLL_DEBUG_EVENT不给我们提供DLL是怎么装载的信息。

Here is the code to process these two events:

std::map < LPVOID, CString > DllNameMap;


...


case LOAD_DLL_DEBUG_EVENT:


{


   strEventMessage = GetFileNameFromHandle(debug_event.u.LoadDll.hFile);


 


 


   // Storing the DLL name into map. Map's key is the Base-address


   DllNameMap.insert(


      std::make_pair( debug_event.u.LoadDll.lpBaseOfDll, strEventMessage) );


 


   strEventMessage.AppendFormat(L" - Loaded at %x", debug_event.u.LoadDll.lpBaseOfDll);


}


break;


...


case UNLOAD_DLL_DEBUG_EVENT:


{


   strEventMessage.Format(L"DLL '%s' unloaded.",


      DllNameMap[debug_event.u.UnloadDll.lpBaseOfDll] ); // Get DLL name from map.


}


break;


这是处理这两个事件的代码:


std::map < LPVOID, CString > DllNameMap;


...


case LOAD_DLL_DEBUG_EVENT:


{


   strEventMessage = GetFileNameFromHandle(debug_event.u.LoadDll.hFile);


 


 


// 将DLL名称存储到map中。Map的键值是基址。


   DllNameMap.insert(


      std::make_pair( debug_event.u.LoadDll.lpBaseOfDll, strEventMessage) );


 


   strEventMessage.AppendFormat(L" - Loaded at %x", debug_event.u.LoadDll.lpBaseOfDll);


}


break;


...


case UNLOAD_DLL_DEBUG_EVENT:


{


   strEventMessage.Format(L"DLL '%s' unloaded.",


      DllNameMap[debug_event.u.UnloadDll.lpBaseOfDll] ); // 从map中获取DLL名称


}


break;
G. Processing EXIT_PROCESS_DEBUG_EVENT
G.处理EXIT_PROCESS_DEBUG_EVENT

This is one of the simplest debugging event, and as you can assess, would arrive when the process exists. This event would arrive irrespective(不论) of how the process exits - normally, terminated externally (Task Manager etc.), or the application's (debuggee) fault leading it to crash.

这是最简单的调试事件之一,正如你可以评估的,当进程退出时会发生。不论进程如何退出,这个事件都会发生- 正常的,外部终止的(任务管理器等),或者是应用程序(被调试者)的错误导致的崩溃。

We use the 'ExitProcess' member, which is of type EXIT_PROCESS_DEBUG_INFO:

Collapse


struct EXIT_PROCESS_DEBUG_INFO


{


    DWORD dwExitCode;


};


我们使用'ExitProcess'成员,类型是EXIT_PROCESS_DEBUG_INFO:

struct EXIT_PROCESS_DEBUG_INFO


{


    DWORD dwExitCode;


};


As soon as this event occurs, we also end the debugger-loop and terminate the debugging thread. For this, we can use a variable to control the loop (the 'for' loop shown in the first page), and set its value to indicate loop termination. Please download the attached files to see the entire code.

bool bContinueDebugging=true;


...


case EXIT_PROCESS_DEBUG_EVENT:


{


   strEventMessage.Format(L"Process exited with code:  0x%x",


                          debug_event.u.ExitProcess.dwExitCode);


   bContinueDebugging=false;


}


break;


这个事件一发生,我们就结束调试循环并终止调试线程。对于这些,我们可以使用一个变量控制循环(第一页中显示的'for'循环),并设置它的值来表明循环终止。请下载关联文件来查看整个代码。

bool bContinueDebugging=true;


...


case EXIT_PROCESS_DEBUG_EVENT:


{


   strEventMessage.Format(L"进程退出,代码:0x%x",


                          debug_event.u.ExitProcess.dwExitCode);


   bContinueDebugging=false;


}


break;
H. Processing EXCEPTION_DEBUG_EVENT
H. 处理EXCEPTION_DEBUG_EVENT

This is the prodigious(惊人的,很大的) event amongst all the debugging events! From MSDN:

This is generated whenever an exception occurs in the process being debugged. Possible exceptions include attempting to access inaccessible memory, executing breakpoint instructions, attempting to divide by zero, or any other exception noted in Structured Exception Handling. The DEBUG_EVENT structure contains an EXCEPTION_DEBUG_INFO structure. This structure describes the exception that caused the debugging event.

在所有调试事件中这是一个很大的事件!摘自MSDN:

无论何时被调试进程发生异常时就会产生这个事件。可能的异常包括试图访问不可访问内存,执行断点指令,试图被零除,或者在结构化异常处理中注明的其它异常。DEBUG_EVENT结构体包含了一个EXCEPTION_DEBUG_INFO结构体。这个结构体描述了引起调试事件的异常。

This debugging event needs a separate article to complete it fully (or partially!). Thus, I would discuss only one type of exception event, along with an introduction to this event itself.

这个调试事件需要一篇单独的文章来完整的(或部分的)描述。因此,我只讨论异常事件的一个类型,连同这个事件本身一起介绍。

The member variable 'Exception' holds the information regarding the exception just occurred. It is of type EXCEPTION_DEBUG_INFO:

struct EXCEPTION_DEBUG_INFO


{


    EXCEPTION_RECORD ExceptionRecord;


    DWORD dwFirstChance;


};


成员变量'Exception'包含关于刚刚发生的异常的信息。它的类型是EXCEPTION_DEBUG_INFO:

struct EXCEPTION_DEBUG_INFO


{


    EXCEPTION_RECORD ExceptionRecord;


    DWORD dwFirstChance;


};


The 'ExceptionRecord' member of this structure contains detailed information regarding the exception. It is of type EXCEPTION_RECORD:

struct EXCEPTION_RECORD


{


    DWORD     ExceptionCode;


    DWORD     ExceptionFlags;


    struct _EXCEPTION_RECORD *ExceptionRecord;


    PVOID     ExceptionAddress;


    DWORD     NumberParameters;


    ULONG_PTR ExceptionInformation[EXCEPTION_MAXIMUM_PARAMETERS];  // 15


};


这个结构体的'ExceptionRecord'成员包含关于这个异常的详细信息。它的类型是EXCEPTION_RECORD:

struct EXCEPTION_RECORD


{


    DWORD     ExceptionCode;


    DWORD     ExceptionFlags;


    struct _EXCEPTION_RECORD *ExceptionRecord;


    PVOID     ExceptionAddress;


    DWORD     NumberParameters;


ULONG_PTR ExceptionInformation[EXCEPTION_MAXIMUM_PARAMETERS];  // 15


};


The detailed information is put into this sub-structure, because exceptions may appear nested, and would be linked to each other in a linked-list manner. It is out of topic for now to discuss nested exceptions.

这个子结构填充了详细信息,因为异常可能嵌套出现,而且以链表方式相互关联。讨论嵌套异常已经超出了的现在的话题。

Before we delve(钻研) into EXCEPTION_RECORD, it is important to discuss EXCEPTION_DEBUG_INFO::dwFirstChance.

在我们研究EXCEPTION_RECORD前,讨论一下EXCEPTION_DEBUG_INFO::dwFirstChance是很重要的。

Are exceptions giving chances?

异常提供了机会?

Not exactly! When a process is being debugged, the debugger always receives the exception before the debuggee gets it. You must have seen "First-chance exception at 0x00412882 in SomeModule:..." while debugging your Visual C++ module. This is referred to as First Chance Exception. The same exception may or may not follow with a second chance exception.

不太准确!当一个进程被调试时,调试器总在被调试者获取它之前接收到。当调试你的Visual C++模块时你一定见过“在一些模块中首次异常发生在0x00412882:...”。这被称为首次机会异常。第二次相同的异常可能会也可能不会随之发生。

When the debuggee gets the exception, it is termed as Second Chance Exception. The debuggee may handle the exception, or may simply crash down. These types of exceptions are not C++ exceptions, but Windows' SEH (structure exception handling) mechanism. I would cover more about it in the next part of this article.

当被调试者获取异常时,这称为第二次机会异常。被调试者可能处理这个异常,或可能简单的崩溃了。这些类型的异常不是C++异常,而是Windows的SEH(结构异常处理)机制。我会在这篇文章的下一部分讨论关于这些内容的更多内容。

The debugger gets exceptions first (First-chance exception), so that it can handle it before giving it to the debuggee. The break-point exception is a kind of exception, which is relevant to the debugger, not the debuggee. Some libraries also generate First chance exceptions to aid the debugger and the debugging process.

调试器首先获取异常(首次机会异常),以便可以在投递给被调试者前处理它。断点异常时一种与调试器相关的异常,而不是被调试者。一些库也会产生首次机会异常来辅助调试器和调试过程。

A word for ContinueDebugEvent
一个关于ContinueDebugEvent的字

The third parameter (dwContinueStatus) of this function is relevant only after an exception event is received. For non-exception events that we discussed, the system ignores the value passed to this function.

这个函数的第三个参数(dwContinueStatus)仅在异常事件到达后有用。对于我们讨论的非异常事件,系统会忽略传递给这个函数的值。

After the exception event processing, ContinueDebugEvent should be called with:

  • DBG_CONTINUE if the exception event was successfully handled by the debugger. No action is required by the debuggee, and the debuggee can run normally.
  • DBG_EXCEPTION_NOT_HANDLED if this event is not handled/resolved by the debuggee. The debugger might just record this event, notify the debugger-user, or do something else.

在异常事件处理后,ContinueDebugEvent需要这样调用:

· DBG_CONTINUE如果调试器成功处理异常事件。不要求被调试者有任何动作,被调试者也可以正常运行。

· DBG_EXCEPTION_NOT_HANDLED如果被调试者没有处理/解决这个事件。调试器可能仅仅记录下这个事件,通知调试器用户,或做其它的。

Please note that returning DBG_CONTINUE for the improper debugging event would raise the same event in the debugger, and the same event would arrive indefinitely. Since we are in the early stage of writing debuggers, we should play safe, and return EXCEPTION_NOT_HANDLED (give up flag!). The exclusion, for this article, is the Break-point event, which I am discussing next.

请注意在调试器中对于不合适的调试事件返回DBG_CONTINUE将会引起相同的事件,同样的事件将会无限的发生。因为我们处在编写调试器的早期,因此我们应该安全进行,并且返回EXCEPTION_NOT_HANDLED(放弃标识!)。这篇文章将断点事件排除,我将会在下一篇讨论它。

Exceptions codes
异常代码

The EXCEPTION_RECORD::ExceptionCode variable holds the arrived exception code, and can have one of these codes (ignore nested exceptions!):

  1. EXCEPTION_ACCESS_VIOLATION
  2. EXCEPTION_ARRAY_BOUNDS_EXCEEDED
  3. EXCEPTION_BREAKPOINT
  4. EXCEPTION_DATATYPE_MISALIGNMENT
  5. EXCEPTION_FLT_DENORMAL_OPERAND
  6. EXCEPTION_FLT_DIVIDE_BY_ZERO
  7. EXCEPTION_FLT_INEXACT_RESULT
  8. EXCEPTION_FLT_INVALID_OPERATION
  9. EXCEPTION_FLT_OVERFLOW
  10. EXCEPTION_FLT_STACK_CHECK
  11. EXCEPTION_FLT_UNDERFLOW
  12. EXCEPTION_ILLEGAL_INSTRUCTION
  13. EXCEPTION_IN_PAGE_ERROR
  14. EXCEPTION_INT_DIVIDE_BY_ZERO
  15. EXCEPTION_INT_OVERFLOW
  16. EXCEPTION_INVALID_DISPOSITION
  17. EXCEPTION_NONCONTINUABLE_EXCEPTION
  18. EXCEPTION_PRIV_INSTRUCTION
  19. EXCEPTION_SINGLE_STEP
  20. EXCEPTION_STACK_OVERFLOW

EXCEPTION_RECORD::ExceptionCode变量保存了到达的异常的代码,并为下列值之一(忽略嵌入异常!):

  1. EXCEPTION_ACCESS_VIOLATION
  2. EXCEPTION_ARRAY_BOUNDS_EXCEEDED
  3. EXCEPTION_BREAKPOINT
  4. EXCEPTION_DATATYPE_MISALIGNMENT
  5. EXCEPTION_FLT_DENORMAL_OPERAND
  6. EXCEPTION_FLT_DIVIDE_BY_ZERO
  7. EXCEPTION_FLT_INEXACT_RESULT
  8. EXCEPTION_FLT_INVALID_OPERATION
  9. EXCEPTION_FLT_OVERFLOW
  10. EXCEPTION_FLT_STACK_CHECK
  11. EXCEPTION_FLT_UNDERFLOW
  12. EXCEPTION_ILLEGAL_INSTRUCTION
  13. EXCEPTION_IN_PAGE_ERROR
  14. EXCEPTION_INT_DIVIDE_BY_ZERO
  15. EXCEPTION_INT_OVERFLOW
  16. EXCEPTION_INVALID_DISPOSITION
  17. EXCEPTION_NONCONTINUABLE_EXCEPTION
  18. EXCEPTION_PRIV_INSTRUCTION
  19. EXCEPTION_SINGLE_STEP
  20. EXCEPTION_STACK_OVERFLOW

Relax! I am not discussing all of them, but one: EXCEPTION_BREAKPOINT. Okay, here is the code:

case EXCEPTION_DEBUG_EVENT:
 
{

 
   EXCEPTION_DEBUG_INFO& exception = debug_event.u.Exception;
 
   switch( exception.ExceptionRecord.ExceptionCode)
 
   {

 
      case STATUS_BREAKPOINT:  // Same value as EXCEPTION_BREAKPOINT
 
 
 
         strEventMessage= "Break point";
 
         break;
 
 
 
      default:
 
         if(exception.dwFirstChance == 1)
 
         {

 
            strEventMessage.Format(L"First chance exception at %x, exception-code: 0x%08x",
 
                        exception.ExceptionRecord.ExceptionAddress,
 
                        exception.ExceptionRecord.ExceptionCode);
 
         }
 
         // else
 
         // { Let the OS handle }
 
 
 
         // There are cases where OS ignores the dwContinueStatus,
 
         // and executes the process in its own way.
 
         // For first chance exceptions, this parameter is not-important
 
         // but still we are saying that we have NOT handled this event.
 
 
 
         // Changing this to DBG_CONTINUE (for 1st chance exception also),
 
         // may cause same debugging event to occur continously.
 
         // In short, this Debugger does not handle debug exception events
 
         // efficiently, and let's keep it simple for a while!
 
 
 
         dwContinueStatus = DBG_EXCEPTION_NOT_HANDLED;
 
         }
 
 
 
         break;
 
}


放松点!我不会全部讨论它们,除了一个:EXCEPTION_BREAKPOINT。好了,这是代码:

case EXCEPTION_DEBUG_EVENT:


{

EXCEPTION_DEBUG_INFO& exception = debug_event.u.Exception;
 
   switch( exception.ExceptionRecord.ExceptionCode)
 
   {


case STATUS_BREAKPOINT:  // 和EXCEPTION_BREAKPOINT 的值相同


 


         strEventMessage= "Break point";


break;


 


default:

if(exception.dwFirstChance == 1)
 
         {

 
            strEventMessage.Format(L"First chance exception at %x, exception-code: 0x%08x",
 
                        exception.ExceptionRecord.ExceptionAddress,
 
                        exception.ExceptionRecord.ExceptionCode);
 
         }


// 否则
// {让操作系统处理 }
// 有些情况下操作系统忽略dwContinueStatus,
// 并以它自己的方式执行进程.
// 对于首发机会异常,这个参数并不重要
// 但是我们仍然要说我们还没有处理这个事件.
// 将这个改为DBG_CONTINUE (对于首发机会异常也是),
// 可能引起同样的调试事件连续出现.
// 总之,这个调试器没有处理调试异常事件
// 高效,让我们保持一段时间的简单性!


         dwContinueStatus = DBG_EXCEPTION_NOT_HANDLED;


         }


 


break;


}


You might be aware of what a breakpoint is. Out of the standard debugger perspective, the break-pointing can happen with the DebugBreak API, or the {int 3} assembly instruction, or System.Diagnostics.Debugger.Break in the .NET Framework. The debugger would receive the Debug-exception code STATUS_BREAKPOINT (same as EXCEPTION_BREAKPOINT) when any of these occur in the running process. The debuggers generally use this event to break the running process, and may display the source code where the event occurred. But in our basic debugger, we would just display this event to the user. No source code or the instruction location is shown. We'll cover displaying the source code in the next part of this article.

你可能意识到了什么是断点。在标准调试器场景外,断点可以发生于DebugBreak API,或{int 3}汇编指令,或.NET框架的System.Diagnostics.Debugger.Break。在正在运行的进程中,当这些情况中的任何一个发生时,调试器会收到调试异常代码STATUS_BREAKPOINT(和EXCEPTION_BREAKPOINT相同)。但是在我们的基本的调试器中,我们仅将这个事件展示给用户。不展示任何源代码或指令。我们将会在这篇文章的下一部分介绍显示源代码。

Raising breakpoint from a process which is not being debugged would crash the application, or may display the JIT dialog box. The is the reason I used:

if ( !IsDebuggerPresent() )


   AfxMessageBox(L"No debugger is attached currently.");


else


   DebugBreak();


从一个没有被调试过的进程中设定一个断点可能会使程序崩溃,或者可能显示JIT对话框。该是我用过的:


if ( !IsDebuggerPresent() )


   AfxMessageBox(L"当前没有关联调试器.");


else


   DebugBreak();


As a final note to this simplest debug-exception event: EXCEPTION_DEBUG_EVENT would be raised first time by the kernel itself, and would always arrive. Debuggers like Visual Studio ignore this very first breakpoint exception, but debuggers like WinDbg would always show you this event too.

作为最后一个最简单的调试异常事件:EXCEPTION_DEBUG_EVENT会被内核本身发起,并且同时还会到达。像Visual Studio这样的调试器会忽略第一个断点异常,但是像WinDbg这样的调试器则会向你显示这个事件。

Winding up...
清盘……

Use any process to debug or use the attached debuggee named DebugMe:

使用任何调试的程序或者名称为DebugMe的被关联的被调试者:

The binaries (EXEs) attached here are compiled with Visual Studio 2005 Service Pack 1. You may not have the VC++ runtime libraries for the same version. You can download them from Microsoft.com or rebuild the projects from your IDE.

附加的二进制文件(EXE)使用Visual Studio 2005 Service Pack 1编译。你可能没有相同版本的VC++运行时链接库。你可以从Microsoft.com下载或从你的IDE中重建工程。

Follow up:
追加:
License
许可

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

这篇文章和相关的源代码和文件,由The Code Project Open License (CPOL)授权。

About the Author
关于作者

Ajay Vijayvargiya