用 HTTP 协议下载资源(WinINet 实现)

WinINet 使用 HTTP 协议下载资源的流程

用 HTTP 协议下载资源(WinINet 实现)_WinInet

相关函数

​InternetCrackUrl​​ 解析 URL

BOOL InternetCrackUrl(
_In_ LPCTSTR lpszUrl, // (1)
_In_ DWORD dwUrlLength, // (2)
_In_ DWORD dwFlags, // (3)
_Inout_ LPURL_COMPONENTS lpUrlComponents // (4)
);

(1) Pointer to a string that contains the canonical URL to be cracked.
(2) Size of the lpszUrl string, in ​​​TCHAR​​​s, or zero if lpszUrl is an ASCIIZ string
(3) Controls the operation: ​​​ICU_DECODE​​​(Converts encoded characters back to their normal form), ​​ICU_ESCAPE​​​(Converts all escape sequences (%xx) to their corresponding characters)
(4) Pointer to a ​​​URL_COMPONENTS​​ structure that receives the URL components.

​InternetOpen​​ 初始化应用程序对 WinINet 的使用

​InternetOpen​​ 是应用程序调用的第一个 WinINet 函数。 它用来告诉 Internet DLL 初始化内部数据结构, 为未来应用程序的调用做准备。当应用程序不再使用 Internet 函数时, 需要调用 ​​InternetCloseHandle​​ 来释放句柄及其关联的资源。

HINTERNET InternetOpen(
_In_ LPCTSTR lpszAgent, // (1)
_In_ DWORD dwAccessType, // (2)
_In_ LPCTSTR lpszProxyName, // (3)
_In_ LPCTSTR lpszProxyBypass, // (4)
_In_ DWORD dwFlags // (5)
);

(1) Pointer to a null-terminated string that specifies the name of the application or entity calling the WinINet functions. This name is used as the user agent in the HTTP protocol.
(2) Type of access required:

  • ​INTERNET_OPEN_TYPE_DIRECT​​: Resolves all host names locally;
  • ​INTERNET_OPEN_TYPE_PRECONFIG​​: Retrieves the proxy or direct configuration from the registry;
  • ​INTERNET_OPEN_TYPE_PRECONFIG_WITH_NO_AUTOPROXY​​: Retrieves the proxy or direct configuration from the registry and prevents the use of a startup Microsoft JScript or Internet Setup (INS) file;
  • ​INTERNET_OPEN_TYPE_PROXY​​​: Passes requests to the proxy unless a proxy bypass list is supplied and the name to be resolved bypasses the proxy. In this case, the function uses ​​INTERNET_OPEN_TYPE_DIRECT​

(3) Pointer to a null-terminated string that specifies the name of the proxy server(s) to use when proxy access is specified by setting dwAccessType to ​​INTERNET_OPEN_TYPE_PROXY​​​.
(4) Pointer to a null-terminated string that specifies an optional list of host names or IP addresses, or both, that should not be routed through the proxy when dwAccessType is set to ​​​INTERNET_OPEN_TYPE_PROXY​​​.
(5) Options:

  • ​INTERNET_FLAG_ASYNC​​: Makes only asynchronous requests on handles descended from the handle returned from this function;
  • ​INTERNET_FLAG_FROM_CACHE​​: Does not make network requests. All entities are returned from the cache);
  • ​INTERNET_FLAG_OFFLINE​​​: Identical to ​​INTERNET_FLAG_FROM_CACHE​

​InternetConnect​​ 为指定网站打开一个文件传输协议(File Transfer Protocol, FTP) 或 HTTP 协议 的会话(session)

HINTERNET InternetConnect(
_In_ HINTERNET hInternet, // (1)
_In_ LPCTSTR lpszServerName, // (2)
_In_ INTERNET_PORT nServerPort, // (3)
_In_ LPCTSTR lpszUsername, // (4)
_In_ LPCTSTR lpszPassword, // (5)
_In_ DWORD dwService, // (6)
_In_ DWORD dwFlags, // (7)
_In_ DWORD_PTR dwContext // (8)
);

(1) Handle returned by a previous call to ​​InternetOpen​​​.
(2) Pointer to a null-terminated string that specifies the host name of an Internet server. Alternately, the string can contain the IP number of the site, in ASCII dotted-decimal format (for example, 11.0.1.45).
(3) Transmission Control Protocol/Internet Protocol (TCP/IP) port on the server.
(4) Pointer to a null-terminated string that specifies the name of the user to log on. If this parameter is NULL, the function uses an appropriate default.
(5) Pointer to a null-terminated string that contains the password to use to log on. If both lpszPassword and lpszUsername are NULL, the function uses the default “anonymous” password.
(6) Type of service to access:

  • ​INTERNET_SERVICE_FTP​​: FTP service;
  • ​INTERNET_SERVICE_GOPHER​​: Gopher service;
  • ​INTERNET_SERVICE_HTTP​​: HTTP service

(7) Options specific to the service used.
(8) Pointer to a variable that contains an application-defined value that is used to identify the application context for the returned handle in callbacks.

​HttpOpenRequest​​ 创建 HTTP 请求(request) 句柄

如果指定了除 “GET” 或 “POST” 以外的请求方法动词, ​​HttpOpenRequest​​​ 自动为请求设置 ​​INTERNET_FLAG_NO_CACHE_WRITE​​​ 和 ​​INTERNET_FLAG_RELOAD​​.

HINTERNET HttpOpenRequest(
_In_ HINTERNET hConnect, // (1)
_In_ LPCTSTR lpszVerb, // (2)
_In_ LPCTSTR lpszObjectName, // (3)
_In_ LPCTSTR lpszVersion, // (4)
_In_ LPCTSTR lpszReferer, // (5)
_In_ LPCTSTR *lplpszAcceptTypes, // (6)
_In_ DWORD dwFlags, // (7)
_In_ DWORD_PTR dwContext // (8)
);

(1) A handle to an HTTP session returned by ​​InternetConnect​​​.
(2) A pointer to a null-terminated string that contains the HTTP verb to use in the request. If this parameter is ​​​NULL​​​, the function uses GET as the HTTP verb.
(3) A pointer to a null-terminated string that contains the name of the target object of the specified HTTP verb. This is generally a file name, an executable module, or a search specifier. (即, 请求资源的 URI)
(4) A pointer to a null-terminated string that contains the HTTP version to use in the request.If this parameter is NULL, the function uses an HTTP version of 1.1 or 1.0, depending on the value of the Internet Explorer settings.(一般设置为 “HTTP/1.0” 或 “HTTP/1.1”)
(5) A pointer to a null-terminated string that specifies the URL of the document from which the URL in the request (​​​lpszObjectName​​​) was obtained. If this parameter is NULL, no referrer is specified.
(6) A pointer to a null-terminated array of strings that indicates media types accepted by the client.Here is an example.

PCTSTR rgpszAcceptTypes[] = {_T(“text/*”), NULL};

(7) Internet options: ​​INTERNET_FLAG_RELOAD​​​ (Forces a download of the requested file, object, or directory listing from the origin server, not from the cache), ​​INTERNET_FLAG_NO_CACHE_WRITE​​​ (Does not add the returned entity to the cache) 等。
(8) A pointer to a variable that contains the application-defined value that associates this operation with any application data.

​HttpAddRequestHeaders​​ 向 HTTP 的请求句柄添加首部字段

BOOL HttpAddRequestHeaders(
_In_ HINTERNET hRequest, // (1)
_In_ LPCTSTR lpszHeaders, // (2)
_In_ DWORD dwHeadersLength, // (3)
_In_ DWORD dwModifiers // (4)
);

(1) A handle returned by a call to the HttpOpenRequest function.
(2) A pointer to a string variable containing the headers to append to the request. Each header must be terminated by a CR/LF (carriage return/line feed) pair.
(3) The size of ​​​lpszHeaders​​​, in TCHARs. If this parameter is -1L, the function assumes that ​​lpszHeaders​​​ is zero-terminated (ASCIIZ), and the length is computed.
(4) A set of modifiers that control the semantics of this function:

  • ​HTTP_ADDREQ_FLAG_ADD​​​: Adds the header if it does not exist. Used with ​​HTTP_ADDREQ_FLAG_REPLACE​​;
  • ​HTTP_ADDREQ_FLAG_ADD_IF_NEW​​: Adds the header only if it does not already exist; otherwise, an error is returned;
  • ​HTTP_ADDREQ_FLAG_COALESCE​​: Coalesces(使联合;使合并) headers of the same name
  • ​HTTP_ADDREQ_FLAG_COALESCE_WITH_COMMA​​: Coalesces headers of the same name with comma(逗号). For example, adding “Accept: text/” followed by “Accept: audio/” with this flag results in the formation of the single header “Accept: text/, audio/“;
  • ​HTTP_ADDREQ_FLAG_COALESCE_WITH_SEMICOLON​​: Coalesces headers of the same name using a semicolon(分号);
  • ​HTTP_ADDREQ_FLAG_REPLACE​​: Replaces or removes a header. If the header value is empty and the header is found, it is removed. If not empty, the header value is replaced.

​HttpSendRequest​​ 发送 Http 请求

BOOL HttpSendRequest(
_In_ HINTERNET hRequest, (1)
_In_ LPCTSTR lpszHeaders, (2)
_In_ DWORD dwHeadersLength, (3)
_In_ LPVOID lpOptional, (4)
_In_ DWORD dwOptionalLength (5)
);

(1) A handle returned by a call to the ​​HttpOpenRequest​​​ function.
(2) A pointer to a null-terminated string that contains the additional headers to be appended to the request. This parameter can be NULL if there are no additional headers to be appended.
(3) The size of the additional headers, in TCHARs. If this parameter is ​​​-1L​​​ and lpszHeaders is not ​​NULL​​​, the function assumes that lpszHeaders is zero-terminated (ASCIIZ), and the length is calculated.
(4) A pointer to a buffer containing any optional data to be sent immediately after the request headers. This parameter is generally used for “POST” and “PUT” operations.
(5) The size of the optional data, in bytes.

​HttpQueryInfo​​ 获取 HTTP 请求的响应情况

例子: ​​Retrieving HTTP Headers​

BOOL HttpQueryInfo(
_In_ HINTERNET hRequest, // (1)
_In_ DWORD dwInfoLevel, // (2)
_Inout_ LPVOID lpvBuffer, // (3)
_Inout_ LPDWORD lpdwBufferLength, // (4)
_Inout_ LPDWORD lpdwIndex // (5)
);

(1) A handle returned by a call to the ​​HttpOpenRequest​​​ or ​​InternetOpenUrl​​​ function.
(2) A combination of an attribute to be retrieved and flags that modify the request. For a list of possible attribute and modifier values, see ​​​Query Info Flags​​.

​HTTP_QUERY_CONTENT_LENGTH​​​ (Retrieves the size of the resource, in bytes), ​​HTTP_QUERY_ACCEPT_RANGES​​​ (Retrieves the types of range requests that are accepted for a resource), ​​HTTP_QUERY_CONTENT_RANGE​​​ (HTTP_QUERY_CONTENT_RANGE), ​​HTTP_QUERY_FLAG_NUMBER​​​ (Returns the data as a 32-bit number for headers whose value is a number, such as the status code), ​​HTTP_QUERY_STATUS_CODE​​ (Receives the status code returned by the server) 等

(3) A pointer to a buffer to receive the requested information.
(4) A pointer to a variable that contains, on entry, the size in bytes of the buffer pointed to by ​​​lpvBuffer​​​. When the function returns successfully, this variable contains the number of bytes of information written to the buffer. In the case of a string, the byte count does not include the string’s terminating null character.
(5) A pointer to a zero-based header index used to enumerate multiple headers with the same name.

​InternetReadFile​​​ 从 ​​InternetOpenUrl​​​, ​​FtpOpenFile​​​, 或 ​​HttpOpenRequest​​ 打开的句柄中读取数据。

为了保证所有的数据都被读取, 需要循环调用 ​​InternetReadFile​​​ 函数, 直到返回的 ​​lpdwNumberOfBytesRead​​ 参数为 0。

BOOL InternetReadFile(
_In_ HINTERNET hFile, // (1)
_Out_ LPVOID lpBuffer, // (2)
_In_ DWORD dwNumberOfBytesToRead, // (3)
_Out_ LPDWORD lpdwNumberOfBytesRead // (4)
);

(1) Handle returned from a previous call to ​​InternetOpenUrl​​​, ​​FtpOpenFile​​​, or ​​HttpOpenRequest​​​.
(2) Pointer to a buffer that receives the data.
(3) Number of bytes to be read.
(4) Pointer to a variable that receives the number of bytes read. ​​​InternetReadFile​​ sets this value to zero before doing any work or error checking.

样例代码

#include <string>  
#include <iostream>
#include <windows.h>
#include <WinINet.h>

using namespace std;

#pragma comment(lib, "WinINet.lib")

int main(int argc, char* argv[])
{
wstring strURL = L"http://blog.csdn.net/yanglingwell/article/details/78258081";
// 解析 URL
URL_COMPONENTS urlComponents;

ZeroMemory(&urlComponents, sizeof(urlComponents));
WCHAR lpszHostName[INTERNET_MAX_HOST_NAME_LENGTH] = {0};
WCHAR lpszUserName[INTERNET_MAX_USER_NAME_LENGTH] = {0};
WCHAR lpszPassword[INTERNET_MAX_PASSWORD_LENGTH] = {0};
WCHAR lpszURLPath[INTERNET_MAX_URL_LENGTH] = {0};
WCHAR lpszScheme[INTERNET_MAX_SCHEME_LENGTH] = {0};

urlComponents.dwStructSize = sizeof(URL_COMPONENTSA);
urlComponents.lpszScheme = lpszScheme;
urlComponents.dwSchemeLength = INTERNET_MAX_SCHEME_LENGTH;
urlComponents.lpszHostName = lpszHostName;
urlComponents.dwHostNameLength = INTERNET_MAX_HOST_NAME_LENGTH;
urlComponents.lpszUserName = lpszUserName;
urlComponents.dwUserNameLength = INTERNET_MAX_USER_NAME_LENGTH;
urlComponents.lpszPassword = lpszPassword;
urlComponents.dwPasswordLength = INTERNET_MAX_PASSWORD_LENGTH;
urlComponents.lpszUrlPath = lpszURLPath;
urlComponents.dwUrlPathLength = INTERNET_MAX_URL_LENGTH;

BOOL bSuccess = InternetCrackUrl(strURL.data(), 0, NULL, &urlComponents);
if(bSuccess == FALSE)
{
wcout << strURL << L" 解析失败!" << endl;
return 0;
}
else if(urlComponents.nScheme != INTERNET_SCHEME_HTTP)
{
wcout << strURL << L" 不是 HTTP 协议!" << endl;
return 0;
}

HINTERNET hSession = NULL;
HINTERNET hInternet = NULL;
HINTERNET hRequest = NULL;

do
{
// Initializes an application's use of the WinINet functions.
// Returns a valid handle that the application passes to subsequent WinINet functions.
// If InternetOpen fails, it returns NULL.
hInternet = InternetOpen(L"yanglingwell", INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0);
if(hInternet == NULL)
{
cout << "InternetOpen failed. errCode: " << GetLastError() << endl;
break;
}

// Opens an HTTP session for a given site.
// Returns a valid handle to the session if the connection is successful, or NULL otherwise.
HINTERNET hSession = InternetConnect(hInternet, urlComponents.lpszHostName, urlComponents.nPort, urlComponents.lpszUserName,
urlComponents.lpszPassword, INTERNET_SERVICE_HTTP, 0, NULL);
if(hSession == NULL)
{
cout << "InternetConnect failed. errCode: " << GetLastError() << endl;
break;
}

// Creates an HTTP request handle
// Returns an HTTP request handle if successful, or NULL otherwise.
hRequest = HttpOpenRequest(hSession, L"GET", urlComponents.lpszUrlPath, NULL, L"", NULL, 0, 0);
if(hRequest == NULL)
{
cout << "HttpOpenRequest failed. errCode: " << GetLastError() << endl;
break;
}

// 设置首部字段
wstring strHeader;
// 设置接受数据类型
strHeader += L"Accept: */*\r\n";
// 设置禁止用缓存和缓存控制
strHeader += L"Pragma: no-cache\r\n";
strHeader += L"Cache-Control: no-cache\r\n";
// 设置其它首部字段....

// Adds one or more HTTP request headers to the HTTP request handle.
if (!HttpAddRequestHeaders(hRequest, strHeader.data(), strHeader.length(), HTTP_ADDREQ_FLAG_ADD|HTTP_ADDREQ_FLAG_REPLACE))
{
cout << "HttpAddRequestHeaders failed. errCode: " << GetLastError() << endl;
break;
}

if (!HttpSendRequest(hRequest, NULL, 0, NULL, 0))
{
cout << "HttpAddRequestHeaders failed. errCode: " << GetLastError() << endl;
break;
}

DWORD dwStatusCode;
DWORD dwSizeDW = sizeof(DWORD);
if (!HttpQueryInfo(hRequest, HTTP_QUERY_FLAG_NUMBER | HTTP_QUERY_STATUS_CODE, &dwStatusCode, &dwSizeDW, NULL))
{
cout << "HttpQueryInfo failed. errCode: " << GetLastError() << endl;
break;
}
else
{
cout << "StatusCode: " << dwStatusCode << endl;
}

WCHAR buf[2048];
DWORD bufSize = sizeof(buf);
DWORD bufRead = 0;
do
{
if(!InternetReadFile(hRequest, &buf, bufSize, &bufRead))
{
cout << "InternetReadFile failed. errCode: " << GetLastError() << endl;
break;
}
wcout << L"reading..." << endl;
} while (bufRead != 0);

} while (FALSE);

if(hInternet != NULL)
{
InternetCloseHandle(hInternet);
}
if(hSession != NULL)
{
InternetCloseHandle(hSession);
}
if(hRequest != NULL)
{
InternetCloseHandle(hRequest);
}

return 0;
}