方案二:主备源IP用于AG通信、主备新增网卡绑定IP用于业务访问

AlwaysOn业务IP和高可用IP分开使用(三)_SQL Server


测试环境描述:

AG角色

主机名

网络适配器

IP地址

主库

TEST-GS-ZHXT1

Ethernet0

10.198.197.167

Ethernet1

10.198.197.173



备库

TEST-GS-ZHXT2

Ethernet0

10.198.197.168

Ethernet1

10.198.197.174



灾备

TEST-GS-ZHXT3

Ethernet0

10.198.194.183


总体思路:

将业务使用的网卡和高可用使用的网卡分开,高可用优先使用到Ethernet 0,配置网卡优先级。在故障切换时WSFC和AG优先使用Ethernet 0通信。业务访问使用的是SQL账号,无需Kerberos验证,是SQL验证。


调整网卡优先级:

开始->运行->输入“ncpa.cpl”->快捷键“Alt+N”->高级设置

AlwaysOn业务IP和高可用IP分开使用(三)_SQL Server_02

将源IP绑定的网络适配器如Ethernet0优先级调整到最一个。


对于访问的业务应用IP添加静态路由:

由于主机只能有一个默认网关,已设置到源IP绑定的网络适配器如Ehernet0上。对于需要访问新增IP的业务应用,需要使用route add -p添加静态路由,并使用if参数指定具体的网络适配器接口ID。


远程Windows验证登录SQL Server实例使用Kerberos验证:

使用域账号远程登录SQL Server实例,查看验证方式:

select * from sys.dm_exec_connections where session_id=@@spid;


无法使用Kerberos验证,使用的是NTLM。

参考:https://technet.microsoft.com/en-us/library/bb463166.aspx


打开Kerberos日志调试:

On an Active Directory server, Kerberos error messages are found in the Event Log. It is necessary to enable extended Kerberos logging before all message types will appear. To enable extended Kerberos logging, add a DWORD registry entry of LogLevel in the following location, and set it to 1:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa\Kerberos\Parameters

The server must be started after this change before the logging will be implemented.

Error

ErrorName

Description

0x7

KDC_ERR_S_PRINCIPAL_UNKNOWN

Server not found in Kerberos database


根据错误判断,应该是SPN未注册或未正确注册。


参考:https://technet.microsoft.com/en-us/library/bb463167.aspx

Common DNS Issues

DNS problems are often encountered only during a service ticket request after a successful TGT request. If a client can successfully authenticate initially but is then unable to acquire a service ticket or access services, then DNS problems are the likely cause.

The error “Server not found in Kerberos database” is common and can be misleading because it often appears when the service principal is not missing. The error can be caused by domain/realm mapping problems or it can be the result of a DNS problem where the service principal name is not being built correctly. Server logs and network traces can be used to determine what service principal is actually being requested.

Kerberos recognizes short host names as different from long host names. For example, problems may occur if a client computer knows an application server as appserver1.example.com, but the Kerberos server knows the same computer as appserver1. Check that each host in the environment knows the others by using a consistent naming pattern.

Kerberos is case sensitive. Problems can occur in an environment using host names with mixed case. In the world of Kerberos, appserver1.EXAMPLE.COM and appserver1.example.com are not the same. Check that DNS resolves host names with consistent case.

Kerberos relies on the presence of both forward and reverse lookup entries in DNS. Check that the host name of each computer can be resolved to its IP address and that its IP address can be resolved to its host name.

DNS domain name ambiguities in a multidomain environment can result in subtle DNS issues. Check that each computer knows the others using the same domain name. Avoiding the use of short host names is particularly important in a multidomain environment.

Look carefully at the configuration of any multihomed hosts. You might need to perform network traces to determine which interfaces and what names are being used in requests to or from computers with multiple network cards.


根据上文中 “Kerberos relies on the presence of both forward and reverse lookup entries in DNS.”对于绑定了新的网卡的IP,需要到DNS去做反向解析。如下图:

AlwaysOn业务IP和高可用IP分开使用(三)_数据库_03

再去验证连接,就是Kerberos验证了。


参考:https://blogs.msdn.microsoft.com/apgcdsd/2011/09/26/kerberosntlm-sql-server/

SQL Server 2008/2008 R2

1) 当SPN被映射到正确的域或者内建机器账号时 (Local System, Network Service),本地连接会使用NTLM,而远程连接会使用Kerberos。

2) 当没有找到注册在正确的域或内建机器账号下的SPN时,连接会使用NTLM。

3) 当域中存在错误的SPN时,认证失败。

具体Kerberos验证的过程,可以参考:https://blogs.technet.microsoft.com/askds/2008/03/06/kerberos-for-the-busy-admin/


测试:

1. 从AG移除备节点,切换10.198.197.173和10.198.197.174,通过167和168登录服务器,修改Ethernet 1的IP地址后,禁用启用网卡。DNS反向查找区域中删除原来的解析,添加新对应关系的地址解析。

AlwaysOn业务IP和高可用IP分开使用(三)_SQL Server_04


查看WSFC状态:

AlwaysOn业务IP和高可用IP分开使用(三)_数据库_05

AlwaysOn业务IP和高可用IP分开使用(三)_SQL Server_06

通过Windows验证远程访问10.198.197.173和174,查看是否使用Kerberos验证:

AlwaysOn业务IP和高可用IP分开使用(三)_数据库_07

AlwaysOn业务IP和高可用IP分开使用(三)_SQL Server_08

AlwaysOn业务IP和高可用IP分开使用(三)_SQL Server_09

都能远程访问数据库服务。


2. 业务切换到备库测试完毕后,IP切换回来。

都能远程访问数据库服务。


总结:

方案二能满足业务IP和高可用IP分开使用的需求。能保证WSFC和AG对Ethernet 0的优先稳定使用,保证集群的安全可靠。