【Nginx】04、nginx 反向代理及负载均衡

原创

xiexiaojun 2015-11-13 11:41:09 博主文章分类：【运维|Web】 ©著作权

©著作权归作者所有：来自51CTO博客作者xiexiaojun的原创作品，请联系作者获取转载授权，否则将追究法律责任

一、nginx反向代理

1、代理

代理分为正向代理和反向代理

1）正向代理

是一个位于客户端和原始服务器(origin server)之间的服务器，为了从原始服务器取得内容，客户端向代理发送一个请求并指定目标(原始服务器)，然后代理向原始服务器转交请求并将获得的内容返回给客户端。代理的是客户端，客户端必须要进行一些特别的设置才能使用正向代理。

2）反向代理

反向代理正好相反，代理的是服务器，并且客户端不需要进行任何特别的设置。客户端向反向代理的命名空间(name-space)中的内容发送普通请求，接着反向代理将判断向何处(原始服务器)转交请求，并将获得的内容返回给客户端，就像这些内容原本就是它自己的一样。

从安全性来讲：

正向代理允许客户端通过它访问任意网站并且隐藏客户端自身，因此你必须采取安全措施以确保仅为经过授权的客户端提供服务。反向代理对外都是透明的，访问者并不知道自己访问的是一个代理。

2、配置nginx实现反向代练

由ngx_http_proxy_module实现

Example Configuration：

location / {
    proxy_pass       http://localhost:8000;
    proxy_set_header Host      $host;
    proxy_set_header X-Real-IP $remote_addr;
}

nginx作为反向代理服务器时工作特性：

接收客户端请求时缓存在本地，接收全部请求后再发往后端服务器；接收后端服务器响应时，边接收边发送给客户端；而squid代理2个阶段都是边接收别发送

Syntax:	`proxy_pass URL;`
Default:	—
Context:	`location`, `if in location`, `limit_except`

Sets the protocol and address of a proxied server and an optional URI to which a location should be mapped. As a protocol, “http” or “https” can be specified. The address can be specified as a domain name or IP address, and an optional port:

proxy_pass http://localhost:8000/uri/;

or as a UNIX-domain socket path specified after the word “unix” and enclosed in colons:

proxy_pass http://unix:/tmp/backend.socket:/uri/;

Nginx通过proxy模块实现反向代理功能。在作为web反向代理服务器时，nginx负责接收客户请求，并能够根据uri、客户端参数或其它的处理逻辑将用户请求调度至上游服务器上(upstream server)。nginx在实现反向代理功能时的最重要指令为proxy_pass，它能够将location定义的某URI代理至指定的上游服务器(组)上。

proxy_pass url；

用来定义协议内容，路径映射和uri

实例：

1）环境

Node7：192.168.10.7 nginx做反向代理服务器

Node5：192.168.10.5 httpd做上游的web服务器

2）添加配置

server {
    root /www/c.net/;
    server_name www.c.net;
    location / {
       proxy_pass http://192.168.10.5/;
           }
  }

测试：

${A7CEH8@{SN`@2F@SL_ACN7.png$

此时查看日志信息：

Node7：

192.168.10.10 - - [10/Mar/2017:22:43:00 +0800] "GET / HTTP/1.1" 200 13 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36" "-"

Node5：

192.168.10.7 - - [10/Mar/2017:22:43:00 +0800] "GET / HTTP/1.0" 200 13 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36"

注意：

上面例子中location匹配的是"/"，proxy_pass http://192.168.10.5/ 这里最后面带不带/都可以，如果location匹配的不是"/"那proxy_pass http://192.168.10.5/ 这里最后面一定要带上/

实例：

server {
    root /www/c.net/;
    server_name www.c.net;
    location /bbs/ {
       proxy_pass http://192.168.10.5/;
           }
  }

proxy_pass http://192.168.10.5/ 这里“/”一定要带，不带的访问的是http://192.168.10.5/bbs/，而不是把“/bbs”映射成“/”，location /bbs 后面是否带/都可以

可以查看日志信息来验证：

Node7:

192.168.10.10 - - [11/Mar/2017:09:41:25 +0800] "GET /bbs HTTP/1.1" 404 244 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36" "-"

Node5:

192.168.10.7 - - [11/Mar/2017:09:41:25 +0800] "GET /bbs HTTP/1.0" 404 279 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36"

不过，这种处理机制中有两个例外：

一个是如果location的uri是通过模式匹配定义的，其URI将直接被传递至上游服务器，而不能为其指定转换的另一个URI。例如下面示例中的/forum将被代理为http://www.magedu.com/forum。

location ~ ^/bbs {
       proxy_pass   http://192.168.10.5  # 此时这里后面一定不能带“/”否则会报错 
}

第二个例外是，如果在loation中使用的URL重定向，那么nginx将使用重定向后的URI处理请求，而不再考虑上游服务器上定义的URI。如下面所示的例子中，传送给上游服务器的URI为

/index.php?page=<match>，而不是/index。

location / {
     rewrite /(.*)$ /index.php?page=$1 break;
     proxy_pass http://localhost:8080/index;
}

proxy模块的可用配置指令非常多，它们分别用于定义proxy模块工作时的诸多属性，如连接超时时长、代理时使用http协议版本等，下面对常用的指令做一个简单说明。

proxy_connect_timeout

nginx将一个请求发送至upstream server之前等待的最大时长；

Syntax:	`proxy_connect_timeout time;`
Default:	proxy_connect_timeout 60s;
Context:	`http`, `server`, `location`

Defines a timeout for establishing a connection with a proxied server. It should be noted that this timeout cannot usually exceed 75 seconds.

定义为建立与代理服务器的连接超时时间。默认为60，通常不能超过75秒超时。

proxy_cookie_domain

将upstream server通过Set-Cookie首部设定的domain属性修改为指定的值，其值可以为一个字符串、正则表达式的模式或一个引用的变量；

Syntax:	`proxy_cookie_domain off;` `proxy_cookie_domain domain replacement;`
Default:	proxy_cookie_domain off;
Context:	`http`, `server`, `location`

例1：
    proxy_cookie_domain localhost example.org;
例2：
    proxy_cookie_domain www.$host $host;
例3：
    proxy_cookie_domain off;
    proxy_cookie_domain localhost example.org;
    proxy_cookie_domain www.example.org example.org;

proxy_cookie_path

将upstream server通过Set-Cookie首部设定的path属性修改为指定的值，其值可以为一个字符串、正则表达式的模式或一个引用的变量；

Syntax:	`proxy_cookie_path off;` `proxy_cookie_path path replacement;`
Default:	proxy_cookie_path off;
Context:	`http`, `server`, `location`

例子：
   proxy_cookie_path off;
   proxy_cookie_path /two/ /;
   proxy_cookie_path ~*^/user/([^/]+) /u/$1;

proxy_hide_header

设定发送给客户端的报文中需要隐藏的首部

Syntax:	`proxy_hide_header field;`
Default:	—
Context:	`http`, `server`, `location`

By default, nginx does not pass the header fields “Date”, “Server”, “X-Pad”, and “X-Accel-...” from the response of a proxied server to a client. The proxy_hide_header directive sets additional fields that will not be passed. If, on the contrary, the passing of fields needs to be permitted, the proxy_pass_header directive can be used.

proxy_set_header

将发送至upsream server的报文的某首部进行重写；

Syntax: proxy_set_header field value;

Default:

Syntax:	`proxy_set_header field value;`
Default:	proxy_set_header Host $proxy_host; proxy_set_header Connection close;
Context:	`http`, `server`, `location`

proxy_set_header Host $proxy_host;

proxy_set_header Connection close;

Context: http, server, location

例子：
      proxy_set_header   Host        $proxy_host;
      proxy_set_header   Connection   close;        
                          # 做反向代理时，将用户的请求发生给上游服务器时，默认关闭长连接

proxy_redirect [default|off|redirect|replacement]

当上游服务器返回的响应是重定向或刷新请求时，proxy_redirect会重写设定http首部的location或refresh；

Syntax:	`proxy_redirect default;` `proxy_redirect off;` `proxy_redirect redirect replacement;`
Default:	proxy_redirect default;
Context:	`http`, `server`, `location`

proxy_redirect http://localhost:8000/two/

proxy_send_timeout

在连接断开之前，两次发送至upstream server的写操作的最大间隔时长；

Syntax:	`proxy_send_timeout time;`
Default:	proxy_send_timeout 60s;
Context:	`http`, `server`, `location`

proxy_read_timeout

在连接断开之前两次从接收upstream server接收读操作的最大间隔时长，默认60s；

proxy_pass_header

发送给客户端的报文中不隐藏的首部；

porxy_pass_request_body

是否将http请求报文包体部分发往上游服务器

Syntax:	`proxy_pass_request_body on \| off;`
Default:	proxy_pass_request_body on;
Context:	`http`, `server`, `location`

porxy_pass_request_header

是否将Http首部发往上游服务器，默认on

示例：

proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;  
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;  # 有多次代理时使用
client_max_body_size 10m;
client_body_buffer_size 128k;
proxy_connect_timeout 30;
proxy_send_timeout 15;
proxy_read_timeout 15;

3、ngx_http_proxy_module中引入的变量

Embedded Variables

The ngx_http_proxy_module module supports embedded variables that can be used to compose headers using the proxy_set_header directive:

$proxy_host
name and port of a proxied server as specified in the proxy_pass directive;
$proxy_port
port of a proxied server as specified in the proxy_pass directive, or the protocol’s default port;
$proxy_add_x_forwarded_for
the “X-Forwarded-For” client request header field with the $remote_addr variable appended to it, separated by a comma. If the “X-Forwarded-For” field is not present in the client request header, the $proxy_add_x_forwarded_for variable is equal to the $remote_addr variable.

二、nginx负载均衡

由ngx_http_upstream_module实现

1、相关指令

Syntax:	`upstream name { ... }`
Default:	—
Context:	`http`

Syntax:	`server address [parameters];`
Default:	—
Context:	`upstream`

Example:

upstream backend {
    server backend1.example.com weight=5;
    server 127.0.0.1:8080       max_fails=3 fail_timeout=30s;
    server unix:/tmp/backend3;

    server backup1.example.com  backup;
}

wight=N：设定权重，默认为1

max_fails=N：健康检查，最大失败尝试的次数，默认为1，就将其标记为不可用

fail_timeout=N：对于不可用主机，停止不再向其转发的时间

down：用于手动停用该后端服务器

backup：标记为备用，当所有节点都故障时，此节点可以使用。当使用ip_hash当所有服务器都当机时，不会自动启用该节点

使用注意：

1）只能用于http上下文

2）各server只能直接使用IP或主机名，不需要协议

3）默认为加权轮询

使用方法：

1）先在主配置文件的http段定义upstream：

2）在server段中location段中调用：

upstream webservers {
    server 192.168.10.5;
    server 192.168.10.8;
      }

server {
    root /www/c.net/;
    server_name www.c.net;
    location ~ /bbs {
       #proxy_pass http://192.168.10.5:8080;
       proxy_pass http://webservers;
       proxy_set_header X-Real-IP $remote_addr;
       proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
           }

2、后端服务器健康状态检测

Nginx的检测方式分为两种，一种是被动检测，另一种是主动检测。下面我们分别看一下这两种方式。

被动检测

被动检测就是通过max_fails和fails_timeout来实现的，

向服务器转发请求失败，或者没有接收到响应，nginx就认为其不可用，会停止一段时间不再向其转发，默认规则是，如果失败了一次，就停止转发10秒钟

主动监测

由Nginx定期的向每台应用服务器发送特殊的请求，来监测应用服务器是否可以正常访问。这种方式称为主动监测。

Syntax:	`health_check [parameters];`
Default:	—
Context:	`location`

The following optional parameters are supported:

interval=time
sets the interval between two consecutive health checks, by default, 5 seconds.
检测间隔时间，默认为5s
jitter=time
sets the time within which each health check will be randomly delayed, by default, there is no delay.
检测的延时时间，默认无
fails=number
sets the number of consecutive failed health checks of a particular server after which this server will be considered unhealthy, by default, 1.
后端服务器检查失败几次就将其标识为不可用，默认1次
passes=number
sets the number of consecutive passed health checks of a particular server after which the server will be considered healthy, by default, 1.
后端服务器检查几次成功就将其重新上线，默认1次
uri=uri
defines the URI used in health check requests, by default, “/”.
检测的uri，默认为/
mandatory
sets the initial “checking” state for a server until the first health check is completed (1.11.7). If the parameter is not specified, the server will be initially considered healthy.
match=name
specifies the match block configuring the tests that a response should pass in order for a health check to pass. By default, the response should have status code 2xx or 3xx.
匹配检测
port=number
defines the port used when connecting to a server to perform a health check (1.9.7). By default, equals the server port.
检测时所使用的端口，默认为服务器的端口

health_check [interval=time] [fails=number] [passes=number] [uri=uri] [math=name];

很不幸的是health_check这个指令只是在商业版（Nginx Plus）中才可以使用的

3、调度算法

加权轮询：（默认）

每个请求按时间顺序逐一分配到不同的后端服务器，如果后端某台服务器宕机，故障系统被自动剔除，使用户访问不受影响。Weight指定轮询权值，Weight值越大，分配到的访问机率越高，主要用于后端每个服务器性能不均的情况下。

ip_hash：

每个请求按访问IP的hash结果分配，这样来自同一个IP的访客固定访问一个后端服务器，有效解决了动态网页存在的session共享问题。

相当于lvs的sh算法，用来实现session绑定

least_conn：

最少连接，相当于加权最少连接

4、其它指令

Syntax:	`keepalive connections;`
Default:	—
Context:	`upstream`

nginx与后端服务实例之间保持的空闲链接数量的上限

Syntax:	`sticky cookie name [expires=time] [domain=domain] [httponly] [secure] [path=path];` `sticky route $variable ...;` `sticky learn create=$variable lookup=$variable zone=name:size [timeout=time];`
Default:	—
Context:	`upstream`

upstream backend {
    server backend1.example.com;
    server backend2.example.com;

    sticky cookie srv_id expires=1h domain=.example.com path=/;
}

sticky命令能够基于cookie保持会话的粘性。但只sticky命令只能在商业版中使用

Syntax:	`sticky_cookie_insert name [expires=time] [domain=domain] [path=path];`
Default:	—
Context:	`upstream`

三、后端web服务器记录真实客户端IP

1、只有1个代理服务器时

1）修改node7 ningx的配置文件，设置SB-IP首部的值为 $remote_addr的值

server {
    root /www/c.net/;
    server_name www.c.net;
    location ~ /bbs/ {
       proxy_pass http://192.168.10.5;
       proxy_set_header SB-IP $remote_addr;
           }
  }

2）修改node5 httpd的配置文件的日志格式显示SB-IP的值

#LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

LogFormat "%{SB-IP}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

分别查看node7和node5的日志格式：

### node7：  #　没有发生变化

192.168.10.10 - - [11/Mar/2017:10:52:07 +0800] "GET /bbs/ HTTP/1.1" 200 12 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36" "-"

### node5：  # 显示了真正客户端的IP

192.168.10.10 - - [11/Mar/2017:10:52:07 +0800] "GET /bbs/ HTTP/1.0" 200 12 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36"

就是这个proxy_set_header可以随意自定义首部的名称和值，再随便添加一个乱七八糟的首部和其值：

server {
    root /www/c.net/;
    server_name www.c.net;
    location ~ /bbs/ {
       proxy_pass http://192.168.10.5;
       proxy_set_header SB-IP $remote_addr;
       proxy_set_header SB-HOST sbsb;
           }
  }

在后端服务器node5的日志上显示给首部的值：

LogFormat "%{SB-IP}i %{SB-HOST}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

查看日志：

192.168.10.10 sbsb - - [11/Mar/2017:11:03:21 +0800] "GET /bbs/ HTTP/1.0" 200 12 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36"

总结：

在只有一级代理时（只有1个反向代理器），我们可以使用proxy_set_header设置一个自定义的首部，只要将其值设置为变量$remote_addr就可以使得后端的web服务器日志中记录真实客户端ip（需要修改响应的日志格式）；但为了通用性，记录真是客户端ip的首部我们首部名称设置为X-Real-IP。

httpd定制访问日志格式时，使用%{X-Real-IP}i获取变量值，变量名不分区大小写

如果后端web服务器是nginx时，使用$http_X_Real_IP获取变量的值，只能使用“_”而不能使用“-”，变量名不分区大小写。

如果有多个代理服务器时，后端web服务器如何记录真是客户端的ip呢？

每个代理服务器上都需要设置X-Real-IP？

这样可以吗？显然不行的，假设有2个nginx代理服务器分别为nginx1，nginx2；

nginx 1上的变量$remote_addr是真是的客户端IP，nginx 2上变量$remote_addr是nginx 1的IP，所有无法传给后端的web服务器上，而且后端web服务器除了需要记录客户端ip可能还需要记录客户端经过代理服务器的各IP，使用X-Real-IP显然做不到。这里就需要使用到XFF头了。

2、X-Forwarded-For

先来看一下X-Forwarded-For的定义：
X-Forwarded-For：简称XFF头，它代表客户端，也就是HTTP的请求端真实的IP，只有在通过了HTTP 代理服务器时（需要手动在代理服务器上设置XFF头）才会添加该项。它不是RFC中定义的标准请求头信息，
标准格式如下：
X-Forwarded-For:client1, proxy1, proxy2
从标准格式可以看出，X-Forwarded-For头信息可以有多个，中间用逗号分隔，

第一项为真实的客户端ip，剩下的就是曾经经过的代理或负载均衡的ip地址，经过几个就会出现几个。

当经过多个nginx代理时，其X-Forwarded-For头信息应该为客户端IP,Nginx1,Nginx2，、、、。

在默认情况下，请求报文中并没有X-Forwarded-For头，需要用户在代理服务器上手动使用proxy_set_header参数设置：
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

$proxy_add_x_forwarded_for 变量是ngx_http_proxy_module 引入的变量，它包含客户端请求头中的"X-Forwarded-For"，与$remote_addr用逗号分开，如果没有"X-Forwarded-For"请求头，则此时$proxy_add_x_forwarded_for等于$remote_addr。

实例：

Node7为nginx代理服务器

Node5为后端web服务器

Node7：

server {
    root /www/c.net/;
    server_name www.c.net;
    location ~ /bbs {
       proxy_pass http://192.168.10.5:8080;
       proxy_set_header X-Real-IP $remote_addr;
      # proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;  # 此时没有设置XFF
       proxy_set_header SB-HOST sbsb;
           }
  }

Node5：设置记录日志的格式

log_format  main  '$http_x_real_ip - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for" "$proxy_add_x_forwarded_for"';

测试后Nod5上显示的日志信息：

192.168.10.10 - - [11/Mar/2017:13:07:25 +0800] "GET /bbs/ HTTP/1.0" 200 5 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36" "-" "192.168.10.7"

Node7上启动XFF设置后再测试Node5上的日志信息：

192.168.10.10 - - [11/Mar/2017:13:21:12 +0800] "GET /bbs/ HTTP/1.0" 200 5 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36" "192.168.10.10" "192.168.10.10, 192.168.10.7"

通过上面的测试，我们理解了XFF和变量$proxy_add_x_forwarded_for之间的关系，那继续

在有多台代理服务器时，设置X-Forwarded-For $proxy_add_x_forwarded_for会有两种情况发生：

假设有nginx 1，nginx2两个代理服务器，一个后端web服务器

1、如果从Nginx 1 上没有设置X-Forwarded-For头（通常这种事情不会发生），而到了Nginx 2设置将X-Forwarded-For设置为$proxy_add_x_forwarded_for时

那么此时后端web服务器上，X-Forwarded-For的值为空，$proxy_add_x_forwarded_for的值为

Nginx 1的IP，因为相对于Nginx 2来说客户端即为Nginx 1，这样的话，后端的web服务器记录不了真实用户的IP的。

2、如果Nginx 1，nginx 2 都设置了XFF的值为$proxy_add_x_forwarded_for时

此时在后端web服务器上，X-Forwarded-For的值为“真实客户端IP”，$proxy_add_x_forwarded_for的值为“真实客户端IP，nginx 2”，虽然能获取到真实客户端IP，但不能记录经过的每个代理服务器的IP（只能记录最后一个代理服务器的IP）

如果是这种情况的话，那后端的程序通过X-Forwarded-For获得客户端IP，则取逗号分隔的第一项即可。

很多人在这里容易弄混了：我们一步一步记录下XFF和AXFF（$proxy_add_x_forwarded_for）的值，

为了方便将192.168.10.10简写为10.10，其它同理

CIP（10.10） --> Nginx 1(10.7) --> Nginx 2(10.5) --> WEB(10.8)

cip:10.10 XFF:- XFF:10.10 XFF：10.10

AXFF:10.10 AXFF:10.10,10.7 AXFF：10.10,10.5

192.168.10.10 - - [11/Mar/2017:14:14:51 +0800] "GET /bbs/ HTTP/1.0" 200 5 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36" "192.168.10.10" "192.168.10.10, 192.168.10.5"

CIP（10.10） --> Nginx 1(10.7) --> Nginx 2(10.5) --> Nginx 3(10.4) --> WEB(10.8)

cip:10.10 XFF:- XFF:10.10 XFF：10.10,10.7 XFF:10.10,10.7

AXFF:10.10 AXFF:10.10,10.7 AXFF：10.10,10.7,10.5 AXFF:10.10,10.7,10.4

3、那后端web服务器如何才能记录下客户端经过的各代理服务器的IP呢?