Varnish缓存

原创

qq59b54138c2c0b 2018-06-08 14:40:10 ©著作权

文章标签 varnish 换粗 文章分类 运维

©著作权归作者所有：来自51CTO博客作者qq59b54138c2c0b的原创作品，请联系作者获取转载授权，否则将追究法律责任

Web Page Cache：

squid --> varnish

程序的运行具有局部性特征：
	时间局部性：一个数据被访问过之后，可能很快会被再次访问到；
	空间局部性：一个数据被访问时，其周边的数据也有可能被访问到
	
cache：命中 
	
	热区：局部性；
		时效性：
			缓存空间耗尽：LRU，最近最少使用；
			过期：缓存清理
			
缓存命中率：hit/(hit+miss)
	(0,1)
	页面命中率：基于页面数量进行衡量
	字节命中率：基于页面的体积进行衡量
	
缓存与否：
	私有数据：private，private cache；
	公共数据：public, public or private cache;

Cache-related Headers Fields
	The most important caching header fields are:

		Expires：过期时间；
			Expires:Thu, 22 Oct 2026 06:34:30 GMT
		Cache-Control：max-age=
		
		
		Etag
		If-None-Match
		
		Last-Modified
		If-Modified-Since
		
		Vary：缓存的数据可能包含格式的转换，如压缩
		Age

	缓存有效性判断机制：
		过期时间：Expires
			HTTP/1.0
				Expires：过期
			HTTP/1.1
				Cache-Control: maxage=
				Cache-Control: s-maxage=
		条件式请求：
			Last-Modified/If-Modified-Since：基于文件的修改时间戳来判别；
			Etag/If-None-Match：基于文件的校验码来判别；
			
		Expires:Thu, 13 Aug 2026 02:05:12 GMT
		Cache-Control:max-age=315360000
		ETag:"1ec5-502264e2ae4c0"
		Last-Modified:Wed, 03 Sep 2014 10:00:27 GMT
		
	缓存层级：
		私有缓存：用户代理附带的本地缓存机制；
		公共缓存：反向代理服务器的缓存功能；
		
		User-Agent <--> private cache <--> public cache <--> public cache 2 <--> Original Server
		
		Cache-Control: key=value, key=value

请求报文用于通知缓存服务如何使用缓存响应请求：
	cache-request-directive = 
		"no-cache"，                        
		| "no-store"                         
		| "max-age" "=" delta-seconds        
		| "max-stale" [ "=" delta-seconds ]  缓存倒计时不得少于多少时间，stale陈旧的
		| "min-fresh" "=" delta-seconds      
		| "no-transform"                    
		| "only-if-cached"                  
		| cache-extension                    

响应报文用于通知缓存服务器如何存储上级服务器响应的内容：
	cache-response-directive =
		"public"                               
		| "private" [ "=" <"> 1#field-name <"> ] 
		| "no-cache" [ "=" <"> 1#field-name <"> ]，可缓存，但响应给客户端之前需要revalidation，即必须发出条件式请求进行缓存有效性验正；
		| "no-store" ，不允许存储响应内容于缓存中；                           
		| "no-transform"                        
		| "must-revalidate"                     
		| "proxy-revalidate"                  
		| "max-age" "=" delta-seconds           私有缓存和公共缓存都有效
		| "s-maxage" "=" delta-seconds          主要控制公共缓存
		| cache-extension     
		
开源解决方案：
	squid（章鱼即八爪鱼）：旧的缓存软件，但是稳定
	varnish：
	squid之于varnish相当于apache之于nginx
	varnish官方站点： http://www.varnish-cache.org/
		Community
		Enterprise
		
		 This is Varnish Cache, a high-performance HTTP accelerator. 
		
	程序架构：
		Manager进程
		Cacher进程，包含多种类型的线程：
			accept, worker, expiry, ... 
		shared memory log：
			统计数据：计数器；
			日志区域：日志记录；
				varnishlog, varnishncsa, varnishstat... 
			
		配置接口：VCL
			Varnish Configuration Language, 
				vcl complier --> c complier --> shared object 

				
	varnish的程序环境：
		/etc/varnish/varnish.params： 配置varnish服务进程的工作特性，例如监听的地址和端		口，缓存机制；
            Unit File: EnvironmentFile=""
		/etc/varnish/default.vcl：配置各Child/Cache线程的缓存策略；
            VCL: dsl, subroutines, 子例程	代码块
		主程序：
			/usr/sbin/varnishd
		CLI interface：
			/usr/bin/varnishadm
		Shared Memory Log交互工具：
			/usr/bin/varnishhist
			/usr/bin/varnishlog
			/usr/bin/varnishncsa	日志输出工具，启动为守护进程，一出现日志就进行读

取； /usr/bin/varnishstat /usr/bin/varnishtop 测试工具程序： /usr/bin/varnishtest VCL配置文件重载程序： /usr/sbin/varnish_reload_vcl 完成转换、编译、触发程序处理 Systemd Unit File： /usr/lib/systemd/system/varnish.service varnish服务 /usr/lib/systemd/system/varnishlog.service 读取原始格式日志 /usr/lib/systemd/system/varnishncsa.service 读取combined格式的日志日志持久的服务；

	varnish的缓存存储机制( Storage Types)：
		-s [name=]type[,options]
		
		· malloc[,size]
			内存存储，[,size]用于定义空间大小；重启后所有缓存项失效；
		· file[,path[,size[,granularity]]]

granularity：粒度；间隔磁盘文件存储，黑盒；重启后所有缓存项失效； · persistent,path,size 文件存储，黑盒；重启后所有缓存项有效；实验；

	varnish程序的选项：
		程序选项：/etc/varnish/varnish.params文件
			-a address[:port][,address[:port][...]，默认为6081端口； 
			-T address[:port]，默认为6082端口；
			-s [name=]type[,options]，定义缓存存储机制；
			-u user
			-g group
			-f config：VCL配置文件；
			-F：运行于前台；
			...
		运行时参数：/etc/varnish/varnish.params文件， DEAMON_OPTS
			DAEMON_OPTS="-p thread_pool_min=5 -p thread_pool_max=500 -p

thread_pool_timeout=300"

			-p param=value：设定运行参数及其值； 可重复使用多次；
			-r param[,param...]: 设定指定的参数为只读状态； 
			
	重载vcl配置文件：
		~ ]# varnish_reload_vcl
			
	varnishadm
		-S /etc/varnish/secret -T [ADDRESS:]PORT 
  
		help [<command>]
		ping [<timestamp>]		存活性测试
		auth <response>
		quit
		banner	显示欢迎信息
		status
		start		启动子进程
		stop
		vcl.load <configname> <filename>
		vcl.inline <configname> <quoted_VCLstring>
		vcl.use <configname>
		vcl.discard <configname>
		vcl.list	列出vcl配置文件，active表示当前正在使用此配置文件
		param.show [-l] [<param>]	列出所有可调整参数
		param.set <param> <value>
		panic.show
		panic.clear
		storage.list
		vcl.show [-v] <configname>	查看指定的默认的配置文件，不加-v是指定的配置文件
		backend.list [<backend_expression>]
		backend.set_health <backend_expression> <state>
		ban <field> <operator> <arg> [&& <field> <oper> <arg>]...
		ban.list	匹配正则表达式进行清除
		
		配置文件相关：
			vcl.list 
			vcl.load：装载，加载并编译；
			vcl.use：激活；
			vcl.discard：删除；
			vcl.show [-v] <configname>：查看指定的配置文件的详细信息；
			
		运行时参数：
			param.show -l：显示列表；
			param.show <PARAM>
			param.set <PARAM> <VALUE>
			
		缓存存储：
			storage.list
			
		后端服务器：
			backend.list 

			
	VCL：
		”域“专有类型的配置语言；
		
		state engine：状态引擎；
		
		VCL有多个状态引擎，状态之间存在相关性，但状态引擎彼此间互相隔离；每个状态引擎可使用return(x)指明关联至哪个下一级引擎；每个状态引擎对应于vcl文件中的一个配置段，即为subroutine
		
			vcl_hash --> return(hit) --> vcl_hit
			
		vcl_recv的默认配置：
		
			sub vcl_recv {
				if (req.method == "PRI") {
					/* We do not support SPDY or HTTP/2.0 */	SPDY是google的http加

速协议； return (synth(405)); } if (req.method != "GET" && req.method != "HEAD" && req.method != "PUT" && req.method != "POST" && req.method != "TRACE" && req.method != "OPTIONS" && req.method != "DELETE") { /* Non-RFC2616 or CONNECT which is weird. */ return (pipe); }

				if (req.method != "GET" && req.method != "HEAD") {
					/* We only deal with GET and HEAD by default */
					return (pass);
				}
				if (req.http.Authorization || req.http.Cookie) {
					/* Not cacheable by default */
					return (pass);
				}
					return (hash);
				}
			}
		
			
		Client Side：
			vcl_recv, vcl_pass, vcl_hit, vcl_miss, vcl_pipe, vcl_purge, vcl_synth, vcl_deliver
			
			vcl_recv：
				hash：vcl_hash
				pass: vcl_pass 
				pipe: vcl_pipe
				synth: vcl_synth
				purge: vcl_hash --> vcl_purge
				
			vcl_hash：
				lookup：
					hit: vcl_hit
					miss: vcl_miss
					pass, hit_for_pass: vcl_pass
					purge: vcl_purge
			
		Backend Side：
			vcl_backend_fetch, vcl_backend_response, vcl_backend_error
	
		两个特殊的引擎：
			vcl_init：在处理任何请求之前要执行的vcl代码：主要用于初始化VMODs；
			vcl_fini：所有的请求都已经结束，在vcl配置被丢弃时调用；主要用于清理VMODs；
		
	vcl的语法格式：
		(1) VCL files start with vcl 4.0;
		(2) //, # and /* foo */ for comments;
		(3) Subroutines are declared with the sub keyword; 例如sub vcl_recv { ...}；
		(4) No loops, state-limited variables（受限于引擎的内建变量）；
		(5) Terminating statements with a keyword for next action as argument of the return() function, i.e.: return(action)；用于实现状态引擎转换； 
		(6) Domain-specific;
		
	The VCL Finite State Machine
		(1) Each request is processed separately;
		(2) Each request is independent from others at any given time;
		(3) States are related, but isolated;
		(4) return(action); exits one state and instructs Varnish to proceed to the next state;
		(5) Built-in VCL code is always present and appended below your own VCL;
		
	三类主要语法：
		sub subroutine {
			...
		}
		
		if CONDITION {
			...
		} else {	
			...
		}
		
		return(), hash_data()
		
	VCL Built-in Functions and Keywords
		函数：
			regsub(str, regex, sub)
			regsuball(str, regex, sub)
			ban(boolean expression)
			hash_data(input)
			synthetic(str)
			
		Keywords:
			call subroutine， return(action)，new，set，unset 
			
		操作符：
			==, !=, ~, >, >=, <, <=
			逻辑操作符：&&, ||, !
			变量赋值：=
			
		举例：obj.hits是内建变量，用于保存某缓存项的从缓存中命中的次数；
			if (obj.hits>0) {
				set resp.http.X-Cache = "HIT via" + " " + server.ip;
			} else {
				set resp.http.X-Cache = "MISS from " + server.ip;
			}
					
	
	变量类型：
		内建变量：
			req.*：request，表示由客户端发来的请求报文相关；
				req.http.*
					req.http.User-Agent, req.http.Referer, ...
			bereq.*：由varnish发往BE主机的httpd请求相关；
				bereq.http.*
			beresp.*：由BE主机响应给varnish的响应报文相关；
				beresp.http.*
			resp.*：由varnish响应给client相关；
			obj.*：存储在缓存空间中的缓存对象的属性；只读；
			
			常用变量：
				bereq.*, req.*：
					bereq.http.HEADERS
					bereq.request：请求方法；
					bereq.url：请求的url；
					bereq.proto：请求的协议版本；
					bereq.backend：指明要调用的后端主机；
					
					req.http.Cookie：客户端的请求报文中Cookie首部的值； 
					req.http.User-Agent ~ "chrome"	浏览器类型
					
					
				beresp.*, resp.*：
					beresp.http.HEADERS
					beresp.status：响应的状态码；
					reresp.proto：协议版本；
					beresp.backend.name：BE主机的主机名；
					beresp.ttl：BE主机响应的内容的余下的可缓存时长；
					
				obj.*
					obj.hits：此对象从缓存中命中的次数；
					obj.ttl：对象的ttl值
					
				server.*
					server.ip：varnish主机的IP；
					server.hostname：varnish主机的Hostname；
				client.*
					client.ip：发请求至varnish主机的客户端IP；
			
		用户自定义：
			set 
			unset 
		
	示例1：强制对某类资源的请求不检查缓存：
		vcl_recv {
			if (req.url ~ "(?i)^/(login|admin)") {	？表示flag标志位引导，i表示ignore忽略字符

大小写； return(pass); } }

	示例2：对于特定类型的资源，例如公开的图片等，取消其私有标识，并强行设定其可以由varnish缓存的时长； 定义在vcl_backend_response中；
		if (beresp.http.cache-control !~ "s-maxage") {
			if (bereq.url ~ "(?i)\.(jpg|jpeg|png|gif|css|js)$") {
				unset beresp.http.Set-Cookie;
				set beresp.ttl = 3600s;
			}
		}
		
	示例3：定义在vcl_recv中；
		if (req.restarts == 0) {	restarts是在更改url请求后进行以将请求返回到recev重新进

行检查；重启后restarts变成1，继续重启变成2 if (req.http.X-Fowarded-For) { set req.http.X-Forwarded-For = req.http.X-Forwarded-For + "," + client.ip; } else { set req.http.X-Forwarded-For = client.ip; } } 实验：在varnish上设置不允许访问login私密数据 vim default.vcl backend default { #配置后端主机，并改变端口为80以方便访问 .host = "172.18.62.63"; .port = "80"; }

sub vcl_recv { #设置接收到访问http://172.18.62.61/login/的请求后不走缓存 if (req.url ~ "^/login" ) { return(pass); } #注意要选择好适合的状态引擎 varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082 进入管理界面 vcl.load test2 default.vcl 加载配置，相当于转换-->编译 vcl.use test2 将编译好的文件触发交给几个进程处理 vcl.list 查看是否在使用test2 缓存对象的修剪：purge, ban (1) 能执行purge操作 sub vcl_purge { return (synth(200,"Purged")); }

		(2) 何时执行purge操作
			sub vcl_recv {
				if (req.method == "PURGE") {
					return(purge);
				}
				...
			}
			
		添加此类请求的访问控制法则：
			acl purgers {
				"127.0.0.0"/8;
				"10.1.0.0"/16;
			}
			
			sub vcl_recv {
				if (req.method == "PURGE") {
					if (!client.ip ~ purgers) {
						return(synth(405,"Purging not allowed for " + client.ip));
					}
					return(purge);
				}
				...
			}
			curl -X PURGE http://127.0.0.1/test2.html	-X指定浏览器用PURGE方法进行

访问； Banning： (1) varnishadm： ban <field> <operator> <arg>

				示例：
					ban req.url ~ ^/javascripts
					
			(2) 在配置文件中定义，使用ban()函数；
			
			示例：
				if (req.method == "BAN") {
					ban("req.http.host == " + req.http.host + " && req.url == " + req.url);
					# Throw a synthetic page so the request won't go to the backend.
					return(synth(200, "Ban added"));
				}	
				
				ban req.http.host==www.ilinux.io && req.url==/test1.html
				建议随用随运行；
				http://www.ilinux.io/test1.html 
                    
			
	如何设定使用多个后端主机：
		backend default {
			.host = "172.16.100.6";
			.port = "80";
		}

		backend appsrv {
			.host = "172.16.100.7";
			.port = "80";
		}
		动静分离
		sub vcl_recv {				
			if (req.url ~ "(?i)\.php$") {
				set req.backend_hint = appsrv;
			} else {
				set req.backend_hint = default;
			}	
			
			...
		}
		
	
		
	Director：
		varnish module； 
			使用前需要导入：
				import directors；
		
		示例：
			import directors;    # load the directors

			backend server1 {
				.host = 
				.port = 
			}
			backend server2 {
				.host = 
				.port = 
			}

			sub vcl_init {
				new GROUP_NAME = directors.round_robin();
				GROUP_NAME.add_backend(server1);
				GROUP_NAME.add_backend(server2);
			}

			sub vcl_recv {
				# send all traffic to the bar director:
				set req.backend_hint = GROUP_NAME.backend();
			}	#注意只有不被缓存的页面访问时才能达到轮询效果
			
		基于cookie的session sticky：
			sub vcl_init {
				new h = directors.hash();
				h.add_backend(one, 1);   // backend 'one' with weight '1'
				h.add_backend(two, 1);   // backend 'two' with weight '1'
			}

			sub vcl_recv {
				// pick a backend based on the cookie header of the client
				set req.backend_hint = h.backend(req.http.cookie);
			}		#由于没有设置cookie，故而此调度不生效		
		
	BE Health Check：
		backend BE_NAME {
			.host =  
			.port = 
			.probe = {
				.url= 
				.timeout= 
				.interval= 
				.window=
				.threshold=
			}
		}
		
		.probe：定义健康状态检测方法；
			.url：检测时要请求的URL，默认为”/"; 
			.request：发出的具体请求；
				.request = 
					"GET /.healthtest.html HTTP/1.1"
					"Host: www.magedu.com"
					"Connection: close"
			.window：基于最近的多少次检查来判断其健康状态； 
			.threshold：最近.window中定义的这么次检查中至有.threshhold定义的次数是成功的；
			.interval：检测频度； 
			.timeout：超时时长；
			.expected_response：期望的响应码，默认为200；
			
		健康状态检测的配置方式：
			(1) probe PB_NAME  { }
			     backend NAME = {
				.probe = PB_NAME;
				...
			     }
			     
			(2) backend NAME  {
				.probe = {
					...
				}
			}

		示例：
			probe check {
				.url = "/.healthcheck.html";
				.window = 5;
				.threshold = 4;
				.interval = 2s;
				.timeout = 1s;
			}

			backend default {
				.host = "10.1.0.68";
				.port = "80";
				.probe = check;
			}

			backend appsrv {
				.host = "10.1.0.69";
				.port = "80";
				.probe = check;
			}
			
        手动设定BE主机的状态：
            sick：管理down; 
            healthy：管理up；
            auto：probe auto；
	例子：		在管理界面设置

backend.set_health srv1 sick backend.set_health srv1 healthy backend.set_health srv1 auto

	设置后端的主机属性：
		backend BE_NAME {
			...
			.connect_timeout = 0.5s;
			.first_byte_timeout = 20s;
			.between_bytes_timeout = 5s;
			.max_connections = 50;
		}
			
			
	 varnish的运行时参数：
		线程模型：
			cache-worker
			cache-main
			ban lurker
			acceptor：
			epoll/kqueue：
			...
			
		线程相关的参数：使用线程池机制管理线程；
			在线程池内部，其每一个请求由一个线程来处理； 其worker线程的最大数决定了varnish的并发响应能力；
			
			thread_pools：Number of worker thread pools. 最好小于或等于CPU核心数量； 
			thread_pool_max：The maximum number of worker threads in each pool. 每线程池的最大线程数；
			thread_pool_min：The minimum number of worker threads in each pool. 额外意义为“最大空闲线程数”；
			param.show thread_pools	显示线程池信息
param.set thread_pools 4	设置线程池的线程数量
				最大并发连接数 = thread_pools  * thread_pool_max
				
			thread_pool_timeout：Thread idle threshold.  Threads in excess of thread_pool_min, which have been idle for at least this long, will be destroyed.	多余最大空闲线程数的线程一旦超时就杀掉
			thread_pool_add_delay：Wait at least this long after creating a thread.
			thread_pool_destroy_delay：Wait this long after destroying a thread.
			
		Timer相关的参数：		#与计时器相关的参数
			send_timeout：Send timeout for client connections. If the HTTP response hasn't been transmitted in this many seconds the session is closed.
			timeout_idle：Idle timeout for client connections. 
			timeout_req： Max time to receive clients request headers, measured from first non-white-space character to double CRNL.
			cli_timeout：Timeout for the childs replies to CLI requests from the mgt_param.
			指的是varnishadm管理连接超时时长；
			设置方式：
				vcl.param 
				param.set
			
			永久有效的方法：
				varnish.params
					DEAMON_OPTS="-p PARAM1=VALUE -p PARAM2=VALUE"
					
	varnish日志区域：
		shared memory log 
			计数器
			日志信息
			
		1、varnishstat - Varnish Cache statistics（统计）
			-1	取出当前的一批信息，而不是实时监控
			-1 -f FILED_NAME 
			-l：可用于-f选项指定的字段名称列表；
			
			MAIN.cache_hit 
			MAIN.cache_miss
			
			# varnishstat -1 -f MAIN.cache_hit -f MAIN.cache_miss
                显示指定参数的当前统计数据；命中次数与没命中次数
			# varnishstat -l -f MAIN -f MEMPOOL
                列出指定配置段的每个参数的意义；
			
		2、varnishtop - Varnish log entry ranking	默认按照速率排序
			-1     Instead of a continously updated display, print the statistics once and exit.

将已有的日志一次性显示出来； -i taglist，可以同时使用多个-i选项，也可以一个选项跟上多个标签；例子：varnishtop -i ReqURL -I <[taglist:]regex>：对指定的标签的值基于regex进行过滤；例子：varnishtop -I login -x taglist：排除列表例子： varnishtop -x ReqURL varnishtop -I HTTP/1 -X <[taglist:]regex>：对指定的标签的值基于regex进行过滤，符合条件的予以排除；

		3、varnishlog - Display Varnish logs	是实时显示日志，已有的日志不会读取
			
		4、 varnishncsa - Display Varnish logs in Apache / NCSA combined log format

5、systemctl restart varnishncsa 启动此服务为守护进程以将内存中的日志实时记录到磁盘文件中；实际上运行的是此/usr/lib/systemd/system/varnishncsa.service服务；内建函数： hash_data()：指明哈希计算的数据；减少差异，以提升命中率； regsub(str,regex,sub)：把str中被regex第一次匹配到字符串替换为sub；主要用于URL Rewrite regsuball(str,regex,sub)：把str中被regex每一次匹配到字符串均替换为sub； return()： ban(expression) ban_url(regex)：Bans所有的其URL可以被此处的regex匹配到的缓存对象； synth(status,"STRING")：生成响应报文；

总结：
	varnish： state engine, vcl 
		varnish 4.0：
			vcl_init 
			vcl_recv
			vcl_hash 
			vcl_hit 
			vcl_pass
			vcl_miss 
			vcl_pipe 
			vcl_waiting
			vcl_purge 
			vcl_deliver
			vcl_synth
			vcl_fini
			
			vcl_backend_fetch
			vcl_backend_response
			vcl_backend_error 
			
		sub VCL_STATE_ENGINE {
			...
		}
		backend BE_NAME {} 
		probe PB_NAME {}
		acl ACL_NAME {}
		
博客作业：以上所有内容； 
	实战项目：两个lamp部署wordpress，用Nginx反代，做压测；nginx后部署varnish缓存，调

整vcl，多次压测；

	ab, http_load, webbench, seige, jmeter, loadrunner,...

补充资料：varnish book http://book.varnish-software.com/4.0/

示例： backend imgsrv1 { .host = "192.168.10.11"; .port = "80"; }

backend imgsrv2 {
	.host = "192.168.10.12";
	.port = "80";
}	

backend appsrv1 {
	.host = "192.168.10.21";
	.port = "80";
}

backend appsrv2 {
	.host = "192.168.10.22";
	.port = "80";
}

sub vcl_init {
	new imgsrvs = directors.random();
	imgsrvs.add_backend(imgsrv1,10);
	imgsrvs.add_backend(imgsrv2,20);
	
	new staticsrvs = directors.round_robin();
	appsrvs.add_backend(appsrv1);
	appsrvs.add_backend(appsrv2);
	
	new appsrvs = directors.hash();
	appsrvs.add_backend(appsrv1,1);
	appsrvs.add_backend(appsrv2,1);		
}

sub vcl_recv {
	if (req.url ~ "(?i)\.(css|js)$" {
		set req.backend_hint = staticsrvs.backend();
	} 		
	if (req.url ~ "(?i)\.(jpg|jpeg|png|gif)$" {
		set req.backend_hint = imgsrvs.backend();
	} else {		
		set req.backend_hint = appsrvs.backend(req.http.cookie);
	}
}
注意：

1、nginx和haproxy调度器是基于7层调度的，是和早期http协议有关的调度； 2、用户的请求和并发数并不是一一对应的关系，因为一个客户端可以同时生成多个线程进行访问，构成多个并发。 3、IAAS：基础架构机服务，如购买个一小时的虚拟机进行软件测试 4、PAAS:平台级服务，不仅有虚拟机，虚拟机里面还运行相应的软件 5、SAAS：软件级服务，用浏览器打开应用，进行云软件使用，如微软的word 6、浏览器私有缓存、公共缓存 7、依据缓存28法则，20%的缓存承载80%的需求，如果用静态缓存，那么当缓存机器出问题时，会使得缓存消失，那么后端机器会瞬间雪崩，并且再次启动服务时，通过健康性检查后，再次涌入大量访问，会使服务机器再次崩溃，要解决此需要限流或将上一层移除启动或进行缓存机器的预热缓存数据；因此使用动态缓存比较好； 8、kv数据是将数据结构缓存到内存中，一次性查找即可找到缓存数据 9、CDN是内容分发网络，是建立的各个区域的缓存服务器（放到用户“家门口”）的集合，对于小公司可以购买这一服务；要买双份的，防止意外发生，但是按流量收费，价格也不便宜； 10、GLSB:全局负载均衡，即全局调度 11、fastcgi协议不能直接用haproxy进行调度，因为需要客户端也发出fastcgi请求才可以，但是客户端并没有发fastcgi协议的请求；一般而言是先调度到nginx，然后由nginx转给fastcgi 的PHP程序；shift+F5是强制刷新； 12、缓存具有空间上和时间上的局限性。 13、ttl：time to live 存活周期；对缓存的内容进行ttl控制，如果在控制有效时，后端服务器内容进行了更新，那么缓存就相当于过期了，这时需要缓存管理员进行cache purge内容清除，以避免用户访问到的都是旧的数据；http1.0版本的缓存控制太过于粗暴，而http1.1的缓存控制机制有所改进，使用了条件式的访问，用户请求到达缓存服务器后没有缓存就调度到后端服务器，有缓存就问后端服务器在此缓存时间后是否有更新；但是有缺点，如更新很频繁的话有毫秒级的更新，这就不能依据此进行，而应该在缓存时随机打一个版本号（if none match），当数据更新后，后端数据版本发生变化，这就解决了依靠时间戳判断的问题；但是这样的话，用户访问效率降低；因此，采用两者结合的做法，在设置了缓存后，在缓存不过期的时间内对访问时，直接调度缓存服务器上的内容，而不会与后端服务器进行交互，在缓存过期后，先通过缓存问后端服务器的数据是否更新了，如果没有更新那么继续从缓存服务器上取内容，如果更新了再从后端服务器进行内容调度；还可以缓存管理员进行purge cache或ban（正则表达式匹配清除） 14、Varnish Architecture

注意：里面的各个进程各自起自己的作用；2个接口，一个用来管理接口的，一个提供应用；分别使用6082端口和6081端口；是递归式缓存；配置使用c语言起作用，如果不是c语言就使用VCL进行转换；日志为了避免存储到磁盘降低性能，是先在内存中找一块固定空间，然后将日志放进去，如果空间满了就进行覆盖，需要日志就人工导出到磁盘；日志分为两个部分，统计部分和日志部分，统计部分不会被覆盖；CLI interface是varnish的管理接口；VAC是一个付费的图形界面管理接口；旁挂式缓存：是客户端到缓存找，没有就不是缓存服务器代理访问后端服务器，而是客户端自己这访问后端服务器；VCL是动态加载的。 15、缓存机制：内存存放：机器一重启就丢失磁盘存放：是在磁盘中放一个二进制文件，此二进制文件内部定义有文件系统，对外是不可见的，一旦机器重启，缓存就会丢失。提倡参数运行时修改持久磁盘缓存：重启后缓存不会丢失，但是4.0版本的varnish是不能使用此功能的 16、配置文件是通过命令行参数来实现的；而缓存的参数是不能当做配置文件的，故而通过环境变量文件进行调用；cat /usr/lib/systemd/system/varnish.service； 17、如果从客户端发来的请求不能被应用层识别，那么varnish就自动将请求降为4层模式转发到后端服务器，不分析应用层首部，称为管道；从varnish缓存发到服务器中间要经过一个状态引擎； 18、VCL是一个预专有配置，一段配置只能适用于一个引擎；当接到用户的请求时，如果要访问缓存，那么就看uri的哈希值是否有。

19、Variable availability in VCL