前言:

大家用pip安装模块有没有遇见过这样的情况:

对的,这就是抽风的timeout

昨天看到一个朋友慢腾腾给所有节点搞tornado,他的源是官方的源,官方的源时不时的会抽风,会慢的。所以我推荐他用国内的源,或者是连接我们自己搭建的源。

由此

我就跟大家扯一下如何构建快速的python 模块的环境 ~


其实国内很多大公司都有自己的pypi源的,只是好多都私有环境的

下面都是国内速度比较快的节点


指定pypi源的方法:


pip install tornado -i http://pypi.sdutlinux.org/simple




也可以是全局的模式


在unix和macos,配置文件为:$HOME/.pip/pip.conf

在windows上,配置文件为:%HOME%\pip\pip.ini


需要在配置文件内加上:


[global]
index-url=http://mirrors.tuna.tsinghua.edu.cn/pypi/simple



以前我都是用豆瓣的源,现在豆瓣的源貌似有问题啦。。。哎,可惜啦

提示:


Downloading/unpacking tornado
  Getting page http://pipy.douban.com/simple/tornado/
  Could not fetch URL http://pipy.douban.com/simple/tornado/: <urlopen error [Errno -2] Name or service not known>
  Will skip URL http://pipy.douban.com/simple/tornado/ when looking for download links for tornado
  Getting page http://pipy.douban.com/simple/
  Could not fetch URL http://pipy.douban.com/simple/: <urlopen error [Errno -2] Name or service not known>
  Will skip URL http://pipy.douban.com/simple/ when looking for download links for tornado
  Cannot fetch index base URL http://pipy.douban.com/simple/
  URLs to search for versions for tornado:
  * http://pipy.douban.com/simple/tornado/
  Getting page http://pipy.douban.com/simple/tornado/
  Could not fetch URL http://pipy.douban.com/simple/tornado/: <urlopen error [Errno -2] Name or service not known>
  Will skip URL http://pipy.douban.com/simple/tornado/ when looking for download links for tornado
  Could not find any downloads that satisfy the requirement tornado



想用更多的源,到这里看看 www.pypi-mirrors.org



可以看到好多的节点的运行情况和速度。


我们的监控系统是python开发的,客户端当然也是python,时常需要大量的部署,最开始我们用的是反向代理做缓存,效果还是很不错的。

我们在用puppet批量部署监控客户端的时候,会让pip指定我们已经反向缓存的pypi的地址、版本库  


比如:


pip install -i https://10.2.20.66/qinghua


意思就是从清华那里搞到包。我想大家都应该熟悉缓存代理吧,大家自己加个location指定proxy_pass就行啦~


一个例子:

(我这里就不详细说明意思了,大家可以看看我写过的nginx反向代理的文章。。。)


install


ulimit -SHn 65535
yum install pcre pcre-devel -y 安装pcre
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
wget http://labs.frickle.com/files/ngx_cache_purge-1.4.tar.gz
tar zxvf ngx_cache_purge-1.4.tar.gz
wget http://nginx.org/download/nginx-1.0.11.tar.gz
tar zxvf nginx-1.0.11.tar.gz
cd nginx-1.0.11/
./configure --user=www --group=www --add-module=../ngx_cache_purge-1.4 --prefix=/usr/local/nginx --with-http_stub_status_module --with-http_ssl_module
make && make install
cd ../


关键配置


proxy_cache_path  /var/lib/nginx/cache/ levels=1:1:2 inactive=24000h keys_zone=cache:100m;
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
server {
        listen   8000 default;
        server_name  localhost;
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
        access_log  /var/log/nginx/localhost.access.log;
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
        #中间省略部分默认配置
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
        location /guanfang {
                proxy_pass http://pypi.python.org/simple;
        proxy_cache cache;
        proxy_cache_valid  any 2400h;
        }
        location /douban {
            proxy_pass http://pipy.douban.com/simple;
            proxy_cache cache;
            proxy_cache_valid  any 24000h;
            proxy_redirect off;
            proxy_set_header Host $host;
            proxy_cache cache_one;
            proxy_cache_valid 200 302 1h;
            proxy_cache_valid 301 1d;
            expires 30d;
    }
    location /taiwan {
            proxy_pass http://mirrors.tuna.tsinghua.edu.cn/pypi/simple;
            proxy_cache cache;
            proxy_cache_valid  any 24000h;
            proxy_redirect off;
            proxy_set_header Host $host;
            proxy_cache cache_one;
            proxy_cache_valid 200 302 1h;
            proxy_cache_valid 301 1d;
            expires 30d;
        }
    location /jiaoyu {
            proxy_pass http://mirrors.tuna.tsinghua.edu.cn/pypi/simple;
            proxy_cache cache;
            proxy_cache_valid  any 24000h;
            proxy_redirect off;
            proxy_set_header Host $host;
            proxy_cache cache_one;
            proxy_cache_valid 200 302 1h;
            proxy_cache_valid 301 1d;
            expires 30d;
        }
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
}


我们讲了用国内国外的pypi节点来快速部署你的环境,来说下,我们线上的部署方法:

个人觉得这个适用于需要大批量部署,又有模块定制,而有些节点不能联外网情况下适用。

要是大家的环境不大,推荐用 -i 的方式。 毕竟搭建私有pypi服务是很折腾的事~


安装pypimirror

pip install z3c.pypimirror


要是安装不上的话,可以用官方的方式:



$ git clone https://github.com/macagua/macagua.buildout.pypimirror.git
$ virtualenv .
$ source ./bin/activate
$ python bootstrap.py
$ ./bin/buildout -vvvN
$ deactivate




安装完毕就要填写配置文件,需要填写的参数主要有:

mirror_file_path 下载的包的存放路径

base_url 服务器地址,这个注意要和Apache上的一致!

create_indexes 布尔类型,用来在下载的每个包目录下创建索引

package_matches 这个是用户自定义的,PyPI包无数,使用正则表达式有选择地下载,要不然硬盘要爆了

lock_file_name 设置运行时锁状态文件存放位置

log_filename 设置日志文件存放位置


生成配置索引

pypimirror -c -v -I pypimirror.cfg


这里很慢的,就算你用国内的源,也会让你气的抽风,骂的吐血~


我是在公司线上机房做的测试,居然还是同步出问题,想看看抽风不?


也可以修改配置文件


DEFAULT]
# the root folder of all mirrored packages.
# if necessary it will be created for you
mirror_file_path = /home/pypimirror/paquetes
# where's your mirror on the net?
base_url = http://pypi.python.jp/
# lock file to avoid duplicate runs of the mirror script
lock_file_name = /home/pypimirror/pypi-poll-access.lock
# Pattern for package files, only those matching will be mirrored
filename_matches =
    *.zip
    *.tgz
    *.egg
    *.tar.gz
    *.tar.bz2
# Pattern for package names; only packages having matching names will
# be mirrored
package_matches =
#   zope.*
#   plone.*
#   Products.*
#   collective.*
   *.*
# remove packages not on pypi (or externals) anymore
cleanup = True
# create index.html files
create_indexes = True
# be more verbose
verbose = True
# resolve download_url links on pypi which point to files and download
# the files from there (if they match filename_matches).
# The filename and filesize (from the download header) are used
# to find out if the file is already on the mirror. Not all servers
# support the content-length header, so be prepared to download
# a lot of data on each mirror update.
# This is highly experimental and shouldn't be used right now.
#
# NOTE: This option should only be set to True if package_matches is not
# set to '*' - otherwise you will mirror a huge amount of data. BE CAREFUL
# using this option!!!
external_links = False
# similar to 'external_links' but also follows an index page if no
# download links are available on the referenced download_url page
# of a given package.
#
# NOTE: This option should only be set to True if package_matches is not
# set to '*' - otherwise you will mirror a huge amount of data. BE CAREFUL
# using this option!!!
follow_external_index_pages = False
# logfile
log_filename = /home/pypimirror/pypimirror.log



更新

pypimirror -c -v -i -U pypimirror.cfg


参数说明:

-i 创建索引

-U 更新镜像。



假设前面设置的下载文件存放路径为/data/pypi/files,下面把它链接到/var/www/目录下:

ln -s /data/pypi/files /var/www/pypi

重启Apache,访问http://xiaorui.cc/pypi/ 应该就可以了。这是最简单的配置,这时的base_url就是 http://xiaorui.cc/pypi/ 。


让他定期更新下


安装pypimirror遇到的问题:


Traceback (most recent call last):  File "/usr/local/lib/python2.6/dist-packages/bottle.py", line 764, in _handle    return route.call(**args)  File "/usr/local/lib/python2.6/dist-packages/bottle.py", line 1575, in wrapper    rv = callback(*a, **ka)  File "hydenMain.py", line 39, in static    return static_file(filename, root='{}/static'.format('.'))ValueError: zero length field name in format


大家把python升级到2.7就行了


需要说明的是,本地的pypi安装的时候,会出现各种各样的问题,请大家都尝试和搜下问题所在。 我们可以把问题都统计下来,好让也遇到这样的问题的人,能更好的定位和解决问题。