安装python2.7
依赖
- lxml, an efficient XML and HTML parser
- parsel, an HTML/XML data extraction library written on top of lxml,
- w3lib, a multi-purpose helper for dealing with URLs and web page encodings
- twisted, an asynchronous networking framework
- cryptography and pyOpenSSL, to deal with various network-level security needs
安装lxml
pip install lxml
如果报错:
AttributeError: 'module' object has no attribute 'HTTPSConnection'
需要在安装python之前先安装openssl,此外还要安装libxml2 and libxslt,以及libffi
yum install -y libxml2 libxml2-devel libxslt libxslt-devel libffi-devel python-devel openssl-devel
pip install cryptographypip install pyopensslpip install parselpip install twisted
安装Scrapy
pip install Scrapy
输入scrapy命令验证一下
报错:
Traceback (most recent call last): File "/usr/local/bin/scrapy", line 5, in <module> from pkg_resources import load_entry_point File "/usr/local/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/pkg_resources.py", line 2607, in <module> File "/usr/local/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/pkg_resources.py", line 565, in resolve pkg_resources.DistributionNotFound: setuptools>=1.0 解决 pip install --upgrade scrapy