手头有台MacBook M1笔记本,大部分应用都不兼容,VMware Fusion不支持Linux虚拟机。Parallel据说支持arm版的Windows和Linux,但是好像也不好用。唯一还有点用的地方就是做机器学习,目前tensorflow2.5原生支持M1,性能相比于2.4有较大提升,但是必须得用MacOS 12,还处于beta阶段。本文记录了在M1上配置tensorflow环境的过程,并且做了一些简单测试,从测试结果来看,性能提升还是比较明显的。

升级MacOS 12

目前苹果为适配M1开发的tensorflow版本已经不用了,tensorflow2.5原生支持M1,所以第一步是升级MacOS12,可以参考下面的教程。

https://zhuanlan.zhihu.com/p/378946858

配置Conda环境

因为Anaconda还不支持m1处理器,自带的python也是3.8的,不能原生支持arm处理器,所以需要使用开源的miniforge代替,它带了python3.9。

以下摘自miniforge的github的主页。

Miniforge3

Latest installers with Python 3.9 (*) in the base environment:

OS Architecture Download
Linux x86_64 (amd64) Miniforge3-Linux-x86_64
Linux aarch64 (arm64) (**) Miniforge3-Linux-aarch64
Linux ppc64le (POWER8/9) Miniforge3-Linux-ppc64le
OS X x86_64 Miniforge3-MacOSX-x86_64
OS X arm64 (Apple Silicon) (***) Miniforge3-MacOSX-arm64
Windows x86_64 Miniforge3-Windows-x86_64

(*) The Python version is specific only to the base environment. Conda can create new environments with different Python versions and implementations.

(**) While the Raspberry PI includes a 64 bit processor, the RasbianOS is built on a 32 bit kernel and is not a supported configuration for these installers. We recommend using a 64 bit linux distribution such as Ubuntu for Raspberry PI.

(***) Apple silicon builds are experimental and haven't had testing like the other platforms.

虽然conda对m1对支持还处于experimental阶段,但是python3.9是原生支持m1处理器的,我们只是用conda管理python的包。

在安装过程中,可能是因为之前安装了anaconda,遇到了conda被zsh kill的问题,试了好多方法,包括装了完整的xcode,都没解决问题,后来换了个安装路径解决了。理论上不需要安装xcode,直接安装miniforge就行。

https://github.com/conda-forge/miniforge/issues/190

安装很简单,只要下载了安装程序,直接执行即可。

 ./Miniforge3-MacOSX-arm64.sh 

一路yes或者默认即可,安完之后重启终端,看看conda和python能否运行,我的运行结果是python3.9.6。

(base)  ~ % python
Python 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:35:11) 
[Clang 11.1.0 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 

修改成国内仓库,打开或者创建~/.condarc,然后添加如下内容:

channels:
  - https://mirrors.ustc.edu.cn/anaconda/pkgs/main/
  - https://mirrors.ustc.edu.cn/anaconda/cloud/conda-forge/
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
  - defaults
show_channel_urls: true

安装一个包看是否用了国内源,可以看到,已经用了国内源

(base) niuxinli@niuxinlideMacBook-Pro ~ % conda install pandas
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /Users/niuxinli/miniforge3

  added / updated specs:
    - pandas

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    bottleneck-1.3.2           |   py39heec5a64_1          96 KB  https://mirrors.ustc.edu.cn/anaconda/pkgs/main
    ca-certificates-2021.7.5   |       hca03da5_1         113 KB  

安装PyCharm

PyCharm支持M1处理器,下载PyCharm社区版即可。

截屏20210925 下午6.44.40.png

给pycharm创建一个环境

截屏20210925 下午7.09.04.png

安装TensorFlow

安装依赖

conda activate pycharm
conda install -c apple tensorflow-deps

用pip安装tensorflow

pip默认源太慢,临时用阿里的源

python -m pip install tensorflow-macos -i https://mirrors.aliyun.com/pypi/simple/ 

安装metal plugin

python -m pip install tensorflow-metal -i https://mirrors.aliyun.com/pypi/simple/ 

安装一些其他依赖

brew install libjpeg
pip install tensorflow-datasets -i https://mirrors.aliyun.com/pypi/simple/ 
conda install -y pandas matplotlib scikit-learn jupyterlab  

安装完后,import numpy报错,

Original error was: dlopen(/Users/niuxinli/miniforge3/envs/pycharm/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-darwin.so, 0x0002): Library not loaded: @rpath/libcblas.3.dylib

查了一下,随便用安装opencv看看能解决吗,确实把import的报错解决了,不过有个错误,说tensorflow2.5与numpy1.21.2不兼容,先不管。

pip install opencv-python -i https://mirrors.aliyun.com/pypi/simple/

以下为安装时的报错

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-macos 2.5.0 requires numpy~=1.19.2, but you have numpy 1.21.2 which is incompatible.

从下面的运行来看这个报错没有影响tensorflow正常工作。

测试TensorFlow

为了对比m1下tensorflow的性能,我在网上找了一个博主写的对比结果和代码,链接如下:

https://zhuanlan.zhihu.com/p/350955566

他还是在mac os 11下安装的,理论上性能不如上面的安装方法。代码我稍微调整了一下兼容性相关的东西,其他的都不变。

import tensorflow as tf
import tensorflow_datasets as tfds
import time
from datetime import timedelta
from tensorflow.python.framework.ops import disable_eager_execution
disable_eager_execution()

(ds_train, ds_test), ds_info = tfds.load(
    'mnist',
    split=['train', 'test'],
    shuffle_files=True,
    as_supervised=True,
    with_info=True,
)

def normalize_img(image, label):
    return tf.cast(image, tf.float32) / 255., label

batch_size = 128

ds_train = ds_train.map(
normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)
ds_train = ds_train.cache()
ds_train = ds_train.shuffle(ds_info.splits['train'].num_examples)
ds_train = ds_train.batch(batch_size)
ds_train = ds_train.prefetch(tf.data.experimental.AUTOTUNE)

ds_test = ds_test.map(
normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)
ds_test = ds_test.batch(batch_size)
ds_test = ds_test.cache()
ds_test = ds_test.prefetch(tf.data.experimental.AUTOTUNE)

model = tf.keras.models.Sequential([
 tf.keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu'),
 tf.keras.layers.Conv2D(64, kernel_size=(3, 3), activation='relu'),
 tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
 tf.keras.layers.Flatten(),
 tf.keras.layers.Dense(128, activation='relu'),
 tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(
 loss='sparse_categorical_crossentropy',
 optimizer=tf.keras.optimizers.Adam(0.001),
 metrics=['accuracy'],
)

start = time.time()

model.fit(
 ds_train,
 epochs=10,
 # validation_steps=1,
 # steps_per_epoch=469,
 # validation_data=ds_test # 此处如果按原脚本添加这行,脚本无法运行,暂时未有解决方法
)

delta = (time.time() - start)
elapsed = str(timedelta(seconds=delta))
print('Elapsed Time: {}'.format(elapsed))

运行的时候可以看到,GPU使用率接近100%

截屏20210925 下午8.38.50.png

运行时间几乎稳定在1分32秒,比博主3分20秒的成绩提高了一半,接近Colab GPU。

截屏20210925 下午8.40.29.png

因此,在m1上安装macos 12以及tensorflow 2.5, 性能比之前接近翻倍。