mac docker 共享

Docker just released a native MacOS runtime environment to run containers on Macs with ease. They fixed many issues, but the bitter truth is they missed something important. The read and write access for mounted volumes is terrible.

Docker刚刚发布了本机MacOS运行时环境,可轻松在Mac上运行容器。 他们解决了许多问题,但痛苦的事实是他们错过了一些重要的事情。 对已装入的卷的读写访问非常糟糕。

(Benchmarks)

We can spin up a container and write to a mounted volume by executing the following commands:

我们可以执行以下命令来启动容器并写入已安装的卷:

  1. Start a container
  2. Mount the current directory
  3. Write random data to a file in this directory
docker run --rm -it -v "$(PWD):/pwd" -w /pwd alpine time dd if=/dev/zero of=speedtest bs=1024 count=100000

Let’s compare the results of Windows, Cent OS and Mac OS:

让我们比较一下Windows,Cent OS和Mac OS的结果:

(Windows 10)

100000+0 records in
100000+0 records out
real    0m 0.29s
user    0m 0.03s
sys     0m 0.21s

(Cent OS)

100000+0 records in
100000+0 records out
real    0m 0.21s
user    0m 0.02s
sys     0m 0.14s

(Mac OS)

100000+0 records in
100000+0 records out
real 0m 19.32s
user 0m 0.42s
sys 0m 1.46s

So the winner is…. 19 seconds for writing. For reading it is quiet similar. When you develop a big dockerized application then you are in a bad spot. Usually you would work on your source code and expect no slowdowns for building. But the bitter truth is it will take ages.

所以胜利者是…。 19秒的写作时间。 对于阅读它是安静的相似。 当您开发大型的dockerized应用程序时,您将处于困境。 通常,您将使用源代码,并且不会降低构建速度。 但事实是,这将需要很长时间。

This GitHub issue tracks the current state. There is a lot of hate so better listen to the “members” instead of reading all the frustrations.

此GitHub问题跟踪当前状态。 有很多恨事,所以最好听“成员”的话,而不要读所有的挫败感。

@dsheetz from the Docker for Mac team nailed the issue:

来自Docker for Mac团队的@dsheetz指出了这个问题:

Perhaps the most important thing to understand is that shared file system performance is multi-dimensional. This means that, depending on your workload, you may experience exceptional, adequate, or poor performance with osxfs, the file system server in Docker for Mac. File system APIs are very wide (20-40 message types) with many intricate semantics involving on-disk state, in-memory cache state, and concurrent access by multiple processes. Additionally, osxfs integrates a mapping between OS X's FSEvents API and Linux's inotify API which is implemented inside of the file system itself complicating matters further (cache behavior in particular).

也许最重要的了解是共享文件系统的性能是多维的 。 这意味着,根据您的工作负载,使用osxfs (适用于Mac的Docker中的文件系统服务器)时,您可能会遇到异常,适当或较差的性能。 文件系统API非常广泛(20-40种消息类型),具有许多复杂的语义,涉及磁盘状态,内存中缓存状态以及多个进程的并发访问。 此外, osxfs集成了OS X的FSEvents API和Linux的inotify API之间的映射,该映射是在文件系统本身内部实现的,这使事情进一步复杂化(尤其是缓存行为)。

At the highest level, there are two dimensions to file system performance: throughput (read/write IO) and latency (roundtrip time). In a traditional file system on a modern SSD, applications can generally expect throughput of a few GB/s. With large sequential IO operations, osxfs can achieve throughput of around 250 MB/s which, while not native speed, will not be the bottleneck for most applications which perform acceptably on HDDs.

在最高级别,文件系统性能有两个维度: 吞吐量 (读/写IO)和延迟 (往返时间)。 在现代SSD上的传统文件系统中,应用程序通常可以期望几GB / s的吞吐量。 通过大量的顺序IO操作, osxfs可以实现约250 MB / s的吞吐量,这虽然不是本机速度,但对于大多数在HDD上可以令人满意地执行的应用程序来说,并不是瓶颈。

Latency is the time it takes for a file system system call to complete. For instance, the time between a thread issuing write in a container and resuming with the number of bytes written. With a classical block-based file system, this latency is typically under 10μs (microseconds). With osxfs, latency is presently around 200μs for most operations or 20x slower. For workloads which demand many sequential roundtrips, this results in significant observable slow down. To reduce the latency, we need to shorten the data path from a Linux system call to OS X and back again. This requires tuning each component in the data path in turn -- some of which require significant engineering effort. Even if we achieve a huge latency reduction of 100μs/roundtrip, we will still "only" see a doubling of performance. This is typical of performance engineering, which requires significant effort to analyze slowdowns and develop optimized components.

延迟是文件系统调用完成所花费的时间。 例如,线程在容器中发出写入操作与恢复写入的字节数之间的时间。 对于经典的基于块的文件系统,此延迟通常在10μs(微秒)以下。 使用osxfs ,目前大多数操作的延迟约为200μs,或慢20倍。 对于需要多次顺序往返的工作负载,这会导致明显的可观察到的速度降低。 为了减少延迟,我们需要缩短从Linux系统调用到OS X并再次返回的数据路径。 这就需要依次调整数据路径中的每个组件 -其中一些需要大量的工程工作。 即使我们将往返时间的延迟大大降低了100μs,我们仍将“仅”看到性能翻倍。 这是性能工程的典型特征,需要大量的精力来分析性能下降并开发优化的组件。

Many people created workarounds with different approaches. Some of them use nfs, Docker in Docker, Unison 2 way sync or rsync. I tried some solutions but non of them worked for my docker container that contains a big Java monolith. Either they install extra tools like vagrant to reduce the pain. Vagrant uses nfs but this is still slow compared to native write and read performance. Or they are unreliable, hard to setup and hard to maintain.

许多人使用不同的方法创建了变通办法。 其中一些使用nfs,Docker中的Docker,Unison 2方式同步或rsync。 我尝试了一些解决方案,但没有一个适用于包含大型Java整体的docker容器。 他们要么安装额外的工具,如流浪者,以减轻痛苦。 Vagrant使用nfs,但与本机读写性能相比仍然很慢。 或者它们不可靠,难以设置且难以维护。

I made a step back and thought about the root issue again. A very good approach is docker-sync. It’s a ruby application with a lot of options. One very mature option is file synchronisation based upon rsync.

我退后一步,再次考虑了根本问题。 一个很好的方法是docker-sync 。 这是一个有很多选择的Ruby应用程序。 一种非常成熟的选择是基于rsync的文件同步。

(Rsync)

Rsync initial release was in 1996 (20 years ago). It’s used for transferring files across computer systems. One important use case is 1-way synchronization.

Rsync最初发布于1996年(20年前)。 它用于跨计算机系统传输文件。 一个重要的用例是1路同步。

Sounds good…, right ?

听起来不错…对吧?

Docker-sync supports rsync for synchronization. In the beginning it worked but a few days later I got some connection issues between my host and my container.

Docker-sync支持rsync进行同步。 从一开始它就起作用了,但是几天后,我在主机和容器之间遇到了一些连接问题。

Do you know the feeling when you want to fix something but it feels so far away? You realise you don’t understand what’s happing behind the scenes.

您想知道要修理的东西时感觉很远吗? 您意识到自己不了解幕后的内容。

The rsync approach sounds right. It tackles the root of the issue: operating on mounted files right now is damn slow.

rsync方法听起来不错。 它解决了问题的根源:现在对已挂载的文件进行操作真是太慢了。

I tried other solutions but without real success.

我尝试了其他解决方案,但没有真正的成功。

(Build a custom image)

So let’s try to get our hands dirty. You start a rsync server in the container and connect to it using rsync. This approach works for many years for other use-cases.

因此,让我们尝试弄污双手。 您在容器中启动rsync服务器,并使用rsync连接到它。 对于其他用例,这种方法可以使用很多年。

Let’s setup a docker Centos 6 container with an installed and configured rsync service.

让我们使用已安装和配置的rsync服务设置一个Docker Centos 6容器。

  1. The Dockerfile
FROM centos:6
# install rsync
RUN yum update -y
RUN yum -y install rsync xinetd
# configure rsync
ADD ./rsyncd.conf /root/
RUN sed -i 's/disable[[:space:]]*=[[:space:]]*yes/disable = no/g' /etc/xinetd.d/rsync # enable rsync
RUN cp /root/rsyncd.conf /etc/rsyncd.conf
RUN /etc/rc.d/init.d/xinetd start
RUN chkconfig xinetd on
# create the dir that will be synced
RUN mkdir /home/share
# just to keep the container running
CMD /etc/rc.d/init.d/xinetd start && tail -f /dev/null

2. Build the container within the repository directory.

2.在存储库目录中构建容器。

docker build . -t docker-rsync

3. Start the container and map the rsync server port to a specific host port.

3.启动容器,并将rsync服务器端口映射到特定的主机端口。

docker run -p 10873:873 docker-rsync

Now we need to sync our share directory and sync any changes again as soon as anything changes. Rsync will only sync changes after an initial sync.

现在,我们需要同步共享目录,并在发生任何更改后立即再次同步所有更改。 Rsync仅在初始同步后才同步更改。

# initial sync
rsync -avP ./share --delete rsync://localhost:10873/example/
# sync on change
fswatch -0 ./share | xargs -0 -n 1 -I {} rsync -avP ./share --delete rsync://localhost:10873/example/

Fswatch utilizes rsync to talk to the container as soon as something changes. We do not use any kind of docker volume mounting. Hence all file operations will stay in the container and will be fast. Whenever we change something rsync transfers it to the container using . For sure you can use all rsync features like delete rules or exclude patterns.

一旦发生更改,Fswatch将使用rsync与容器进行对话。 我们不使用任何类型的docker卷挂载。 因此,所有文件操作都将保留在容器中并且速度很快。 每当我们更改某些内容时,rsync都会使用将其传输到容器。 当然,您可以使用所有rsync功能,例如删除规则或排除模式。

If we change something (it does not matter if it’s a small project or a huge one) then we see something like

如果我们更改某些内容(无论是小型项目还是大型项目都没有关系),那么我们会看到类似

2 files to consider
share/helloWorld.txt
           5 100%    0.00kB/s    0:00:00 (xfer#1, to-check=0/2)
sent 159 bytes  received 44 bytes  406.00 bytes/sec
total size is 5  speedup is 0.02

0.02 seconds, great !

0.02秒,太好了!

Fswatch uses file system events on Mac OS. Thus is still very fast and you can event tweak it. For example by excluding build related directories like target or node_modules.

Fswatch在Mac OS上使用文件系统事件。 这样仍然非常快,您可以对其进行事件调整。 例如,通过排除与构建相关的目录,例如targetnode_modules

Sources are available on GitHub.

来源可在GitHub上找到 。

For small projects the bad performance is not a critical issue. For huge application rsync is our hero. Good old tools, and still reliable and important.

对于小型项目,不良性能并不是关键问题。 对于庞大的应用程序,rsync是我们的英雄。 好的旧工具,仍然可靠且重要。

Especially for all guys who love Mac OS and need to use a VM know the pain. Issues like the command key mapping are annoying. Either you map it to the Windows key or in the end you don’t use it anymore. So on Mac OS you use cmd+c to copy something and in your container you use control. For sure you can also map your host control to command but then you have again other issues. Everything is better when you can work in Mac OS instead of in a virtual machine as a mac user.

尤其是对于所有喜欢Mac OS且需要使用VM的人来说,痛苦都在于此。 诸如命令键映射之类的问题令人讨厌。 您要么将其映射到Windows键,要么最后不再使用它。 因此,在Mac OS上,您可以使用cmd + c复制某些内容,并在容器中使用控件。 当然,您也可以将主机控件映射到命令,但是又遇到了其他问题。 当您可以在Mac OS中而不是在Mac用户中在虚拟机中工作时,一切都会变得更好。



I hope you enjoyed the article. If you like it and feel the need for a round of applause, follow me on Twitter.

希望您喜欢这篇文章。

I am a co-founder of our revolutionary journey platform called Explore The World. We are a young startup located in Dresden, Germany and will target the German market first. Reach out to me if you have feedback and questions about any topic.

Happy coding :)

快乐的编码:)

翻译自: https://www.freecodecamp.org/news/speed-up-file-access-in-docker-for-mac-fbeee65d0ee7/

mac docker 共享