介绍了主线程和合成线程各自工作的演化,为什么将raster像素化移到了合成线程
我们知道,在childcompositor里,implthread里,pendingtree被raster之后,才会被activate成activetree. 而且raster任务由一些rasterthread执行,那么这些raster操作干了什么?
看看这篇chromium的官方英文文章:
https://www.chromium.org/developers/design-documents/impl-side-painting
引入了impl-sidepainting之后,raster操作从webcore线程挪到了impl线程,impl线程安排rastertask让rasterthread去真正执行。webcore线程绘制到skpicture里面。里面都是displaylist. 并且划分了picturepile,针对一个layer,划分成多个tile(tile还有不同的缩放比例),每个tile有不同的位置,优先级,还有不同的resolution.impl线程针对tile排序,产生rastertask让raster线程去操作。
绘制的结果产生texture,通过sharedmemory供给gpumemory.
impl 线程合成时,根据resourceid, (textureid)来检查该tile对应texture是否已经在gpu中了,而后可以合成。
————————————————
版权声明:本文为「csshell2002」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
Background & Problem Statement
Our current compositor thread architecture is built around the idea of rasterizing layers on the main webkit thread and then, on the compositor thread, drawing the bits of the layers that we have based on their various animated and scrolled positions. This allows us to move the page up and down, e.g. due to finger dragging, without having to block on the webkit thread. When a tile is exposed that does not have contents, we draw a checkerboard and wait for the main thread to rasterize that tile.
- We want to be able to fill in checkerboards without requiring a new commit, since that requires going to a busy webkit thread and pulling in a whole new tree + damage. We also want to be able to render tiles at multiple resolutions, and quality levels. These kinds of tricks reduce memory pressure, avoid the jarring interruption of checkerboards.
The Excessive Checkerboarding Problem
A lot of our unwanted checkerboarding comes from invalidates getting intermixed with "requests" from the impl thread to fill in missing tiles. In the current architecture, we can only rasterize tiles on the main thread, using webkit's rendering data structures. If webkit's rendering tree is completely unchanged, then the page scrolls, all the rasterization requests that go to the main thread are easily satisifed by webkit.
However, any time javascript changes the rendering tree, we have the following problem: we have some "newly exposed tiles" that the compositor thread needs to prevent checkerboarding. But, annoyingly, any of the previously-painted tiles that webkit says were invalidated. We can only paint the new rendering tree -- the old rendering tree is gone. So, we have two options at this point:
Our other source of heavy checkerboarding is latency related. The work we do on the main thread is based on as scroll position update message that comes from the impl thread. This message is itself not very latent, arriving on the main thread milliseconds after it is sent. However, paints for a new set of tiles can take 300ms + to complete, even with the relaxed atomicity approach described above. By the time we have painted all 300ms worth of work, the page has scrolled way past the original scroll position, and half of the tiles we worked hard to prepare are irrelevant. We have discussed a variety of solutions here, but the real core problem is that the main thread cannot be updated fast enough with the new scroll positions to really ever keep up properly.
Planned Solution
Display lists. Namely, SkPictures, modified a bit to support partial updating. We call this a Picture pile, a name borrowed from the awesome folks behind Android Browser. The idea is to only capture a display list of the webkit rendering tree on the main thread. Then, do rasterization on the impl thread, which is much more responsive.
On main thread, web content is turned into PictureLayers. Picture layers make a recording of the layer into a PicturePile. We track invalidations in SkRegions and during the display list capture process, decide between re-capturing the entire layer or just grabbing the invalidated area and drawing it on-top of the previously recorded base layer.
During commit, we pass these PicturePiles to a PictureLayerImpl. Recall, layers can change in scale over time, under animation, pinch zoom, etc. To handle this, a PictureLayerImpl manages one or more PictureLayerTiling objects (via a PictureLayerTilingSet), which is a decomposition of the layer's entire contents into tiles at a picture screenspace resolution. So for example, a 512x512 layer might have a tiling into 4 256x256 tiles for a 1:1 ratio of screenspace pixels to content pixels, but also 1 256x256 tile for a 1:2 ratio of screenspace to conten space. We manage these tilings dyanmically.
A tiling itself takes the layers entire size, not just the visible part, and breaks it up into Tiles. Each tile represents a rectangle of the PicturePile painted into a Resource ID [think, GL texture], at a given resolution and quality setting.
This approach fixes the “atomicity of commits” problem by allowing us to servie checkerboard misses without havin to go to the laggy, potentially changed main thread. In the previous example, when the compositor sees a checkerboarded tile, we can rasterize it without having to start a commit flow, allowing us to disallow commits entirely during flings and other heavy animation use cases.
Hitch-free commits
A key challenge with this approach is switching from the old tree to the new tree. In the existing architecture, when we go to switch to the new tree, we have painted and uploaded all the tiles, so the tree can be immediately switched.
Handling Giant SkPictures
One potential challenge to impl-side painting compared to our existing painting model is that the SkPicture for a given layer are potentially unbounded. We plan to mitigate this by limiting the PicturePile's size to a 10,000px (emperically determined) portion of the total layer size cenetered around the viewport at the time of the picture pile's first creation. When the impl thread starts needing tiles outside the pile's area, we will asynchronously trigger the main thread to go update the pile around the new viewport center.
Handling setPictureListener 合成器的结果可以传回给主线程???
If the embedder has a picture listener, we need to send a serialized SkPicture to the embedding process. We would need to, at every impl-side swapbuffers, serialize our SkPictures for all the active layers (plus the bitmaps) and send them to the main thread.Followup Work
The initial impl-side painting implementation is expected to enable the following followup use cases:
Low-res tiles: For tiles that take a long time to rasterize, we may want to rasterize them at half or third resolution. This often dramatically reduces (5-6x anecdotally) raster cost and allows us to avoid checkerboarding during fling. However, it is worth noting that some Android users criticized this behavior on ICS devices as making fonts look too ugly. High-dpi devices may change the UX impact of this behavior on users.
Just-in-time scaling: We currently do resizing of content at many layers in the pipeline. For example, we rasterize layers at their content resolution without consideration to their screenspace transform. Thus, a layer that is -webkit-transform: scale(0.5)’d will actually paint at its full size. Similarly, we resize images inside webkit at their content resolution. We could reduce rasterization/decode costs and memory footprint if we could do all of this scaling using the draw-time transforms on the impl thread.
Accelerated painting: An interesting property of impl-side painting is that it cleans up our accelerated painting story. We would store the SkPicture for a layer, and then can decide to rasterize a layer with the GPU without having to involve the main thread at all in the process.