UnitySRP原理初探
分析基于Unity官方提供的资料非常的珍贵SRP底层渲染流程及原理
原视频从profiler的角度解释了SRP内部运行原理,但看完依然有几个问题无法解答.
- 自定义的SRP RenderPipeline的Render是在哪里被调用的?
- context的函数是如何调用Unity内核的?
- 经过调试发现Render()函数并不是多线程调用的,为什么说SRP是多线程渲染.
为了解答上面几个问题,需要对程序做一个逆向分析.
下面是一段简单的渲染队列运行过程:
public class OpaqueAssetPipeInstance : RenderPipeline
{
protected override void Render(ScriptableRenderContext context, Camera[] cameras)
{
foreach (var camera in cameras)
{
// Culling
camera.TryGetCullingParameters(out var cullingParams);
var cullreults = context.Cull(ref cullingParams);
// Setup camera for rendering (sets render target, view/projection matrices and other
// per-camera built-in shader variables).
context.SetupCameraProperties(camera);
// clear depth buffer
var cmd = new CommandBuffer();
cmd.ClearRenderTarget(true, false, Color.black);
context.ExecuteCommandBuffer(cmd);
cmd.Release();
// Draw opaque objects using BasicPass shader pass
SortingSettings sst = new SortingSettings(camera){criteria = SortingCriteria.CommonOpaque};
var settings = new DrawingSettings(new ShaderTagId("BasicPass"),sst);
var filterSettings = new FilteringSettings(RenderQueueRange.opaque);
context.DrawRenderers(cullreults, ref settings,ref filterSettings);
// Draw skybox
context.DrawSkybox(camera);
context.Submit();
}
}
}
Render何时被调用
可以用调试器堆栈,也可以经过IDA pro查找得出:
UnityMain
UnityMainImpl
PlayerMainWndProc
RenderManager::RenderCameras
RenderManager::RenderCamerasWithScriptableRenderLoop
ScriptableRenderContext::ExtractAndExecuteRenderPipeline//执行pipeline
ScriptingInvocation::Invoke
scripting_method_invok
scripting_method_invok就会调用il2cpp中的RenderPipeline.Render().可以看出Render的确是在主线程运行.
回答了以上第一条问题.
context.Cull
这里来看context.Cull(ref cullingParams);可以在IDA pro中看到其调用过程,总结如下
ScriptableRenderContext_CUSTOM_Internal_Cull_Injected//是Cull函数在Unity内核的运行过程
CullScriptable
CullScene((struct CullResults *)v4);
CullDynamicScene
JobBatchDispatcher::ScheduleJobForEachInternal(
(JobBatchDispatcher *)&v28,
(struct JobFence *)&v29,
CullDynamicObjectsJob,
v19,
v19[54],
CullDynamicSceneCombineJob,
a2);
CullResults::GetOrCreateSharedRendererScene((CullResults *)v4);
CullResults::GetOrCreateSharedRendererScene(CullResults *this)
ExtractSceneRenderNodeQueue//遍历所有目前可见的Renderer对象,然后把它里面的数据拷贝到RenderNode上
CullSceneDynamicObjects
这里的CullDynamicObjectsJob与原文中的profiler的CullSceneDynamicObjects联系起来:
可以发现CullDynamicScene会将culling队列进行分割,交给JobBatchDispatcher分配多个CullDynamicObjectsJob,已达到多线程Culling的效果.
根据job timeline可以判断出CullDynamicSceneCombineJob是在所有CullDynamicObjectsJob运行结束后,由最后一个运行CullDynamicObjectsJob的线程执行的.
所以可以判断出JobBatchDispatcher::ScheduleJobForEachInternal函数的具体作用.符合了原文中提到的对culling队列的提出,和最后整合的过程.
ExtractRenderNodeQueue
随后就是ExtractSceneRenderNodeQueue,它也是在culling过程中调用的
CullResults::GetOrCreateSharedRendererScene((CullResults *)v4);
CullResults::GetOrCreateSharedRendererScene(CullResults *this)
ExtractSceneRenderNodeQueue//遍历所有目前可见的Renderer对象,然后把它里面的数据拷贝到RenderNode上
BeginRenderQueueExtraction
JobBatchDispatcher::ScheduleJobForEachInternal(
a7,
(struct JobFence *)((char *)v17 + 2224),
ExecuteRenderQueueJob,
v17,
v20,
CopyNodesIntoJobGaps,
(const struct JobFence *)v23);
这里ExecuteRenderQueueJob就是被执行线程,与ExtractRenderNodeQueue对应起来了.
关于内核函数与profiler命名的对应
对于一个不了解Unity内核的程序员是很难理解,答案就藏在UnityPlayer.dll的debug版本中.
来看ExecuteRenderQueueJob的逆向伪代码.
void __fastcall ExecuteRenderQueueJob(void *a1, unsigned int a2)
{
__int64 *v2; // rdi
__int64 i; // rcx
__int64 v4; // [rsp+0h] [rbp-58h] BYREF
struct profiling::Marker *v5; // [rsp+28h] [rbp-30h]
_DWORD *v6; // [rsp+38h] [rbp-20h]
__int64 v7; // [rsp+40h] [rbp-18h]
v2 = &v4;
for ( i = 20i64; i; --i )
{
*(_DWORD *)v2 = -858993460;
v2 = (__int64 *)((char *)v2 + 4);
}
v7 = -2i64;
v5 = (struct profiling::Marker *)&unk_18BDFF480;
profiler_begin((struct profiling::Marker *)&unk_18BDFF480);
v6 = (char *)a1 + 24 * a2 + 2280;
ExecuteRenderQueue(
(char *)a1 + 136 * a2 + 48,
a1,
(unsigned int)*v6,
(unsigned int)(*((_DWORD *)a1 + 6 * a2 + 572) + *v6));
profiler_end(v5);
}
可以看到profiler_begin和profiler_end,代表了记录信息开始和结束.
__int64 dynamic_initializer_for__gExtractRenderNodeQueue__()
{
__int64 *v0; // rdi
__int64 i; // rcx
__int64 v3; // [rsp+0h] [rbp-28h] BYREF
v0 = &v3;
for ( i = 8i64; i; --i )
{
*(_DWORD *)v0 = -858993460;
v0 = (__int64 *)((char *)v0 + 4);
}
return profiling::Marker::Marker(&unk_18BDFF480, 0i64, "ExtractRenderNodeQueue", 0i64);
}
unk_18BDFF480与ExtractRenderNodeQueue绑定,就表示了ExecuteRenderQueueJob记录信息是以ExtractRenderNodeQueue的名字被记录下来的.对无法涉及底层的程序员理解起来太不容易了.
context.ExecuteCommandBuffer
我们先来看看ExecuteCommandBuffer在il2cpp中的调用
OpaqueAssetPipeInstance_Render
ScriptableRenderContext_ExecuteCommandBuffer_m044EA375988E542EF1A03C560F924EEFD743A875
ScriptableRenderContext_ExecuteCommandBuffer_Internal
ScriptableRenderContext_ExecuteCommandBuffer_Internal_Injected
_il2cpp_icall_func(____unity_self0, ___commandBuffer1)//"UnityEngine.Rendering.ScriptableRenderContext::ExecuteCommandBuffer_Internal_Injected(UnityEngine.Rendering.ScriptableRenderContext&,UnityEngine.Rendering.CommandBuffer)"调用UnityPlayer中的函数
ScriptableRenderContext_CUSTOM_ExecuteCommandBuffer_Internal_Injected
ScriptableRenderContext::ExecuteCommandBuffer
RenderingCommandBuffer::RenderingCommandBuffer
dynamic_array<RenderingCommandBuffer *,0>::emplace_back//m_CommandBuffers
ScriptableRenderContext::AddCommandWithIndex
dynamic_array<ScriptableRenderContext::Command,0>::emplace_back//m_Commands
可以看出context.ExecuteCommandBuffe并没有execute只是填充了m_DrawRenderersCommands和m_Commands与视频中相同.
context.DrawRenderers
我们先来看看DrawRenderers在il2cpp中的调用
ScriptableRenderContext_DrawRenderers
ScriptableRenderContext_DrawRenderers_Internal
ScriptableRenderContext_DrawRenderers_Internal_Injected
_il2cpp_icall_func
//UnityEngine.Rendering.ScriptableRenderContext::DrawRenderers_Internal_Injected(UnityEngine.Rendering.ScriptableRenderContext&,System.IntPtr,UnityEngine.Rendering.DrawingSettings&,UnityEngine.Rendering.FilteringSettings&,System.IntPtr,System.IntPtr,System.Int32)
在UnityPlayer中
void __cdecl ScriptableRenderContext_CUSTOM_DrawRenderers_Internal_Injected
ScriptableRenderContext::DrawRenderers(*(ScriptableRenderContext **)a1, a2, a3, a4, Src, a6, a7);
dynamic_array<DrawRenderersCommand,0>::emplace_back//m_DrawRenderersCommands
ScriptableRenderContext::AddCommandWithIndex
dynamic_array<ScriptableRenderContext::Command,0>::emplace_back//m_Commands
与上面ExecuteCommandBuffer类似,也只是更新了列表.
context.Submit
il2cpp中
ScriptableRenderContext_Submit
ScriptableRenderContext_Submit_Internal_Injected
_il2cpp_icall_func
//UnityEngine.Rendering.ScriptableRenderContext::Submit_Internal_Injected(UnityEngine.Rendering.ScriptableRenderContext&)
UnityPlayer中
ScriptableRenderContext_CUSTOM_Submit_Internal_Injected
ScriptableRenderContext::Submit(ScriptableRenderContext *this)
ScriptableRenderContext::ExecuteScriptableRenderLoop
ExecuteDrawRenderersCommand
GfxDeviceClient::ExecuteAsync
GfxDevice::ExecuteAsync
ScriptableRenderLoopJob
ScriptableRenderLoopDraw
ScriptableRenderLoopDrawSRPBatcher
Flush
RenderMultipleMeshes
ExecuteDrawRenderersCommand会触发profiler中的"RenderLoop.ScheduleDraw"
关于RenderMultipleMeshes
void __fastcall ScriptableBatchRenderer::RenderMultipleMeshes(__int64 this, __int64 renderNodeQueue, __int64 renderMultipleData, unsigned int mask)
真正用于填充和提交数据的是RenderMultipleMeshes不是Flush函数.
总结
context.Cull 多线程Culling和生成RenderQueue队列.
context.ExecuteCommandBuffer和context.DrawRenderers填充command列表.
context.Submit渲染提交的主要函数,负责合批和提交的主要工作.