The OpenGL® Shading Language, Version 4.60.7    第二章翻译

Chapter 2. Overview of Shading
The OpenGL Shading Language is actually several closely related languages. These languages are
used to create shaders for each of the programmable processors contained in the API’s processing
pipeline. Currently, these processors are the vertex, tessellation control, tessellation evaluation,
geometry, fragment, and compute processors.
Unless otherwise noted in this paper, a language feature applies to all languages, and common
usage will refer to these languages as a single language. The specific languages will be referred to
by the name of the processor they target: vertex, tessellation control, tessellation evaluation,
geometry, fragment, or compute.
Most API state is not tracked or made available to shaders. Typically, user-defined variables will be
used for communicating between different stages of the API pipeline. However, a small amount of
state is still tracked and automatically made available to shaders, and there are a few built-in
variables for interfaces between different stages of the API pipeline.

2.1. Vertex Processor
The vertex processor is a programmable unit that operates on incoming vertices and their
associated data. Compilation units written in the OpenGL Shading Language to run on this
processor are called vertex shaders. When a set of vertex shaders are successfully compiled and
linked, they result in a vertex shader executable that runs on the vertex processor.
The vertex processor operates on one vertex at a time. It does not replace graphics operations that
require knowledge of several vertices at a time.

2.2. Tessellation Control Processor
The tessellation control processor is a programmable unit that operates on a patch of incoming
vertices and their associated data, emitting a new output patch. Compilation units written in the
OpenGL Shading Language to run on this processor are called tessellation control shaders. When a
set of tessellation control shaders are successfully compiled and linked, they result in a tessellation
control shader executable that runs on the tessellation control processor.
The tessellation control shader is invoked for each vertex of the output patch. Each invocation can
read the attributes of any vertex in the input or output patches, but can only write per-vertex
attributes for the corresponding output patch vertex. The shader invocations collectively produce a
set of per-patch attributes for the output patch.
After all tessellation control shader invocations have completed, the output vertices and per-patch
attributes are assembled to form a patch to be used by subsequent pipeline stages.
Tessellation control shader invocations run mostly independently, with undefined relative
execution order. However, the built-in function barrier() can be used to control execution order by
synchronizing invocations, effectively dividing tessellation control shader execution into a set of
phases. Tessellation control shaders will get undefined results if one invocation reads from a pervertex
or per-patch attribute written by another invocation at any point during the same phase, or

if two invocations attempt to write different values to the same per-patch output 32-bit component
in a single phase.

2.3. Tessellation Evaluation Processor
The tessellation evaluation processor is a programmable unit that evaluates the position and other
attributes of a vertex generated by the tessellation primitive generator, using a patch of incoming
vertices and their associated data. Compilation units written in the OpenGL Shading Language to
run on this processor are called tessellation evaluation shaders. When a set of tessellation
evaluation shaders are successfully compiled and linked, they result in a tessellation evaluation
shader executable that runs on the tessellation evaluation processor.
Each invocation of the tessellation evaluation executable computes the position and attributes of a
single vertex generated by the tessellation primitive generator. The executable can read the
attributes of any vertex in the input patch, plus the tessellation coordinate, which is the relative
location of the vertex in the primitive being tessellated. The executable writes the position and
other attributes of the vertex.

2.4. Geometry Processor
The geometry processor is a programmable unit that operates on data for incoming vertices for a
primitive assembled after vertex processing and outputs a sequence of vertices forming output
primitives. Compilation units written in the OpenGL Shading Language to run on this processor are
called geometry shaders. When a set of geometry shaders are successfully compiled and linked, they
result in a geometry shader executable that runs on the geometry processor.
A single invocation of the geometry shader executable on the geometry processor will operate on a
declared input primitive with a fixed number of vertices. This single invocation can emit a variable
number of vertices that are assembled into primitives of a declared output primitive type and
passed to subsequent pipeline stages.

2.5. Fragment Processor
The fragment processor is a programmable unit that operates on fragment values and their
associated data. Compilation units written in the OpenGL Shading Language to run on this
processor are called fragment shaders. When a set of fragment shaders are successfully compiled
and linked, they result in a fragment shader executable that runs on the fragment processor.
A fragment shader cannot change a fragment’s (x, y) position. Access to neighboring fragments is
not allowed. The values computed by the fragment shader are ultimately used to update
framebuffer memory or texture memory, depending on the current API state and the API command
that caused the fragments to be generated.

2.6. Compute Processor
The compute processor is a programmable unit that operates independently from the other shader
processors. Compilation units written in the OpenGL Shading Language to run on this processor are
called compute shaders. When a set of compute shaders are successfully compiled and linked, they
result in a compute shader executable that runs on the compute processor.
A compute shader has access to many of the same resources as fragment and other shader
processors, such as textures, buffers, image variables, and atomic counters. It does not have fixedfunction
outputs. It is not part of the graphics pipeline and its visible side effects are through
changes to images, storage buffers, and atomic counters.
A compute shader operates on a group of work items called a workgroup. A workgroup is a
collection of shader invocations that execute the same code, potentially in parallel. An invocation
within a workgroup may share data with other members of the same workgroup through shared
variables and issue memory and control flow barriers to synchronize with other members of the
same workgroup.

中文
第二章 着色器概述

OpenGL着色语言实际上是几个密切相关的语言。这些语言是

用于为API处理中包含的每个可编程处理器创建着色器

管道。目前,这些处理器是顶点、细分曲面控制、细分曲面评估、,

几何体、片元和计算处理器。

除非本文中另有说明,否则语言特性适用于所有语言,并且通用

用法将这些语言称为单一语言。具体语言将参考

根据目标处理器的名称:顶点、细分曲面控制、细分曲面评估,

几何体、片元或计算。

大多数API状态都不会被跟踪或提供给着色器。通常,用户定义的变量

用于API管道不同阶段之间的通信。然而,少量的

状态仍然被跟踪并自动提供给着色器,并且有一些内置的

API管道不同阶段之间接口的变量。

2.1. 顶点处理器

顶点处理器是一个可编程单元,它对传入的顶点及其顶点进行操作

关联数据。用OpenGL着色语言编写的编译单元

处理器称为顶点着色器。当一组顶点着色器成功编译并

链接后,它们将生成在顶点处理器上运行的顶点着色器可执行文件。

顶点处理器一次操作一个顶点。它不会取代

一次需要多个顶点的知识。

2.2. 细分曲面控制处理器

细分曲面控制处理器是一个可编程的单元,它在一片输入信号上运行

顶点及其关联数据,发出新的输出面片。编撰单位

在这个处理器上运行的OpenGL着色语言称为细分曲面控制着色器。当

一组细分控制着色器已成功编译和链接,它们将导致细分

在细分控制处理器上运行的控制着色器可执行文件。

将为输出面片的每个顶点调用细分控制着色器。每次调用都可以

读取输入或输出面片中任何顶点的属性,但只能逐顶点写入

对应输出面片顶点的属性。着色器调用共同生成

输出面片的每面片属性集。

所有细分控制着色器调用完成后,输出顶点和每个面片

属性被组合成一个补丁,供后续管道阶段使用。

细分控制着色器调用主要是独立运行的,具有未定义的相对

执行令。但是,内置函数barrier()可以通过

同步调用,有效地将细分控制着色器执行划分为一组

阶段。如果一个调用从pervertex读取,细分控制着色器将获得未定义的结果

或者在同一阶段的任何时候由另一个调用写入的每个补丁属性,或者如果两个调用尝试向相同的每个补丁输出32位组件写入不同的值

在单相中。

2.3. 细分曲面计算处理器

细分曲面计算处理器是一种可编程单元,用于计算位置和其他

由细分曲面图元成器生成的顶点属性,使用一个传入的额外补充的

顶点及其相关数据。编译单元写在OpenGL着色语言

在这个处理器上运行被称为细分曲面计算着色器。当一组细分曲面

计算着色器被成功地编译和链接,他们的结果是细分曲面计算着色器

在细分曲面计算处理器上运行的着色器可执行程序。

每一次细分曲面计算可执行计算的调用,可以执行计算由细分曲面图元生成

单个顶点生成的位置和属性。可执行读取额外补充多个顶点的属性,

加上细分曲面坐标,这是细分曲面图元顶点的相对位置。可执行文件写入位置和

顶点的其他属性。

2.4. 几何体处理器

几何处理器是一个可编程的单元,它对数据进行操作,为a的传入顶点

顶点处理后的基元集合并输出一系列的顶点成形输出

图元。在这个处理器上运行的OpenGL材质语言编写的编译单元是

被称为几何着色器。当一组几何着色器被成功编译和链接时,它们

生成在几何处理器上运行的几何着色器可执行文件。

一个单一的调用几何着色器可执行的几何处理器将操作

已声明的输入图元具有固定数量的顶点。这一次调用可以触发一个变量

装配到已声明的输出图元类型的原语中的顶点数

传递到后续管道阶段。

2.5 片元处理器

片元处理器是一种可编程单元,它对片元值及其进行操作

相关的数据。用OpenGL着色语言编写的编译单元在上面运行

处理器被称为片段着色器。当一组片元着色器被成功编译

和链接,他们导致一个片元着色器可执行程序运行在片元处理器上。

片元着色器不能改变片段的(x, y)位置。访问邻近的片元是

不允许的。由片元着色器计算的值最终用于更新

片元内存或纹理内存,这取决于当前的API状态和API命令

这就产生了片元。

2.6. 计算处理器

计算处理器是独立于其他着色器的可编程单元

处理器。运行在这个处理器上的OpenGL着色语言编写的编译单元称为计算着色器。当一组计算机着色器被成功编译和链接时,它们

生成在计算处理器上运行的计算着色器可执行文件。

一个计算着色器可以访问许多与片元和其他着色器相同的资源

处理器,比如纹理、缓冲区、图像变量和原子计数器。它没有固定的功能

输出。它不是图形管道的一部分,它的可见副作用是通过的

对图像、存储缓冲区和原子计数器的更改。

一个计算着色器操作一组称为工作组的工作项。工作组是一个

执行相同代码的着色器调用的集合,可能是并行的。调用

在一个工作组中,可以通过着色器调用与同一工作组的其他成员共享数据

变量并发出内存和控制流障碍以与其他成员同步

相同的工作组。

 

 

补充解释

下面代码将使用Tessellation Shader,传入2个控制点的情况下绘制一条正弦曲线连接这2个控制点。

效果如下图:(细分数目分别为1,8,32)

The OpenGL® Shading Language, Version 4.60.7 第二章翻译_API

The OpenGL® Shading Language, Version 4.60.7 第二章翻译_API_02

The OpenGL® Shading Language, Version 4.60.7 第二章翻译_GLSL_03

 

Vertex Shader:

 


 



1. ​​#version 400​​
2. ​​layout (location = 0) in vec3 in_Vertex;​​
3. ​​uniform mat4 ModelViewProjectionMatrix;​​
4.
5. ​​void main()​​
6. ​​{​​
7. ​​gl_Position = vec4(in_Vertex, 1);​​
8. ​​}​​

就这么简单。Vertex Shader原封不动的把传入的点传给下一道渲染工序:Tessellation Control Shader。

 

Tessellation Control Shader(TCS):

 


 

1. ​​#version 400​​
2. ​​layout( vertices=4 ) out;​​
3. ​​void main()​​
4. ​​{​​
5. ​​// Pass along the vertex position unmodified​​
6. ​​gl_out[gl_InvocationID].gl_Position = gl_in[gl_InvocationID].gl_Position; ​​
7.
8. ​​gl_TessLevelOuter[0] = float(1);​​
9. ​​gl_TessLevelOuter[1] = float(32);​​
10. ​​}​​

Tessellation Control Shader干了两件事情。

 

第一:把输入的控制点的坐标信息(gl_in[gl_InvocationID].gl_Position)原封不动的输出。

第二:设置了细分控制参数。对于不同的细分类型(这里是曲线细分,会在下一个shader中看到细分类型的选择),这2个参数意义不一样。

gl_TessLevelOuter[0]表示要生成几条曲线。这里我们选择1条。

gl_TessLevelOuter[1] 表示将曲线细分成几段。这个很是决定细分程度的关键参数。我们分别使用1、8、32测试。

【注意】gl_TessLevelOuter[0]与gl_TessLevelOuter[1]的意义,在ATI显卡和NV显卡中是相反的!我这里以NV的卡为例。

Tessellation Evaluation Shader(TES):

 


1. ​​#version 400​​
2. ​​layout( isolines ) in;​​
3. ​​uniform mat4 ModelViewProjectionMatrix;​​
4. ​​void main()​​
5. ​​{​​
6. ​​float u = gl_TessCoord.x;​​
7. ​​vec3 p0 = gl_in[0].gl_Position.xyz;​​
8. ​​vec3 p1 = gl_in[1].gl_Position.xyz;​​
9. ​​float leng = length(p1 - p0)/2.0;​​
10. ​​// Linear interpolation​​
11. ​​vec3 p;​​
12. ​​p.x = p0.x*u + p1.x*(1-u);​​
13. ​​p.y = p0.y + leng*sin(u*2*3.1415);​​
14. ​​// Transform to clip coordinates​​
15. ​​gl_Position = ModelViewProjectionMatrix * vec4(p, 1);​​
16. ​​}​​

这里就是最关键的细分曲线算法了。

首先看到layout( isolines ) 的申明。这就告诉了TES我们使用的细分类型为曲线。这样,这里的gl_TessCoord.x,对于曲线来说我们只用到x分量就够了,他的取值范围在[0, 1]之间自动插值(根据Tessellation Control Shader中设置的分段数)

 

对于传入的2个控制点p0和p1,我们计算两点之间的长度,然后使用[0, 2*pi]区间,绘制一条正弦曲线。最后进行MVP坐标转换输出。

 

 Fragment Shader


 

1. ​​#version 400​​
2.
3. ​​void main()​​
4. ​​{​​
5. ​​gl_FragColor = vec4(1, 0, 0, 1.0);​​
6. ​​}​​

这个不是关键,能多简单我就多简单了。

OpenGL代码:

 

 


 

1.  
2. ​​pShader->sendUniform(string("ModelViewProjectionMatrix"), value_ptr(matMVP2)); ​​
3. ​​gl::BindVertexArray(_vertexArrayBlock); ​​
4. ​​gl::EnableVertexAttribArray(0);​​
5.
6. ​​gl::PatchParameteri(GL_PATCH_VERTICES, 2);​​
7. ​​gl::DrawArrays(GL_PATCHES, 0, 2);​​
8.
9. ​​gl::BindVertexArray(0);​​
10. ​​gl::DisableVertexAttribArray(0);​​

这里看到绘制类型一定只能为GL_PATCHES,另外注意设置Patch参数为GL_PATCH_VERTICES。由于只有2个控制点,DrawArrays传递参数2.

【注】2个控制点的VBO或者VAO的设置,以及4个Shader的编译、链接等步骤,这里就省略不写了。