cinera_handmade.network/cmuratori/hero/code/code478.hmml

116 lines
9.8 KiB
Plaintext
Raw Permalink Normal View History

[video output=day478 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Changing to Single Dispatch Per Pass (Part 2)" vod_platform=youtube id=0d0_NitChCY annotator=Miblo]
2018-09-04 21:49:51 +00:00
[0:00][Recap our switch to texture vertex indices][:hardware :rendering :speech]
[1:36][Augment game_render_commands with an IndexArray, with a few words on the GPU's ability to cache vertex shader transforms][:hardware :rendering]
[7:29][Remove renderer_texture_group and renderer_memory_layout, and pass our new IndexArray down the pipe, also augmenting open_gl with IndexArray and MaxIndexCount][:hardware :rendering]
[14:00][:Run the Renderer Test successfully][:hardware :rendering]
[14:12][Reduce the MaxQuadCountPerFrame in RenderTest() and the (grass) CoverIndex in PushSimpleScene(), with a few words on breaking the problem into steps][:hardware :rendering]
[16:30][:Run the Renderer Test with our single vertex buffer working][:hardware :rendering]
[16:51][Augment render_entry_textured_quads with IndexArrayOffset for OpenGLEndFrame() to use the index buffer passed in][:hardware :rendering]
[20:19][Change OpenGLEndFrame() to call glBufferData() once, instead of while processing every render command][:hardware :rendering]
[22:17][:Run it to see problems with the depth peeling][:hardware :rendering]
[22:44][Make OpenGLInit() generate a separate CompositeVertexBuffer and ScreenFillVertexBuffer[ref
site="OpenGL 4 Reference Pages"
page=glBufferData
url=https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glBufferData.xhtml]][:hardware :rendering]
[28:52][:Run it with the depth peel working correctly][:hardware :rendering]
[29:03][Revert the MaxQuadCountPerFrame in RenderTest() and the (grass) CoverIndex in PushSimpleScene() to their original high values][:hardware :rendering]
[29:55][:Run it to see that it is much zippier][:hardware :performance :rendering]
[30:34][Temporarily disable the :lighting][:rendering]
[31:15][:Run it to see that it didn't affect it much][:lighting :performance :rendering]
[31:51][Make RenderLoop() disable the :lighting][:rendering]
[33:17][:Run it and gauge the :performance with the :lighting disabled][:rendering]
[35:40][Turn off all the grass and augment open_gl with an IndexBuffer for OpenGLEndFrame() to use, rather than initialising Indices itself[ref
site="OpenGL 4 Reference Pages"
page=glBufferData
url=https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glBufferData.xhtml][ref
site=Khronos
page=glcorearb.h
url=https://www.khronos.org/registry/OpenGL/api/GL/glcorearb.h]][:hardware :rendering]
[45:31][Make PushQuad() fill our new IndexBuffer with the VertIndex][:hardware :rendering]
[49:08][:Run it without crashing, but also without seeing anything][:hardware :rendering]
[49:37][Fix OpenGLEndFrame() to pass GL_UNSIGNED_SHORT to glDrawElements()][:hardware :rendering]
[50:36][:Run it to see that our triangle must be wound wrong][:hardware :rendering]
[50:46][Triangle winding][:blackboard :hardware :rendering]
[51:25][Fix the vertex winding in PushQuad()][:hardware :rendering]
[51:29][:Run it to see that it looks about right][:hardware :rendering]
[51:38][Let OpenGLEndFrame() texture our quads with their bitmaps][:hardware :rendering]
[51:58][:Run it to see that it looks about right][:hardware :rendering]
[52:05][Switch OpenGLEndFrame() to perform one draw call per scene, using the white bitmap for each sprite][:hardware :rendering]
[53:21][:Run it to see that our :performance has improved][:hardware :rendering]
[54:56][Re-enable all the grass with a view to overcoming the unsigned short vertex buffer index limit][:hardware :rendering]
[55:40][:Run it and hit our VI == VertIndex assertion][:hardware :rendering]
[56:23][Enable GetCurrentQuads() and PushQuad() to handle more than 65535 / 4 quads using glDrawElementsBaseVertex()[ref
site=docs.GL
page=glDrawElementsBaseVertex
url=http://docs.gl/gl2/glDrawElementsBaseVertex][ref
site=Khronos
page=glcorearb.h
url=https://www.khronos.org/registry/OpenGL/api/GL/glcorearb.h]][:hardware :rendering]
[1:04:41][:Run it to see that it looks good][:hardware :rendering]
[1:05:29][Let OpenGLEndFrame() texture our quads with the grass bitmap][:hardware :rendering]
[1:06:06][:Run it to see that we now don't have any :performance spikes][:hardware :rendering]
[1:08:30][Begin to switch the renderer over to use texture arrays, first enabling the vertex shader to understand the notion of a vertex index][:hardware :rendering]
[1:18:44][:Run it successfully][:hardware :rendering]
[1:18:51][Switch OpenGLAllocateTexture() to specify a feed-forward texture array, calling glTexSubImage3D()[ref
site="OpenGL Wiki"
page="Array Texture"
url=https://www.khronos.org/opengl/wiki/Array_Texture][ref
site=docs.GL
page=glTexSubImage3D
url=http://docs.gl/gl3/glTexSubImage3D]][:hardware :memory :rendering]
[1:38:09][Switch [~hero Handmade Hero] over to use the renderer's new texture array][:"asset loading" :hardware :memory :rendering]
[1:45:01][:Run it and hit an OpenGL error: "Texture name does not refer to a texture object generated by OpenGL"][:hardware :rendering]
[1:46:08][Temporarily add a TextureHandles array to the open_gl struct][:hardware :rendering]
[1:48:42][:Run it to see our tree-texture scene][:hardware :rendering]
[1:49:00][Remove the TextureHandles in favour of sampling slices into the lone texture array[ref
site=docs.GL
page=glBindTexture
url=http://docs.gl/gl3/glBindTexture][ref
site="OpenGL Wiki"
page="Array Texture"
url=https://www.khronos.org/opengl/wiki/Array_Texture][ref
site=Khronos
page=glcorearb.h
url=https://www.khronos.org/registry/OpenGL/api/GL/glcorearb.h][ref
site="OpenGL Registry"
page="The OpenGL Shading Language 1.50 Quick Reference Card"
url=https://www.khronos.org/files/opengl-quick-reference-card.pdf]][:hardware :rendering]
[2:02:05][:Run it and hit an OpenGL error: "Invalid texture format"][:hardware :rendering]
[2:03:38][Prevent OpenGLInit() from allocating the WhiteBitmap, and make it specify a texture level of 1 when calling glTexStorage2D()][:hardware :rendering]
[2:04:29][:Run it to see it all working with the correct textures, just that they are not always correctly sized[ref
site="OpenGL 4 Reference Pages"
page=glGenerateMipmap
url=https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glGenerateMipmap.xhtml]][:"asset loading" :hardware :rendering]
[2:07:26][:Run [~hero Handmade Hero] to see that it has issues][:"asset loading" :hardware :rendering]
[2:08:09][Remove the WhiteBitmap from the renderer in favour of allowing the game itself to specify it if needed][:"asset loading"]
[2:11:56][Enable AllocateGameAssets() in [~hero Handmade Hero] to create its needed WhiteBitmap][:"asset loading"]
[2:15:24][Augment the renderer_texture with Width and Height for PushQuad() to use and so fix the texture sizing, renaming TextureHandle() to ReferToTexture()][:"asset loading"]
[2:27:13][Make PushQuad() scale the texture UVs based on their sizes][:rendering]
[2:29:14][:Run the Renderer Test and [~hero Handmade Hero] to see that the textures are correctly sized][:rendering]
[2:29:37][Define TEXTURE_ARRAY_DIM][:rendering]
[2:30:42][:Run it, doing texture arrays][:rendering]
[2:31:22][Q&A][:speech]
[2:31:46][@lkalinovcic][Q: Are there any benefits to using PBOs for async texture transfers these days? I've heard that drivers copy texture data into their internal :memory anyway, so the call returns immediately and behaves as if it were asynchronous][:hardware :rendering]
[2:34:27][@mmozeiko][Q: In GL Core Profile you need to generate all "names" (textures, buffers, FBO, ...). Only in Compatibility Profile you are allowed to provide your own "name" values][:hardware :rendering]
[2:34:49][@lkalinovcic][Q: Unrelated: What's the recommended way to deal with Windows doing an infinite WindowProc loop while moving or resizing a window? My current approach is to jump in and out of the event processing loop with fibers. Is there something less hacky?][:"platform layer"]
[2:36:14][@mmozeiko][Q: You probably would want to replace glTexStorage3D() call with glTexImage3D(). TexStorage is from OpenGL 4.2 version (or ARB_texture_storage extension). TexStorage differs from TexImage in a way that its texture properties (size / format) are immutable it cannot be changed once texture is created. It may help :performance but often it does not change][:hardware :rendering]
[2:36:58][Replace glTexStorage3D() with glTexImage3D()[ref
site=docs.GL
page=glTexImage3D
url=http://docs.gl/gl3/glTexImage3D]][:hardware :rendering]
[2:40:04][@vateferfout][Q: Do you still multiply the offset in the glDrawElementsBaseVertex() call? Because I never do it and never had any problem, so I don't understand what's going on[ref
site=docs.GL
page=glDrawElementsBaseVertex
url=http://docs.gl/gl2/glDrawElementsBaseVertex][ref
site=OpenGL
page="OpenGL 3.2 (Core Profile)"
url=https://www.khronos.org/registry/OpenGL/specs/gl/glspec32.core.pdf]][:hardware :rendering]
[2:44:19][@mmozeiko][Q: Please show glTexSubImage3D() call that fails][:hardware :rendering :run]
[2:47:38][@mmozeiko][Q: It needs to be GL_RGBA or whatever (value does not matter, because it is for last argument pixel data, which is NULL, so no data specified)][:hardware :rendering]
[2:48:28][@mmozeiko][Q: "does not matter", as long as it is valid, nonzero][:hardware :rendering]
[2:49:49][@cemuzunlar][Q: I think glTexImage3D() is the failing call. Can you please check the call stack again at crash time?][:hardware :rendering]
[2:51:45][Wrap it up with the determination to find out why we can't use glTexImage3D() here in OpenGLInit(), and a plug of [~cmuratori Casey] and [~checker Chris Hecker]'s upcoming talks at Busan Indie Connect Festival 2018[ref
site="BIC Festival"
url=http://www.bicfest.org/en/]][:speech]
[/video]