From 7540421ed273e20fbd0e1a51fc1a0772693a48ba Mon Sep 17 00:00:00 2001 From: Matt Mascarenhas Date: Tue, 4 Sep 2018 22:49:51 +0100 Subject: [PATCH] Annotate hero/code478 --- cmuratori/hero/code/code478.hmml | 117 ++++++++++++++++++++++++++++++- 1 file changed, 115 insertions(+), 2 deletions(-) diff --git a/cmuratori/hero/code/code478.hmml b/cmuratori/hero/code/code478.hmml index 0ee37ef..1e923c4 100644 --- a/cmuratori/hero/code/code478.hmml +++ b/cmuratori/hero/code/code478.hmml @@ -1,2 +1,115 @@ -[video member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Changing to Single Dispatch Per Pass (Part 2)" vod_platform=youtube id=0d0_NitChCY annotator=] -[/video] \ No newline at end of file +[video member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Changing to Single Dispatch Per Pass (Part 2)" vod_platform=youtube id=0d0_NitChCY annotator=Miblo] +[0:00][Recap our switch to texture vertex indices][:hardware :rendering :speech] +[1:36][Augment game_render_commands with an IndexArray, with a few words on the GPU's ability to cache vertex shader transforms][:hardware :rendering] +[7:29][Remove renderer_texture_group and renderer_memory_layout, and pass our new IndexArray down the pipe, also augmenting open_gl with IndexArray and MaxIndexCount][:hardware :rendering] +[14:00][:Run the Renderer Test successfully][:hardware :rendering] +[14:12][Reduce the MaxQuadCountPerFrame in RenderTest() and the (grass) CoverIndex in PushSimpleScene(), with a few words on breaking the problem into steps][:hardware :rendering] +[16:30][:Run the Renderer Test with our single vertex buffer working][:hardware :rendering] +[16:51][Augment render_entry_textured_quads with IndexArrayOffset for OpenGLEndFrame() to use the index buffer passed in][:hardware :rendering] +[20:19][Change OpenGLEndFrame() to call glBufferData() once, instead of while processing every render command][:hardware :rendering] +[22:17][:Run it to see problems with the depth peeling][:hardware :rendering] +[22:44][Make OpenGLInit() generate a separate CompositeVertexBuffer and ScreenFillVertexBuffer[ref + site="OpenGL 4 Reference Pages" + page=glBufferData + url=https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glBufferData.xhtml]][:hardware :rendering] +[28:52][:Run it with the depth peel working correctly][:hardware :rendering] +[29:03][Revert the MaxQuadCountPerFrame in RenderTest() and the (grass) CoverIndex in PushSimpleScene() to their original high values][:hardware :rendering] +[29:55][:Run it to see that it is much zippier][:hardware :performance :rendering] +[30:34][Temporarily disable the :lighting][:rendering] +[31:15][:Run it to see that it didn't affect it much][:lighting :performance :rendering] +[31:51][Make RenderLoop() disable the :lighting][:rendering] +[33:17][:Run it and gauge the :performance with the :lighting disabled][:rendering] +[35:40][Turn off all the grass and augment open_gl with an IndexBuffer for OpenGLEndFrame() to use, rather than initialising Indices itself[ref + site="OpenGL 4 Reference Pages" + page=glBufferData + url=https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glBufferData.xhtml][ref + site=Khronos + page=glcorearb.h + url=https://www.khronos.org/registry/OpenGL/api/GL/glcorearb.h]][:hardware :rendering] +[45:31][Make PushQuad() fill our new IndexBuffer with the VertIndex][:hardware :rendering] +[49:08][:Run it without crashing, but also without seeing anything][:hardware :rendering] +[49:37][Fix OpenGLEndFrame() to pass GL_UNSIGNED_SHORT to glDrawElements()][:hardware :rendering] +[50:36][:Run it to see that our triangle must be wound wrong][:hardware :rendering] +[50:46][Triangle winding][:blackboard :hardware :rendering] +[51:25][Fix the vertex winding in PushQuad()][:hardware :rendering] +[51:29][:Run it to see that it looks about right][:hardware :rendering] +[51:38][Let OpenGLEndFrame() texture our quads with their bitmaps][:hardware :rendering] +[51:58][:Run it to see that it looks about right][:hardware :rendering] +[52:05][Switch OpenGLEndFrame() to perform one draw call per scene, using the white bitmap for each sprite][:hardware :rendering] +[53:21][:Run it to see that our :performance has improved][:hardware :rendering] +[54:56][Re-enable all the grass with a view to overcoming the unsigned short vertex buffer index limit][:hardware :rendering] +[55:40][:Run it and hit our VI == VertIndex assertion][:hardware :rendering] +[56:23][Enable GetCurrentQuads() and PushQuad() to handle more than 65535 / 4 quads using glDrawElementsBaseVertex()[ref + site=docs.GL + page=glDrawElementsBaseVertex + url=http://docs.gl/gl2/glDrawElementsBaseVertex][ref + site=Khronos + page=glcorearb.h + url=https://www.khronos.org/registry/OpenGL/api/GL/glcorearb.h]][:hardware :rendering] +[1:04:41][:Run it to see that it looks good][:hardware :rendering] +[1:05:29][Let OpenGLEndFrame() texture our quads with the grass bitmap][:hardware :rendering] +[1:06:06][:Run it to see that we now don't have any :performance spikes][:hardware :rendering] +[1:08:30][Begin to switch the renderer over to use texture arrays, first enabling the vertex shader to understand the notion of a vertex index][:hardware :rendering] +[1:18:44][:Run it successfully][:hardware :rendering] +[1:18:51][Switch OpenGLAllocateTexture() to specify a feed-forward texture array, calling glTexSubImage3D()[ref + site="OpenGL Wiki" + page="Array Texture" + url=https://www.khronos.org/opengl/wiki/Array_Texture][ref + site=docs.GL + page=glTexSubImage3D + url=http://docs.gl/gl3/glTexSubImage3D]][:hardware :memory :rendering] +[1:38:09][Switch [~hero Handmade Hero] over to use the renderer's new texture array][:"asset loading" :hardware :memory :rendering] +[1:45:01][:Run it and hit an OpenGL error: "Texture name does not refer to a texture object generated by OpenGL"][:hardware :rendering] +[1:46:08][Temporarily add a TextureHandles array to the open_gl struct][:hardware :rendering] +[1:48:42][:Run it to see our tree-texture scene][:hardware :rendering] +[1:49:00][Remove the TextureHandles in favour of sampling slices into the lone texture array[ref + site=docs.GL + page=glBindTexture + url=http://docs.gl/gl3/glBindTexture][ref + site="OpenGL Wiki" + page="Array Texture" + url=https://www.khronos.org/opengl/wiki/Array_Texture][ref + site=Khronos + page=glcorearb.h + url=https://www.khronos.org/registry/OpenGL/api/GL/glcorearb.h][ref + site="OpenGL Registry" + page="The OpenGL Shading Language 1.50 Quick Reference Card" + url=https://www.khronos.org/files/opengl-quick-reference-card.pdf]][:hardware :rendering] +[2:02:05][:Run it and hit an OpenGL error: "Invalid texture format"][:hardware :rendering] +[2:03:38][Prevent OpenGLInit() from allocating the WhiteBitmap, and make it specify a texture level of 1 when calling glTexStorage2D()][:hardware :rendering] +[2:04:29][:Run it to see it all working with the correct textures, just that they are not always correctly sized[ref + site="OpenGL 4 Reference Pages" + page=glGenerateMipmap + url=https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glGenerateMipmap.xhtml]][:"asset loading" :hardware :rendering] +[2:07:26][:Run [~hero Handmade Hero] to see that it has issues][:"asset loading" :hardware :rendering] +[2:08:09][Remove the WhiteBitmap from the renderer in favour of allowing the game itself to specify it if needed][:"asset loading"] +[2:11:56][Enable AllocateGameAssets() in [~hero Handmade Hero] to create its needed WhiteBitmap][:"asset loading"] +[2:15:24][Augment the renderer_texture with Width and Height for PushQuad() to use and so fix the texture sizing, renaming TextureHandle() to ReferToTexture()][:"asset loading"] +[2:27:13][Make PushQuad() scale the texture UVs based on their sizes][:rendering] +[2:29:14][:Run the Renderer Test and [~hero Handmade Hero] to see that the textures are correctly sized][:rendering] +[2:29:37][Define TEXTURE_ARRAY_DIM][:rendering] +[2:30:42][:Run it, doing texture arrays][:rendering] +[2:31:22][Q&A][:speech] +[2:31:46][@lkalinovcic][Q: Are there any benefits to using PBOs for async texture transfers these days? I've heard that drivers copy texture data into their internal :memory anyway, so the call returns immediately and behaves as if it were asynchronous][:hardware :rendering] +[2:34:27][@mmozeiko][Q: In GL Core Profile you need to generate all "names" (textures, buffers, FBO, ...). Only in Compatibility Profile you are allowed to provide your own "name" values][:hardware :rendering] +[2:34:49][@lkalinovcic][Q: Unrelated: What's the recommended way to deal with Windows doing an infinite WindowProc loop while moving or resizing a window? My current approach is to jump in and out of the event processing loop with fibers. Is there something less hacky?][:"platform layer"] +[2:36:14][@mmozeiko][Q: You probably would want to replace glTexStorage3D() call with glTexImage3D(). TexStorage is from OpenGL 4.2 version (or ARB_texture_storage extension). TexStorage differs from TexImage in a way that its texture properties (size / format) are immutable – it cannot be changed once texture is created. It may help :performance but often it does not change][:hardware :rendering] +[2:36:58][Replace glTexStorage3D() with glTexImage3D()[ref + site=docs.GL + page=glTexImage3D + url=http://docs.gl/gl3/glTexImage3D]][:hardware :rendering] +[2:40:04][@vateferfout][Q: Do you still multiply the offset in the glDrawElementsBaseVertex() call? Because I never do it and never had any problem, so I don't understand what's going on[ref + site=docs.GL + page=glDrawElementsBaseVertex + url=http://docs.gl/gl2/glDrawElementsBaseVertex][ref + site=OpenGL + page="OpenGL 3.2 (Core Profile)" + url=https://www.khronos.org/registry/OpenGL/specs/gl/glspec32.core.pdf]][:hardware :rendering] +[2:44:19][@mmozeiko][Q: Please show glTexSubImage3D() call that fails][:hardware :rendering :run] +[2:47:38][@mmozeiko][Q: It needs to be GL_RGBA or whatever (value does not matter, because it is for last argument pixel data, which is NULL, so no data specified)][:hardware :rendering] +[2:48:28][@mmozeiko][Q: "does not matter", as long as it is valid, nonzero][:hardware :rendering] +[2:49:49][@cemuzunlar][Q: I think glTexImage3D() is the failing call. Can you please check the call stack again at crash time?][:hardware :rendering] +[2:51:45][Wrap it up with the determination to find out why we can't use glTexImage3D() here in OpenGLInit(), and a plug of [~cmuratori Casey] and [~checker Chris Hecker]'s upcoming talks at Busan Indie Connect Festival 2018[ref + site="BIC Festival" + url=http://www.bicfest.org/en/]][:speech] +[/video]