cinera_handmade.network/cmuratori/hero/code/code478.hmml

116 lines
9.8 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[video output=day478 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Changing to Single Dispatch Per Pass (Part 2)" vod_platform=youtube id=0d0_NitChCY annotator=Miblo]
[0:00][Recap our switch to texture vertex indices][:hardware :rendering :speech]
[1:36][Augment game_render_commands with an IndexArray, with a few words on the GPU's ability to cache vertex shader transforms][:hardware :rendering]
[7:29][Remove renderer_texture_group and renderer_memory_layout, and pass our new IndexArray down the pipe, also augmenting open_gl with IndexArray and MaxIndexCount][:hardware :rendering]
[14:00][:Run the Renderer Test successfully][:hardware :rendering]
[14:12][Reduce the MaxQuadCountPerFrame in RenderTest() and the (grass) CoverIndex in PushSimpleScene(), with a few words on breaking the problem into steps][:hardware :rendering]
[16:30][:Run the Renderer Test with our single vertex buffer working][:hardware :rendering]
[16:51][Augment render_entry_textured_quads with IndexArrayOffset for OpenGLEndFrame() to use the index buffer passed in][:hardware :rendering]
[20:19][Change OpenGLEndFrame() to call glBufferData() once, instead of while processing every render command][:hardware :rendering]
[22:17][:Run it to see problems with the depth peeling][:hardware :rendering]
[22:44][Make OpenGLInit() generate a separate CompositeVertexBuffer and ScreenFillVertexBuffer[ref
site="OpenGL 4 Reference Pages"
page=glBufferData
url=https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glBufferData.xhtml]][:hardware :rendering]
[28:52][:Run it with the depth peel working correctly][:hardware :rendering]
[29:03][Revert the MaxQuadCountPerFrame in RenderTest() and the (grass) CoverIndex in PushSimpleScene() to their original high values][:hardware :rendering]
[29:55][:Run it to see that it is much zippier][:hardware :performance :rendering]
[30:34][Temporarily disable the :lighting][:rendering]
[31:15][:Run it to see that it didn't affect it much][:lighting :performance :rendering]
[31:51][Make RenderLoop() disable the :lighting][:rendering]
[33:17][:Run it and gauge the :performance with the :lighting disabled][:rendering]
[35:40][Turn off all the grass and augment open_gl with an IndexBuffer for OpenGLEndFrame() to use, rather than initialising Indices itself[ref
site="OpenGL 4 Reference Pages"
page=glBufferData
url=https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glBufferData.xhtml][ref
site=Khronos
page=glcorearb.h
url=https://www.khronos.org/registry/OpenGL/api/GL/glcorearb.h]][:hardware :rendering]
[45:31][Make PushQuad() fill our new IndexBuffer with the VertIndex][:hardware :rendering]
[49:08][:Run it without crashing, but also without seeing anything][:hardware :rendering]
[49:37][Fix OpenGLEndFrame() to pass GL_UNSIGNED_SHORT to glDrawElements()][:hardware :rendering]
[50:36][:Run it to see that our triangle must be wound wrong][:hardware :rendering]
[50:46][Triangle winding][:blackboard :hardware :rendering]
[51:25][Fix the vertex winding in PushQuad()][:hardware :rendering]
[51:29][:Run it to see that it looks about right][:hardware :rendering]
[51:38][Let OpenGLEndFrame() texture our quads with their bitmaps][:hardware :rendering]
[51:58][:Run it to see that it looks about right][:hardware :rendering]
[52:05][Switch OpenGLEndFrame() to perform one draw call per scene, using the white bitmap for each sprite][:hardware :rendering]
[53:21][:Run it to see that our :performance has improved][:hardware :rendering]
[54:56][Re-enable all the grass with a view to overcoming the unsigned short vertex buffer index limit][:hardware :rendering]
[55:40][:Run it and hit our VI == VertIndex assertion][:hardware :rendering]
[56:23][Enable GetCurrentQuads() and PushQuad() to handle more than 65535 / 4 quads using glDrawElementsBaseVertex()[ref
site=docs.GL
page=glDrawElementsBaseVertex
url=http://docs.gl/gl2/glDrawElementsBaseVertex][ref
site=Khronos
page=glcorearb.h
url=https://www.khronos.org/registry/OpenGL/api/GL/glcorearb.h]][:hardware :rendering]
[1:04:41][:Run it to see that it looks good][:hardware :rendering]
[1:05:29][Let OpenGLEndFrame() texture our quads with the grass bitmap][:hardware :rendering]
[1:06:06][:Run it to see that we now don't have any :performance spikes][:hardware :rendering]
[1:08:30][Begin to switch the renderer over to use texture arrays, first enabling the vertex shader to understand the notion of a vertex index][:hardware :rendering]
[1:18:44][:Run it successfully][:hardware :rendering]
[1:18:51][Switch OpenGLAllocateTexture() to specify a feed-forward texture array, calling glTexSubImage3D()[ref
site="OpenGL Wiki"
page="Array Texture"
url=https://www.khronos.org/opengl/wiki/Array_Texture][ref
site=docs.GL
page=glTexSubImage3D
url=http://docs.gl/gl3/glTexSubImage3D]][:hardware :memory :rendering]
[1:38:09][Switch [~hero Handmade Hero] over to use the renderer's new texture array][:"asset loading" :hardware :memory :rendering]
[1:45:01][:Run it and hit an OpenGL error: "Texture name does not refer to a texture object generated by OpenGL"][:hardware :rendering]
[1:46:08][Temporarily add a TextureHandles array to the open_gl struct][:hardware :rendering]
[1:48:42][:Run it to see our tree-texture scene][:hardware :rendering]
[1:49:00][Remove the TextureHandles in favour of sampling slices into the lone texture array[ref
site=docs.GL
page=glBindTexture
url=http://docs.gl/gl3/glBindTexture][ref
site="OpenGL Wiki"
page="Array Texture"
url=https://www.khronos.org/opengl/wiki/Array_Texture][ref
site=Khronos
page=glcorearb.h
url=https://www.khronos.org/registry/OpenGL/api/GL/glcorearb.h][ref
site="OpenGL Registry"
page="The OpenGL Shading Language 1.50 Quick Reference Card"
url=https://www.khronos.org/files/opengl-quick-reference-card.pdf]][:hardware :rendering]
[2:02:05][:Run it and hit an OpenGL error: "Invalid texture format"][:hardware :rendering]
[2:03:38][Prevent OpenGLInit() from allocating the WhiteBitmap, and make it specify a texture level of 1 when calling glTexStorage2D()][:hardware :rendering]
[2:04:29][:Run it to see it all working with the correct textures, just that they are not always correctly sized[ref
site="OpenGL 4 Reference Pages"
page=glGenerateMipmap
url=https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glGenerateMipmap.xhtml]][:"asset loading" :hardware :rendering]
[2:07:26][:Run [~hero Handmade Hero] to see that it has issues][:"asset loading" :hardware :rendering]
[2:08:09][Remove the WhiteBitmap from the renderer in favour of allowing the game itself to specify it if needed][:"asset loading"]
[2:11:56][Enable AllocateGameAssets() in [~hero Handmade Hero] to create its needed WhiteBitmap][:"asset loading"]
[2:15:24][Augment the renderer_texture with Width and Height for PushQuad() to use and so fix the texture sizing, renaming TextureHandle() to ReferToTexture()][:"asset loading"]
[2:27:13][Make PushQuad() scale the texture UVs based on their sizes][:rendering]
[2:29:14][:Run the Renderer Test and [~hero Handmade Hero] to see that the textures are correctly sized][:rendering]
[2:29:37][Define TEXTURE_ARRAY_DIM][:rendering]
[2:30:42][:Run it, doing texture arrays][:rendering]
[2:31:22][Q&A][:speech]
[2:31:46][@lkalinovcic][Q: Are there any benefits to using PBOs for async texture transfers these days? I've heard that drivers copy texture data into their internal :memory anyway, so the call returns immediately and behaves as if it were asynchronous][:hardware :rendering]
[2:34:27][@mmozeiko][Q: In GL Core Profile you need to generate all "names" (textures, buffers, FBO, ...). Only in Compatibility Profile you are allowed to provide your own "name" values][:hardware :rendering]
[2:34:49][@lkalinovcic][Q: Unrelated: What's the recommended way to deal with Windows doing an infinite WindowProc loop while moving or resizing a window? My current approach is to jump in and out of the event processing loop with fibers. Is there something less hacky?][:"platform layer"]
[2:36:14][@mmozeiko][Q: You probably would want to replace glTexStorage3D() call with glTexImage3D(). TexStorage is from OpenGL 4.2 version (or ARB_texture_storage extension). TexStorage differs from TexImage in a way that its texture properties (size / format) are immutable it cannot be changed once texture is created. It may help :performance but often it does not change][:hardware :rendering]
[2:36:58][Replace glTexStorage3D() with glTexImage3D()[ref
site=docs.GL
page=glTexImage3D
url=http://docs.gl/gl3/glTexImage3D]][:hardware :rendering]
[2:40:04][@vateferfout][Q: Do you still multiply the offset in the glDrawElementsBaseVertex() call? Because I never do it and never had any problem, so I don't understand what's going on[ref
site=docs.GL
page=glDrawElementsBaseVertex
url=http://docs.gl/gl2/glDrawElementsBaseVertex][ref
site=OpenGL
page="OpenGL 3.2 (Core Profile)"
url=https://www.khronos.org/registry/OpenGL/specs/gl/glspec32.core.pdf]][:hardware :rendering]
[2:44:19][@mmozeiko][Q: Please show glTexSubImage3D() call that fails][:hardware :rendering :run]
[2:47:38][@mmozeiko][Q: It needs to be GL_RGBA or whatever (value does not matter, because it is for last argument pixel data, which is NULL, so no data specified)][:hardware :rendering]
[2:48:28][@mmozeiko][Q: "does not matter", as long as it is valid, nonzero][:hardware :rendering]
[2:49:49][@cemuzunlar][Q: I think glTexImage3D() is the failing call. Can you please check the call stack again at crash time?][:hardware :rendering]
[2:51:45][Wrap it up with the determination to find out why we can't use glTexImage3D() here in OpenGLInit(), and a plug of [~cmuratori Casey] and [~checker Chris Hecker]'s upcoming talks at Busan Indie Connect Festival 2018[ref
site="BIC Festival"
url=http://www.bicfest.org/en/]][:speech]
[/video]