[video member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Basic Dynamic Quad Output Optimizations" vod_platform=youtube id=4RQ8fMyN4Tw annotator=Miblo] [0:00][Plug Molly Rocket's Discord channel[ref site=Discord page=MollyRocket url=https://discord.gg/mollyrocket]][:speech] [2:03][Recap our multisampling improvements][:speech] [4:11][A few words on deferred :rendering][:speech] [7:10][Lessening the bandwidth requirements of the :lighting][:rendering :speech] [8:38][Determine to optimise our ground cover output][:speech] [9:39][Check out our time per frame: 10ms][:performance :run] [11:16][Reacquaint ourselves with DrawGroundCover()][:"entity system" :research] [12:17][Enable DrawGroundCover()][:"entity system"] [12:24][RenderGroundCover took 0.3%, and now takes 17% of our frame time][:"entity system" :performance :rendering :run] [13:35][Hit our MaxVertexCount assertion][:rendering :run] [14:05][Increase MaxQuadCountPerFrame fourfold][:rendering] [14:52][Again consult the Frames breakdown and hit our MaxVertexCount assertion][:rendering :run] [16:22][Remove entity timing blocks][:"debug system" :"entity system"] [17:12][Again consult the Frames breakdown and hit our MaxVertexCount assertion][:rendering :run] [18:50][Investigate why we are hitting the MaxVertexCount][:rendering :run] [20:24][Stop GetBestMatchAssetFrom() being a TIMED_FUNCTION][:"debug system" :"asset system"] [21:40][Consult the Frames breakdown without hitting our MaxVertexCount assertion][:performance :rendering :run] [22:26][Rather than timing UpdateAndRenderEntities() entirely, just time DrawGroundCover][:"debug system" :"entity system"] [24:18][DrawGroundCover takes 26% of our frame time][:"entity system" :performance :rendering :run] [24:52][Capture a frame to see that the GPU is not struggling to render our ground cover][:performance :rendering :run] [26:33][Plan to time DrawGroundCover(), and maybe cache PushQuad()][:caching :"debug system" :"entity system" :research] [29:34][Time the "Computation" code in DrawGroundCover()][:"debug system" :"entity system"] [30:40][DrawGroundCover takes 4% and the Computation takes 19% of our frame time][:"entity system" :performance :run] [31:31][Endeavour to speed up our DrawGroundCover Computation, using :caching][:"debug system" :"entity system" :research] [36:32][Rework ground_cover to facilitate faster ground cover generation][:"data structure" :"entity system"] [39:55][Enable DrawGroundCover() to operate on cached quads, calling OutputQuads()][:caching :"entity system"] [45:41][The freeness of the CPU moving a value into a destination + offset vs incrementing the destination itself][:hardware :language :performance :speech] [50:58][Make DrawGroundCover() call explicitly indexed VertexOut() and IndexOut()][:caching :"entity system"] [53:37][Make DrawGroundCover() acquire the TextureIndex, and relieve it of getting the bitmap info][:caching :"entity system"] [56:24][Make FillUnpackedEntity() get the bitmap info, introducing v4 and v3 versions of FinalizeColor()][:caching :"entity system"] [1:01:54][Clean up PushSprite()][:"entity system"] [1:02:15][Consider making PushSprite() work with pre-calculated values][:"entity system" :research] [1:05:11][Make FillUnpackedEntity() compute the ground cover's P and UV values previously done by PushSprite() and PushQuad(), introducing GetUVScaleForBitmap(), GetUVScaleForRegularTexture() and GetUVScaleForSpecialTexture()][:"entity system"] [1:18:27][Consider why SpriteValuesForUpright() problematically takes a RenderGroup][:"entity system" :research] [1:20:21][Introduce a version of SpriteValuesForUpright() that takes a WorldUp, CameraUp and XAxisH, instead of a RenderGroup, for FillUnpackedEntity() to call][:"entity system"] [1:26:00][Introduce indexed_vertex_output, OutputQuads(), VertexOut(), QuadIndexOut() and a version of Advance that takes an indexed_vertex_output][:"entity system"] [1:47:30][Hit our assertion in the SafeTruncateToU16() call in OutputQuads()][:"entity system" :run] [1:48:09][Let OutputQuads() early-out if the VertexCount and IndexCount are both 0][:"entity system"] [1:48:52][Again hit our assertion in the SafeTruncateToU16() call in OutputQuads(), and investigate why][:"entity system" :run] [1:52:44][Understanding the batch-fitting check in GetCurrentQuads()][:"entity system" :research] [1:53:56][Step in to OutputQuads() to see a QuadCount of 7670, which suggests we should have sunk fewer than 65752 vertices][:"entity system" :run] [1:54:51][Scrutinise the indexed_vertex_output versions of Advance(), and OutputQuads() and DrawGroundCover()][:"entity system" :research] [1:57:33][Fix DrawGroundCover() to Advance() by 1 quad][:"entity system"] [1:57:49][Again hit our assertion in the SafeTruncateToU16() call in OutputQuads(), and investigate why][:"entity system" :run] [1:59:29][Step through OutputQuads() successfully for a while, concluding that it and GetCurrentQuads() may not be agreeing][:"entity system" :run] [2:01:43][Fix OutputQuads() to increment the QuadCount][:"entity system"] [2:03:06][Find that we are running okay, but not seeing the expected ground cover][:"entity system" :run] [2:03:28][Step in to DrawGroundCover() and inspect its values][:"entity system" :run] [2:05:38][Fix FillUnpackedEntity() to set the ground cover's correct P values][:"entity system"] [2:06:28][See our ground cover, just with an incorrect camera up vector][:"entity system" :run] [2:06:52][DrawGroundCover takes 12% of our frame time][:"entity system" :performance :rendering :run] [2:07:56][Step it to UpdateAndRenderWorld() to get the camera's up vector][:camera :run] [2:09:06][Fix the DEFAULT_CAMERA_UP][:camera] [2:09:31][See that our ground cover is sheared][:"entity system" :run] [2:10:03][Try to fix the winding in QuadIndexOut()][:"entity system"] [2:10:41][See that our ground cover remains wrong][:"entity system" :run] [2:10:51][Scrutinise QuadIndexOut(), and wonder if a P value is wrong][:"entity system" :research] [2:12:08][Fix FillUnpackedEntity() to set the ground cover's correct P\[3\]][:"entity system"] [2:12:16][Admire our more performant ground cover][:"entity system" :run] [2:13:25][Q&A[ref site=Discord page=MollyRocket url=https://discord.gg/mollyrocket]][:speech] [2:14:00][@x13pixels][Q: Should OutputQuads() (line \~1078) use a '&&' rather than '||' to check that both vertex / index counts are nonzero?][:"entity system"] [2:15:11][@vaualbus][Q: So next step to optimize that is to do instancing :rendering for the ground cover?] [2:17:47][@pragma][Q: Maybe it would be easier to do the upright sprite calculaiton in the vertex shader instead?] [2:18:43][@Ryan][Q: Speaking of getting the compiler to output the right instructions, I have been toying with a lower level 'language' (assembly without the friction really). Would love to hear your thoughts (super simple example)[ref site=pastebin page=ML2TLeAf url=https://pastebin.com/ML2TLeAf]][:asm :language] [2:23:33][@frostyNinja][Q: Do you ever feel like the verbosity of passing around an often-used pointer like game_assets in [~hero Handmade Hero] is too much hassle or do you find it's more beneficial in the long run over a global?][:language] [2:26:27][@ssd][Q: I think you might have missed @pragma's question above? Or maybe I missed the answer?] [2:26:32][@platin21][Q: How would you go about breakable surfaces, not like in a voxel style?[ref author=TeranGroup title="A Thermomechanical Material Point Method for Baking and Cooking" publisher=YouTube url=https://www.youtube.com/watch?v=iBpolaB4DqA]][:geometry] [2:32:47][@simpalaxy][Q: Early in the series, you said you wanted to try to make your own math functions. Have you gotten around to doing that yet?][:mathematics] [2:33:49][Wrap it up with a plug of Molly Rocket's Discord channel[ref site=Discord page=MollyRocket url=https://discord.gg/mollyrocket] and glimpse into the future of :lighting][:speech] [/video]