diff --git a/cmuratori/hero/code/code555.hmml b/cmuratori/hero/code/code555.hmml new file mode 100644 index 0000000..bc63188 --- /dev/null +++ b/cmuratori/hero/code/code555.hmml @@ -0,0 +1,131 @@ +[video member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Looking for GPU Performance Issues" vod_platform=youtube id=fduWZsh1riQ annotator=Miblo] +[0:02][Welcome to the stream][:speech] +[0:14][A few words on RenderDoc's crash message yesterday, with praise for their tech support, and plans to enable the game to fail gracefully when launched with incorrect parameters][:speech] +[3:48][Launch [~hero Handmade Hero] in RenderDoc][:run] +[4:13][Set the wrong Working Directory in RenderDoc][:admin] +[4:34][Crash RenderDoc upon launching [~hero Handmade Hero]][:run] +[6:12][Launch [~hero Handmade Hero] in ~RemedyBG with the Working Directory set wrong][:run] +[7:15][Hit our assertion in GetFontInfo()][:"asset system" :run] +[7:39][Make GetFontInfo() additionally assert that the Asset's Type is HHAAsset_None][:"asset system"] +[8:34][Hit our TextureIndex assertion in PushQuad()][:"asset system" :run] +[9:55][Enable all our asset Get*() functions to handle the absence of assets][:"asset system"] +[16:43][:Run successfully with our incorrect Working Directory] +[17:13][Enable AllocateGameAssets() to issue a warning notification when no assets were available][:"error handling"] +[23:17][Crash ~RemedyBG apparently on a jump-to-zero][:run] +[23:58][Create jump_to_zero_crash.cpp] +[25:07][:Run jump_to_zero to find that ~RemedyBG is fine with it] +[25:48][:Run [~hero Handmade Hero] successfully with our new warning code][:"error handling"] +[26:38][Introduce Win32ErrorMessage()][:"error handling" :"platform layer"] +[28:31][:Run and close cleanly][:"error handling" :"platform layer"] +[28:44][Make Win32ErrorMessage() print out an error message[ref + site=MSDN + page="MessageBoxExA function" + url=https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-messageboxexa]][:"error handling" :"platform layer"] +[34:16][:Run and close with our warning box][:"error handling" :"platform layer"] +[34:39][Fix the MBoxType in Win32ErrorMessage()[ref + site=MSDN + page="MessageBox function" + url=https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-messagebox]] +[35:46][:Run and close with our warning box][:"error handling" :"platform layer"] +[35:51][Temporarily try making AllocateGameAssets() produce a Fatal error][:"error handling"] +[36:17][:Run and close with our error box][:"error handling" :"platform layer"] +[36:29][Make WinMainCRTStartup() emit errors using Win32ErrorMessage()][:"error handling" :"platform layer"] +[38:58][:Run and close with our warning box][:"error handling" :"platform layer"] +[39:06][:Run unsuccessfully with the correct Working Directory][:"error handling" :"platform layer"] +[41:16][We've got a [@Molly saucy bean]][:speech] +[42:12][Fix GetBitmap() to correctly set the TextureHandle][:"asset system"] +[43:45][:Run successfully][:"asset system"] +[44:02][:Run in RenderDoc with the wrong Working Directory] +[44:25][Crash RenderDoc post-exit][:run] +[45:40][:Run in RenderDoc with the correct Working Directory, noting that we sometimes miss 60 FPS][:performance] +[46:54][Capture a frame in RenderDoc and see that it took 35590 µs][:run] +[48:48][Look at our four Colour Passes, and plan to submit vertices economically][:rendering :run] +[51:37][Make OpenGLInit() disable RequestVSync][:rendering] +[52:40][Find that our frame time hovers around 14 ms per frame][:performance :rendering :run] +[53:27][Reacquaint ourselves with our render dispatch in OpenGLEndFrame()][:rendering :research] +[55:31][Understanding glMapBuffer()[ref + site=docs.GL + page=glMapBuffer + url=http://docs.gl/gl3/glMapBuffer]][:api :rendering :research] +[57:57][Nsight rendering time: 14ms to 18ms / frame][:rendering :run] +[59:37][Scrub through Events to see that glDrawArrays takes 1.27ms][:rendering :run] +[1:00:35][Scrutinise ResolveMultisample() with a view to speeding it up][:rendering :research] +[1:05:00][Make CompileResolveMultisample() bake the SampleCount in to the shader, to hopefully permit the loop to be unrolled][:hardware :rendering] +[1:08:03][Find that it actually works][:hardware :rendering :run] +[1:08:19][Make CompileResolveMultisample() bake InvSampleCount in to the shader, and slightly reorganise it][:hardware :rendering] +[1:12:35][Find that that made no difference to the frame time, and that UpdateAndRenderEntities() takes a while][:hardware :performance :rendering :run] +[1:14:04][Temporarily Disable DrawGroundCover()][:rendering] +[1:14:37][Find that Game Update takes less of the total time, but we are not hitting 60 FPS][:performance :rendering :run] +[1:17:11][Disable HANDMADE_SLOW] +[1:18:23][Find that Debug Collation takes a lot of the total time][:"debug system" :performance :run] +[1:18:46][Compile out some of the :"debug system" if HANDMADE_SLOW is off] +[1:20:43][See the debug UI][:"debug system" :run] +[1:21:07][Instead compile out that part of the :"debug system" if HANDMADE_SLOW is on, and rearrange the code to fix compile errors] +[1:23:33][Find that lots of the :"debug system" is absent][:run] +[1:24:06][Compile out timing if HANDMADE_SLOW is off][:"debug system"] +[1:25:05][Still see jerkiness with debug collation off][:performance :run] +[1:26:09][Compile in our frame marker in all situations][:"debug system"] +[1:26:45][See that our frame time is well below 16ms, but we are not actually seeing 60 FPS][:performance :run] +[1:27:22][Nsight rendering time: 10ms to 14ms / frame][:rendering :run] +[1:29:07][Capture a frame in Nsight to see that glDrawArrays remains by far the most expensive call][:performance :rendering :run] +[1:33:47][Hard set SampleCount to 1 in CompileResolveMultisample()][:rendering] +[1:34:44][Nsight rendering time: 4ms / frame][:rendering :run] +[1:35:03][Capture a frame in Nsight to find that the resolve and draw calls take similar times][:performance :rendering :run] +[1:36:09][Understanding multisampling][:performance :rendering :speech] +[1:37:47][See our crispy lines, without multisampling][:rendering :run] +[1:38:31][:Research multisampling in GLSL[ref + site="OpenGL Registry" + page="The OpenGL ES Shading Language" + url=https://khronos.org/registry/OpenGL/specs/es/3.0/GLSL_ES_Specification_3.00.pdf][ref + site="Khronos Wiki" + page="Multisampling" + url=https://www.khronos.org/opengl/wiki/Multisampling][ref + site=NVIDIA + page="Deferred Shading MSAA Sample" + url=http://gameworksdocs.nvidia.com/GraphicsSamples/DeferredShadingMSAASample.htm][ref + site="OpenGL 4 Reference Pages" + page="texelFetch" + url=https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/texelFetch.xhtml][ref + author="Johan Andersson" + title="DirectX 11 Rendering in Battlefield 3" + url=http://www.dice.se/wp-content/uploads/2014/12/GDC11_DX11inBF3_Public.pdf]][:rendering] +[1:45:42][Spec out our desired fast path in CompileResolveMultisample(), based on there being only one sample in a multisample][:rendering] +[1:48:00][See nothing][:rendering :run] +[1:48:04][Revert CompileResolveMultisample()][:rendering] +[1:50:08][See everything][:rendering :run] +[1:50:10][Spec out our desired fast path in CompileResolveMultisample() again][:rendering] +[1:51:40][See nothing][:rendering :run] +[1:51:52][Cut out the else and the if(1) in CompileResolveMultisample()][:rendering] +[1:52:41][Find that the if(1) was the problem, somehow][:rendering :run] +[1:53:08][Reinsert the if(1) in CompileResolveMultisample()][:rendering] +[1:53:30][Capture a frame in Nsight, but still see no error message][:rendering :run] +[1:55:24][Enable HANDMADE_SLOW] +[1:55:46][Trigger a fragment shader error: implicit cast from "int" to "bool"][:rendering :run] +[1:56:21][Change the if(1) to if(true) in CompileResolveMultisample()][:rendering] +[1:57:20][See our standard output][:rendering :run] +[1:57:29][Trigger our non-multisampled fast path in CompileResolveMultisample()][:rendering] +[1:57:34][See our crispy edges][:rendering :run] +[1:57:47][Make a note to ask GPU people about our fast path] +[1:58:28][Q&A][:speech] +[1:59:13][@nxsy][Q: textureSamples(sampler)?[ref + site="OpenGL 4 Reference Pages" + page="textureSamples" + url=https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/textureSamples.xhtml]][:rendering] +[2:00:27][@Brian][Q: You can have Windows automatically capture crash dumps for you. Check if this key exists: HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Windows\\Windows Error Reporting\\LocalDumps and, if so, it will automatically capture dumps to %LOCALAPPDATA%\\CrashDumps. You can read more[ref + site="Windows Dev Center" + page="Collecting User-Mode Dumps" + url=https://docs.microsoft.com/en-us/windows/win32/wer/collecting-user-mode-dumps]] +[2:03:47][@aaronnickovich][Q: The Watch page[ref + site="Handmade Hero" + page="Watch" + url=https://handmadehero.org/watch] has no more scheduled appearances. When will you be streaming next?] +[2:04:14][@blazeitfury][Q: You should switch to Linux][:"operating system"] +[2:05:41][@rationalcoder][Q: Someone had asked whether or not that would optimize anything since the GPU has to execute both branches][:language] +[2:07:34][@bulmanator][Q: I’m not sure if you were joking when you said to ask, but if you have the time and feel like it I would 100% be down to see lectures from you about the rest of the :animation system like you did with skinning] +[2:09:26][@philliptrudeau][Q: I think that putting it under "Chat" made it less clickable. Also, flashy thumbnails are really important] +[2:11:20][@blazeitfury][Q: My girlfriend watches and tells you most of the things you said about Linux are old school issues][:"operating system"] +[2:16:13][@aaronnickovich][I have an Arch Linux machine. Its dependencies are now broken and will take weeks to get working again][:"operating system"] +[2:18:37][@ivereadthesequel][Q: [@cmuratori Casey] please dispel the myth that @molly123 and I are the same person. @rupan3 has a crazy conspiracy going on I can't shake] +[2:19:01][@pythno][Q: For what reason do you like Linux if it breaks all the time?][:"operating system"] +[2:23:12][Wrap it up][:speech] +[/video]