cinera_handmade.network/cmuratori/hero/code/code555.hmml

132 lines
9.9 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[video output=day555 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Looking for GPU Performance Issues" vod_platform=youtube id=fduWZsh1riQ annotator=Miblo]
[0:02][Welcome to the stream][:speech]
[0:14][A few words on RenderDoc's crash message yesterday, with praise for their tech support, and plans to enable the game to fail gracefully when launched with incorrect parameters][:speech]
[3:48][Launch [~hero Handmade Hero] in RenderDoc][:run]
[4:13][Set the wrong Working Directory in RenderDoc][:admin]
[4:34][Crash RenderDoc upon launching [~hero Handmade Hero]][:run]
[6:12][Launch [~hero Handmade Hero] in ~RemedyBG with the Working Directory set wrong][:run]
[7:15][Hit our assertion in GetFontInfo()][:"asset system" :run]
[7:39][Make GetFontInfo() additionally assert that the Asset's Type is HHAAsset_None][:"asset system"]
[8:34][Hit our TextureIndex assertion in PushQuad()][:"asset system" :run]
[9:55][Enable all our asset Get*() functions to handle the absence of assets][:"asset system"]
[16:43][:Run successfully with our incorrect Working Directory]
[17:13][Enable AllocateGameAssets() to issue a warning notification when no assets were available][:"error handling"]
[23:17][Crash ~RemedyBG apparently on a jump-to-zero][:run]
[23:58][Create jump_to_zero_crash.cpp]
[25:07][:Run jump_to_zero to find that ~RemedyBG is fine with it]
[25:48][:Run [~hero Handmade Hero] successfully with our new warning code][:"error handling"]
[26:38][Introduce Win32ErrorMessage()][:"error handling" :"platform layer"]
[28:31][:Run and close cleanly][:"error handling" :"platform layer"]
[28:44][Make Win32ErrorMessage() print out an error message[ref
site=MSDN
page="MessageBoxExA function"
url=https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-messageboxexa]][:"error handling" :"platform layer"]
[34:16][:Run and close with our warning box][:"error handling" :"platform layer"]
[34:39][Fix the MBoxType in Win32ErrorMessage()[ref
site=MSDN
page="MessageBox function"
url=https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-messagebox]]
[35:46][:Run and close with our warning box][:"error handling" :"platform layer"]
[35:51][Temporarily try making AllocateGameAssets() produce a Fatal error][:"error handling"]
[36:17][:Run and close with our error box][:"error handling" :"platform layer"]
[36:29][Make WinMainCRTStartup() emit errors using Win32ErrorMessage()][:"error handling" :"platform layer"]
[38:58][:Run and close with our warning box][:"error handling" :"platform layer"]
[39:06][:Run unsuccessfully with the correct Working Directory][:"error handling" :"platform layer"]
[41:16][We've got a [@Molly saucy bean]][:speech]
[42:12][Fix GetBitmap() to correctly set the TextureHandle][:"asset system"]
[43:45][:Run successfully][:"asset system"]
[44:02][:Run in RenderDoc with the wrong Working Directory]
[44:25][Crash RenderDoc post-exit][:run]
[45:40][:Run in RenderDoc with the correct Working Directory, noting that we sometimes miss 60 FPS][:performance]
[46:54][Capture a frame in RenderDoc and see that it took 35590 µs][:run]
[48:48][Look at our four Colour Passes, and plan to submit vertices economically][:rendering :run]
[51:37][Make OpenGLInit() disable RequestVSync][:rendering]
[52:40][Find that our frame time hovers around 14 ms per frame][:performance :rendering :run]
[53:27][Reacquaint ourselves with our render dispatch in OpenGLEndFrame()][:rendering :research]
[55:31][Understanding glMapBuffer()[ref
site=docs.GL
page=glMapBuffer
url=http://docs.gl/gl3/glMapBuffer]][:api :rendering :research]
[57:57][Nsight rendering time: 14ms to 18ms / frame][:rendering :run]
[59:37][Scrub through Events to see that glDrawArrays takes 1.27ms][:rendering :run]
[1:00:35][Scrutinise ResolveMultisample() with a view to speeding it up][:rendering :research]
[1:05:00][Make CompileResolveMultisample() bake the SampleCount in to the shader, to hopefully permit the loop to be unrolled][:hardware :rendering]
[1:08:03][Find that it actually works][:hardware :rendering :run]
[1:08:19][Make CompileResolveMultisample() bake InvSampleCount in to the shader, and slightly reorganise it][:hardware :rendering]
[1:12:35][Find that that made no difference to the frame time, and that UpdateAndRenderEntities() takes a while][:hardware :performance :rendering :run]
[1:14:04][Temporarily Disable DrawGroundCover()][:rendering]
[1:14:37][Find that Game Update takes less of the total time, but we are not hitting 60 FPS][:performance :rendering :run]
[1:17:11][Disable HANDMADE_SLOW]
[1:18:23][Find that Debug Collation takes a lot of the total time][:"debug system" :performance :run]
[1:18:46][Compile out some of the :"debug system" if HANDMADE_SLOW is off]
[1:20:43][See the debug UI][:"debug system" :run]
[1:21:07][Instead compile out that part of the :"debug system" if HANDMADE_SLOW is on, and rearrange the code to fix compile errors]
[1:23:33][Find that lots of the :"debug system" is absent][:run]
[1:24:06][Compile out timing if HANDMADE_SLOW is off][:"debug system"]
[1:25:05][Still see jerkiness with debug collation off][:performance :run]
[1:26:09][Compile in our frame marker in all situations][:"debug system"]
[1:26:45][See that our frame time is well below 16ms, but we are not actually seeing 60 FPS][:performance :run]
[1:27:22][Nsight rendering time: 10ms to 14ms / frame][:rendering :run]
[1:29:07][Capture a frame in Nsight to see that glDrawArrays remains by far the most expensive call][:performance :rendering :run]
[1:33:47][Hard set SampleCount to 1 in CompileResolveMultisample()][:rendering]
[1:34:44][Nsight rendering time: 4ms / frame][:rendering :run]
[1:35:03][Capture a frame in Nsight to find that the resolve and draw calls take similar times][:performance :rendering :run]
[1:36:09][Understanding multisampling][:performance :rendering :speech]
[1:37:47][See our crispy lines, without multisampling][:rendering :run]
[1:38:31][:Research multisampling in GLSL[ref
site="OpenGL Registry"
page="The OpenGL ES Shading Language"
url=https://khronos.org/registry/OpenGL/specs/es/3.0/GLSL_ES_Specification_3.00.pdf][ref
site="Khronos Wiki"
page="Multisampling"
url=https://www.khronos.org/opengl/wiki/Multisampling][ref
site=NVIDIA
page="Deferred Shading MSAA Sample"
url=http://gameworksdocs.nvidia.com/GraphicsSamples/DeferredShadingMSAASample.htm][ref
site="OpenGL 4 Reference Pages"
page="texelFetch"
url=https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/texelFetch.xhtml][ref
author="Johan Andersson"
title="DirectX 11 Rendering in Battlefield 3"
url=http://www.dice.se/wp-content/uploads/2014/12/GDC11_DX11inBF3_Public.pdf]][:rendering]
[1:45:42][Spec out our desired fast path in CompileResolveMultisample(), based on there being only one sample in a multisample][:rendering]
[1:48:00][See nothing][:rendering :run]
[1:48:04][Revert CompileResolveMultisample()][:rendering]
[1:50:08][See everything][:rendering :run]
[1:50:10][Spec out our desired fast path in CompileResolveMultisample() again][:rendering]
[1:51:40][See nothing][:rendering :run]
[1:51:52][Cut out the else and the if(1) in CompileResolveMultisample()][:rendering]
[1:52:41][Find that the if(1) was the problem, somehow][:rendering :run]
[1:53:08][Reinsert the if(1) in CompileResolveMultisample()][:rendering]
[1:53:30][Capture a frame in Nsight, but still see no error message][:rendering :run]
[1:55:24][Enable HANDMADE_SLOW]
[1:55:46][Trigger a fragment shader error: implicit cast from "int" to "bool"][:rendering :run]
[1:56:21][Change the if(1) to if(true) in CompileResolveMultisample()][:rendering]
[1:57:20][See our standard output][:rendering :run]
[1:57:29][Trigger our non-multisampled fast path in CompileResolveMultisample()][:rendering]
[1:57:34][See our crispy edges][:rendering :run]
[1:57:47][Make a note to ask GPU people about our fast path]
[1:58:28][Q&A][:speech]
[1:59:13][@nxsy][Q: textureSamples(sampler)?[ref
site="OpenGL 4 Reference Pages"
page="textureSamples"
url=https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/textureSamples.xhtml]][:rendering]
[2:00:27][@Brian][Q: You can have Windows automatically capture crash dumps for you. Check if this key exists: HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Windows\\Windows Error Reporting\\LocalDumps and, if so, it will automatically capture dumps to %LOCALAPPDATA%\\CrashDumps. You can read more[ref
site="Windows Dev Center"
page="Collecting User-Mode Dumps"
url=https://docs.microsoft.com/en-us/windows/win32/wer/collecting-user-mode-dumps]]
[2:03:47][@aaronnickovich][Q: The Watch page[ref
site="Handmade Hero"
page="Watch"
url=https://handmadehero.org/watch] has no more scheduled appearances. When will you be streaming next?]
[2:04:14][@blazeitfury][Q: You should switch to Linux][:"operating system"]
[2:05:41][@rationalcoder][Q: Someone had asked whether or not that would optimize anything since the GPU has to execute both branches][:language]
[2:07:34][@bulmanator][Q: Im not sure if you were joking when you said to ask, but if you have the time and feel like it I would 100% be down to see lectures from you about the rest of the :animation system like you did with skinning]
[2:09:26][@philliptrudeau][Q: I think that putting it under "Chat" made it less clickable. Also, flashy thumbnails are really important]
[2:11:20][@blazeitfury][Q: My girlfriend watches and tells you most of the things you said about Linux are old school issues][:"operating system"]
[2:16:13][@aaronnickovich][I have an Arch Linux machine. Its dependencies are now broken and will take weeks to get working again][:"operating system"]
[2:18:37][@ivereadthesequel][Q: [@cmuratori Casey] please dispel the myth that @molly123 and I are the same person. @rupan3 has a crazy conspiracy going on I can't shake]
[2:19:01][@pythno][Q: For what reason do you like Linux if it breaks all the time?][:"operating system"]
[2:23:12][Wrap it up][:speech]
[/video]