From 0f145a11ab0e73b0c5f2e857616f8e87c12443ad Mon Sep 17 00:00:00 2001 From: Matt Mascarenhas Date: Fri, 20 Sep 2019 17:27:10 +0100 Subject: [PATCH] Index hero/code554 --- cmuratori/hero/code/code554.hmml | 62 ++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) create mode 100644 cmuratori/hero/code/code554.hmml diff --git a/cmuratori/hero/code/code554.hmml b/cmuratori/hero/code/code554.hmml new file mode 100644 index 0000000..c567469 --- /dev/null +++ b/cmuratori/hero/code/code554.hmml @@ -0,0 +1,62 @@ +[video member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Reducing GPU Memory Footprint" vod_platform=youtube id=dl4QKPK8LMo annotator=Miblo] +[0:02][Update ~Milton and ~RemedyBG, with thanks to Ameen Sayegh[ref + site=Twitter + page="Ameen Sayegh" + url=https://twitter.com/ameensayegh] for the ~Milton grid patch][:admin] +[2:46][Try out ~Milton's grids][:blackboard] +[6:18][~Milton or tablet driver issue: Smoothing][:blackboard] +[8:18][Demo our :lighting :sampling sphere][:run] +[9:29][Note the fast :lighting computation][:research] +[12:35][8-Wide Light Probes[ref + author="Jordan Leigh" + title="3D Interactive Cube with Rotating Sides using CSS3 and JavaScript" + publisher=Codepen + url=https://codepen.io/jordizle/pen/haIdo/]][:blackboard :geometry :lighting :sampling] +[22:15][Plan to work on both the pixel shader and light propagation][:lighting :speech] +[24:06][See stutter in our frame rate][:performance :run] +[25:55][We've got an interloping [@Molly puss]][:speech] +[27:18][Consider that we're using too much GPU :memory][:run] +[28:42][Calculate our GPU :memory requirements][:admin] +[31:51][Crash Nsight upon launching [~hero Handmade Hero]][:run] +[33:35][Add a new Nsight project for [~hero Handmade Hero]][:admin] +[35:13][Crash Nsight upon launching [~hero Handmade Hero]][:run] +[36:36][Install RenderDoc[ref + site=RenderDoc + url=https://renderdoc.org] and configure it for [~hero Handmade Hero]][:admin] +[40:26][Crash RenderDoc upon launching [~hero Handmade Hero]][:run] +[41:36][cloc [~hero Handmade Hero]: 32,936 lines][:admin] +[43:24][Remove the light_buffer array from open_gl][:hardware :memory] +[47:41][Find that there is no change][:hardware :memory :run] +[47:51][Track the frame buffer :memory usage in FreeFramebuffer() and CreateFramebuffer()] +[57:41][Find that we may get a faster frame rate with V-Sync disabled][:performance :run] +[58:13][Add a "Renderer" DEBUG_DATA_BLOCK in WinMainCRTStartup() for the framebuffer and texture :memory][:"debug system" :memory] +[1:04:58][Enable the :"debug system" to handle umm type] +[1:07:57][Find that we have a TotalFramebufferMemory of 3GB][:memory :run] +[1:08:22][Add UsedMultisampleCount to the "Renderer" DEBUG_DATA_BLOCK in WinMainCRTStartup()][:"debug system"] +[1:10:48][Find that our UsedMultisampleCount is 16, and calculate the :memory requirements as 2.47GB][:run] +[1:13:14][Consider :memory usage improvements: Only store one depth buffer][:speech] +[1:18:59][Consider :memory usage improvements: Streamline the colour buffer][:speech] +[1:23:02][Consider :rendering solid cubes into a single multisampling buffer with depth peeling disabled, resolve and composite this with our alpha items][:memory :speech] +[1:30:15][Consider depth peeling in only two buffers][:memory :speech] +[1:36:26][Spot our DepthPeelResolveBuffer, and make OpenGLEndFrame() use a single depth peel buffer][:memory :rendering] +[1:39:21][Find that we render just fine in one DepthPeelBuffer][:rendering :memory :run] +[1:40:11][Shrink open_gl down to only contain one DepthPeelBuffer][:memory :rendering] +[1:42:15][Find that we render correctly and have a TotalFramebufferMemory of 2GB][:memory :run] +[1:43:16][Step in to CreateFramebuffer() and watch the :memory usage][:rendering :run] +[1:46:00][Add a new Nsight project for [~hero Handmade Hero] using the correct Working Directory][:admin] +[1:46:51][:Run successfully in Nsight][:owl] +[1:48:07][Continue to step through CreateFramebuffer() and spot that we're multiplying the MaxMultiSampleCount into the GPUMemoryUsed even when multisampling is disabled][:owl :run] +[1:48:53][Fix CreateFramebuffer() to multiply the correct SampleCount into the GPUMemoryUsed][:memory :owl] +[1:49:44][Step through CreateFramebuffer() to see that our GPUMemoryUsed is actually okay][:memory :owl :rendering :run] +[1:55:55][Find that we have a TotalFramebufferMemory of 325MB][:memory :run] +[1:57:46][Nsight rendering time: 14ms / frame][:memory :rendering :run] +[1:58:38][Permit NVIDIA GPU performance counters[ref + site="NVIDIA Developer" + page="NVIDIA Development Tools Solutions - ERR_NVGPUCTRPERM: Nsight Graphics Permission issue with Performance Counters" + url=https://developer.nvidia.com/nvidia-development-tools-solutions-err-nvgpuctrperm-nsight-graphics]][:admin] +[2:01:09][Capture for Live Analysis in Nsight, and look for a combined :memory count][:rendering :run] +[2:07:39][Nsight: Range Profiler View][:performance :rendering :run] +[2:14:57][Reflect on our peel buffer :memory usage reduction, and plan to pseudo-simulate the light probes][:lighting :rendering :speech] +[2:16:29][Try disabling V-Sync and multisampling, to find that the latter fails][:rendering :run] +[2:17:17][Plan to fix the multisampling read][:lighting :rendering :speech] +[/video]