[video member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Reducing GPU Memory Footprint" vod_platform=youtube id=dl4QKPK8LMo annotator=Miblo] [0:02][Update ~Milton and ~RemedyBG, with thanks to Ameen Sayegh[ref site=Twitter page="Ameen Sayegh" url=https://twitter.com/ameensayegh] for the ~Milton grid patch][:admin] [2:46][Try out ~Milton's grids][:blackboard] [6:18][~Milton or tablet driver issue: Smoothing][:blackboard] [8:18][Demo our :lighting :sampling sphere][:run] [9:29][Note the fast :lighting computation][:research] [12:35][8-Wide Light Probes[ref author="Jordan Leigh" title="3D Interactive Cube with Rotating Sides using CSS3 and JavaScript" publisher=Codepen url=https://codepen.io/jordizle/pen/haIdo/]][:blackboard :geometry :lighting :sampling] [22:15][Plan to work on both the pixel shader and light propagation][:lighting :speech] [24:06][See stutter in our frame rate][:performance :run] [25:55][We've got an interloping [@Molly puss]][:speech] [27:18][Consider that we're using too much GPU :memory][:run] [28:42][Calculate our GPU :memory requirements][:admin] [31:51][Crash Nsight upon launching [~hero Handmade Hero]][:run] [33:35][Add a new Nsight project for [~hero Handmade Hero]][:admin] [35:13][Crash Nsight upon launching [~hero Handmade Hero]][:run] [36:36][Install RenderDoc[ref site=RenderDoc url=https://renderdoc.org] and configure it for [~hero Handmade Hero]][:admin] [40:26][Crash RenderDoc upon launching [~hero Handmade Hero]][:run] [41:36][cloc [~hero Handmade Hero]: 32,936 lines][:admin] [43:24][Remove the light_buffer array from open_gl][:hardware :memory] [47:41][Find that there is no change][:hardware :memory :run] [47:51][Track the frame buffer :memory usage in FreeFramebuffer() and CreateFramebuffer()] [57:41][Find that we may get a faster frame rate with V-Sync disabled][:performance :run] [58:13][Add a "Renderer" DEBUG_DATA_BLOCK in WinMainCRTStartup() for the framebuffer and texture :memory][:"debug system" :memory] [1:04:58][Enable the :"debug system" to handle umm type] [1:07:57][Find that we have a TotalFramebufferMemory of 3GB][:memory :run] [1:08:22][Add UsedMultisampleCount to the "Renderer" DEBUG_DATA_BLOCK in WinMainCRTStartup()][:"debug system"] [1:10:48][Find that our UsedMultisampleCount is 16, and calculate the :memory requirements as 2.47GB][:run] [1:13:14][Consider :memory usage improvements: Only store one depth buffer][:speech] [1:18:59][Consider :memory usage improvements: Streamline the colour buffer][:speech] [1:23:02][Consider :rendering solid cubes into a single multisampling buffer with depth peeling disabled, resolve and composite this with our alpha items][:memory :speech] [1:30:15][Consider depth peeling in only two buffers][:memory :speech] [1:36:26][Spot our DepthPeelResolveBuffer, and make OpenGLEndFrame() use a single depth peel buffer][:memory :rendering] [1:39:21][Find that we render just fine in one DepthPeelBuffer][:rendering :memory :run] [1:40:11][Shrink open_gl down to only contain one DepthPeelBuffer][:memory :rendering] [1:42:15][Find that we render correctly and have a TotalFramebufferMemory of 2GB][:memory :run] [1:43:16][Step in to CreateFramebuffer() and watch the :memory usage][:rendering :run] [1:46:00][Add a new Nsight project for [~hero Handmade Hero] using the correct Working Directory][:admin] [1:46:51][:Run successfully in Nsight][:owl] [1:48:07][Continue to step through CreateFramebuffer() and spot that we're multiplying the MaxMultiSampleCount into the GPUMemoryUsed even when multisampling is disabled][:owl :run] [1:48:53][Fix CreateFramebuffer() to multiply the correct SampleCount into the GPUMemoryUsed][:memory :owl] [1:49:44][Step through CreateFramebuffer() to see that our GPUMemoryUsed is actually okay][:memory :owl :rendering :run] [1:55:55][Find that we have a TotalFramebufferMemory of 325MB][:memory :run] [1:57:46][Nsight rendering time: 14ms / frame][:memory :rendering :run] [1:58:38][Permit NVIDIA GPU performance counters[ref site="NVIDIA Developer" page="NVIDIA Development Tools Solutions - ERR_NVGPUCTRPERM: Nsight Graphics Permission issue with Performance Counters" url=https://developer.nvidia.com/nvidia-development-tools-solutions-err-nvgpuctrperm-nsight-graphics]][:admin] [2:01:09][Capture for Live Analysis in Nsight, and look for a combined :memory count][:rendering :run] [2:07:39][Nsight: Range Profiler View][:performance :rendering :run] [2:14:57][Reflect on our peel buffer :memory usage reduction, and plan to pseudo-simulate the light probes][:lighting :rendering :speech] [2:16:29][Try disabling V-Sync and multisampling, to find that the latter fails][:rendering :run] [2:17:17][Plan to fix the multisampling read][:lighting :rendering :speech] [/video]