cinera_handmade.network/cmuratori/hero/code/code554.hmml

63 lines
4.7 KiB
Plaintext
Raw Permalink Normal View History

[video output=day554 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Reducing GPU Memory Footprint" vod_platform=youtube id=dl4QKPK8LMo annotator=Miblo]
2019-09-20 16:27:10 +00:00
[0:02][Update ~Milton and ~RemedyBG, with thanks to Ameen Sayegh[ref
site=Twitter
page="Ameen Sayegh"
url=https://twitter.com/ameensayegh] for the ~Milton grid patch][:admin]
[2:46][Try out ~Milton's grids][:blackboard]
[6:18][~Milton or tablet driver issue: Smoothing][:blackboard]
[8:18][Demo our :lighting :sampling sphere][:run]
[9:29][Note the fast :lighting computation][:research]
[12:35][8-Wide Light Probes[ref
author="Jordan Leigh"
title="3D Interactive Cube with Rotating Sides using CSS3 and JavaScript"
publisher=Codepen
url=https://codepen.io/jordizle/pen/haIdo/]][:blackboard :geometry :lighting :sampling]
[22:15][Plan to work on both the pixel shader and light propagation][:lighting :speech]
[24:06][See stutter in our frame rate][:performance :run]
[25:55][We've got an interloping [@Molly puss]][:speech]
[27:18][Consider that we're using too much GPU :memory][:run]
[28:42][Calculate our GPU :memory requirements][:admin]
[31:51][Crash Nsight upon launching [~hero Handmade Hero]][:run]
[33:35][Add a new Nsight project for [~hero Handmade Hero]][:admin]
[35:13][Crash Nsight upon launching [~hero Handmade Hero]][:run]
[36:36][Install RenderDoc[ref
site=RenderDoc
url=https://renderdoc.org] and configure it for [~hero Handmade Hero]][:admin]
[40:26][Crash RenderDoc upon launching [~hero Handmade Hero]][:run]
[41:36][cloc [~hero Handmade Hero]: 32,936 lines][:admin]
[43:24][Remove the light_buffer array from open_gl][:hardware :memory]
[47:41][Find that there is no change][:hardware :memory :run]
[47:51][Track the frame buffer :memory usage in FreeFramebuffer() and CreateFramebuffer()]
[57:41][Find that we may get a faster frame rate with V-Sync disabled][:performance :run]
[58:13][Add a "Renderer" DEBUG_DATA_BLOCK in WinMainCRTStartup() for the framebuffer and texture :memory][:"debug system" :memory]
[1:04:58][Enable the :"debug system" to handle umm type]
[1:07:57][Find that we have a TotalFramebufferMemory of 3GB][:memory :run]
[1:08:22][Add UsedMultisampleCount to the "Renderer" DEBUG_DATA_BLOCK in WinMainCRTStartup()][:"debug system"]
[1:10:48][Find that our UsedMultisampleCount is 16, and calculate the :memory requirements as 2.47GB][:run]
[1:13:14][Consider :memory usage improvements: Only store one depth buffer][:speech]
[1:18:59][Consider :memory usage improvements: Streamline the colour buffer][:speech]
[1:23:02][Consider :rendering solid cubes into a single multisampling buffer with depth peeling disabled, resolve and composite this with our alpha items][:memory :speech]
[1:30:15][Consider depth peeling in only two buffers][:memory :speech]
[1:36:26][Spot our DepthPeelResolveBuffer, and make OpenGLEndFrame() use a single depth peel buffer][:memory :rendering]
[1:39:21][Find that we render just fine in one DepthPeelBuffer][:rendering :memory :run]
[1:40:11][Shrink open_gl down to only contain one DepthPeelBuffer][:memory :rendering]
[1:42:15][Find that we render correctly and have a TotalFramebufferMemory of 2GB][:memory :run]
[1:43:16][Step in to CreateFramebuffer() and watch the :memory usage][:rendering :run]
[1:46:00][Add a new Nsight project for [~hero Handmade Hero] using the correct Working Directory][:admin]
[1:46:51][:Run successfully in Nsight][:owl]
[1:48:07][Continue to step through CreateFramebuffer() and spot that we're multiplying the MaxMultiSampleCount into the GPUMemoryUsed even when multisampling is disabled][:owl :run]
[1:48:53][Fix CreateFramebuffer() to multiply the correct SampleCount into the GPUMemoryUsed][:memory :owl]
[1:49:44][Step through CreateFramebuffer() to see that our GPUMemoryUsed is actually okay][:memory :owl :rendering :run]
[1:55:55][Find that we have a TotalFramebufferMemory of 325MB][:memory :run]
[1:57:46][Nsight rendering time: 14ms / frame][:memory :rendering :run]
[1:58:38][Permit NVIDIA GPU performance counters[ref
site="NVIDIA Developer"
page="NVIDIA Development Tools Solutions - ERR_NVGPUCTRPERM: Nsight Graphics Permission issue with Performance Counters"
url=https://developer.nvidia.com/nvidia-development-tools-solutions-err-nvgpuctrperm-nsight-graphics]][:admin]
[2:01:09][Capture for Live Analysis in Nsight, and look for a combined :memory count][:rendering :run]
[2:07:39][Nsight: Range Profiler View][:performance :rendering :run]
[2:14:57][Reflect on our peel buffer :memory usage reduction, and plan to pseudo-simulate the light probes][:lighting :rendering :speech]
[2:16:29][Try disabling V-Sync and multisampling, to find that the latter fails][:rendering :run]
[2:17:17][Plan to fix the multisampling read][:lighting :rendering :speech]
[/video]