From d4c07f7a3327353853426b99bf57b94385f127dd Mon Sep 17 00:00:00 2001 From: Matt Mascarenhas Date: Tue, 24 Mar 2020 21:57:07 +0000 Subject: [PATCH] Index hero/code587 --- cmuratori/hero/code/code587.hmml | 87 ++++++++++++++++++++++++++++++++ 1 file changed, 87 insertions(+) create mode 100644 cmuratori/hero/code/code587.hmml diff --git a/cmuratori/hero/code/code587.hmml b/cmuratori/hero/code/code587.hmml new file mode 100644 index 0000000..ab56a28 --- /dev/null +++ b/cmuratori/hero/code/code587.hmml @@ -0,0 +1,87 @@ +[video member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Optimizing the Specular to Diffuse Transform" vod_platform=youtube id=J0Z4rdTYM0Y annotator=Miblo] +[0:03][Demo the current state and :performance of our :lighting][:run] +[1:37][Reacquaint ourselves with the :lighting's blend-over-time parameter in EndLightingComputation()][:research] +[3:02][Demo the fast-response :lighting blend][:run] +[3:11][Decrease tUpdateBlend from 10/60 to 1/60][:lighting] +[3:14][Check out the slower-response, but noiseless :lighting blend][:run] +[3:36][Increase tUpdateBlend from 1/60 to 5/60][:lighting] +[3:50][Check out the usable-response, but flickery :lighting blend][:run] +[4:21][Decrease tUpdateBlend from 5/60 to 2/60][:lighting] +[4:22][Check out the slower-response, but less flickery :lighting blend][:run] +[4:39][Increase tUpdateBlend from 2/60 to 8/60][:lighting] +[4:45][Check out the faster-response, but noisy :lighting blend][:run] +[5:22][Notice light buildup in the dungeon][:lighting :run] +[5:56][Check that light buildup in the dungeon, possibly due to the voxel switch][:lighting :run] +[6:58][Determine to gauge the :performance of our specular–diffuse transform][:lighting :speech] +[7:39][Consider shrinking the :lighting lookup voxel in Z][:run] +[9:13][Comment out LIGHT_LOOKUP_VOXEL_DIM, and respecify ComputeLightPropagationWork() and EndLightingComputation() to operate in X-slices][:lighting] +[14:52][Define MAX_LIGHT_LOOKUP_VOXEL_DIM for InitLighting() to use][:lighting] +[17:16][Replace mentions of LIGHT_LOOKUP_VOXEL_DIM in ComputeLightPropagationWork()][:lighting] +[20:35][Replace mentions of LIGHT_LOOKUP_VOXEL_DIM in CompileZBiasProgram()][:lighting] +[27:34][Reintroduce LIGHT_LOOKUP_VOXEL_DIM for Win32InitOpenGL() to use][:lighting] +[27:56][Get the same thing we saw before][:lighting :run] +[28:10][Split out LIGHT_LOOKUP_VOXEL_DIM to all three dimensions for Win32InitOpenGL() to use][:lighting] +[28:28][Check out our cubic :lighting lookup voxel][:run] +[28:41][Decrease the LIGHT_LOOKUP_VOXEL_DIM_Z from 32 to 16][:lighting] +[28:49][Check out our squatter, faster :lighting lookup voxel][:run] +[29:34][Increase the LIGHT_LOOKUP_VOXEL_DIM_Z from 16 to 32][:lighting] +[29:43][125ms per frame, with a 32×32×32 voxel][:lighting :performance :run] +[29:55][Decrease the LIGHT_LOOKUP_VOXEL_DIM_Z from 32 to 16][:lighting] +[30:04][75ms per frame, with a 32×32×16 voxel][:lighting :performance :run] +[31:02][24% frame time spent in ComputeLightPropagationWork()][:lighting :performance :run] +[31:17][Disable the specular–diffuse transform in ComputeLightPropagationWork()][:lighting] +[31:45][65ms per frame with 3% frame time spent in ComputeLightPropagationWork(), without the specular–diffuse transform][:lighting :performance :run] +[32:18][Prepare to optimise the specular–diffuse transform][:lighting :research] +[33:36][Re-enable the specular–diffuse transform in ComputeLightPropagationWork()][:lighting] +[33:52][25% frame time spent in ComputeLightPropagationWork()][:lighting :performance :run] +[33:59][Disable the specular–diffuse transform in ComputeLightPropagationWork()][:lighting] +[34:03][Hit assertion in DEBUGGetArenaByLookupBlock()][:"debug system" :run] +[34:33][3% frame time spent in ComputeLightPropagationWork(), without the specular–diffuse transform][:lighting :performance :run] +[35:14][Enable ComputeLightPropagationWork() to count up the zero weights][:lighting :optimisation] +[38:01][Step in to ComputeLightPropagationWork() to find a ZeroWCount of 196][:lighting :optimisation :run] +[39:22][Consider our potential for optimising ComputeLightPropagationWork()][:optimisation :research] +[40:09][Inspect the assembly of the specular–diffuse transform in ComputeLightPropagationWork()][:asm :lighting :run] +[41:54][Define LIGHT_ATLAS_ASSERT()][:lighting] +[43:19][Inspect the assembly of the specular–diffuse transform in ComputeLightPropagationWork()][:asm :lighting :run] +[43:33][Disable multithreading of the :lighting, wondering if ~RemedyBG supports step-single-thread][:threading] +[44:43][Inspect the assembly of the specular–diffuse transform in ComputeLightPropagationWork()][:asm :lighting :run] +[49:35][Optimise ComputeLightPropagationWork() to load and shuffle a row at once, introducing LoadF32_4X() and Broadcast4x()][:lighting :optimisation :simd] +[1:15:35][Inspect the assembly of the specular–diffuse transform in ComputeLightPropagationWork()][:asm :lighting :run] +[1:16:28][Re-enable multithreading of the :lighting][:threading] +[1:16:46][11% frame time spent in ComputeLightPropagationWork(), but with chromatic aberration][:lighting :performance :run] +[1:17:32][Double-check the specular–diffuse transform][:lighting :research] +[1:20:21][Fix ComputeLightPropagationWork() to load the specular texels in strides of 4, rather than 12][:lighting :optimisation :simd] +[1:20:52][Admire our correct and faster :lighting][:performance :run] +[1:21:51][Consider our potential for optimising the specular–diffuse transform: Separable blur[ref + site=Desmos + page="Untitled Graph" + url=https://desmos.com/calculator][ref + site=Wikipedia + page="Gaussian function" + url=https://en.wikipedia.org/wiki/Gaussian_function][ref + site=Wikipedia + page="Raised-cosine filter" + url=https://en.wikipedia.org/wiki/Raised-cosine_filter]][:lighting :optimisation :research] +[1:31:00][Check out our :lighting][:run] +[1:31:10][Decrease the light transmission rate from 0.975 to 0.75 in BuildDiffuseLightMaps()][:lighting] +[1:31:23][More readily see our darker light map viewer][:lighting :run] +[1:32:28][Set up ComputeLightPropagationWork() to perform the specular–diffuse transform as a separable filter][:lighting :optimisation] +[1:48:04][Check out our :lighting][:run] +[1:48:20][Q&A][:speech] +[1:49:15][@lucid_frost][Q: Are there any :caching concerns? I'm not familiar with how much data is being pushed around here][:lighting :performance] +[1:52:16][@philliptrudeau][Q: This scene has a little bit of variance in the :lighting between frames. Is there a way to set up this solution so that the scene looks more "static", without taking a significant :performance hit?] +[1:53:35][@somebody_took_my_name][Q: The light seems to be repeating outside of the light box (before the rewrite). Is it still there and, if so, is it a modulus issue?][:lighting] +[1:54:53][@mattiamanzati][Q: You mentioned something about shaders API being better at this kind of job. I lost your point on that because of me being unfamiliar with the environment. Can you please explain that a little bit more?][:hardware :lighting] +[1:59:38][@czapa10][Q: You often say that there should be some high level :language feature which allows you to write :SIMD code easier. Can you tell how this feature would exactly look like? Do you mean something like [@naysayer88 Jon Blow] has in Jai (fast SOA, AOS switching)? Can't you do this feature yourself using :metaprogramming?] +[2:01:05][@vtlmks][Intel Intrinsics Guide[ref + site=Intel + page="Intel Intrinsics Guide" + url=https://software.intel.com/sites/landingpage/IntrinsicsGuide/] is broken, it seems] +[2:01:45][@i_am_seabass][He's got it cached] +[2:01:56][@czapa10][You can't specify specify architecture] +[2:02:25][Plug uops[ref + site=uops.info + url=https://uops.info/table.html]][:research] +[2:03:47][Admire the :lighting][:run] +[2:04:22][Close it on up][:speech] +[/video]