cinera_handmade.network/cmuratori/hero/code/code587.hmml

88 lines
7.3 KiB
Plaintext
Raw Permalink Normal View History

[video output=day587 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Optimizing the Specular to Diffuse Transform" vod_platform=youtube id=J0Z4rdTYM0Y annotator=Miblo]
2020-03-24 21:57:07 +00:00
[0:03][Demo the current state and :performance of our :lighting][:run]
[1:37][Reacquaint ourselves with the :lighting's blend-over-time parameter in EndLightingComputation()][:research]
[3:02][Demo the fast-response :lighting blend][:run]
[3:11][Decrease tUpdateBlend from 10/60 to 1/60][:lighting]
[3:14][Check out the slower-response, but noiseless :lighting blend][:run]
[3:36][Increase tUpdateBlend from 1/60 to 5/60][:lighting]
[3:50][Check out the usable-response, but flickery :lighting blend][:run]
[4:21][Decrease tUpdateBlend from 5/60 to 2/60][:lighting]
[4:22][Check out the slower-response, but less flickery :lighting blend][:run]
[4:39][Increase tUpdateBlend from 2/60 to 8/60][:lighting]
[4:45][Check out the faster-response, but noisy :lighting blend][:run]
[5:22][Notice light buildup in the dungeon][:lighting :run]
[5:56][Check that light buildup in the dungeon, possibly due to the voxel switch][:lighting :run]
[6:58][Determine to gauge the :performance of our speculardiffuse transform][:lighting :speech]
[7:39][Consider shrinking the :lighting lookup voxel in Z][:run]
[9:13][Comment out LIGHT_LOOKUP_VOXEL_DIM, and respecify ComputeLightPropagationWork() and EndLightingComputation() to operate in X-slices][:lighting]
[14:52][Define MAX_LIGHT_LOOKUP_VOXEL_DIM for InitLighting() to use][:lighting]
[17:16][Replace mentions of LIGHT_LOOKUP_VOXEL_DIM in ComputeLightPropagationWork()][:lighting]
[20:35][Replace mentions of LIGHT_LOOKUP_VOXEL_DIM in CompileZBiasProgram()][:lighting]
[27:34][Reintroduce LIGHT_LOOKUP_VOXEL_DIM for Win32InitOpenGL() to use][:lighting]
[27:56][Get the same thing we saw before][:lighting :run]
[28:10][Split out LIGHT_LOOKUP_VOXEL_DIM to all three dimensions for Win32InitOpenGL() to use][:lighting]
[28:28][Check out our cubic :lighting lookup voxel][:run]
[28:41][Decrease the LIGHT_LOOKUP_VOXEL_DIM_Z from 32 to 16][:lighting]
[28:49][Check out our squatter, faster :lighting lookup voxel][:run]
[29:34][Increase the LIGHT_LOOKUP_VOXEL_DIM_Z from 16 to 32][:lighting]
[29:43][125ms per frame, with a 32×32×32 voxel][:lighting :performance :run]
[29:55][Decrease the LIGHT_LOOKUP_VOXEL_DIM_Z from 32 to 16][:lighting]
[30:04][75ms per frame, with a 32×32×16 voxel][:lighting :performance :run]
[31:02][24% frame time spent in ComputeLightPropagationWork()][:lighting :performance :run]
[31:17][Disable the speculardiffuse transform in ComputeLightPropagationWork()][:lighting]
[31:45][65ms per frame with 3% frame time spent in ComputeLightPropagationWork(), without the speculardiffuse transform][:lighting :performance :run]
[32:18][Prepare to optimise the speculardiffuse transform][:lighting :research]
[33:36][Re-enable the speculardiffuse transform in ComputeLightPropagationWork()][:lighting]
[33:52][25% frame time spent in ComputeLightPropagationWork()][:lighting :performance :run]
[33:59][Disable the speculardiffuse transform in ComputeLightPropagationWork()][:lighting]
[34:03][Hit assertion in DEBUGGetArenaByLookupBlock()][:"debug system" :run]
[34:33][3% frame time spent in ComputeLightPropagationWork(), without the speculardiffuse transform][:lighting :performance :run]
[35:14][Enable ComputeLightPropagationWork() to count up the zero weights][:lighting :optimisation]
[38:01][Step in to ComputeLightPropagationWork() to find a ZeroWCount of 196][:lighting :optimisation :run]
[39:22][Consider our potential for optimising ComputeLightPropagationWork()][:optimisation :research]
[40:09][Inspect the assembly of the speculardiffuse transform in ComputeLightPropagationWork()][:asm :lighting :run]
[41:54][Define LIGHT_ATLAS_ASSERT()][:lighting]
[43:19][Inspect the assembly of the speculardiffuse transform in ComputeLightPropagationWork()][:asm :lighting :run]
[43:33][Disable multithreading of the :lighting, wondering if ~RemedyBG supports step-single-thread][:threading]
[44:43][Inspect the assembly of the speculardiffuse transform in ComputeLightPropagationWork()][:asm :lighting :run]
[49:35][Optimise ComputeLightPropagationWork() to load and shuffle a row at once, introducing LoadF32_4X() and Broadcast4x()][:lighting :optimisation :simd]
[1:15:35][Inspect the assembly of the speculardiffuse transform in ComputeLightPropagationWork()][:asm :lighting :run]
[1:16:28][Re-enable multithreading of the :lighting][:threading]
[1:16:46][11% frame time spent in ComputeLightPropagationWork(), but with chromatic aberration][:lighting :performance :run]
[1:17:32][Double-check the speculardiffuse transform][:lighting :research]
[1:20:21][Fix ComputeLightPropagationWork() to load the specular texels in strides of 4, rather than 12][:lighting :optimisation :simd]
[1:20:52][Admire our correct and faster :lighting][:performance :run]
[1:21:51][Consider our potential for optimising the speculardiffuse transform: Separable blur[ref
site=Desmos
page="Untitled Graph"
url=https://desmos.com/calculator][ref
site=Wikipedia
page="Gaussian function"
url=https://en.wikipedia.org/wiki/Gaussian_function][ref
site=Wikipedia
page="Raised-cosine filter"
url=https://en.wikipedia.org/wiki/Raised-cosine_filter]][:lighting :optimisation :research]
[1:31:00][Check out our :lighting][:run]
[1:31:10][Decrease the light transmission rate from 0.975 to 0.75 in BuildDiffuseLightMaps()][:lighting]
[1:31:23][More readily see our darker light map viewer][:lighting :run]
[1:32:28][Set up ComputeLightPropagationWork() to perform the speculardiffuse transform as a separable filter][:lighting :optimisation]
[1:48:04][Check out our :lighting][:run]
[1:48:20][Q&A][:speech]
[1:49:15][@lucid_frost][Q: Are there any :caching concerns? I'm not familiar with how much data is being pushed around here][:lighting :performance]
[1:52:16][@philliptrudeau][Q: This scene has a little bit of variance in the :lighting between frames. Is there a way to set up this solution so that the scene looks more "static", without taking a significant :performance hit?]
[1:53:35][@somebody_took_my_name][Q: The light seems to be repeating outside of the light box (before the rewrite). Is it still there and, if so, is it a modulus issue?][:lighting]
[1:54:53][@mattiamanzati][Q: You mentioned something about shaders API being better at this kind of job. I lost your point on that because of me being unfamiliar with the environment. Can you please explain that a little bit more?][:hardware :lighting]
[1:59:38][@czapa10][Q: You often say that there should be some high level :language feature which allows you to write :SIMD code easier. Can you tell how this feature would exactly look like? Do you mean something like [@naysayer88 Jon Blow] has in Jai (fast SOA, AOS switching)? Can't you do this feature yourself using :metaprogramming?]
[2:01:05][@vtlmks][Intel Intrinsics Guide[ref
site=Intel
page="Intel Intrinsics Guide"
url=https://software.intel.com/sites/landingpage/IntrinsicsGuide/] is broken, it seems]
[2:01:45][@i_am_seabass][He's got it cached]
2020-03-24 23:42:25 +00:00
[2:01:56][@czapa10][You can't specify specific architecture]
2020-03-24 21:57:07 +00:00
[2:02:25][Plug uops[ref
site=uops.info
url=https://uops.info/table.html]][:research]
[2:03:47][Admire the :lighting][:run]
[2:04:22][Close it on up][:speech]
[/video]