cinera_handmade.network/cmuratori/hero/code/code616.hmml

82 lines
7.2 KiB
Plaintext
Raw Permalink Normal View History

2020-07-08 21:36:52 +00:00
[video output=day616 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Tableless Grid Walk" vod_platform=youtube id=rrcYMDRE9wA annotator=Miblo]
[0:02][Recap and set the stage for the day][:speech]
[0:39][hhlightprof total seconds elapsed: 4.508690][:lighting :performance :run]
[1:32][Set up to gauge the :performance of the inline lighting grid traversal computation][:lighting :speech]
[3:49][Walk through ComputeWalkTableFast()][:lighting :research]
[6:11][Begin to enable GridRayCast() to invoke and verify the inline lighting grid traversal][:lighting :optimisation]
[9:39][Consider the problem with our WalkTable X, Y, X, Z ordering][:lighting :optimisation :research]
[10:31][Revert GridRayCast()][:lighting :optimisation]
[10:41][Consider ordering our WalkTable X, Y, Z, X][:lighting :optimisation :research]
[14:00][:Run hhlightprof successfully][:lighting :optimisation]
[14:08][Provoke an error in the BestTable][:lighting :optimisation]
[14:21][:Run hhlightprof with a fault][:lighting :optimisation]
[14:28][Change ComputeWalkTableFast() to pack our :SIMD lanes X, Y, Z, X][:lighting :optimisation]
[19:48][:Run hhlightprof with a fault][:lighting :optimisation]
[19:54][Update the MaskTable in ComputeWalkTableFast() to work with X, Y, Z, X ordering][:lighting :optimisation :simd]
[20:30][:Run hhlightprof with a fault][:lighting :optimisation]
[21:11][Fix the X, Y, Z, X ordered MaskTable in ComputeWalkTableFast()][:lighting :optimisation :simd]
[21:28][:Run hhlightprof with a fault][:lighting :optimisation]
[21:57][Update ComputeWalkTableFast() to set t4s from shuffled X, Y, Z, X ordering][:lighting :optimisation :simd]
[22:16][:Run hhlightprof with a fault][:lighting :optimisation]
[27:02][Fix the BestTable :documentation][:lighting :optimisation :simd]
[27:57][Investigate why t4's fourth :SIMD lane is unset][:lighting :optimisation :run :simd]
[28:36][Fix MaskTable to set the fourth :SIMD lane][:lighting :optimisation]
[29:00][:Run hhlightprof successfully][:lighting :optimisation]
[29:03][Change ComputeWalkTableFast() to leave the fourth :SIMD lane blank, to give X, Y, Z][:lighting :optimisation :simd]
[34:35][:Run hhlightprof with a fault][:lighting :optimisation]
[35:00][Make ComputeWalkTableFast() zero out the fourth lane of InvRayD4][:lighting :optimisation :simd]
[35:19][:Run hhlightprof with a fault][:lighting :optimisation]
[36:40][Update the MaskTable in ComputeWalkTableFast() for X, Y, Z ordering][:lighting :optimisation :simd]
[37:13][:Run hhlightprof with a fault][:lighting :optimisation]
[37:43][Update the ShuffleTable in ComputeWalkTableFast() for X, Y, Z ordering][:lighting :optimisation :simd]
[38:09][:Run hhlightprof with a fault][:lighting :optimisation]
[38:40][Step through ComputeWalkTableFast()][:lighting :optimisation :run :simd]
[40:31][Document the BestTable in ComputeWalkTableFast()][:documentation :lighting :optimisation :simd]
[47:57][Make ComputeWalkTableFast() set the BestTable entry for all-equal as 0 (or X)][:lighting :optimisation :simd]
[48:25][:Run hhlightprof until faulting on SampleDirIndex 1012 (out of 1024)][:lighting :optimisation]
[50:26][Document and determine that the Y > Z case simply involves a preference problem][:documentation :lighting :optimisation :simd]
[52:53][Change the old ComputeWalkTable() to prefer Z in the Y > Z case][:lighting :optimisation :simd]
[54:25][:Run hhlightprof successfully][:lighting :optimisation]
[54:37][Loft up the BestTable, ShuffleTable, MaskTable and related values from ComputeWalkTableFast() to GridRayCast(), prefixing their names with t][:lighting :optimisation :simd]
[1:07:24][:Run hhlightprof with a fault][:lighting :optimisation]
[1:08:19][Fix GridRayCast() to update tTerminateVerify after verifying][:lighting :optimisation :simd]
[1:08:27][:Run hhlightprof with a fault][:lighting :optimisation]
[1:10:34][Double-check what ComputeWalkTable() does when a Ray is pointing backwards][:lighting :research]
[1:12:04][Step through GridRayCast()][:lighting :optimisation :run :simd]
[1:16:31][The WalkTable contains garbage also in-game][:lighting :optimisation :run :simd]
[1:18:49][Step through GridRayCast() watching the tTerminate values][:lighting :optimisation :run :simd]
[1:21:44][Break into GridRayCast() and investigate the inf tTerminate][:lighting :optimisation :run :simd]
[1:24:17][Add a breakpoint in ComputeWalkTable() on DestIndex 6913][:lighting]
[1:24:57][Break into ComputeWalkTable() on DestIndex 6913, with inf tTerminate][:lighting :run]
[1:27:59][Add a breakpoint earlier in ComputeWalkTable() on DestIndex 6912][:lighting]
[1:28:16][Step through ComputeWalkTable() on DestIndex 6912][:lighting :run]
[1:29:32][Make ComputeWalkTable() and ComputeWalkTableFast() use the AbsoluteValue of the ray direction][:lighting :optimisation :simd]
[1:30:31][Successfully step through ComputeWalkTable() on DestIndex 6912, and into the game][:lighting :run]
[1:31:04][:Run the game successfully in -O2][:lighting :optimisation :simd]
[1:31:32][Switch GridRayCast() over to use the inline lighting grid traversal][:lighting :optimisation :simd]
[1:32:46][:Run the game successfully][:lighting :optimisation :simd]
[1:33:01][hhlightprof total seconds elapsed: 6.826915][:lighting :optimisation :performance :run :simd]
[1:33:47][Toggle GridRayCast() to the precomputed WalkTable][:lighting :optimisation :simd]
[1:33:57][hhlightprof total seconds elapsed: 4.458800][:lighting :optimisation :performance :run :simd]
[1:35:45][Change GridRayCast() to compute InvRayDPacked more concisely][:lighting :optimisation :simd]
[1:37:16][hhlightprof total seconds elapsed: 4.470019][:lighting :optimisation :performance :run :simd]
[1:37:34][Toggle GridRayCast() to the inline lighting grid traversal][:lighting :optimisation :simd]
[1:37:43][hhlightprof total seconds elapsed: 7.426753][:lighting :optimisation :performance :run :simd]
[1:37:55][Try letting GridRayCast() both use the precomputed WalkTable, and compute the traversal inline][:lighting :optimisation :simd]
[1:38:32][:Run hhlightprof with a fault][:lighting :optimisation]
[1:39:31][Break into GridRayCast() on our fault][:lighting :optimisation :run :simd]
[1:41:16][Prevent GridRayCast() from setting tTerminate to tTerminateVerify][:lighting :optimisation :simd]
[1:41:26][:Run the game with a fault][:lighting :optimisation]
[1:42:12][Prevent GridRayCast() from setting GridIndex to dGridResult][:lighting :optimisation :simd]
[1:42:24][:Run the game successfully][:lighting :optimisation]
[1:42:37][:Run the game in -O2][:lighting :optimisation]
[1:42:57][hhlightprof total seconds elapsed: 6.543860][:lighting :optimisation :performance :run :simd]
[1:43:29][Inspect the assembly of our inline lighting grid traversal][:asm :lighting :optimisation :run :simd]
[1:45:36][Q&A][:speech]
[1:45:55][@somebody_took_my_name][Q: I think the first entry of the tShuffle and tMask tables haven't changed to x, as you have done for the BestDim table (the all equal case). It is a rare case, though (if it happens at all)][:lighting :simd]
[1:46:38][Fix the ShuffleTable and MaskTable for X, Y, Z ordering][:lighting :optimisation :simd]
[1:47:15][:Run the game successfully][:lighting :optimisation]
[1:47:44][@billdstrong][Q: Could we be going out of the cache and that made it slow? How would we check that?][:lighting :performance]
[1:49:48][Wrap it up][:speech]
[/video]