From f676c915c01067c7300a03ee06a7f04abf420764 Mon Sep 17 00:00:00 2001 From: Matt Mascarenhas Date: Wed, 8 Jul 2020 22:36:52 +0100 Subject: [PATCH] Index hero/code616 --- cmuratori/hero/code/code616.hmml | 81 ++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+) create mode 100644 cmuratori/hero/code/code616.hmml diff --git a/cmuratori/hero/code/code616.hmml b/cmuratori/hero/code/code616.hmml new file mode 100644 index 0000000..78732d9 --- /dev/null +++ b/cmuratori/hero/code/code616.hmml @@ -0,0 +1,81 @@ +[video output=day616 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Tableless Grid Walk" vod_platform=youtube id=rrcYMDRE9wA annotator=Miblo] +[0:02][Recap and set the stage for the day][:speech] +[0:39][hhlightprof total seconds elapsed: 4.508690][:lighting :performance :run] +[1:32][Set up to gauge the :performance of the inline lighting grid traversal computation][:lighting :speech] +[3:49][Walk through ComputeWalkTableFast()][:lighting :research] +[6:11][Begin to enable GridRayCast() to invoke and verify the inline lighting grid traversal][:lighting :optimisation] +[9:39][Consider the problem with our WalkTable X, Y, X, Z ordering][:lighting :optimisation :research] +[10:31][Revert GridRayCast()][:lighting :optimisation] +[10:41][Consider ordering our WalkTable X, Y, Z, X][:lighting :optimisation :research] +[14:00][:Run hhlightprof successfully][:lighting :optimisation] +[14:08][Provoke an error in the BestTable][:lighting :optimisation] +[14:21][:Run hhlightprof with a fault][:lighting :optimisation] +[14:28][Change ComputeWalkTableFast() to pack our :SIMD lanes X, Y, Z, X][:lighting :optimisation] +[19:48][:Run hhlightprof with a fault][:lighting :optimisation] +[19:54][Update the MaskTable in ComputeWalkTableFast() to work with X, Y, Z, X ordering][:lighting :optimisation :simd] +[20:30][:Run hhlightprof with a fault][:lighting :optimisation] +[21:11][Fix the X, Y, Z, X ordered MaskTable in ComputeWalkTableFast()][:lighting :optimisation :simd] +[21:28][:Run hhlightprof with a fault][:lighting :optimisation] +[21:57][Update ComputeWalkTableFast() to set t4s from shuffled X, Y, Z, X ordering][:lighting :optimisation :simd] +[22:16][:Run hhlightprof with a fault][:lighting :optimisation] +[27:02][Fix the BestTable :documentation][:lighting :optimisation :simd] +[27:57][Investigate why t4's fourth :SIMD lane is unset][:lighting :optimisation :run :simd] +[28:36][Fix MaskTable to set the fourth :SIMD lane][:lighting :optimisation] +[29:00][:Run hhlightprof successfully][:lighting :optimisation] +[29:03][Change ComputeWalkTableFast() to leave the fourth :SIMD lane blank, to give X, Y, Z][:lighting :optimisation :simd] +[34:35][:Run hhlightprof with a fault][:lighting :optimisation] +[35:00][Make ComputeWalkTableFast() zero out the fourth lane of InvRayD4][:lighting :optimisation :simd] +[35:19][:Run hhlightprof with a fault][:lighting :optimisation] +[36:40][Update the MaskTable in ComputeWalkTableFast() for X, Y, Z ordering][:lighting :optimisation :simd] +[37:13][:Run hhlightprof with a fault][:lighting :optimisation] +[37:43][Update the ShuffleTable in ComputeWalkTableFast() for X, Y, Z ordering][:lighting :optimisation :simd] +[38:09][:Run hhlightprof with a fault][:lighting :optimisation] +[38:40][Step through ComputeWalkTableFast()][:lighting :optimisation :run :simd] +[40:31][Document the BestTable in ComputeWalkTableFast()][:documentation :lighting :optimisation :simd] +[47:57][Make ComputeWalkTableFast() set the BestTable entry for all-equal as 0 (or X)][:lighting :optimisation :simd] +[48:25][:Run hhlightprof until faulting on SampleDirIndex 1012 (out of 1024)][:lighting :optimisation] +[50:26][Document and determine that the Y > Z case simply involves a preference problem][:documentation :lighting :optimisation :simd] +[52:53][Change the old ComputeWalkTable() to prefer Z in the Y > Z case][:lighting :optimisation :simd] +[54:25][:Run hhlightprof successfully][:lighting :optimisation] +[54:37][Loft up the BestTable, ShuffleTable, MaskTable and related values from ComputeWalkTableFast() to GridRayCast(), prefixing their names with t][:lighting :optimisation :simd] +[1:07:24][:Run hhlightprof with a fault][:lighting :optimisation] +[1:08:19][Fix GridRayCast() to update tTerminateVerify after verifying][:lighting :optimisation :simd] +[1:08:27][:Run hhlightprof with a fault][:lighting :optimisation] +[1:10:34][Double-check what ComputeWalkTable() does when a Ray is pointing backwards][:lighting :research] +[1:12:04][Step through GridRayCast()][:lighting :optimisation :run :simd] +[1:16:31][The WalkTable contains garbage also in-game][:lighting :optimisation :run :simd] +[1:18:49][Step through GridRayCast() watching the tTerminate values][:lighting :optimisation :run :simd] +[1:21:44][Break into GridRayCast() and investigate the inf tTerminate][:lighting :optimisation :run :simd] +[1:24:17][Add a breakpoint in ComputeWalkTable() on DestIndex 6913][:lighting] +[1:24:57][Break into ComputeWalkTable() on DestIndex 6913, with inf tTerminate][:lighting :run] +[1:27:59][Add a breakpoint earlier in ComputeWalkTable() on DestIndex 6912][:lighting] +[1:28:16][Step through ComputeWalkTable() on DestIndex 6912][:lighting :run] +[1:29:32][Make ComputeWalkTable() and ComputeWalkTableFast() use the AbsoluteValue of the ray direction][:lighting :optimisation :simd] +[1:30:31][Successfully step through ComputeWalkTable() on DestIndex 6912, and into the game][:lighting :run] +[1:31:04][:Run the game successfully in -O2][:lighting :optimisation :simd] +[1:31:32][Switch GridRayCast() over to use the inline lighting grid traversal][:lighting :optimisation :simd] +[1:32:46][:Run the game successfully][:lighting :optimisation :simd] +[1:33:01][hhlightprof total seconds elapsed: 6.826915][:lighting :optimisation :performance :run :simd] +[1:33:47][Toggle GridRayCast() to the precomputed WalkTable][:lighting :optimisation :simd] +[1:33:57][hhlightprof total seconds elapsed: 4.458800][:lighting :optimisation :performance :run :simd] +[1:35:45][Change GridRayCast() to compute InvRayDPacked more concisely][:lighting :optimisation :simd] +[1:37:16][hhlightprof total seconds elapsed: 4.470019][:lighting :optimisation :performance :run :simd] +[1:37:34][Toggle GridRayCast() to the inline lighting grid traversal][:lighting :optimisation :simd] +[1:37:43][hhlightprof total seconds elapsed: 7.426753][:lighting :optimisation :performance :run :simd] +[1:37:55][Try letting GridRayCast() both use the precomputed WalkTable, and compute the traversal inline][:lighting :optimisation :simd] +[1:38:32][:Run hhlightprof with a fault][:lighting :optimisation] +[1:39:31][Break into GridRayCast() on our fault][:lighting :optimisation :run :simd] +[1:41:16][Prevent GridRayCast() from setting tTerminate to tTerminateVerify][:lighting :optimisation :simd] +[1:41:26][:Run the game with a fault][:lighting :optimisation] +[1:42:12][Prevent GridRayCast() from setting GridIndex to dGridResult][:lighting :optimisation :simd] +[1:42:24][:Run the game successfully][:lighting :optimisation] +[1:42:37][:Run the game in -O2][:lighting :optimisation] +[1:42:57][hhlightprof total seconds elapsed: 6.543860][:lighting :optimisation :performance :run :simd] +[1:43:29][Inspect the assembly of our inline lighting grid traversal][:asm :lighting :optimisation :run :simd] +[1:45:36][Q&A][:speech] +[1:45:55][@somebody_took_my_name][Q: I think the first entry of the tShuffle and tMask tables haven't changed to x, as you have done for the BestDim table (the all equal case). It is a rare case, though (if it happens at all)][:lighting :simd] +[1:46:38][Fix the ShuffleTable and MaskTable for X, Y, Z ordering][:lighting :optimisation :simd] +[1:47:15][:Run the game successfully][:lighting :optimisation] +[1:47:44][@billdstrong][Q: Could we be going out of the cache and that made it slow? How would we check that?][:lighting :performance] +[1:49:48][Wrap it up][:speech] +[/video]