diff --git a/cmuratori/hero/code/code611.hmml b/cmuratori/hero/code/code611.hmml index aea038a..9e4ac0e 100644 --- a/cmuratori/hero/code/code611.hmml +++ b/cmuratori/hero/code/code611.hmml @@ -61,7 +61,7 @@ site=uops.info url=https://uops.info/table.html]][:lighting :optimisation :simd] [1:49:07][Dependents, and Cycle Ordering][:hardware :performance] -[1:52:32][Finish making ComputeVoxelIrradianceAt() operate wide[ref +[1:52:32][Continue to make ComputeVoxelIrradianceAt() operate wide[ref site=Intel page="Intel Intrinsics Guide" url=https://software.intel.com/sites/landingpage/IntrinsicsGuide/][ref diff --git a/cmuratori/hero/code/code612.hmml b/cmuratori/hero/code/code612.hmml new file mode 100644 index 0000000..0ca4d85 --- /dev/null +++ b/cmuratori/hero/code/code612.hmml @@ -0,0 +1,64 @@ +[video output=day612 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="First Pass Optimization of Voxel Sampling" vod_platform=youtube id=W3ml7cO96F0 annotator=Miblo] +[0:01][Recap and set the stage for the day][:speech] +[2:08][Describe our vectorisation of ComputeVoxelIrradianceAt()][:lighting :optimisation :research :simd] +[3:11][Instrument ComputeVoxelIrradianceAt() to verify the new :SIMD against the old scalar code][:lighting :optimisation] +[9:08][Continue to make ComputeVoxelIrradianceAt() operate wide][:lighting :optimisation :simd] +[13:14][Introduce an f32_4x version of Clamp01(), with a few words on optimising compilers][:language :mathematics :simd] +[16:19][Continue to make ComputeVoxelIrradianceAt() operate wide][:lighting :optimisation :simd] +[29:10][Change the f32_4x version of Clamp01() to use ZeroF32_4x()][:mathematics :simd] +[29:57][@billdstrong][How is [@cmuratori he] going to test the Clamp01() if [@cmuratori he] deleted it from the original code?] +[30:04][Introduce an f32_4x version Floor()[ref + site=Intel + page="Intel Intrinsics Guide" + url=https://software.intel.com/sites/landingpage/IntrinsicsGuide/]][:mathematics :simd] +[31:46][Fix compile errors in our ComputeVoxelIrradianceAt() vectorisation][:lighting :optimisation :simd] +[35:52][Optimise ComputeVoxelIrradianceAt() to sum weights before broadcasting them][:lighting :optimisation :simd] +[38:19][On the cognitive demand of :SIMD, as opposed to instruction sets like AVX-512 and NEON][:isa :speech] +[42:01][Continue to make ComputeVoxelIrradianceAt() operate wide, loading in the tiles[ref + site=Intel + page="Intel Intrinsics Guide" + url=https://software.intel.com/sites/landingpage/IntrinsicsGuide/][ref + site=uops.info + url=https://uops.info/table.html]][:lighting :optimisation :simd] +[1:09:06][Introduce ConvertS32()[ref + site=Intel + page="Intel Intrinsics Guide" + url=https://software.intel.com/sites/landingpage/IntrinsicsGuide/]][:simd] +[1:12:44][Finish making ComputeVoxelIrradianceAt() operate wide, introducing an f32_4x version of Clamp()][:lighting :optimisation :simd] +[1:18:39][:Run the game][:lighting :optimisation :simd] +[1:19:02][Step through ComputeVoxelIrradianceAt() to find that our vectorised code has been compiled out][:lighting :optimisation :run :simd] +[1:19:28][Make ComputeVoxelIrradianceAt() return the :SIMD computed result][:lighting :optimisation] +[1:19:50][Step through ComputeVoxelIrradianceAt() and try to check out our vectorised code][:asm :lighting :optimisation :run :simd] +[1:21:15][Disable multithreading of the :lighting][:threading] +[1:21:45][Step through our multithreaded ComputeVoxelIrradianceAt()][:asm :lighting :optimisation :run :simd] +[1:22:19][Comment out the old scalar ComputeVoxelIrradianceAt()][:lighting :optimisation] +[1:23:16][Step through our single-threaded ComputeVoxelIrradianceAt()][:asm :lighting :optimisation :run :simd] +[1:23:54][Update ~RemedyBG][:admin] +[1:26:30][Step through the assembly of our new vectorised ComputeVoxelIrradianceAt()][:asm :lighting :optimisation :run :simd] +[1:28:41][Our :lighting looks like the vectorisation just worked][:optimisation :run :simd] +[1:28:47][Enable multithreading of the :lighting][:threading] +[1:29:05][Our :lighting looks like it did before][:optimisation :run :simd] +[1:29:18][hhlightprof total seconds elapsed: 5.110175][:lighting :performance :run] +[1:30:56][Disable LIGHTING_USE_GRID][:lighting] +[1:31:13][hhlightprof total seconds elapsed: 6.390334][:lighting :performance :run] +[1:32:36][Enable LIGHTING_USE_GRID][:lighting] +[1:32:54][77% of our frame time spent in ComputeLightPropagationWork][:lighting :performance :run] +[1:34:09][Q&A][:speech] +[1:35:00][@billdstrong][Q: Do you plan on bringing your editor on stream, or not? You keep bragging about it] +[1:35:05][@mindmark42][Q: Can you run lightprof without any days?][:lighting] +[1:35:17][@mindmark42][rays][:lighting] +[1:35:28][Try decreasing the CostMetric from 16 to 0 in GridRayCast()][:lighting] +[1:36:03][hhlightprof total seconds elapsed: 2.583887][:lighting :performance :run] +[1:36:47][@vaualbus][Q: Can we time that function with the :"debug system"? So we see how long the top part of that function takes?] +[1:37:01][@equivocatorrr][Q: Why is frame time stability such a rare / impossible thing without leaving headroom?][:performance] +[1:38:54][@pragmascrypt][Q: Did you activate :threading again for the benchmark?] +[1:39:18][@sagian2005][Q: [@cmuratori Casey], I just sent you an email. It's re: the SSE stuff you did on today's stream. You might get a smile out of it][:simd] +[1:39:29][@nobodad][Q: @naysayer88 mentioned that you discussed with him why programming languages shouldn't have unsigned integers. Have you posted your rationale somewhere that I can read? Would you be willing to?][:language] +[1:40:49][@fl_aw3n][Q: Can I compile all files in all subdirectories with CL recursively?][:language] +[1:41:06][@yesyesyourmother][Q: Can you use some of the :lighting work you do on [~hero Handmade Hero] in different projects?] +[1:41:31][@relvet][Q: When do we add special sauce, and how much of it? I feel this game needs a Sauce-O-Meter] +[1:42:31][@mindmark42][Q: Couldn't the v3 XYZ be loaded with a single load if we pad them?][:simd] +[1:46:11][@exp_ix][Q: Are there any fundamental differences between games engines that use low poly models vs this one?] +[1:47:34][@noobgirrafe][How can I get your emacs config?] +[1:48:08][Shut it down][:speech] +[/video]