From b9a18742baecfaac13ba9519f64960d75a2650c2 Mon Sep 17 00:00:00 2001 From: Matt Mascarenhas Date: Sat, 16 May 2020 19:26:09 +0100 Subject: [PATCH] Index hero/code601 --- cmuratori/hero/code/code601.hmml | 74 ++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 cmuratori/hero/code/code601.hmml diff --git a/cmuratori/hero/code/code601.hmml b/cmuratori/hero/code/code601.hmml new file mode 100644 index 0000000..ec99df5 --- /dev/null +++ b/cmuratori/hero/code/code601.hmml @@ -0,0 +1,74 @@ +[video output=day601 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Sketching Out the Walk Table Generator" vod_platform=youtube id=01utCzOSruc annotator=Miblo] +[0:01][Recap and set the stage for the day improving our :lighting :performance][:speech] +[3:13][Reflect on yesterday's work on GridRayCast()][:lighting :research] +[5:01][Determine to generate the WalkTable][:lighting :research] +[5:50][Set up GridRayCast() to index into the WalkTable][:lighting] +[7:03][On striding through the voxel, and the need to bounds-check in multiple dimensions][:lighting :research] +[10:02][Make RayCast() set a WalkCount for GridRayCast() to use when striding through the WalkTable, augmenting light_sample_direction with an EndAtWalkOffset][:"data structure" :lighting] +[12:28][Micro- and Macro-op Fusion[ref + site="Compiler Explorer" + url=https://godbolt.org]][:asm :performance :research] +[19:15][Looping in clang -O2[ref + site="Compiler Explorer" + url=https://godbolt.org]][:asm :performance :research] +[20:33][clang's loop unrolling[ref + site="Compiler Explorer" + url=https://godbolt.org]][:asm :performance :research] +[24:54][Looping in msvc -O2[ref + site="Compiler Explorer" + url=https://godbolt.org]][:asm :performance :research] +[27:04][Looping in clang -O3, with its micro-architectural analysis[ref + site="Compiler Explorer" + url=https://godbolt.org]][:asm :performance :research] +[28:24][Micro- and Macro-op Fusion[ref + site="Compiler Explorer" + url=https://godbolt.org]][:asm :performance :research] +[30:34][Micro- and Macro-op Fusion of non-deterministic looping[ref + site="Compiler Explorer" + url=https://godbolt.org][ref + site=uops.info + url=https://uops.info/table.html]][:asm :performance :research] +[37:45][Non-deterministic looping in clang -O3, with its micro-architectural analysis[ref + site="Compiler Explorer" + url=https://godbolt.org][ref + site="LLVM 10 Documentation" + page="llvm-mca - LLVM Machine Code Analyzer" + url=https://www.llvm.org/docs/CommandGuide/llvm-mca.html]][:asm :performance :research] +[44:49][Make GridRayCast() loop through the WalkTable, renaming the indices in light_sample_direction][:"data structure" :lighting] +[50:03][Consider how to generate the WalkCount][:lighting :research] +[52:20][Walk Table Truncation][:blackboard :lighting] +[55:10][Replace the WalkCount with a SPATIAL_GRID_NODE_TERMINATOR skirt surrounding the spatial grid][:"data structure" :lighting] +[57:20][Prepare RayCast(), GridRayCast(), FullCast() and ComputeLightPropagationWork() to develop the WalkTable generation and compare the current and resulting routines, introducing GridIndexFrom()][:"data structure" :lighting] +[1:23:18][WalkTable generation, à la Bresenham Line Drawing][:blackboard :geometry :lighting] +[1:28:05][Introduce ComputeWalkTable()][:geometry :lighting] +[1:39:51][Q&A][:speech] +[1:40:14][@naysayer88][Q: https://imgur.com/a/A28qRIw[ref + site=imgur + page=A28qRIw + url=https://imgur.com/a/A28qRIw]] +[1:41:15][@thetamiel][Billy Mitchell has recently sued some people for slandering him] +[1:42:31][@sneakybob_wot][Did they say he used an emulator rather than a real arcade?] +[1:43:15][@mindmark42][Q: Can you explain more what a micro-op is?][:isa] +[1:43:32][Skylake-ish Core Micro-op (µop)][:asm :blackboard :hardware :isa] +[1:58:29][Micro-op Fusion and Limits][:asm :blackboard :hardware :isa] +[2:01:39][Macro-op Fusion][:asm :blackboard :hardware :isa] +[2:07:02][@somebody_took_my_name][Q: I don't think they fuse the instructions anymore. Take a look at LOOP, LOOPE and LOOPNE on uops.info.[ref + site=uops.info + url=https://uops.info/table.html] They take a lot of uops. (No expert here, but pure from poking Godbolt and the x86_64 instruction set / byte codes[ref + site="x86asm.net" + page="coder64 edition | X86 Opcode and Instruction Reference 1.12" + url=http://ref.x86asm.net/coder64.html])][:asm :hardware :isa] +[2:07:37][@somebody_took_my_name][Q: Never mind, I thought you meant that the compiler fused the instructions, not the uops in the pipe][:asm :hardware :isa] +[2:07:48][@naysayer88][Q: What if it is an HTML DIV in the scheduler?] +[2:07:53][WebAssembly][:asm :blackboard] +[2:08:23][@printf_armin][They do even crazier instruction squashing now][:asm :hardware :isa] +[2:10:14][We're all good here][:speech] +[2:11:04][@naysayer88][Probably not!] +[2:11:28][@naysayer88][I should have OBS'd it] +[2:11:47][@somebody_took_my_name][Does he have a save game? Or are there no save games?] +[2:12:01][@naysayer88][Look, if my screenshot[ref + site=imgur + page=A28qRIw + url=https://imgur.com/a/A28qRIw] is invalid, then YOURS is invalid, and my score being higher, my invalid screenshot is better than yours] +[2:12:16][That's it, everybody][:speech] +[/video]