diff --git a/cmuratori/hero/code/code121.hmml b/cmuratori/hero/code/code121.hmml index b44b940..a5b5d44 100644 --- a/cmuratori/hero/code/code121.hmml +++ b/cmuratori/hero/code/code121.hmml @@ -10,7 +10,7 @@ [42:14][Taking a look at the total throughput count] [43:18][Casey needs some more soya \[sic\] milk] [44:17][Could we do a load once, and grab out the two values that we needed?] -[45:48][Day 121 Blackboard: Explanation of possible texel loading optimisation] +[45:48][Explanation of possible texel loading optimisation][:blackboard] [50:32][Figuring out how the compiler is loading the texel data] [1:00:18][This is fine, then] [1:01:01][We multiply by TexturePitch and sizeof(uint32) four-wide manually, which is stupid] @@ -19,35 +19,35 @@ [1:04:07][Give the compiler the wide stuff so that it can see it as wide] [1:11:21][_mm_mul_epi32 does not do integer * integer] [1:13:43][Port pressure (we're back to InterIteration)] -[1:17:46][Blackboard: Hyperthreading] -[1:27:22][Blackboard: Designing how to break up the renderer for multithreading to ease pressure on the caches] -[1:32:22][Blackboard: Divide the frame buffer into chunks that are sized appropriately for the cache] -[1:39:55][Blackboard: The plan for setting up the renderer] +[1:17:46][Hyperthreading][:blackboard] +[1:27:22][Designing how to break up the renderer for multithreading to ease pressure on the caches][:blackboard] +[1:32:22][Divide the frame buffer into chunks that are sized appropriately for the cache][:blackboard] +[1:39:55][The plan for setting up the renderer][:blackboard] [1:40:47][Implementation of interleaved scanlines, in readiness for hyperthreading] -[1:46:36][Blackboard: The logic of interleaved scanlines] +[1:46:36][The logic of interleaved scanlines][:blackboard] [1:52:37][Updating compiler directives for folks who use LLVM] [1:55:20][Implementation of frame buffer divisions, in readiness for multi-core processing] -[2:05:30][Go to Disassembly of DrawRectangleSlowly in order to diagnose bogus cycle count] +[2:05:30][Go to Disassembly of DrawRectangleQuickly() in order to diagnose bogus cycle count] [2:10:04][Frame buffer divisions, continued] [2:20:50][Introduce GetClampedRectArea] [2:22:12][Problematic thing: Our convention for rectangles before was that they did not include their final value] -[2:27:33][Fix the cycle counter for DrawRectangleSlowly again] +[2:27:33][Fix the cycle counter for DrawRectangleQuickly() again] [2:29:42][A shortcut didn't work out. (!quote 297 + !quote 298)] [2:30:56][Loft FillRect above the loop] [2:36:34][Introduce PixelPxRow in order to keep PixelPx as a wide value rather than having to set it each time] [2:39:50][Check IACA for performance difference and revert to setting PixelPx each time through the loop] [2:43:28][Shuffle calculations around to figure out how the performance is affected, for good or ill] -[2:51:17][Blackboard: Thinking about that alignment problem] +[2:51:17][Thinking about that alignment problem][:blackboard] [2:55:58][Align MinX and MaxX] [3:00:18][Microsoft Visual Studio 2013 has stopped working] [3:02:03][Dancing trees] [3:03:03][Change our loads and stores to no longer be unaligned] [3:04:05][Assess performance difference and revert back to the unaligned load and store instructions] [3:05:12][Make sure that we actually always fill the real clip region and not write outside the clip region] -[3:07:10][Blackboard: Our options for filling the pixels] +[3:07:10][Our options for filling the pixels][:blackboard] [3:09:12][Implementation of alignment to the ending edge] [3:16:48][Clip the leading edge] -[3:19:41][Blackboard: ClipMask] +[3:19:41][ClipMask][:blackboard] [3:21:33][Try setting StartupClipMask by using _mm_srli_si128] [3:22:28][// TODO(casey): This is stupid.] [3:26:10][Early-out the FillRect tests]