[video member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Aligning Rendering Memory" vod_platform=youtube id=blcNbU70I9o annotator=dspecht annotator=Miblo] [1:23][Set the stage for this week] [2:51][Demo the problem with the tile size] [4:04][Confirm that _mm_sfence is unnecessary] [5:46][Blackboard: "Write-Combining Memory"] [13:43][Blackboard: Tiles and Alignment] [21:02][Suggest rounding TileWidth and TileHeight to the nearest 4 and aligning the tiles to the 4-byte boundary] [21:35][Think some more, and then unthink it] [22:11][handmade_render_group.cpp: Ensure that we always get aligned tiles] [23:27][Blackboard: Calculating and filling the correct number of 4-pixel units] [26:10][Blackboard: The destination buffer must always allow us to overwrite it by a certain amount] [29:49][Assert that OutputTarget->Memory is aligned to 16-bytes] [31:45][Round TileWidth up to the nearest 4] [33:13][See what numbers we're getting] [34:14][Ensure that FinalTileWidth accounts for the fact that it'll be smaller] [34:54][Remove FinalTileWidth and limit ClipRect.MaxX to the OutputTarget->Width] [35:23][Remove the ClipRect adjustments] [36:17][Run single-threaded for the moment] [37:06][Moment of realisation: We're only clipping to the end, but don't handle clipping at the beginning] [37:34][Blackboard: The clipping problem] [41:58][Align the MinX and MaxX] [47:08][Blackboard: Setting the EndClipMasks] [48:24][Figure out when to use the EndClipMask] [50:01][Compile and see what's happening] [50:30][Reverse the sense of the test and run it again] [50:55][See if we're always aligned] [51:16][Compile in -O2 and turn on multi-threading to ensure everything's still kosher] [52:02][win32_handmade.cpp: Double-check that the platform code is allocating the memory aligned] [53:20][Align Buffer->Pitch to 16-bytes] [54:10][handmade_platform.h: Introduce Align16] [55:13][Everything looks fine at 1920x1080] [56:13][handmade.cpp: Setup the PlatformAddEntry and PlatformCompleteAllWork at the beginning] [58:00][Q&A] [58:43][@robotchocolatedino][The bitmap memory size calculation squares BytesPerPixel which allocates more memory than necessary] [58:50][win32_handmade.cpp: Remove the multiplication by BytesPerPixel] [59:19][@ttbjm][Can you test if weird resolutions actually are working?] [1:00:33][handmade_render_group.cpp: Ensure that it goes all the way to the end if this is the last tile] [1:07:06][handmade.cpp: Hard-set the game's DrawBuffer dimensions to test our support of arbitrary resolutions] [1:08:28][@braincruser][Will you check for Cache Aliasing? With this much alignment hitting cache aliasing is much easier] [1:08:51][@braincruser][Cache Associativity aliasing...] [1:09:27][@flyingwafflenyc][What about the extra pixel around the edges to deal with bilinear filtering? Will you get rid of that?] [1:10:33][Compile with that #if 0'd out and glimpse into the future of 4K art] [1:11:50][@pragmascrypt][Will you keep the rendering of the two lines on separate hyperthreads?] [1:13:33][@terrorscout][So this game is going to be a 4K Zelda 1-style game?] [1:14:43][Assess our progress and glimpse into the future of tightening up the renderer] [1:16:38][Shut things down here] [/video]