[video output=day113 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Simple Performance Counters" vod_platform=youtube id=QdTqYhv8tL0 annotator=dspecht annotator=Miblo]
[2:04][Blackboard: Basic process of optimization]
[6:02][Gather statistics]
[8:45][Revisit __rdtsc]
[10:22][GAME_UPDATE_AND_RENDER: Add __rdtsc cycle counts]
[11:23][Introduce debug_cycle_counter in handmade_platform.h]
[13:52][Support folks on platforms such as Linux and Mac, etc.]
[18:42][Compile, clean and run]
[19:44][FillGroundChunk: Turn off the ground chunks]
[20:24][Get a look at those timer values]
[26:09][Run the game and see the debug cycle count]
[27:25][We are already missing our budget][quote 84]
[28:16][Confirm what we know to be true]
[28:45][RenderGroupToOutput: Add a timed block]
[29:55][Introduce DebugGlobalMemory to enable us to access this timing stuff even when we shouldn't have access to it]
[32:10][Note the difference between the two cycle counts]
[33:34][Confirm that DrawRectangleSlowly is the culprit]
[35:16][Introduce HitCount to discover if DrawRectangleSlowly is slow because it is slow, or because it's called so often]
[37:45][Discover that we're calling the renderer thirteen times and DrawRectangleSlowly 64 times]
[39:05][Figure out how many pixels we're filling]
[42:49][Interpret the data]
[43:28][Bust out some emacs-fu][quote 85]
[43:42][Note that we're not operating on that many more pixels than the total on the screen]
[44:21][Blackboard: Overdraw]
[46:30][Blackboard: Progress report]
[47:24][Turn off the NormalMap]
[48:58][Understanding ballpark timings]
[51:53][Make an estimate]
[1:00:48][AVX-512 hype]
[1:01:15][Q&A][:speech]
[1:01:54][@grumpygiant256][Why no text labels on the counters?]
[1:02:10][@childz][I'm sorry if this is something you've gone over before, but can you explain the difference between new and malloc() in C++ and when each is useful?]
[1:03:04][@mrcowking][Is your handwriting as bad in real life as it is with a tablet?]
[1:03:22][@mr4thdimention][Can you also cut out instructions by doing more work to save previous computations. Like d - XAxis, followed by d - XAxis - YAxis. Should that just be 2 instructions?]
[1:03:50][@davidthomas426][Do you think we'll do multithreading in the software renderer?]
[1:04:07][@braincruser][Is it possible to Quad-pump every operation?]
[1:07:29][@braincruser][I meant, put it in wide instruction (SIMD)]
[1:09:02][@garryjohanson][Would you talk Jeff into doing an optimization stream where you are the TA?]
[1:10:01][That is the end of the stream][:speech]
[/video]