[video member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Thread-safe Performance Counters" vod_platform=youtube id=oDZ-sh0cKoY annotator=Miblo annotator=debiatan]
[0:16][Recap and plan for today]
[2:27][Packing a tiny monospaced font into the asset file]
[4:05][Supporting font selection]
[6:54][Adding the debug font]
[9:09][Testing the packing of multiple fonts]
[10:08][The AssetCount is wrong. Let's debug that]
[11:50][We're missing one entire character set]
[14:45][Are TypeIDs not initialized?]
[15:22][Let's see if it's the asset processor's fault]
[17:36][The way the code is structured, we can't BeginAssetType twice for the same AssetType...]
[18:13][... but we can group all the font packing calls inside WriteFonts to sidestep that issue]
[19:16][Tagging the fonts]
[22:05][Testing the new font packing]
[22:40][Picking a different font]
[23:55][It works!]
[24:33][Specifying the size of the font]
[27:30][Adding some extra TIMED_BLOCK calls]
[31:59][There are still two problems that we need to solve:]
[32:02][1) The timing of anything that encloses the render process is wrong]
[32:40][2) The access to the timers is not thread-safe]
[35:00][The first problem can be solved by displaying the counters with one frame of delay...]
[36:20][... but we'll first solve the threading problem]
[36:42][(Blackboard) Thread-safe performance counters]
[36:55][Review of implementation of performance counters. Using Record->CycleCount as temporary storage is problematic, because it should contain the end result]
[39:20][Using a separate value to keep track of the StartCycles]
[40:15][The concurrent access to HitCount and CycleCount is also a problem]
[42:28][Implementing AtomicAddU32]
[44:15][Consulting the _InterlockedExchangeAdd docs]
[45:56][The result does not look correct yet]
[46:20][Resetting the counters atomically]
[47:44][Correcting the _snprintf_s format specifiers of the time counters]
[48:10][Updating CycleCount and HitCount atomically together by merging them into a single U64]
[52:32][Making the reading and resetting of the counters also atomic using an unconditional atomic exchange]
[53:56][It seems to be working]
[54:05][We are now thread-safe]
[55:20][Reporting both DebugRecordsMain and DebugRecordsOptimized]
[58:48][Reporting also the line number of TIMED_BLOCKs]
[59:45][Adding some more TIMED_BLOCKs]
[1:00:30][The path of least resistance should be the right path, so that doing the right thing is never drudge work]
[1:01:10][Q&A][:speech]
[1:01:36][@andsz_][This was, again, pretty awesome! Thank you]
[1:01:44][@TheBuzzSaw][Wouldn't a union be helpful for that HitCount_CycleCount? It just seems unnecessary having to remember their offsets and / or lengths]
[1:02:14][@elxenoaizd][Yesterday I asked about preprocessor constants other than __FILE__, __LINE__ etc. I did some searching and found out about __TIME__ and __DATE__. Maybe it's useful for us to include a date-time stamp in some of our logs?]
[1:02:50][@insofaras][Can you align the text into columns with %32s in the sprintf or something?]
[1:03:09][handmade.cpp: Align the text into columns]
[1:04:25][@elxenoaizd][You mentioned yesterday that destructors are called when the scope of the object ends. I just wanted to note it seems that if you use it to exit a function instead of 'return' then the destructors won't get called!]
[1:04:58][@dandymcgee][Will you marry me?]
[1:05:27][@panic00][Won't rtdsc give wrong results if your thread is pre-empted and scheduled onto a different CPU between the constructor and destructor calls?]
[1:08:18][@MannySlain][Are you working on any games other than Handmade Hero?]
[1:08:37][@twitch_makes_me_itch][General Programming Question: Do you have any general advice for optimizing code performance, e.g. multithreading, algorithm complexity analysis, etc.)?]
[1:09:00][@elxenoaizd][Sorry I made a typo in the previous question: You mentioned yesterday that destructors are called when the scope of the object ends. I just wanted to note it seems that if you use "exit" to exit a function instead of 'return' then the destructors won't get called!]
[1:09:44][@Wisteso][The values are pretty hard to read when they're changing literally every frame. Wouldn't it be better to average them?]
[1:10:11][@panic00][Have you considered keeping the timers in thread-local storage instead of using atomics every time you write to them?]
[1:12:11][@cubercaleb][Are lock-free data structures worth the time they take to write?]
[1:12:35][@teryrorDS][Do you think it would be worthwhile to make the time records hierarchical (like a call tree), and how would you go about it?]
[1:14:43][@SeaOfSorrows][Can you explain the difference between mutexes and interlocks?]
[1:15:17][Blackboard: Interlocks and Mutexes]
[1:20:43][@ijustwantfood][Wait, about the values, can you draw a graph out of them?]
[1:20:55][@elxenoaizd][NOOO! Save the Cherry MX Blues]
[1:21:03][@ttbjm][Will you be able to fix the click bug at the end of sounds with the debug system?]
[1:21:36][@guit4rfreak][Can you still time a smaller section inside the code or does it just work for whole functions now?]
[1:22:21][@elxenoaizd][I noticed you prefer to keep things on the stack and return by value. 0) This helps with locality of reference and is more cache friendly, am I right? 1) Is the stack size something to be concerned about? 2) When returning by value, is the time taken to copy the object and return it something to be worried about? Maybe when the struct is large? In that case, do you prefer to return a pointer to the object?]
[1:23:06][@sparklyguy][How do threads wait on lock-free structures? Do they?[ref
    site="Maurice Herlihy"
    page="Wait-Free Synchronization"
    url="http://cs.brown.edu/~mph/Herlihy91/p124-herlihy.pdf"]]
[1:28:20][@gasto5][What's wrong with pointers to inline functions?]
[1:28:59][@elxenoaizd][Why you skips my questions?]
[1:29:09][@BIurberry][Did you get a degree in CS? What's your background on programming?]
[1:29:42][Wind it down][:speech]
[/video]