cinera_handmade.network/cmuratori/hero/code/code181.hmml

55 lines
4.4 KiB
Plaintext
Raw Permalink Normal View History

[video output=day181 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Log-based Performance Counters" vod_platform=youtube id=s_qSvBp6nFw annotator=Miblo annotator=debiatan]
[00:11][Recap and plan for today]
[02:10][Our performance counters are inexpensive, convenient and thread-safe]
[03:39][They report the amount of time an operation takes, but not _when_ it happens]
[04:18][In order to improve our profile view, we need to collect more data]
[06:02][The action of collecting the data should be cheap. Any real work should be deferred till the end of the frame]
[07:36][A log-based system could fit our requirements]
[08:18][Is the amount of logged events going to be too large?]
[10:27][Let's give the log-based approach a try]
[10:55][Can we call rdtscp?]
[12:15][We seem to be in luck][quote 218]
[13:50][rdtsc tells us about elapsed processor time]
[14:51][rdtscp should also report which processor/core is running our code]
[19:00][Trying to see if rdtscp returns the core identity]
[20:20][We don't have a way of disambiguating threads at the moment]
[20:58]["It's kinda nice to get stuff for free"][quote 220]
[21:13][We'll use a lighter version of debug_record to record debug entries]
[23:22][Disambiguating threads, cores, debug_records and debug_record_arrays inside debug_event]
[24:19][We only need a single big DebugEventArray]
[25:26][Keeping track of the position inside the debug event array]
[25:55][Double-buffering the debug event arrays]
[27:12][Filling in the debug event records]
[28:22][The array index will be determined by a preprocessor symbol defined inside build.bat]
[29:49][Defining AtomicAddU32]
[32:34][Telling apart the beginning and end of a timed block using entry types]
[33:46][Pulling together duplicated event recording code into the RecordDebugEvent macro]
[37:30][What is the 32-bit interlocked add-exchange instruction? It's _InterlockedExchangeAdd]
[40:38][We can't do a synchronous exchange of pointers and clear the event index at the same time...]
[42:08][... unless we pack the position and the event index together into a single 64-bit variable]
[45:41][Exchanging debug event arrays at the end of the frame]
[46:30][Transforming the macros into inline functions temporarily for easy stepping]
[47:40][Our debug array is not large enough for the amount of entries we're recording]
[49:04][Bump that number temporarily just to see if that really is happening for reals][quote 221]
[49:30][The information we log will allow more in-depth profiling operations]
[50:05][Recap on the bundling together of array and event indices]
[50:49][Switching event array indices correctly]
[53:29][CollateRecords() can reproduce the behavior of our old non-log-based event system]
[57:12][We have two event arrays (one for each compilation unit) and that makes the code uglier than it would be if we had better tools]
[58:25][Linearizing the access to the debug_record arrays]
[1:02:21][Fixing compilation errors]
[1:03:59][Accessing the filenames, function names and line numbers of debug_records]
[1:06:50][The log-based system seems to be working]
[1:07:37][Q&A][:speech]
[1:08:04][@elxenoaizd][You mentioned that you use a known base address for your memory management. Could you talk a bit more about that? Does that mean I can now find things by just offsetting from that address, and does it mean that if I fwrite this whole block I'll essentially be fwriting the whole game?]
[1:09:02][@butwhynot1][You can specify which functions to optimize by enabling optimization on the command line and surrounding code you don't want optimized with #pragma optimize("", off) ..... #pragma optimize("", on)]
[1:09:51][@elxenoaizd][Do you ever find use size_t or do you just use u32, u64, etc?]
[1:10:20][@elxenoaizd][Do you keep track of struct padding when you add / remove fields, or is it something you don't think about too much, so order of fields doesn't matter much?]
[1:10:56][@butwhynot1][Also, what's the point of the core number in the debug info? It seems the thread ID is the important part]
[1:11:54][Look forward to tomorrow]
[1:12:35][@JamesWidman][How will we avoid collecting or displaying stats on debug-rendering code?]
[1:12:53][@plain_flavored][Are profilers as bad as debuggers?]
[1:12:58][@inliferty][What do you think about checked exceptions?]
[1:13:29][Thanks and a few words on preordering the source and using the GitHub repositories][:speech]
[/video]