diff --git a/pervognsen/bitwise/bitwise/bitwise003.hmml b/pervognsen/bitwise/bitwise/bitwise003.hmml index 33a1b8d..4620ba8 100644 --- a/pervognsen/bitwise/bitwise/bitwise003.hmml +++ b/pervognsen/bitwise/bitwise/bitwise003.hmml @@ -58,7 +58,7 @@ [1:48:00][The homework seems kind of daunting][:authored] [1:49:15][@xanatos387][@pervognsen You've used the phrase "left-fold" several times tonight. What does that refer to?] [1:53:19][@elavid][@gargltk Yes, grammar is broken right now, doesn't use expr3][:language :parsing] -[1:54:34][@captainkraft][@pervognsen If you were on Linux, would you use gdb to frequently run the code through the debugger, use a different debugger, or use some other method entirely?] +[1:54:34][A few words on having a finely interleaved debugging and editing flow, in response to a question from @CaptainKraft about debugging with GDB on Linux][:speech] [1:55:18][There's a typo in the Wirth book[ref author="Niklaus Wirth" title="Compiler Construction" diff --git a/pervognsen/bitwise/bitwise/bitwise012.hmml b/pervognsen/bitwise/bitwise/bitwise012.hmml new file mode 100644 index 0000000..dc163df --- /dev/null +++ b/pervognsen/bitwise/bitwise/bitwise012.hmml @@ -0,0 +1,48 @@ +[video member=pervognsen stream_platform=twitch project=bitwise title="More Optimization & Clean-Up" vod_platform=youtube id=Hf5PevrUx4g annotator=Miblo] +[0:08][Recap and set the stage for the day][:speech] +[3:05][A few words on the slowness of linearly :searching while performing the string interning][:optimisation :speech] +[5:56][:Run the program on the 6160384-line test3.ion with MSVC's statistical profiler, providing some background on :profiling :speech] +[12:58][Check out our :performance statistics, to see that _malloc_base() is our biggest consumer of clock cycles][:run] +[17:46][Review the work implementing an open addressing hash table[ref + site=Wikipedia + page="Open addressing" + url=https://en.wikipedia.org/wiki/Open_addressing] for the string interning][:"data structure" :optimisation :parsing :research] +[26:23][Recommend [@nothings Sean Barrett]'s 'A Performance Comparison of Judy to Hash Tables'[ref + author="Sean Barrett" + title="A Performance Comparison of Judy to Hash Tables" + url=http://nothings.org/computer/judy/]][:"data structure" :optimisation :research] +[30:14][Review the work on str_hash_range(), and str_intern_range() with a mention of the Birthday Paradox[ref + site=Wikipedia + page="Birthday problem" + url=https://en.wikipedia.org/wiki/Birthday_problem]][:"data structure" :optimisation :parsing :research] +[38:06][Review the work using a hash table to optimise the symbol table lookup and type resolution][:"data structure" :optimisation :parsing :research] +[42:03][Re-enable all of ion_compile_file() in addition to the :parsing][:"code generation"] +[42:23][:Run it to find that it doesn't work, and investigate why] +[47:32][Revert the uncommitted changes][:admin] +[48:07][Regenerate test3.ion and :run our program on it performing the full :"code generation"] +[49:11][@xanatos387][So in the last stream, at a few points in passing was mentioned some use of randomization in hashing. I didn't understand that - isn't determinism the one thing you need in a hash table? When would you ever use randomization?][:"data structure" :optimisation] +[52:18][Consult the profile of the full :"code generation" to see that common_vsprintf() - called by strf() and buf__printf() - is our biggest :performance hotspot][:run] +[56:58][See that _malloc_base() - also called by strf() - is the second :performance hotspot][:memory :run] +[1:00:22][Toggle off the :"code generation" part of ion_compile_file()] +[1:00:31][:Run it to see that this takes ~4 seconds][:performance] +[1:01:23][Determine to ignore the C :"code generation" :performance for now, and rather remove the temporary allocations for the stretchy buffers][:memory :optimisation :speech] +[1:02:47][:Consult the profile with ion_compile_file() only performing the :parsing, to see where buf__grow() creates a :performance hotspot] +[1:05:24][Introduce temp_alloc() to do arena-style :memory allocation for buf__grow() to call][:optimisation] +[1:13:29][Relieve buf__grow() of calling realloc()][:memory] +[1:16:07][:Run it to see that it all actually works][:"code generation" :memory :parsing] +[1:16:56][Make buf__grow() call temp_alloc(), and enable the latter to allocate from main :memory][:optimisation] +[1:19:02][:Run it under the profiler, and compare the reports for our two allocators, noting that their total CPU time is similar][:memory :optimisation :performance] +[1:25:50][Q&A][:speech] +[1:25:57][@barubro][Aren't the profile buckets relative time? Not absolute time?] +[1:27:38][@synchronizerman][Wait, without malloc how is he allocating :memory? Did he make his own memory allocator?] +[1:28:38][@miotatsu][stb_sprintf time!] +[1:29:27][Determine to bug [@rygorous Fabian] about the unexpectedly comparable :performance of the :memory arena allocation vs malloc] +[1:30:26][@barubro][Why not try the other profiler? "Performance wizard"] +[1:30:48][:Run it via the "Performance wizard"][:performance] +[1:33:26][@xanatos387][@pervognsen Will we do direct :"code generation" now or will we wait on that until we have a CPU to target?] +[1:33:48][@cmdrkroz][@pervognsen Is the eventual plan to write an Ion compiler in Ion? Reason why I ask is whether this will have the same behavior in the RISC-V itself] +[1:35:01][@artexx2000][@pervognsen Any chance make Ion func call return more than one value in the future when C backend no longer needed?][:language] +[1:36:55][@barubro][Is the Ion compiler going to be modified to take multiple files? How are you going to handle things like directories for multiple platforms?] +[1:38:10][@aks_ism][If I recall correctly, malloc() does only page table setup, doesn't really allocate any pages. Maybe that's why we are seeing this issue][:memory] +[1:39:45][We're coming up to the end, with a glimpse into next week][:speech] +[/video]