cinera_handmade.network/pervognsen/bitwise/bitwise/bitwise012.hmml

49 lines
4.7 KiB
Plaintext

[video member=pervognsen stream_platform=twitch project=bitwise title="More Optimization & Clean-Up" vod_platform=youtube id=Hf5PevrUx4g annotator=Miblo]
[0:08][Recap and set the stage for the day][:speech]
[3:05][A few words on the slowness of linearly :searching while performing the string interning][:optimisation :speech]
[5:56][:Run the program on the 6160384-line test3.ion with MSVC's statistical profiler, providing some background on :profiling][:speech]
[12:58][Check out our :performance statistics, to see that _malloc_base() is our biggest consumer of clock cycles][:run]
[17:46][Review the work implementing an open addressing hash table[ref
site=Wikipedia
page="Open addressing"
url=https://en.wikipedia.org/wiki/Open_addressing] for the string interning][:"data structure" :optimisation :parsing :research]
[26:23][Recommend [@nothings Sean Barrett]'s 'A Performance Comparison of Judy to Hash Tables'[ref
author="Sean Barrett"
title="A Performance Comparison of Judy to Hash Tables"
url=http://nothings.org/computer/judy/]][:"data structure" :optimisation :research]
[30:14][Review the work on str_hash_range(), and str_intern_range() with a mention of the Birthday Paradox[ref
site=Wikipedia
page="Birthday problem"
url=https://en.wikipedia.org/wiki/Birthday_problem]][:"data structure" :optimisation :parsing :research]
[38:06][Review the work using a hash table to optimise the symbol table lookup and type resolution][:"data structure" :optimisation :parsing :research]
[42:03][Re-enable all of ion_compile_file() in addition to the :parsing][:"code generation"]
[42:23][:Run it to find that it doesn't work, and investigate why]
[47:32][Revert the uncommitted changes][:admin]
[48:07][Regenerate test3.ion and :run our program on it performing the full :"code generation"]
[49:11][@xanatos387][So in the last stream, at a few points in passing was mentioned some use of randomization in hashing. I didn't understand that - isn't determinism the one thing you need in a hash table? When would you ever use randomization?][:"data structure" :optimisation]
[52:18][Consult the profile of the full :"code generation" to see that common_vsprintf() - called by strf() and buf__printf() - is our biggest :performance hotspot][:run]
[56:58][See that _malloc_base() - also called by strf() - is the second :performance hotspot][:memory :run]
[1:00:22][Toggle off the :"code generation" part of ion_compile_file()]
[1:00:31][:Run it to see that this takes ~4 seconds][:performance]
[1:01:23][Determine to ignore the C :"code generation" :performance for now, and rather remove the temporary allocations for the stretchy buffers][:memory :optimisation :speech]
[1:02:47][Consult the profile with ion_compile_file() only performing the :parsing, to see where buf__grow() creates a :performance hotspot]
[1:05:24][Introduce temp_alloc() to do arena-style :memory allocation for buf__grow() to call][:optimisation]
[1:13:29][Relieve buf__grow() of calling realloc()][:memory]
[1:16:07][:Run it to see that it all actually works][:"code generation" :memory :parsing]
[1:16:56][Make buf__grow() call temp_alloc(), and enable the latter to allocate from main :memory][:optimisation]
[1:19:02][:Run it under the profiler, and compare the reports for our two allocators, noting that their total CPU time is similar][:memory :optimisation :performance]
[1:25:50][Q&A][:speech]
[1:25:57][@barubro][Aren't the profile buckets relative time? Not absolute time?]
[1:27:38][@synchronizerman][Wait, without malloc how is he allocating :memory? Did he make his own memory allocator?]
[1:28:38][@miotatsu][stb_sprintf time!]
[1:29:27][Determine to bug [@rygorous Fabian] about the unexpectedly comparable :performance of the :memory arena allocation vs malloc]
[1:30:26][@barubro][Why not try the other profiler? "Performance wizard"]
[1:30:48][:Run it via the "Performance wizard"][:performance]
[1:33:26][@xanatos387][@pervognsen Will we do direct :"code generation" now or will we wait on that until we have a CPU to target?]
[1:33:48][@cmdrkroz][@pervognsen Is the eventual plan to write an Ion compiler in Ion? Reason why I ask is whether this will have the same behavior in the RISC-V itself]
[1:35:01][@artexx2000][@pervognsen Any chance make Ion func call return more than one value in the future when C backend no longer needed?][:language]
[1:36:55][@barubro][Is the Ion compiler going to be modified to take multiple files? How are you going to handle things like directories for multiple platforms?]
[1:38:10][@aks_ism][If I recall correctly, malloc() does only page table setup, doesn't really allocate any pages. Maybe that's why we are seeing this issue][:memory]
[1:39:45][We're coming up to the end, with a glimpse into next week][:speech]
[/video]