cinera_handmade.network/cmuratori/hero/code/code594.hmml

168 lines
17 KiB
Plaintext

[video output=day594 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Switching from Center-Radius to Min-Max" vod_platform=youtube id=TzoW0CjmPl8 annotator=Miblo]
[0:03][Plug the Meow the Infinite printed comic Kickstarter[ref
site=Kickstarter
page="Meow the Infinite: Book One"
url=https://www.kickstarter.com/projects/annarettberg/meow-the-infinite-book-one]][:research]
[0:25][Recap our creation of hhlightprof][:lighting :optimisation :speech]
[1:26][Demo hhlightprof (7.150223 total seconds elapsed)][:lighting :optimisation :run]
[2:31][Describe the two distinct ray casting / box categorisation routines in RayCast()][:lighting :optimisation :research]
[4:46][Describe the lighting_box structure, with the determination to reduce extraneous computations by storing the BoxMin and BoxMax][:"data structure" :lighting :optimisation :research]
[6:50][Replace P and Radius with BoxMin and BoxMax in lighting_box, and begin to propagate this change][:"data structure" :lighting :optimisation]
[7:28][Why compute the BoxMin and BoxMax part way down the :lighting pipeline?][:optimisation :speech]
[8:44][Continue to update the :lighting routines to use BoxMin and BoxMax][:optimisation]
[16:23][:Run hhlightprof to find that the light boxes and boxrefs don't match, because our storage format has changed][:lighting]
[17:02][Enable hhlightprof to transform the old Center-Radius dump to the new Min-Max format][:lighting]
[18:49][hhlightprof total seconds elapsed (with the same results as earlier): 6.702761][:lighting :optimisation :run]
[19:21][Make a note to remove the dump transform from hhlightprof][:lighting]
[19:47][Reflect on our data storage / computation improvement][:lighting :optimisation :research]
[20:34][Further RayCast() optimisations][:lighting :optimisation :research]
[22:17][Further RayCast() optimisations: 1) How do we want to store our tree?][:"data structure" :lighting :optimisation :research]
[22:40][Further RayCast() optimisations: 2) What does our tree want to look like (e.g. k-d tree)?][:"data structure" :lighting :optimisation :research]
[23:11][Further RayCast() optimisations: 3) Do we want the ray casting and the box categorisation in the same loop?][:lighting :optimisation :research]
[23:59][Determine to investigate the efficiency of our information streaming][:lighting :optimisation :research]
[24:54][Separate the ray casting and box categorisation loops in RayCast()][:lighting :optimisation]
[26:17][Augment lighting_work with a lighting_box_pack ScratchSpace array][:"data structure" :lighting :optimisation]
[27:26][Make InternalLightingCore() initialise our new ScratchSpace, and make ProfileRun() in hhlightprof allocate space for it][:lighting :memory :optimisation]
[30:56][:Memory per thread][:research :threading]
[32:59][Make a note to add :memory per thread][:threading]
[33:50][A few words on :memory per thread][:research :threading]
[34:55][:Run hhlightprof with the same results][:lighting :optimisation]
[35:07][Temporarily make RayCast() push lighting boxes onto the ScratchSpace][:lighting :memory :optimisation]
[36:00][:Run hhlightprof with the same results][:lighting :memory :optimisation]
[36:07][Revert RayCast() to push boxes onto the old stack array][:lighting :memory :optimisation]
[36:40][CTAssert() lighting_box_pack][:"data structure"]
[36:59][Make RayCast() do the box pushing and box categorisation in two passes][:lighting :optimisation]
[42:56][:Run hhlightprof with different results][:lighting :optimisation]
[43:14][Scour RayCast() for bugs][:lighting :optimisation :research]
[44:48][Use consistent :language in both loops for accessing lighting_box_pack][:lighting]
[45:05][Continue to scour RayCast() for bugs][:lighting :optimisation :research]
[45:55][Remove the compiled-out code in RayCast()][:lighting :optimisation]
[46:04][Continue to scour RayCast() for bugs][:lighting :optimisation :research]
[46:33][:Run hhlightprof still with different results from originally][:lighting :optimisation]
[47:01][Toggle RayCast() back to the old one-pass code][:lighting :optimisation]
[47:09][:Run hhlightprof with the same results as originally][:lighting :optimisation]
[47:12][Toggle RayCast() to the new two-pass code][:lighting :optimisation]
[47:16][Continue to hunt for inconsistencies between the one-pass and two-pass code in RayCast()][:lighting :optimisation :research]
[48:33][Fix the box pushing loop to only push a box if necessary][:lighting :optimisation]
[49:17][:Run hhlightprof with different results from originally][:lighting :optimisation]
[50:08][Continue to scour RayCast() for bugs][:lighting :optimisation :research]
[51:01][Assert in the box categorisation loop that the previously pushed box is a LeafContainer][:lighting :optimisation]
[51:23][:Run hhlightprof without hitting that assertion][:lighting :optimisation]
[51:29][Continue to scour RayCast() for bugs][:lighting :optimisation :research]
[54:22][@redunlocked][Q: The while(Depth) loop doesn't wrap the second loop][:lighting :optimisation]
[54:56][Continue to scour RayCast() for bugs][:lighting :optimisation :research]
[55:14][Make hhlightprof stream out to files the box leaves and partitions, introducing RECORD_LEAF_BOX() and RECORD_PARTITION_BOX()][:"file io" :lighting :optimisation]
[59:53][:Run hhlightprof][:lighting :optimisation]
[1:00:34][Make hhlightprof only stream out the first run of box leaves and partitions][:"file io" :lighting :optimisation]
[1:00:54][:Run hhlightprof][:lighting :optimisation]
[1:00:59][Rename our box leaves and partitions files to new*][:admin :lighting :optimisation]
[1:01:33][Toggle RayCast() back to the old one-pass code][:lighting :optimisation]
[1:01:44][:Run hhlightprof][:lighting :optimisation]
[1:01:48][Rename our box leaves and partitions files to old*][:admin :lighting :optimisation]
[1:02:02][Try to compare our partitions in Meld][:admin :lighting :optimisation]
[1:03:54][Install Beyond Compare[ref
site="Scooter Software: Home of Beyond Compare"
url=http://www.scootersoftware.com/]][:admin]
[1:04:59][Compare our box leaves in Beyond Compare, to see differences][:admin :lighting :optimisation]
[1:06:13][Compare our partitions in Beyond Compare, to see differences][:admin :lighting :optimisation]
[1:07:09][Toggle RayCast() to the new two-pass code][:lighting :optimisation]
[1:07:16][Scour RayCast() for a reason why we're pushing more boxes][:lighting :optimisation :research]
[1:08:17][Consult our box leaves differences in Beyond Compare][:admin :lighting :optimisation]
[1:10:16][Continue to scour RayCast() for a reason why we're pushing more boxes][:lighting :optimisation :research]
[1:10:36][Assert in RayCast() that the partitioning loop does not receive a non-leaf container][:lighting :optimisation]
[1:10:55][Toggle off the box / partition streaming in hhlightprof][:"file io" :lighting :optimisation]
[1:11:54][:Run hhlightprof without hitting that assertion][:lighting :optimisation]
[1:11:57][Puzzle over our bug in RayCast()][:lighting :optimisation :research]
[1:13:01][Make ProfileRun() print out the total input and output boxes][:lighting :optimisation]
[1:14:05][:Run hhlightprof to find a mismatch between the total boxes counts][:lighting :optimisation]
[1:14:28][Consult our box leaves differences in Beyond Compare][:admin :lighting :optimisation]
[1:15:26][Restrict our dumping to one ray's worth of work][:lighting :optimisation]
[1:18:00][:Run hhlightprof][:lighting :optimisation]
[1:18:15][Enable RECORD_RAYCAST_STACK][:lighting :optimisation]
[1:18:24][:Run hhlightprof][:lighting :optimisation]
[1:18:30][Rename our box leaves and partitions files to new*][:admin :lighting :optimisation]
[1:19:05][Toggle RayCast() back to the old one-pass code][:lighting :optimisation]
[1:19:15][:Run hhlightprof][:lighting :optimisation]
[1:19:16][Find our box leaves and partitions dumps to match][:admin :lighting :optimisation]
[1:19:31][Increase our dumping to sixteen rays' worth of work][:lighting :optimisation]
[1:20:13][:Run hhlightprof][:lighting :optimisation]
[1:20:19][Consult our box leaves and partitions files to see partition box indices lower than the input box count][:admin :lighting :optimisation]
[1:22:50][Reacquaint ourselves with SplitBox()][:lighting :optimisation :research]
[1:25:44][Disable RECORD_RAYCAST_STACK][:lighting :optimisation]
[1:25:55][hhlightprof total seconds elapsed: 6.682471][:lighting :optimisation :run]
[1:26:08][Prepare to remove the indirect table-read from SplitBox()][:lighting :optimisation]
[1:28:23][Change SplitBox() to split boxes directly into storage, removing AddBoxReferences() and AddBoxReference()][:lighting :memory :optimisation]
[1:36:38][Update BuildSpatialPartitionForLighting() to use AddBoxStorage() instead of AddBoxReference()][:lighting :memory :optimisation]
[1:37:36][:Run hhlightprof with a high max error / texel][:lighting :optimisation]
[1:38:00][Check all GetBox() calls, and SplitBox() for bugs][:lighting :optimisation :research]
[1:40:28][Fix AddBoxStorage() to correctly store the box index][:lighting :optimisation]
[1:40:45][hhlightprof total seconds elapsed: 6.460956][:lighting :optimisation :run]
[1:41:31][Make a note to remove the BoxRefTable from lighting_solution, removing all mentions of them][:"data structure" :lighting]
[1:42:45][hhlightprof total seconds elapsed: 6.449993][:lighting :optimisation :run]
[1:43:34][Enable RECORD_RAYCAST_STACK][:lighting :optimisation]
[1:43:53][:Run hhlightprof][:lighting :optimisation]
[1:44:02][Consult our box leaves and partitions files to see high partition box indices, as expected, and rename the files to new*][:admin :lighting :optimisation]
[1:44:17][Toggle RayCast() to the new two-pass code][:lighting :optimisation]
[1:44:26][:Run hhlightprof][:lighting :optimisation]
[1:44:31][Rename our box leaves and partitions files to new*, and diff them in Beyond Compare][:admin :lighting :optimisation]
[1:45:26][Make RayCast() record box partition pushes, introducing RECORD_PARTITION_PUSH()][:"file io" :lighting :optimisation]
[1:48:55][Toggle RayCast() back to the old one-pass code][:lighting :optimisation]
[1:49:11][:Run hhlightprof and rename the files to old*][:admin :lighting :optimisation]
[1:49:52][Fix RayCast() to call RECORD_PARTITION_PUSH() in the correct place][:"file io" :lighting :optimisation]
[1:50:08][:Run hhlightprof and see our pushes in the files][:admin :lighting :optimisation]
[1:50:31][Make RECORD_RAYCAST_END() append a newline after each ray][:"file io" :lighting :optimisation]
[1:50:52][:Run hhlightprof, see our rays and rename the files to old*][:admin :lighting :optimisation]
[1:51:18][Toggle RayCast() to the new two-pass code][:lighting :optimisation]
[1:51:34][:Run hhlightprof and rename the files to new*][:admin :lighting :optimisation]
[1:51:43][Diff our partitions files in Beyond Compare, to see different treatment of boxes 1638 and 1844][:admin :lighting :optimisation]
[1:54:17][Set up a breakpoint on box 1638 in RayCast(), and switch the build to -Od][:lighting :optimisation]
[1:55:54][Toggle RayCast() back to the old one-pass code, and comment out the __debugbreak()][:lighting :optimisation]
[1:56:07][:Run hhlightprof and compare these box leaves and partitions with the existing old ones][:admin :lighting :optimisation]
[1:56:26][Reinstate the __debugbreak() in RayCast()][:lighting :optimisation]
[1:56:36][:Run to our __debugbreak() and inspect the values][:lighting :optimisation]
[1:57:43][Determine to check that the leaf count is not larger than our available size][:lighting :optimisation :research]
[1:59:00][Make ProfileRun() save the GlobalMaxWorkStackDepth for RayCast() to assert the LeafCount against, toggling RayCast() back to the old one-pass code][:lighting :optimisation :research]
[2:00:19][Toggle RayCast() back to the old one-pass code][:lighting :optimisation]
[2:00:31][:Run hhlightprof and hit an assertion][:lighting :optimisation]
[2:01:40][Toggle off the __debugbreak() in RayCast()][:lighting :optimisation]
[2:02:07][:Run hhlightprof without hitting our LeafCount assertion][:lighting :optimisation]
[2:02:14][Step through RayCast() to see a consistently low LeafCount][:lighting :optimisation :run]
[2:02:50][Consider ProfileRun() to be allocating enough :memory][:lighting :optimisation :research]
[2:03:29][Remove our LeafCount assertion from RayCast(), and reinstate the __debugbreak() calls on box 1638][:lighting :optimisation]
[2:04:14][:Run to our __debugbreak() and inspect Box 1638][:lighting :optimisation]
[2:04:29][~RemedyBG feature request: Save Watch Window to File][:admin]
[2:04:49][Continue to inspect Box 1638, and save off its Min and Max][:lighting :optimisation :run]
[2:07:27][Realise that our two-loop RayCast() will necessarily test more boxes for partitioning because we pushed more leaves onto the stack][:lighting :optimisation :run]
[2:08:43][Remove the two-pass streaming from RayCast(), because we want to early-out][:lighting :optimisation]
[2:09:01][Plan to go forward with k-d tree, testing the near side before the far side, and early-out if we hit][:lighting :optimisation :research]
[2:09:52][Remove the __debugbreak() calls from RayCast()][:lighting :optimisation]
[2:10:11][Remove the ScratchBuffer from lighting_work][:"data structure" :lighting :optimisation]
[2:11:15][:Run hhlightprof successfully][:lighting :optimisation]
[2:11:24][:Run the game successfully][:lighting :optimisation]
[2:12:01][Q&A][:speech]
[2:12:38][@somebody_took_my_name][Q: Just wanted to let you know we have an IsMemoryEqual() function. Nice debug work today][:memory]
[2:13:48][@xxthebigfoxx][Q: Off-topic, but I noticed your last blog post is dated 2019]
[2:14:45][@x1bzzr][Q: Why is it that software companies increasingly seem to not care about :performance? I've heard you talk about the technical reasons, but I still don't understand why companies would give up on having better software quality than their competitors. Is it because education is bad? Is it because non-performance oriented software is being normalized to the point people don't care about it? Is it because hardware has been compensating for the lack of software performance?]
[2:17:20][@recursivechat][Q: Do you have a git repo for any of your code? Not necessarily [~hero Handmade Hero] code but something small that I could read to see what sort of "patterns" you use to write good code]
[2:19:04][@mattiamanzati][Q: About my question before the stream started, I am looking for a doubly linked list structure, and usually that's done through a structure with a first and last entry pointers, and those have a structure with a next, prev and value pointers. (Let's call this entry_structure.) In order to add an item to the end of the list you need to create a new entry_structure, set the prev pointer to the previous last entry, and UPDATE THE LAST entry_structure in order to have next point to the new last item. Is there any other structure that represents a doubly linked list without this prerequisite?][:"data structure"]
[2:21:02][@xsinxdx][Q: Are there any scalability / usability issues when using enums vs something like a hash table for indexing assets?]
[2:23:10][@buildervanished][Q: Do you know of some sane way of learning a bit of DirectX 11 other than trial and error while scavenging information on MSDN?]
[2:23:20][@mindmark42][Q: How would you recommend writing assembly directly for a function if the complier refuses to optimize it the way you want?][:asm]
[2:24:17][@chronic_quagga][Q: How important is it, in game development and in general, to be concerned with microarchitecture-level optimizations and using metrics from CPU performance counters?][:optimisation]
[2:24:51][@theintrojuicer][Q: How many lines of code are there in the [~hero Handmade Hero] project?]
[2:25:16][34,292 lines of code][:admin]
[2:25:47][@thordura][Q: If you do another programming discussion with [@naysayer88 Jon], do you think it would be valuable do a discussion about entity systems?][:"entity system"]
[2:26:14][@kniffel5][Q: Visual Studio 19 has support for x86 but not x64 assembly. Why's one in it, but the other isn't? Are they so different from each other?][:asm]
[2:27:16][@thordura][Q: I meant like what you guys consider is a good go-to approach to entities][:"entity system"]
[2:28:02][Wonder if we heard back from @mattiamanzati about the doubly linked list question][:"data structure" :research]
[2:29:21][Wander around the orphanage][:run]
[2:29:59][@sagian2005][Q: I think they wanted to know if there is something that acts like a doubly linked list that isn't actually a doubly linked list][:"data structure"]
[2:30:42][Wrap it up with a plug of the Meow the Infinite printed comic Kickstarter[ref
site=Kickstarter
page="Meow the Infinite: Book One"
url=https://www.kickstarter.com/projects/annarettberg/meow-the-infinite-book-one] and the determination to raid [@naysayer88 Jon]'s stream[ref
site=Twitch
page=Naysayer88
url=https://twitch.tv/naysayer88]][:speech]
[/video]