cinera_handmade.network/cmuratori/hero/code/code592.hmml

123 lines
9.2 KiB
Plaintext
Raw Normal View History

[video output=day592 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Capturing the Entire Lighting Data" vod_platform=youtube id=YTIz_eV_BsE annotator=Miblo]
2020-04-13 14:24:45 +00:00
[0:02][Welcome to the stream][:speech]
[0:08][Plug the Meow the Infinite printed comic Kickstarter[ref
site=Kickstarter
page="Meow the Infinite: Book One"
url=https://www.kickstarter.com/projects/annarettberg/meow-the-infinite-book-one] and the related fun videos at Molly Rocket's YouTube channel[ref
site=YouTube
page="Molly Rocket"
url=https://www.youtube.com/c/MollyRocket]][:research]
[4:06][Dive into the code][:speech]
[4:46][Fix hhlightprof to allocate :memory for four times the BoxCount to allow room for child boxes][:lighting]
[5:12][Determine to pseudo-verify our standalone ray tracer][:lighting :speech]
[6:44][Determine to learn VTune][:profiling :speech]
[7:19][:Run hhlightprof for 16 seconds, to completion][:lighting]
[7:41][Decrease the ray multiplier from 256 to 64 in TestRayCast()][:lighting]
[7:51][:Run hhlightprof for 4 seconds, to completion][:lighting]
[8:00][Increase the ray multiplier from 64 to 128 in TestRayCast()][:lighting]
[8:08][:Run hhlightprof for 7 seconds, to completion][:lighting]
[8:15][Launch VTune and hunt its interface for custom analysis capabilities][:profiling :run]
[11:45][Launch VTune as administrator][:profiling :run]
[12:11][Consult VTune's documentation[ref
site="Intel Developer Zone"
page="Intel® VTune™ Profiler: Featured Documentation"
url=https://software.intel.com/en-us/vtune/documentation/featured-documentation]][:profiling :research]
[14:09][:Research VTune's Hardware Event-based Sampling Collection[ref
site="Intel Developer Zone"
page="Intel® VTune™ Profiler User Guide: Hardware Event-based Sampling Collection"
url=https://software.intel.com/en-us/vtune-help-hardware-event-based-sampling-collection]][:profiling]
[15:45][:Research VTune's Microarchitecture Exploration Analysis for Hardware Issues[ref
site="Intel Developer Zone"
page="Intel® VTune™ Profiler User Guide: Microarchitecture Exploration Analysis for Hardware Issues"
url=https://software.intel.com/en-us/vtune-help-general-exploration-analysis]][:profiling]
[16:02][:Research VTune's Custom Analysis[ref
site="Intel Developer Zone"
page="Intel® VTune™ Profiler User Guide: Custom Analysis"
url=https://software.intel.com/en-us/vtune-help-custom-analysis]][:profiling]
[16:27][Create a new "Fruit Salad x12" custom analysis in VTune][:admin :profiling]
[18:15][Some words on x64 performance counters][:timing :speech]
[21:14][Continue to create our "Fruit Salad x12" custom analysis in VTune][:admin :profiling]
[23:25][Rename "Fruit Salad x12" to "No Counters Template" and clone it as "The Ultimate Fruit Salad"][:admin :profiling]
[24:36][Enable our desired counters in "The Ultimate Fruit Salad"][:admin :profiling]
[25:42][:Run hhlightprof in VTune][:lighting :profiling]
[25:58][Consult our VTune Hardware Events analysis][:lighting :profiling :run]
[27:44][A few words on 4 µOPs per cycle issuance][:hardware :speech]
[29:54][Continue to consult our VTune Hardware Events analysis][:lighting :profiling :run]
[30:44][Rename "The Ultimate Fruit Salad" to "Instructions Per Clock" and swap out the UOPS_EXECUTED.CORE_CYCLES_NONE counter for UOPS_EXECUTED.CORE_CYCLES_GE_4][:admin :profiling]
[31:58][:Run our "Instructions Per Clock" analysis of hhlightprof][:lighting :profiling]
[32:13][Consult our "Instructions Per Clock" VTune analysis][:lighting :profiling :run]
[32:42][Interpret our "Instructions Per Clock" VTune analysis][:lighting :profiling :run]
[34:36][Get the Work In More Efficiently vs Do the Work More Efficiently][:performance :speech]
[35:04][Create a new "Arithmetic Port Usage" custom analysis in VTune][:admin :profiling]
[38:16][Understanding port usage with uops.info[ref
site=uops.info
url=https://uops.info/table.html]][:profiling :research]
[43:08][Enable the counters of ports 0, 1 and 5, renaming "Arithmetic Port Usage" to "Float Port Usage"][:admin :profiling]
[44:08][:Run our "Float Port Usage" analysis of hhlightprof][:lighting :profiling]
[44:23][Consult our "Float Port Usage" VTune analysis][:lighting :profiling :run]
[46:54][Create a new "All Ports Usage" custom analysis in VTune][:admin :profiling]
[47:44][:Run our "All Ports Usage" analysis of hhlightprof][:lighting :profiling]
[48:00][Consult our "All Ports Usage" VTune analysis, noting that port 4 is the write port[ref
site=uops.info
url=https://uops.info/table.html]][:lighting :profiling :run]
[50:26][Interpret our "All Ports Usage" VTune analysis, desiring to Get the Work In More Efficiently][:lighting :profiling :run]
[52:34][Determine to validate our code][:lighting :profiling :speech]
[53:39][Step through an -Od build of hhlightprof][:lighting :profiling :run]
[58:38][Disable the LightBoxDumpTrigger][:"file io" :lighting]
[58:46][Continue to step through hhlightprof][:lighting :profiling :run]
[1:02:25][Make EndLightingComputation() call DEBUGDumpData() on the SpecAtlas, DiffuseAtlas, and the entire :lighting Solution][:"file io"]
[1:07:31][Dump the :lighting data to file][:"file io" :run]
[1:08:29][Check out our :lighting debug dumps][:admin]
[1:08:52][Make TestRayCast() in hhlightprof do the full ComputeLightPropagationWork(), replacing Commands in the lighting_work with the DiffuseAtlas and SpecAtlas][:"data structure" :lighting]
[1:14:53][Introduce LoadEntireFile() in hhlightprof, and load in all our :lighting dumps][:"file io"]
[1:21:01][Set up hhlightprof to use our loaded dumps][:lighting]
[1:25:12][Hit a write access violation on the Solution->SamplingSpheres][:lighting :run]
[1:25:41][Hit a write access violation on Byte in ZeroSize()][:lighting :run]
[1:25:45][Allocate :memory for the Solution->Works][:lighting]
[1:28:11][:Run hhlightprof successfully][:lighting]
[1:28:21][Make hhlightprof validate our SpecAtlas texels][:lighting]
[1:31:14][:Run hhlightprof with non-zero errors][:lighting]
[1:31:57][:Run hhlightprof in -O2 with non-zero errors][:lighting]
[1:32:02][Make hhlightprof print the TexelCount][:lighting]
[1:32:28][:Run hhlightprof with non-zero errors][:lighting]
[1:32:51][Try making hhlightprof overwrite the tUpdateBlend as 1.0f][:lighting]
[1:33:49][:Run hhlightprof with a similar number of errors][:lighting]
[1:34:32][Double-check our SpecAtlas validity checking][:lighting :research]
[1:35:04][Try making hhlightprof overwrite the tUpdateBlend as 0.0f][:lighting]
[1:35:22][:Run hhlightprof with a changed number of errors][:lighting]
[1:35:41][Try making hhlightprof overwrite the tUpdateBlend as 100.0f][:lighting]
[1:35:55][:Run hhlightprof with an infinite number of errors][:lighting]
[1:36:03][Try making hhlightprof overwrite the tUpdateBlend as 10.0f][:lighting]
[1:36:14][:Run hhlightprof with many errors][:lighting]
[1:36:16][Try making hhlightprof overwrite the tUpdateBlend as 1.0f][:lighting]
[1:36:22][:Run hhlightprof with few errors][:lighting]
[1:36:33][Prevent hhlightprof from overwriting the tUpdateBlend][:lighting]
[1:36:41][Continue to scour hhlightprof and [~hero Handmade Hero]'s :lighting system for inconsistencies][:research]
[1:43:47][Make TestRayCast() set the Work->VoxelX][:lighting]
[1:44:39][:Run hhlightprof, still with errors][:lighting]
[1:44:54][Introduce InternalLightingCore() to perform our dumps][:lighting]
[1:49:22][:Run hhlightprof, still with errors][:lighting]
[1:49:33][Q&A][:speech]
[1:49:58][@somebody_took_my_name][Q: Is it still building under the overdose flag?]
[1:50:32][@uplinkcoder][Q: Maybe capture a single threaded run?][:lighting :threading]
[1:52:05][@wakeuphate][Q: On your UOPS info table[ref
site=uops.info
url=https://uops.info/table.html] earlier in the stream, you were set to Skylake rather than Kaby Lake, just in case that's why you were seeing issues then]
[1:55:29][@dataqsloth][Q: Have you considered releasing your C course earlier as a beta release (even full price) just because so many of us and people we know are currently on quarantine!]
[1:55:59][@ikojan][How long would the course be?]
[1:57:24][@vaualbus][Q: Is it really bad on x86 to do unaligned pointers? Because I have a packed structure with pointers and the compiler is warning me about unaligned pointers. (Apparently there is a declspec to allow unaligned pointers?)][:memory]
[2:00:15][@leo0230][Q: Do instructions with memory operands actually use the arithmetic units before the data is ready?[ref
site=uops.info
url=https://uops.info/table.html]][:hardware]
[2:03:03][@enyo_enev][Q: Is it going to be useful for hardcore fans of [~hero Handmade Hero]?]
[2:03:51][@ikojan][Q: Will the course be project-based, where throughout the course you'll be making x or y project?]
[2:04:00][@mindmark42][Q: Is there an API to access the performance counters so you don't have to use VTune?]
[2:05:45][Wrap it up with a plug of the Meow the Infinite printed comic Kickstarter[ref
site=Kickstarter
page="Meow the Infinite: Book One"
url=https://www.kickstarter.com/projects/annarettberg/meow-the-infinite-book-one] and related fun videos at Molly Rocket's YouTube channel[ref
site=YouTube
page="Molly Rocket"
url=https://www.youtube.com/c/MollyRocket]][:speech]
[/video]