Index hero/code614
This commit is contained in:
		
							parent
							
								
									bd1f0190ab
								
							
						
					
					
						commit
						b62ffb4f4e
					
				|  | @ -0,0 +1,59 @@ | |||
| [video output=day614 member=cmuratori stream_platform=twitch stream_username=handmade_hero  project=code title="Continuing Streamlining the Raycaster" vod_platform=youtube id=IxeKOAcvgK0 annotator=Miblo] | ||||
| [0:01][Welcome to the stream][:speech] | ||||
| [0:06][Determine to continue with :optimisation][:lighting :run] | ||||
| [0:57][Recap yesterday's welding :optimisation in GridRayCast()][:lighting :research] | ||||
| [4:09][Consider :optimisation potential of the SpecTexel load / stores in GridRayCast()][:lighting :research] | ||||
| [7:22][Illustrate the possibility of loading in the SpecTexel values and InvBlend at the outset][:lighting :optimisation] | ||||
| [9:23][Seek easier :optimisation opportunities in GridRayCast()][:lighting :research] | ||||
| [11:43][Simplify out OcclusionN from GridRayCast()][:lighting :optimisation] | ||||
| [12:27][Seek :optimisation with OcclusionD and RayD in GridRayCast()][:lighting :research] | ||||
| [18:48][Streamline the SignRayD and NormalXYZ computations in GridRayCast()][:lighting :optimisation :simd] | ||||
| [25:35][Reacquaint ourselves with the hit testing and shuffling code in GridRayCast()][:lighting :research :simd] | ||||
| [30:30][Streamline the Normal selection in GridRayCast()][:lighting :optimisation :simd] | ||||
| [34:46][Check out the port usage of various instructions, noting that we may get an AND for free[ref | ||||
|     site=uops.info | ||||
|     url=https://uops.info/table.html]][:isa :research] | ||||
| [40:23][Continue to streamline the Normal selection in GridRayCast(), introducing a NormalTable, before toggling back to the old code][:lighting :optimisation :simd] | ||||
| [48:12][:Run successfully][:lighting] | ||||
| [48:31][Streamline the ProbeSampleNSingle usage in GridRayCast()][:lighting :optimisation :simd] | ||||
| [55:01][:Run successfully, and consider unit testing the grid ray cast][:lighting] | ||||
| [56:49][Treat ProbeSampleNSingle wide in GridRayCast()][:lighting :optimisation :simd] | ||||
| [1:01:34][:Run successfully][:lighting] | ||||
| [1:01:50][Treat OcclusionD wide in GridRayCast()][:lighting :optimisation :simd] | ||||
| [1:03:28][:Run successfully][:lighting] | ||||
| [1:04:02][Finish streamlining the Normal selection in GridRayCast()][:lighting :optimisation :simd] | ||||
| [1:07:46][:Run successfully][:lighting] | ||||
| [1:08:13][Temporarily try hard setting the NormalIndex to 0 in GridRayCast()][:lighting :optimisation :simd] | ||||
| [1:08:27][We can't tell it's wrong][:lighting :optimisation :run :simd] | ||||
| [1:08:56][Let GridRayCast() set the computed NormalIndex and make a note to test this][:lighting :optimisation :simd] | ||||
| [1:09:36][hhlightprof total seconds elapsed: 4.534789][:lighting :performance :run :simd] | ||||
| [1:10:20][Simplify out tUpdateBlend in GridRayCast()][:lighting :optimisation :simd] | ||||
| [1:12:49][Augment light_atlas with StrideXYZ_4x and VoxelDim_4x][:"data structure" :lighting :optimisation :simd] | ||||
| [1:17:45][:Run successfully][:lighting] | ||||
| [1:17:54][Make MakeLightAtlas() set the StrideXYZ and VoxelDim, for GridRayCast() to load out of that atlas, changing their format in light_atlas to be an array of 4][:"data structure" :lighting :optimisation :simd] | ||||
| [1:20:37][:Run successfully][:lighting] | ||||
| [1:20:46][hhlightprof total seconds elapsed: 4.513986][:lighting :performance :run :simd] | ||||
| [1:22:09][Remove the old AABBRayCast()][:lighting] | ||||
| [1:24:42][:Run successfully][:lighting] | ||||
| [1:24:51][Prepare lighting_box to pack down to 64-bits total, propagating this change][:"data structure" :lighting] | ||||
| [1:28:29][:Run successfully][:lighting] | ||||
| [1:28:38][Clean out the sprawl from FullCast()][:lighting :optimisation :simd] | ||||
| [1:36:20][:Run successfully][:lighting] | ||||
| [1:36:25][Look into welding the GridRayCast() calling loop from FullCast() into GridRayCast() itself][:lighting :optimisation :research :simd] | ||||
| [1:39:21][hhlightprof total seconds elapsed: 4.511818][:lighting :performance :run :simd] | ||||
| [1:39:36][Extend GridRayCast() to operate on twice as many samples][:lighting :optimisation :simd] | ||||
| [1:40:44][:Run successfully][:lighting] | ||||
| [1:40:46][hhlightprof total seconds elapsed: 4.394170][:lighting :performance :run :simd] | ||||
| [1:41:52][Toggle off the debug code in FullCast()][:"debug system" :lighting] | ||||
| [1:43:26][hhlightprof total seconds elapsed: 4.392245][:lighting :performance :run :simd] | ||||
| [1:43:41][Consider welding the GridRayCast() calling loop from FullCast() into GridRayCast() itself][:lighting :optimisation :research :simd] | ||||
| [1:45:57][Q&A][:speech] | ||||
| [1:47:07][@mindmark42][Q: Yesterday you changed your :SIMD extract functions to use shuffles instead. Could you explain again why that is better?][:performance] | ||||
| [1:47:26][Extract vs Shuffle][:blackboard :performance :simd] | ||||
| [1:56:14]["Semantic" Extraction][:blackboard :language :performance :simd] | ||||
| [1:58:02][Unnecessary extract and cast, with thanks to @mmozeiko][:blackboard :performance :simd] | ||||
| [1:59:05][Shuffle][:blackboard :performance :simd] | ||||
| [2:00:41][@3ygun][Q: Is there such a thing as smooching too much and causing the compiler to bail before doing optimizations?][:language] | ||||
| [2:01:11][@billdstrong][Q: Would we gain any speed by moving ahead 16 and doing 12 ops per pass?][:lighting :performance] | ||||
| [2:01:40][Thank you, everyone] | ||||
| [/video] | ||||
		Loading…
	
		Reference in New Issue