[video output=day114 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="Preparing a Function for Optimization" vod_platform=youtube id=_vkI9BedvKA annotator=dspecht annotator=Miblo]
[53:28][@braincruser][The way the code is written now you have a very long dependency chain (between instructions). Will you break down the code to remove it?]
[56:42][@stelar7][Why did you write float instead of real32 this stream?]
[57:14][@stelar7][Why use -O2 instead of -O3 or -Ofast (possibly with -fverbose-asm)?]
[58:06][@garryjohanson][Do you ever use exclusive or operations to avoid pipeline stalls? If not, what do you use?]
[59:04][@g3rain1][Aren't those square roots pretty expensive?[ref
[1:05:40][@davidthomas426][Since xAxis and yAxis are usually perpendicular, should we special case for that? In the same vein, should we special-case for axis-aligned?]
[1:06:56][@waterlimon][Does the compiler do any automatic SSE optimization (or have option for it?)]
[1:09:01][@stelar7][sqrt_ss vs sqrt_ps vs sqrt_pd?[ref
[1:11:56][@waterlimon][Would SSE allow doing sRGB using exponent 2.2 instead of approximating using one of 2, without a huge performance hit?]
[1:12:41][@pseudonym73][The main reason why you don't get automatic SIMD is precise exceptions. You probably need to tell the compiler that you don't need them]
[1:14:44][@waterlimon][What happens if "/arch:AVX2" switch is enabled?]