31 lines
2.1 KiB
Plaintext
31 lines
2.1 KiB
Plaintext
[video member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="SSE Mixer Pre and Post Loops" vod_platform=youtube id=l3zbzEYRLJc annotator=ZedZull annotator=Miblo annotator=debiatan]
|
|
[00:02:11][Plan for today: SIMDizing the mixer]
|
|
[00:03:41][Aligning the temporary buffer]
|
|
[00:05:00][Making sure the temporary sound buffers are big enough to fit all samples]
|
|
[00:05:29][Explanation of Align16]
|
|
[00:06:23][Alignment macro for any power of two: AlignPow2]
|
|
[00:11:17][Clamping samples to the signed 16-bit integer range]
|
|
[00:18:09][(intermission) Two's complement]
|
|
[00:34:44][Back to SIMD]
|
|
[00:36:48][Rounding the samples]
|
|
[00:37:37][Downconverting from 32-bit to 16-bit integers. No clamping necessary!]
|
|
[00:39:54][Looking for intrinsics that interleave 16-bit values]
|
|
[00:44:18][Interleaving the samples before packing them]
|
|
[00:47:27][Making sure we don't write out of bounds]
|
|
[00:49:00][Debugging output using structured input]
|
|
[00:52:50][Padding the buffer in the platform layer to make sure we always have space for overwrites]
|
|
[00:54:20][Casey remembers that the horizontal mouse position was linked to music panning]
|
|
[00:54:52][Getting rid of unnecessary clamping operations]
|
|
[00:55:45][Using aligned loads and stores]
|
|
[00:57:24][Plan for next episode]
|
|
[01:01:30][More 2s complement. Full example]
|
|
[01:11:30][Q&A]
|
|
[01:11:37][@cubercaleb][Why isn't 2's complement used for floating-point numbers if it makes signed arithmetic easy?]
|
|
[01:16:35][@poohshoes][Are you not going to profile it too see how much faster it gets?]
|
|
[01:16:55][@dr_s80][When you implemented streaming in chunks of audio; I believe the code actually loads the entire file (with a platform layer VirtualAlloc) for each chunk. Is this just an artifact of the debug nature of that code?]
|
|
[01:17:33][@ishytarus][Does the audio make the framerate in debug mode?]
|
|
[01:26:09][@cubercaleb][If 1111 (-1) is supposed to be less than 0000 (0) then how do number comparisons work on the CPU level?]
|
|
[01:32:39][@marumoto][Do you have any tips for speeding up compile time when using multiple translation units?]
|
|
[01:32:55][@sssmcgrath][It's movsx for signed]
|
|
[/video]
|