cinera_handmade.network/cmuratori/hero/code/code112.hmml

[video output=day112 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="A Mental Model of CPU Performance" vod_platform=youtube id=qin-Eps3U_E annotator=dspecht annotator=Miblo]
[2:17][Blackboard: Optimization]
[3:58][Blackboard: CPU + GPU instructions]
[5:28][Blackboard: Math operations done wide (SIMD)]
[8:26][Blackboard: An example instruction]
[10:10][Blackboard: Issuing an instruction is expensive]
[13:13][Blackboard: Optimization considerations]
[15:56][Blackboard: Memory access costs]
[17:54][Blackboard: Cycles]
[22:05][Blackboard: You should always know how many cycles you have to work with]
[24:37][Blackboard: You won't always have all cycles available for use]
[25:57][Blackboard: What is a cycle?]
[31:38][Blackboard: Pipeline stages]
[34:36][Blackboard: Why pipeline? (Doing the laundry)]
[39:41][Blackboard: Latency and Throughput]
[43:53][Blackboard: Where latency causes us a problem]
[48:33][Blackboard: Cache miss]
[51:01][Blackboard: Hyperthreading]
[52:42][Blackboard: Optimization, the platform]
[55:07][Blackboard: So that is optimization][quote 83]
[55:25][Blackboard: Efficiency]
[59:30][Q&A][:speech]
[1:00:13][@atomiclich][Would you be willing to make more blackboard episodes? This is very informative]
[1:00:46][@grumpygiant256][Are you going to be using anything like VTune for measuring performance?]
[1:01:26][@bakeheart][How are instructions written in cache memory?]
[1:02:33][@d7samurai][Do we manually issue prefetching or is that something inferred by the CPU by looking at how we access memory?]
[1:05:08][@childz][I know this is a long way off, but after Handmade Hero is done, do you plan to continue educational streams?]
[1:05:34][@andsz_][How often do you estimate the actual amount of work prior to implementing a feature vs just implementing it and measuring it?]
[1:08:59][@snobrdr97][So if memory takes a few hundred cycles if the instructions have to reach out to the hard drive, what impact would that have?]
[1:11:14][@starchypancakes][Two questions: 1) Are there ever any cases where we have to worry about one of our instructions being decoded into multiple microcode instructions without our knowledge?]
[1:13:16][@starchypancakes][2) In optimizing, have you set up the code in such a way that you can optimize things function-by-function with this eventuality in mind, or will we have to restructure some of the functions to allow them to be optimized?]
[1:16:37][@hyco24][Would it be inefficient to offload the cache to an SSD over/or with minimal RAM usage or would the latency be too much?]
[1:17:32][@vertex_]["Premature Optimization is the root of all evil." What's your take on that quote?]
[1:19:28][@noxy_key][Is there any way to use or avoid hyperthreading to your advantage?]
[1:19:44][@zjadekkarenvae][What do you tell someone who doesn't like emacs?]
[1:20:39][@bakeheart][Does hyperthreading reduce maximum bandwidth because it has to switch between states, or can both states operate at the same time?]
[1:21:20][@quatzequatel][In your experience what drives the "good enough" optimization and how do the novice guys get a handle on that?]
[1:24:00][Wrap things up][:speech]
[/video]
Cinera 0.7.0 Update Add output parameter to all of hero/code, hero/intro-to-c and hero/misc, preserving the current URLs while allowing different .hmml filenames, notably for hero/misc which now gets sorted chronologically. Update the cinera__*.css files 2020-05-09 20:59:36 +00:00			`[video output=day112 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="A Mental Model of CPU Performance" vod_platform=youtube id=qin-Eps3U_E annotator=dspecht annotator=Miblo]`
Relocate riscy and add newly converted hero The idea here is to reduce the amount of superfluous stuff downloaded to each server running cinera 2017-12-06 22:26:13 +00:00			`[2:17][Blackboard: Optimization]`
			`[3:58][Blackboard: CPU + GPU instructions]`
			`[5:28][Blackboard: Math operations done wide (SIMD)]`
			`[8:26][Blackboard: An example instruction]`
			`[10:10][Blackboard: Issuing an instruction is expensive]`
			`[13:13][Blackboard: Optimization considerations]`
			`[15:56][Blackboard: Memory access costs]`
			`[17:54][Blackboard: Cycles]`
			`[22:05][Blackboard: You should always know how many cycles you have to work with]`
			`[24:37][Blackboard: You won't always have all cycles available for use]`
			`[25:57][Blackboard: What is a cycle?]`
			`[31:38][Blackboard: Pipeline stages]`
			`[34:36][Blackboard: Why pipeline? (Doing the laundry)]`
			`[39:41][Blackboard: Latency and Throughput]`
			`[43:53][Blackboard: Where latency causes us a problem]`
			`[48:33][Blackboard: Cache miss]`
			`[51:01][Blackboard: Hyperthreading]`
			`[52:42][Blackboard: Optimization, the platform]`
			`[55:07][Blackboard: So that is optimization][quote 83]`
			`[55:25][Blackboard: Efficiency]`
Fix some incorrectly converted annotations Also apply some :speech categorisation 2018-03-07 21:48:09 +00:00			`[59:30][Q&A][:speech]`
Relocate riscy and add newly converted hero The idea here is to reduce the amount of superfluous stuff downloaded to each server running cinera 2017-12-06 22:26:13 +00:00			`[1:00:13][@atomiclich][Would you be willing to make more blackboard episodes? This is very informative]`
			`[1:00:46][@grumpygiant256][Are you going to be using anything like VTune for measuring performance?]`
			`[1:01:26][@bakeheart][How are instructions written in cache memory?]`
			`[1:02:33][@d7samurai][Do we manually issue prefetching or is that something inferred by the CPU by looking at how we access memory?]`
			`[1:05:08][@childz][I know this is a long way off, but after Handmade Hero is done, do you plan to continue educational streams?]`
			`[1:05:34][@andsz_][How often do you estimate the actual amount of work prior to implementing a feature vs just implementing it and measuring it?]`
			`[1:08:59][@snobrdr97][So if memory takes a few hundred cycles if the instructions have to reach out to the hard drive, what impact would that have?]`
			`[1:11:14][@starchypancakes][Two questions: 1) Are there ever any cases where we have to worry about one of our instructions being decoded into multiple microcode instructions without our knowledge?]`
			`[1:13:16][@starchypancakes][2) In optimizing, have you set up the code in such a way that you can optimize things function-by-function with this eventuality in mind, or will we have to restructure some of the functions to allow them to be optimized?]`
			`[1:16:37][@hyco24][Would it be inefficient to offload the cache to an SSD over/or with minimal RAM usage or would the latency be too much?]`
			`[1:17:32][@vertex_]["Premature Optimization is the root of all evil." What's your take on that quote?]`
			`[1:19:28][@noxy_key][Is there any way to use or avoid hyperthreading to your advantage?]`
			`[1:19:44][@zjadekkarenvae][What do you tell someone who doesn't like emacs?]`
			`[1:20:39][@bakeheart][Does hyperthreading reduce maximum bandwidth because it has to switch between states, or can both states operate at the same time?]`
			`[1:21:20][@quatzequatel][In your experience what drives the "good enough" optimization and how do the novice guys get a handle on that?]`
Fix some incorrectly converted annotations Also apply some :speech categorisation 2018-03-07 21:48:09 +00:00			`[1:24:00][Wrap things up][:speech]`
Relocate riscy and add newly converted hero The idea here is to reduce the amount of superfluous stuff downloaded to each server running cinera 2017-12-06 22:26:13 +00:00			`[/video]`