[video output=day112 member=cmuratori stream_platform=twitch stream_username=handmade_hero project=code title="A Mental Model of CPU Performance" vod_platform=youtube id=qin-Eps3U_E annotator=dspecht annotator=Miblo]
[2:17][Blackboard: Optimization]
[3:58][Blackboard: CPU + GPU instructions]
[5:28][Blackboard: Math operations done wide (SIMD)]
[8:26][Blackboard: An example instruction]
[10:10][Blackboard: Issuing an instruction is expensive]
[13:13][Blackboard: Optimization considerations]
[15:56][Blackboard: Memory access costs]
[17:54][Blackboard: Cycles]
[22:05][Blackboard: You should always know how many cycles you have to work with]
[24:37][Blackboard: You won't always have all cycles available for use]
[25:57][Blackboard: What is a cycle?]
[31:38][Blackboard: Pipeline stages]
[34:36][Blackboard: Why pipeline? (Doing the laundry)]
[39:41][Blackboard: Latency and Throughput]
[43:53][Blackboard: Where latency causes us a problem]
[48:33][Blackboard: Cache miss]
[51:01][Blackboard: Hyperthreading]
[52:42][Blackboard: Optimization, the platform]
[55:07][Blackboard: So that is optimization][quote 83]
[55:25][Blackboard: Efficiency]
[59:30][Q&A][:speech]
[1:00:13][@atomiclich][Would you be willing to make more blackboard episodes? This is very informative]
[1:00:46][@grumpygiant256][Are you going to be using anything like VTune for measuring performance?]
[1:01:26][@bakeheart][How are instructions written in cache memory?]
[1:02:33][@d7samurai][Do we manually issue prefetching or is that something inferred by the CPU by looking at how we access memory?]
[1:05:08][@childz][I know this is a long way off, but after Handmade Hero is done, do you plan to continue educational streams?]
[1:05:34][@andsz_][How often do you estimate the actual amount of work prior to implementing a feature vs just implementing it and measuring it?]
[1:08:59][@snobrdr97][So if memory takes a few hundred cycles if the instructions have to reach out to the hard drive, what impact would that have?]
[1:11:14][@starchypancakes][Two questions: 1) Are there ever any cases where we have to worry about one of our instructions being decoded into multiple microcode instructions without our knowledge?]
[1:13:16][@starchypancakes][2) In optimizing, have you set up the code in such a way that you can optimize things function-by-function with this eventuality in mind, or will we have to restructure some of the functions to allow them to be optimized?]
[1:16:37][@hyco24][Would it be inefficient to offload the cache to an SSD over/or with minimal RAM usage or would the latency be too much?]
[1:17:32][@vertex_]["Premature Optimization is the root of all evil." What's your take on that quote?]
[1:19:28][@noxy_key][Is there any way to use or avoid hyperthreading to your advantage?]
[1:19:44][@zjadekkarenvae][What do you tell someone who doesn't like emacs?]
[1:20:39][@bakeheart][Does hyperthreading reduce maximum bandwidth because it has to switch between states, or can both states operate at the same time?]
[1:21:20][@quatzequatel][In your experience what drives the "good enough" optimization and how do the novice guys get a handle on that?]
[1:24:00][Wrap things up][:speech]
[/video]