cinera_handmade.network/cmuratori/hero/code/code010_template.html

67 lines
4.7 KiB
HTML
Raw Normal View History

<html>
<head>
<!-- __CINERA_INCLUDES__ -->
</head>
<body>
<!-- __CINERA_MENUS__ -->
<!-- __CINERA_PLAYER__ -->
<!-- __CINERA_SCRIPT__ -->
<article id="video-notes">
<h1><!-- __CINERA_TITLE__ --></h1>
<p>Today we look at some techniques to get basic timing information from your running game. Timing, like everything, is
more complicated than it first appears.</p>
<p>A Couple of ideas of time:</p>
<ul>
<li>Wall clock time - time as it passes in the real world. Measured in seconds.</li>
<li>Processor time - how many cycles? this is related to wall clock time by processor frequency, but for a long time now
frequency varies a lot and quickly.</li>
</ul>
<h2 id="wall-clock-time">Wall Clock Time</h2>
<p>The Windows platform attempts to provide us with some tools for high precision timing, but as it is a complicated topic,
there are some gotchas.</p>
<p><a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms644905.aspx"><code>QueryPerformanceFrequency()</code></a> returns a
LARGE_INTEGER number of counts/sec. It&#39;s guaranteed to be stable, so you can get away with just calling it once at
startup. <a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms644904.aspx"><code>QueryPerformanceCounter()</code></a> returns
a LARGE_INTEGER number of counts.</p>
<p>So, dividing counter/freq will give you a number of seconds since some unknown time in the past. More useful would be
(counter - last_counter)/freq. This will allow us to get an elapsed time since some known point in the past. However,
almost anything we want to time should be less than a second, and since this is an integer divide, anything between 1
and 0 seconds will return 0. Not super useful. So, we instead multiply the elapsed counts by 1000 to get our formula
to get to elapsed milliseconds.</p>
<pre><code>elapsedMs = (1000*(counter - last_counter)) / freq
</code></pre><p>To get instantaneous frames per second, we can just divide without changing to milliseconds:</p>
<pre><code>fps = freq / (counter - last_counter)
</code></pre><p>Important ideas:</p>
<ul>
<li>To time a frame, only query the timer once per frame, otherwise your timer will leave out time between last frame&#39;s
end and this frame&#39;s start.</li>
</ul>
<h2 id="processor-time">Processor Time</h2>
<p>Every x86 family proccessor has a <a href="http://en.wikipedia.org/wiki/Time_Stamp_Counter">Timestamp Counter (TSC)</a>, which
increments with every clock cycle since it was reset. RDTSC is a processor intruction that reads the TSC into general
purpose registers.</p>
<p>For processors before Sandy Bridge but after dynamic clocking, RDTSC gave us actual clocks, but it was difficult to
correlate to wall time because of the variable frequency. Since Sandy Bridge, they give us &quot;nominal&quot; clocks, which
is to say the number of clocks elapsed at the chip&#39;s nominal frequency. These should correlate closely to wall clock
time, but make tracking the &quot;number of cycles&quot; notion of processor time more difficult.</p>
<p>RDTSC is usually exposed in a compiler intrinsic. Check the docs for your compiler.</p>
<p>Resources:</p>
<ul>
<li><a href="http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html">Intel Architecture Manuals</a></li>
</ul>
<h2 id="other-topics">Other topics</h2>
<p>Casey had to cover a couple of new corners of C in order to work with the techniques above.</p>
<h3 id="union-types">Union types</h3>
<p><a href="http://en.wikipedia.org/wiki/Union_type">Union types</a> are a C feature that let you superimpose a number of different
layouts over the same chunk of memory. For example LARGE_INTEGER, the return type of the QueryPerf calls. I can treat
it as an int64 by accessing its QuadPart, or as two int32s via HighPart and LowPart.</p>
<h3 id="compiler-intrinsics">Compiler Intrinsics</h3>
<p>An <a href="http://en.wikipedia.org/wiki/Intrinsic_function">intrinsic</a> is a compiler-specific extension that allows direct
invocation of some processor instruction. They generally need to be extensions to the compiler so they can avoid all
the expensive niceties compilers have to afford functions.</p>
</article>
</body>
</html>