cinera_handmade.network/cmuratori/hero/code/code118_template.html

37 lines
2.3 KiB
HTML

<html>
<head>
<!-- __CINERA_INCLUDES__ -->
</head>
<body>
<div>
<!-- __CINERA_MENUS__ -->
<!-- __CINERA_PLAYER__ -->
</div>
<!-- __CINERA_SCRIPT__ -->
<article id="video-notes">
<h1><!-- __CINERA_TITLE__ --></h1>
<p>Masking the write:</p>
<p>In SIMD, doing operations &quot;4-wide&quot; means that one wide (packed) operation operates on four pixels. So there&#39;s no
difference between doing an operation on one pixel or two or three or four, except when it comes to reading and
writing.</p>
<p>The way we can make sure we only write the pixels we&#39;re actually operating on meaningfully is by masking out the ones we
aren&#39;t. Instead of doing a conditional check every loop, we want to build a mask that&#39;s filled with 1s in the places
where we&#39;ll keep the pixels, and 0s in the places where we&#39;ll throw out the pixels.
If we&#39;re operating on four pixels at once and we&#39;re hanging 2 off the edge, the mask might look like:</p>
<p>[0x00000000,0x00000000,0xFFFFFFFF,0xFFFFFFFF]</p>
<p>By doing a bitwise AND with the pixel data we generate, we can mask out the values that are invalid, since the zeroes in
the mask will knock out any bits set in our data. Likewise, the 1s will ensure any values we want to keep will remain in
place.</p>
<p>We still need to preserve the destination how it was, and the easiest way to do that is to remember what the destination
looked like before, and use those values wherever we knocked out values in our data. So we generate an inverted mask
that might look something like:</p>
<p>[0xFFFFFFFF,0xFFFFFFFF,0x00000000,0x00000000]</p>
<p>Using the same AND technique, we can grab out the destination values that should remain unchanged. Then, we can combine
that with the set of valid pixel values we generated using the other mask using a bitwise OR. Since the places where the
two sets of values overlap are set to 0s in one of them, the data will effectively just be copied from one onto the
other with no interference.</p>
</article>
</body>
</html>