<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Exploring memcmp</title>
	<atom:link href="http://justin.harmonize.fm/index.php/2009/05/exploring-memcmp/feed/" rel="self" type="application/rss+xml" />
	<link>http://justin.harmonize.fm/index.php/2009/05/exploring-memcmp/</link>
	<description>A cup of coffee and a soapbox is like a bottle of Jack and a gun.</description>
	<lastBuildDate>Sun, 13 Jun 2010 06:26:32 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: justin</title>
		<link>http://justin.harmonize.fm/index.php/2009/05/exploring-memcmp/comment-page-1/#comment-178</link>
		<dc:creator>justin</dc:creator>
		<pubDate>Thu, 14 May 2009 00:05:04 +0000</pubDate>
		<guid isPermaLink="false">http://justin.harmonize.fm/?p=231#comment-178</guid>
		<description>I&#039;m aware of how poor this is algorithmically, but it&#039;s fast enough for what&lt;br&gt;I&#039;m doing. This was just a fun experiment in the impact of SIMD&lt;br&gt;instructions; I have no plans to use anything except the most naive approach&lt;br&gt;with the dumbest compiler settings for actually accomplishing my task.&lt;br&gt;&lt;br&gt;That algorithm looks interesting though, and pretty much exactly what&#039;s&lt;br&gt;needed. Thanks!</description>
		<content:encoded><![CDATA[<p>I&#39;m aware of how poor this is algorithmically, but it&#39;s fast enough for what<br />I&#39;m doing. This was just a fun experiment in the impact of SIMD<br />instructions; I have no plans to use anything except the most naive approach<br />with the dumbest compiler settings for actually accomplishing my task.</p>
<p>That algorithm looks interesting though, and pretty much exactly what&#39;s<br />needed. Thanks!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: justin</title>
		<link>http://justin.harmonize.fm/index.php/2009/05/exploring-memcmp/comment-page-1/#comment-177</link>
		<dc:creator>justin</dc:creator>
		<pubDate>Wed, 13 May 2009 21:25:57 +0000</pubDate>
		<guid isPermaLink="false">http://justin.harmonize.fm/?p=231#comment-177</guid>
		<description>The naive implementation was fast enough, so I was just playing around. For any serious optimization, I agree that you definitely want to look at algorithmic differences first. &lt;br&gt;&lt;br&gt;Nice work on actually coming up with some numbers!</description>
		<content:encoded><![CDATA[<p>The naive implementation was fast enough, so I was just playing around. For any serious optimization, I agree that you definitely want to look at algorithmic differences first. </p>
<p>Nice work on actually coming up with some numbers!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael Hanson</title>
		<link>http://justin.harmonize.fm/index.php/2009/05/exploring-memcmp/comment-page-1/#comment-176</link>
		<dc:creator>Michael Hanson</dc:creator>
		<pubDate>Wed, 13 May 2009 20:57:54 +0000</pubDate>
		<guid isPermaLink="false">http://justin.harmonize.fm/?p=231#comment-176</guid>
		<description>I got curious and dug up an implementation of Boyer-Moore-Horspool (which has a worse worst case than Boyer-Moore, but a comparable average case) and ran it on a theoretical worst-case file.    The implementation I used was at &lt;a href=&quot;http://www.dcc.uchile.cl/%7Erbaeza/handbook/algs/7/713b.srch.p.html&quot; rel=&quot;nofollow&quot;&gt;http://www.dcc.uchile.cl/~rbaeza/handbook/algs/...&lt;/a&gt; but there are others (and better documented) elsewhere, I&#039;m sure.&lt;br&gt;&lt;br&gt;Running on a 328 MB haystack, 16 KB needle, using mmap for I/O, with a match at the last possible spot, on a MB Pro, the BMH algorithm found the match in 0.625 sec of real time.  The naive memcmp algorithm, even with -O3 and -march=opteron, took 157 seconds!&lt;br&gt;&lt;br&gt;As usual, algorithm choice is much, much more important than processor optimizations, especially when N is large.</description>
		<content:encoded><![CDATA[<p>I got curious and dug up an implementation of Boyer-Moore-Horspool (which has a worse worst case than Boyer-Moore, but a comparable average case) and ran it on a theoretical worst-case file.    The implementation I used was at <a href="http://www.dcc.uchile.cl/%7Erbaeza/handbook/algs/7/713b.srch.p.html" rel="nofollow"></a><a href="http://www.dcc.uchile.cl/~rbaeza/handbook/algs/.." rel="nofollow">http://www.dcc.uchile.cl/~rbaeza/handbook/algs/..</a>. but there are others (and better documented) elsewhere, I&#39;m sure.</p>
<p>Running on a 328 MB haystack, 16 KB needle, using mmap for I/O, with a match at the last possible spot, on a MB Pro, the BMH algorithm found the match in 0.625 sec of real time.  The naive memcmp algorithm, even with -O3 and -march=opteron, took 157 seconds!</p>
<p>As usual, algorithm choice is much, much more important than processor optimizations, especially when N is large.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jordan</title>
		<link>http://justin.harmonize.fm/index.php/2009/05/exploring-memcmp/comment-page-1/#comment-175</link>
		<dc:creator>Jordan</dc:creator>
		<pubDate>Wed, 13 May 2009 20:15:47 +0000</pubDate>
		<guid isPermaLink="false">http://justin.harmonize.fm/?p=231#comment-175</guid>
		<description>You could speed this up even more by using a smarter searching method than memcmp. Something like &lt;a href=&quot;http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm&quot; rel=&quot;nofollow&quot;&gt;http://en.wikipedia.org/wiki/Knuth–Morris–Pratt...&lt;/a&gt; should be much better than just moving ahead one character at a time, though depending on the size of the two files, memcmp might be faster just because of the  (much) faster SIMD instructions, instead of the algorithmic improvements from KMP.</description>
		<content:encoded><![CDATA[<p>You could speed this up even more by using a smarter searching method than memcmp. Something like <a href="http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm" rel="nofollow"></a><a href="http://en.wikipedia.org/wiki/Knuth–Morris–Pratt.." rel="nofollow">http://en.wikipedia.org/wiki/Knuth–Morris–Pratt..</a>. should be much better than just moving ahead one character at a time, though depending on the size of the two files, memcmp might be faster just because of the  (much) faster SIMD instructions, instead of the algorithmic improvements from KMP.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael Hanson</title>
		<link>http://justin.harmonize.fm/index.php/2009/05/exploring-memcmp/comment-page-1/#comment-174</link>
		<dc:creator>Michael Hanson</dc:creator>
		<pubDate>Wed, 13 May 2009 20:10:02 +0000</pubDate>
		<guid isPermaLink="false">http://justin.harmonize.fm/?p=231#comment-174</guid>
		<description>Interesting article.  Your approach has poor runtime if your haystack contains many instances of a &quot;needle prefix&quot; -- that is, regions where some X bytes of the needle are present in the haystack but the pattern then fails to match.  You&#039;ll be forced to examine those same bytes over and over again in order to get through the prefix, looking for the correct starting point.&lt;br&gt;&lt;br&gt;Take a look at search algorithms based on incremental approaches, for example the Boyer-Moore algorithm, for a much more efficient solution.</description>
		<content:encoded><![CDATA[<p>Interesting article.  Your approach has poor runtime if your haystack contains many instances of a &#8220;needle prefix&#8221; &#8212; that is, regions where some X bytes of the needle are present in the haystack but the pattern then fails to match.  You&#39;ll be forced to examine those same bytes over and over again in order to get through the prefix, looking for the correct starting point.</p>
<p>Take a look at search algorithms based on incremental approaches, for example the Boyer-Moore algorithm, for a much more efficient solution.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
