<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="https://shazwazza.com/rss/xslt"?>
<rss xmlns:a10="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Shazwazza</title>
    <link>https://shazwazza.com/</link>
    <description>My blog which is pretty much just all about coding</description>
    <generator>Articulate, blogging built on Umbraco</generator>
    <image>
      <url>/media/0libq25y/frog.png?rmode=max&amp;v=1da0e911f4e6890</url>
      <title>Shazwazza</title>
      <link>https://shazwazza.com/</link>
    </image>
    <item>
      <guid isPermaLink="false">1329</guid>
      <link>https://shazwazza.com/post/petapoco-may-cause-high-memory-usage-with-certain-queries/</link>
      <category>ASP.Net</category>
      <category>Web Development</category>
      <title>PetaPoco may cause high memory usage with certain queries</title>
      <description>&lt;p&gt;If you are using &lt;a href="https://github.com/toptensoftware/PetaPoco" target="_blank"&gt;PetaPoco&lt;/a&gt;, or &lt;a href="https://github.com/schotime/NPoco" target="_blank"&gt;NPoco&lt;/a&gt; (&lt;em&gt;which seams to be the most up-to-date fork of the project&lt;/em&gt;), the title of this post might be a bit scary… but hopefully you won’t have to worry. This really depends on how you create your queries and how many different query structures you are executing.&lt;/p&gt;
&lt;h2&gt;High memory usage&lt;/h2&gt;
&lt;p&gt;Here is the code in relation to the memory growth when using PetaPoco:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/toptensoftware/PetaPoco/blob/master/PetaPoco/PetaPoco.cs#L1836" title="https://github.com/toptensoftware/PetaPoco/blob/master/PetaPoco/PetaPoco.cs#L1836"&gt;https://github.com/toptensoftware/PetaPoco/blob/master/PetaPoco/PetaPoco.cs#L1836&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;What is happening here is that every time a POCO needs to be mapped from a data source, this will add more values to a static cache, specifically this one:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/toptensoftware/PetaPoco/blob/master/PetaPoco/PetaPoco.cs#L2126" title="https://github.com/toptensoftware/PetaPoco/blob/master/PetaPoco/PetaPoco.cs#L2126"&gt;https://github.com/toptensoftware/PetaPoco/blob/master/PetaPoco/PetaPoco.cs#L2126&lt;/a&gt;  (&lt;em&gt;m_PocoDatas&lt;/em&gt;)&lt;/p&gt;
&lt;p&gt;This isn’t a bad thing… but it can be if you are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;using non-parameterized where clauses&lt;/li&gt;
&lt;li&gt;you have dynamically generated where clauses&lt;/li&gt;
&lt;li&gt;you use a lot of sql ‘IN’ clauses – since the items in the array being passed to the ‘IN’ clauses is dynamic&lt;/li&gt;
&lt;li&gt;you have tons of differently statically unique where clauses&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each time a unique SQL query is sent to PetaPoco it will store this SQL string and associate it to a delegate (which is also cached). Over time, as these unique SQL queries are executed, the internal static cache will grow. In some cases this could consume quite a lot of memory.&lt;/p&gt;
&lt;p&gt;The other thing to note is how large the ‘key’ that PetaPoco/NPoco uses:&lt;/p&gt;
&lt;pre class="csharpcode"&gt;var key = &lt;span class="kwrd"&gt;string&lt;/span&gt;.Format(&lt;span class="str"&gt;"{0}:{1}:{2}:{3}:{4}"&lt;/span&gt;, sql, connString, ForceDateTimesToUtc, firstColumn, countColumns);&lt;/pre&gt;
&lt;p&gt;Considering how many queries might be executing in your application, the storage for these keys alone could take up quite a lot of memory! An SQL statement combined with a connection string could be very long, and each of these combinations gets stored in memory for every unique SQL query executed that returns mapped POCO objects.&lt;/p&gt;
&lt;h2&gt;Parameterized queries vs. non-parameterized&lt;/h2&gt;
&lt;p&gt;Here’s some examples of why non-parameterized queries will cause lots of memory consumption. Lets say we have a simple query like:&lt;/p&gt;
&lt;pre class="csharpcode"&gt;db.Query&amp;lt;MyTable&amp;gt;(&lt;span class="str"&gt;"WHERE MyColumn=@myValue"&lt;/span&gt;, &lt;span class="kwrd"&gt;new&lt;/span&gt; {myValue = &lt;span class="str"&gt;"test"&lt;/span&gt;})&lt;/pre&gt;
&lt;p&gt;Which results in this SQL:&lt;/p&gt;
&lt;pre class="csharpcode"&gt;&lt;span class="kwrd"&gt;SELECT&lt;/span&gt; * &lt;span class="kwrd"&gt;FROM&lt;/span&gt; MyTable &lt;span class="kwrd"&gt;WHERE&lt;/span&gt; MyColumn = @myValue&lt;/pre&gt;
&lt;p&gt;This query can be used over and over again with a different value and PetaPoco will simply store a single SQL key in it’s internal cache. However, if you are executing queries without real parameters such as:&lt;/p&gt;
&lt;pre class="csharpcode"&gt;db.Query&amp;lt;MyTable&amp;gt;(&lt;span class="str"&gt;"WHERE MyColumn='hello'"&lt;/span&gt;);
db.Query&amp;lt;MyTable&amp;gt;(&lt;span class="str"&gt;"WHERE MyColumn='world'"&lt;/span&gt;);
db.Query&amp;lt;MyTable&amp;gt;(&lt;span class="str"&gt;"WHERE MyColumn='hello world'"&lt;/span&gt;);&lt;/pre&gt;
&lt;p&gt;Which results in this SQL:&lt;/p&gt;
&lt;pre class="csharpcode"&gt;&lt;span class="kwrd"&gt;SELECT&lt;/span&gt; * &lt;span class="kwrd"&gt;FROM&lt;/span&gt; MyTable &lt;span class="kwrd"&gt;WHERE&lt;/span&gt; MyColumn = &lt;span class="str"&gt;'hello'&lt;/span&gt;;
&lt;span class="kwrd"&gt;SELECT&lt;/span&gt; * &lt;span class="kwrd"&gt;FROM&lt;/span&gt; MyTable &lt;span class="kwrd"&gt;WHERE&lt;/span&gt; MyColumn = &lt;span class="str"&gt;'world'&lt;/span&gt;;
&lt;span class="kwrd"&gt;SELECT&lt;/span&gt; * &lt;span class="kwrd"&gt;FROM&lt;/span&gt; MyTable &lt;span class="kwrd"&gt;WHERE&lt;/span&gt; MyColumn = &lt;span class="str"&gt;'hello world'&lt;/span&gt;;&lt;/pre&gt;
&lt;p&gt;Then PetaPoco will store each of these statements against a delegate in it’s internal cache since each of these string statements are not equal to each other.&lt;/p&gt;
&lt;p&gt;Depending on your application you still might have a very large number of unique parameterized queries, though I’d assume you’d have to have a terrifically huge amount for it to be a worry.&lt;/p&gt;
&lt;h2&gt;Order by queries&lt;/h2&gt;
&lt;p&gt;Unfortunately even if you use parameterized queries, PetaPoco will store the SQL query key with it’s &lt;em&gt;Order By&lt;/em&gt; clause which isn’t necessary and will again mean more duplicate SQL keys and delegates being tracked. For example if you have these resulting queries:&lt;/p&gt;
&lt;pre class="csharpcode"&gt;&lt;span class="kwrd"&gt;SELECT&lt;/span&gt; * &lt;span class="kwrd"&gt;FROM&lt;/span&gt; MyTable &lt;span class="kwrd"&gt;WHERE&lt;/span&gt; MyColumn = @myValue &lt;span class="kwrd"&gt;ORDER&lt;/span&gt; &lt;span class="kwrd"&gt;BY&lt;/span&gt; SomeField;
&lt;span class="kwrd"&gt;SELECT&lt;/span&gt; * &lt;span class="kwrd"&gt;FROM&lt;/span&gt; MyTable &lt;span class="kwrd"&gt;WHERE&lt;/span&gt; MyColumn = @myValue &lt;span class="kwrd"&gt;ORDER&lt;/span&gt; &lt;span class="kwrd"&gt;BY&lt;/span&gt; AnotherField;&lt;/pre&gt;
&lt;p&gt;PetaPoco will store each of these statements in it’s internal cache separately since the strings don’t match, however the delegate that PetaPoco is storing against these SQL statements isn’t concerned about the ordering output, it’s only concerned about the column and table selection so in theory it should be stripping off the last &lt;em&gt;Order By &lt;/em&gt;clause&lt;em&gt; (and other irrelevant clauses) &lt;/em&gt;to avoid this duplication.&lt;/p&gt;
&lt;h2&gt;A slightly better implementation&lt;/h2&gt;
&lt;p&gt;First, if you are using PetaPoco/NPoco, you shouldn’t use dynamic queries for the point’s mentioned above. If you need this functionality then I suppose it might be worth these libraries adding some sort of property on the Database object or a parameter in either the Fetch or Query methods to specify that you don’t want to use the internal cache (this will be slower, but you won’t get unwanted memory growth). I’d really just suggest not using dynamically created where clauses ;-)&lt;/p&gt;
&lt;p&gt;Next, there’s a few things that could be fixed in the PetaPoco/NPoco core to manage memory a little better:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The size the the key that is stored in memory doesn’t need to be that big. A better implementation would be to use a hash combiner class to combine the &lt;em&gt;GetHashCode&lt;/em&gt; result of each of those parameters that make up the key. This is a very fast way to create a hash of some strings that will result in a much smaller key. An example of a hash combiner class is here (which is actually inspired by the various internal hash code combiner classes in .Net): &lt;a href="https://github.com/umbraco/Umbraco-CMS/blob/7.2.0/src/Umbraco.Core/HashCodeCombiner.cs" title="https://github.com/umbraco/Umbraco-CMS/blob/7.2.0/src/Umbraco.Core/HashCodeCombiner.cs"&gt;&lt;em&gt;https://github.com/umbraco/Umbraco-CMS/blob/7.2.0/src/Umbraco.Core/HashCodeCombiner.cs&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Instead of storing all of this cache in static variables, have them stored in an &lt;em&gt;ObjectCache/MemoryCache (&lt;/em&gt;&lt;a href="http://msdn.microsoft.com/en-us/library/system.runtime.caching.objectcache(v=vs.110).aspx" title="http://msdn.microsoft.com/en-us/library/system.runtime.caching.objectcache(v=vs.110).aspx"&gt;&lt;em&gt;http://msdn.microsoft.com/en-us/library/system.runtime.caching.objectcache(v=vs.110).aspx&lt;/em&gt;&lt;/a&gt;&lt;em&gt;) with &lt;/em&gt;a sliding expiration so the memory can get collected when it’s unused&lt;/li&gt;
&lt;li&gt;The Order By clause should be ignored based on the point mentioned above&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I’ve created a PR for NPoco &lt;a href="https://github.com/schotime/NPoco/pull/134" target="_blank"&gt;here&lt;/a&gt;, and also created an issue on the original PetaPoco source &lt;a href="https://github.com/toptensoftware/PetaPoco/issues/185" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;</description>
      <pubDate>Thu, 23 Mar 2023 15:08:15 Z</pubDate>
      <a10:updated>2023-03-23T15:08:15Z</a10:updated>
    </item>
  </channel>
</rss>