<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
     xmlns:atom="http://www.w3.org/2005/Atom"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:wfw="http://wellformedweb.org/CommentAPI/"
     >
  <channel>
    <title>The Basho Blog</title>
    <link>http://blog.basho.com</link>
    <description>The Basho Blog</description>
    <pubDate>Thu, 02 Sep 2010 00:08:05 GMT</pubDate>
    <generator>Blogofile</generator>
    <sy:updatePeriod>hourly</sy:updatePeriod>
    <sy:updateFrequency>1</sy:updateFrequency>
    <item>
      <title>A New Community Editor - Marten Gustafson</title>
      <link>http://blog.basho.com/2010/09/01/a-new-community-editor---marten-gustafson/</link>
      <pubDate>Wed, 01 Sep 2010 17:00:00 EDT</pubDate>
      <category><![CDATA[Riak]]></category>
      <category><![CDATA[Community]]></category>
      <guid isPermaLink="true">http://blog.basho.com/2010/09/01/a-new-community-editor---marten-gustafson/</guid>
      <description>A New Community Editor - Marten Gustafson</description>
      <content:encoded><![CDATA[

<p>Community Editors are non-Basho developers who have edit permissions on the <a href="http://wiki.basho.com">Riak Wiki.</a> How do you become one? Check out the <a href="http://wiki.basho.com/display/RIAK/Contributing+to+the+Riak+Wiki">Community Processes Section</a> on the Riak Wiki for details.</p>

<p>We added a new Community Editor this week. His name is <a href="http://twitter.com/martengustafson">Mårten Gustafson.</a></p>

<p>This is Mårten:</p>

<img src="/images/gustafsen-riak.jpg">

<p></pr>

<p>Aside from being involved in putting a Riak application into production, Mårten has been active, knowledgeable and helpful on the Riak Mailing list and in the IRC Room (where he goes by the unassuming "chids"). He recently came forward and expressed interest in being a Community Editor, and based on his Riak credentials, the team here at Basho was more than happy to bring him aboard.</p>  

<p>Welcome, Mårten! We are looking forward to your contributions.</p> 

<p>If you're interested in being a Community Editor for the Riak Wiki, let us know. We would love to talk to you.</p>

<p>Best,</p>

<p><a href="http://twitter.com/pharkmillups">Mark</a></p>]]></content:encoded>
    </item>
    <item>
      <title>Ripple 0.8 is Full of Good Stuff</title>
      <link>http://blog.basho.com/2010/08/31/ripple-0.8-is-full-of-good-stuff/</link>
      <pubDate>Tue, 31 Aug 2010 23:00:00 EDT</pubDate>
      <category><![CDATA[Developers]]></category>
      <category><![CDATA[Riak]]></category>
      <category><![CDATA[Ruby]]></category>
      <category><![CDATA[NoSQL]]></category>
      <guid isPermaLink="true">http://blog.basho.com/2010/08/31/ripple-0.8-is-full-of-good-stuff/</guid>
      <description>Ripple 0.8 is Full of Good Stuff</description>
      <content:encoded><![CDATA[
<p><strong><em>This is a repost from the blog of Sean Cribbs, our Developer Advocate. You can read the <a href="http://ow.ly/2xGkN">original post</a> there.</strong></em></p>

	<p>It&#8217;s been a while since I&#8217;ve blogged about a release of Ripple, in fact, it&#8217;s been a long time since I&#8217;ve <em>released</em> Ripple. So this post is going to dig into <a href="http://rubygems.org/gems/ripple/versions/0.8.0">Ripple 0.8</a> (released today, August 31) and catch you up on what has happened since 0.7.1 (and 0.5 if you don&#8217;t follow the <a href="http://github.com/seancribbs/ripple">Github project</a>).</p>

	<p>The major features, which I&#8217;ll describe in more detail below, are:</p>

	<ul>
		<li>Supports Riak 0.12 features</li>
		<li>Runs on Rails 3 (non-prerelease)</li>
		<li>Adds Linked associations</li>
		<li>Adds session stores for Rack and Rails 3 apps</li>
	</ul>

	<h2>Riak 0.12 Features</h2>

	<p>The biggest changes here were some bucket-related features. First of all, you can <strong>define default quorum parameters for requests on a per-bucket basis</strong>, exposed as bucket properties. Riak 0.12 also allows you to specify &#8220;symbolic&#8221; quorums, that is, &#8220;all&#8221; (N replies), &#8220;quorum&#8221; (N/2 + 1 replies), or &#8220;one&#8221; (1 reply). <a href="http://seancribbs.github.com/ripple/Riak/Bucket.html">Riak::Bucket</a> has support for these new properties and exposes them as <code>attr_accessor</code>-like methods. This is a big time saver if you need to tune your quorums for different use-cases or N-values.</p>

	<p>Second, <strong>keys are not listed by default</strong>. There used to be a big flashing warning sign on <a href="http://seancribbs.github.com/ripple/Riak/Client.html#bucket-instance_method">Riak::Client#bucket</a> that encouraged you to pass <code>:keys =&#62; false</code>. In Ripple 0.8 that&#8217;s the default, but it&#8217;s also explicit so that if you use the latest gem on Riak 0.11 or earlier, you should get the same behavior.</p>

	<h2>Runs on Rails 3</h2>

	<p>I&#8217;ve been pushing for Rails 3 ever since Ripple was conceived, but now that the <em>actual</em> release of Rails 3 is out, it&#8217;s an easier sell. Thanks to all the <a href="http://github.com/seancribbs/ripple/blob/master/CONTRIBUTORS.textile">contributors</a> who helped me keep Ripple up-to-date with the latest prereleases.</p>

	<h2>Linked associations</h2>

	<p>These are <strong>HOT</strong>, and were the missing features that held me back from saying &#8220;Yes, you should use Ripple in your app.&#8221; The underlying concepts take some time to understand (the upcoming link-walking page to the <a href="http://wiki.basho.com/display/RIAK/The+Riak+Fast+Track">Fast Track</a> will help), but you actually have a lot more freedom than foreign keys. Here&#8217;s some examples (with a little detail of how they work):</p>

	<p><script src="http://gist.github.com/560126.js?file=gistfile1.rb"></script></p>

	<p>You&#8217;ll notice only <code>one</code> and <code>many</code> in the above examples. From the beginning, I&#8217;ve eschewed creating the <code>belongs_to</code> macro because I think it has the wrong semantics for how linked associations work (links are all on the origin side). It&#8217;s more like you &#8220;point to one of&#8221; or &#8220;point to many of&#8221;. Minor point, but often it&#8217;s the language you choose that frames how you think about things.</p>

	<h2>Session stores</h2>

	<p>Outside the Ruby-sphere, web session storage is one of Riak&#8217;s most popular use-cases. Both <a href="http://mochimedia.com/">Mochi</a> and <a href="http://wikia.com/">Wikia</a> are using it for this. Now, it&#8217;s really easy to do the same for your Rails or Sinatra app.</p>

	<p>For Sinatra, Padrino and other pure Rack apps, use <code>Riak::SessionStore</code>:</p>

	<p><script src="http://gist.github.com/560126.js?file=gistfile2.rb"></script></p>

	<p>For Rails 3, use <code>Ripple::SessionStore</code>:</p>

	<p><script src="http://gist.github.com/560126.js?file=gistfile3.rb"></script></p>

	<h2>Roadmap</h2>

	<p>If you&#8217;re curious what&#8217;s coming next in Ripple, or to provide feedback on the direction, be sure to check out the <a href="http://github.com/seancribbs/ripple/issues">issue tracker</a>.  As always, questions can be sent direct to me (<code>sean AT basho DOT com</code>) or to the <a href="http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com">riak-users mailing list</a>. Documentation is available on <a href="http://seancribbs.github.com/ripple">Github Pages</a>.</p>

	<p>Cheers!</p>]]></content:encoded>
    </item>
    <item>
      <title>Webinar Recap - Riak in Action: Wriaki</title>
      <link>http://blog.basho.com/2010/08/20/webinar-recap---riak-in-action:-wriaki/</link>
      <pubDate>Fri, 20 Aug 2010 11:00:00 EDT</pubDate>
      <category><![CDATA[Riak]]></category>
      <category><![CDATA[Erlang]]></category>
      <category><![CDATA[Webmachine]]></category>
      <category><![CDATA[NoSQL]]></category>
      <category><![CDATA[Database]]></category>
      <guid isPermaLink="true">http://blog.basho.com/2010/08/20/webinar-recap---riak-in-action:-wriaki/</guid>
      <description>Webinar Recap - Riak in Action: Wriaki</description>
      <content:encoded><![CDATA[

<p>Thank you to those who attended our  webinar yesterday. Like before, we're recapping
  the questions below for everyone's sake (in no particular order). If you
  missed the webinar, want some more information, or want to ask us
  some more questions, we've prepared
  a <a href="http://forms.basho.com/riak-in-action-wriaki-p">resource 
    page</a> for you. As always, you can also <a href="http://wiki.basho.com/display/RIAK/Contact+Basho">get ahold of us directly</a>.</p>

<p><strong>Q: How would solve full text search with the current versions
    of Riak? One could also take Wriaki as an example as most wikis
    have some sort of fulltext search functionality.</strong></p>

<p>I recommend using existing fulltext solutions. Solr has matched up
  well with most of the web applications I have written, and would
  certainly work for Wriaki as well.</p>

<p><strong>Q: Where in the course of the interaction (shown on slide
    18) are you defining the client ID? Don't you need the client ID
    and vclock to match between updates?</strong></p>

<p>On slide 42, we talk about "actors" which are essentially client
  IDs. Using the logged-in user as the client ID can help prevent
  vclock explosion and is a sensible way of structuring your updates.</p>

<p>&mdash; <a href="http://twitter.com/hobbyist">Bryan</a></p>
]]></content:encoded>
    </item>
    <item>
      <title>Free Webinar - Riak in Action: Wriaki - August 19 @ 2PM Eastern</title>
      <link>http://blog.basho.com/2010/08/13/free-webinar---riak-in-action:-wriaki---august-19-@-2pm-eastern/</link>
      <pubDate>Fri, 13 Aug 2010 11:00:00 EDT</pubDate>
      <category><![CDATA[Riak]]></category>
      <category><![CDATA[Webmachine]]></category>
      <category><![CDATA[NoSQL]]></category>
      <category><![CDATA[Database]]></category>
      <guid isPermaLink="true">http://blog.basho.com/2010/08/13/free-webinar---riak-in-action:-wriaki---august-19-@-2pm-eastern/</guid>
      <description>Free Webinar - Riak in Action: Wriaki - August 19 @ 2PM Eastern</description>
      <content:encoded><![CDATA[

<p>Documentation is great, but playing with examples can also be a
  helpful way to tackle steep learning curves.  To help you learn about
  ways of using Riak, we'd like to present "Wriaki", an example
  implementation of a wiki that stores its data in Riak.</p>

<p>We invite you to join us for a <strong>free webinar</strong>
  on <strong>Thursday, August 19 at 2:00PM Eastern Time
    (UTC-4)</strong> about <strong>Riak in Action: Wriaki</strong>.
  During the
  presentation, <a href="http://www.basho.com/bios.html#Bryan">Bryan Fink</a> will cover:</p>

<ul>
  <li>Modeling wiki data in the Riak key/value store</li>
  <li>Access patterns using both get/put and map/reduce</li>
  <li>Three strategies that Wriaki uses for dealing with eventual
    consistency</li>
  <li>how the user interface changes to accommodate Wriaki's models</li>
</ul>

<p>The code for Wriaki will be open-source at the time of the
  presentation. <strong>The presentation will last 30 to 45
    minutes, with time for questions at the end.</strong> <del>Fill in the
  form below to reserve your seat!</del> Sorry, registration has closed!</p>

<p><em>If you cannot attend, the video and slides will be made
    available afterward in the recap post on the blog.</em></p>
]]></content:encoded>
    </item>
    <item>
      <title>Basho Partners with Joyent to Bring You Hosted Riak</title>
      <link>http://blog.basho.com/2010/08/08/basho-partners-with-joyent-to-bring-you-hosted-riak/</link>
      <pubDate>Sun, 08 Aug 2010 00:30:00 EDT</pubDate>
      <category><![CDATA[Riak]]></category>
      <category><![CDATA[NoSQL]]></category>
      <category><![CDATA[Database]]></category>
      <guid isPermaLink="true">http://blog.basho.com/2010/08/08/basho-partners-with-joyent-to-bring-you-hosted-riak/</guid>
      <description>Basho Partners with Joyent to Bring You Hosted Riak</description>
      <content:encoded><![CDATA[

<p>This is a huge day for Basho Technologies, Riak, and our growing community of users.</p>

<p>We are thrilled to announce Basho's partnership with <a href="http://www.joyent.com">Joyent</a> to bring our community hosted Riak on Joyent's Smart platform.  With both open source and enterprise versions available, anyone can quickly spin up a Riak cluster and start building applications.</p>

<p>When we first began talking to Jason and David and the rest of the Joyent team early this year, we realized we shared a common vision for the future of infrastructure.  The past several months have been spent finalizing the details, and in just a few weeks you'll be able to go to <a href="http://my.joyent.com">my.joyent.com</a> and, with a few clicks, purchase and deploy as many nodes of Riak you want, need, and can handle.</p>

<p>Making pre-configured Riak SmartMachines available in the Joyent cloud
will enable developers to combine all the benefits of Riak with the
proven, advanced hosting platform that  businesses like LinkedIn, Gilt, and Backstage rely on every day.</p>

<p>The team at Joyent has <a href="http://www.joyent.com/2010/08/joyent-and-basho-partner-to-deliver-the-best-internet-scale-data-store">posted more details about the partnership on their blog.</a> Go read it, and mark your calendar, because hosted Riak is here!</p>

<p>Thanks,</p>

<a href="http://twitter.com/bashot">Earl</a>]]></content:encoded>
    </item>
    <item>
      <title>Webinar Recap - Riak with Rails</title>
      <link>http://blog.basho.com/2010/08/06/webinar-recap---riak-with-rails/</link>
      <pubDate>Fri, 06 Aug 2010 10:00:00 EDT</pubDate>
      <category><![CDATA[Riak]]></category>
      <category><![CDATA[Ruby]]></category>
      <category><![CDATA[NoSQL]]></category>
      <category><![CDATA[Database]]></category>
      <guid isPermaLink="true">http://blog.basho.com/2010/08/06/webinar-recap---riak-with-rails/</guid>
      <description>Webinar Recap - Riak with Rails</description>
      <content:encoded><![CDATA[

<p>Thank you to those who attended our Rails-oriented webinar yesterday. Like before, we're recapping
  the questions below for everyone's sake (in no particular order). If you
  missed the webinar, want some more information, or want to ask us
  some more questions, we've prepared
  a <a href="http://forms.basho.com/riak-and-rails-a-powerful-combination-attended-p">resource 
    page</a> for you. As always, you can also <a href="http://wiki.basho.com/display/RIAK/Contact+Basho">get ahold of us directly</a>.</p>

<p><strong>Q: When you have multiple application servers and Riak
    nodes, how do you handle "replication lag"?</strong></p>

<p>Most web applications have some element of eventual consistency
  (or potential inconsistency) in them by their nature. Object and
  view caches sacrifice immediate consistency for gains in throughput
  and latency, and hopefully provide a better user experience.  With
  Riak, you can achieve acceptable data freshness by "reading your
  writes". That is, use the same read quorum as your write
  quorum and make sure that the R+W is greater than N. For example,
  using R=W=DW=2 when N=3 will give a strong assurance of consistency.</p>

<p><strong>Q: I find myself doing <code>def key; id; end</code>. Is
    there any easier way to tell Ripple the key?</strong></p>

<p>Currently there is not. However, I've found myself using this
  pattern frequently when I want a meaningful key that is also an
  attribute. There's
  an <a href="http://github.com/seancribbs/ripple/issues#issue/3">issue</a>
  on the tracker just for this feature. In the meantime, you could use
  two method aliases:</p>

<pre>class User
  include Ripple::Document
  property :email, String, :presence => true
  
  # This forces all attribute methods to be defined
  define_attribute_methods
  alias_method :key, :email
  alias_method :key=, :email=
end</pre>

<p>As long as your property is a string, this should work just
  fine.</p>

<p><strong>Q: Any tips on how to handle pagination over MapReduce
    queries?</strong></p>

<p>The challenge with pagination in Riak is that reduce phases are
  not guaranteed to run only once, but instead are run in parallel as
  results from the previous phase come in asynchronously, and then
  followed by a final reduce.  So in a sense, you have to treat all
  invocations of your reduce function as a "re-reduce". We have plans to allow
  reduce phases to specify that they should be run only once, but for
  right now you can get around this limitation.</p>

<p>Reduce phases are always run on the coordinating node, so if you
  put a reduce phase before the one where you want to perform
  pagination, you are pretty much guaranteed that the whole result
  set is going to be available in a single application of the final
  reduce. A typical combination would be a "sorting" phase followed
  by a "pagination" phase. <code>Riak.reduceSort</code>
  and <code>Riak.reduceSlice</code> are
  two <a href="http://bitbucket.org/basho/riak_kv/src/tip/priv/mapred_builtins.js">built-in
  functions</a> that could help accomplish this task.</p>

<p>&mdash; <a href="http://twitter.com/seancribbs">Sean</a> and <a href="http://twitter.com/schofield">Grant</a></p>
]]></content:encoded>
    </item>
    <item>
      <title>Introducing Riak Core</title>
      <link>http://blog.basho.com/2010/07/30/introducing-riak-core/</link>
      <pubDate>Fri, 30 Jul 2010 13:30:00 EDT</pubDate>
      <category><![CDATA[Riak]]></category>
      <category><![CDATA[Riak Core]]></category>
      <category><![CDATA[NoSQL]]></category>
      <category><![CDATA[Database]]></category>
      <guid isPermaLink="true">http://blog.basho.com/2010/07/30/introducing-riak-core/</guid>
      <description>Introducing Riak Core</description>
      <content:encoded><![CDATA[



<p><strong> This post was originally published on <a href="http://twitter.com/kevsmith">Kevin Smith's</a> Blog, <a href="http://weblog.hypotheticalabs.com/?p=539">Hypothetical Labs.</a> If you have questions or comments, please use the original post.</strong></p>

<h2 id="sec-2">What is riak_core?</h2> 
<p> 
  <code>riak_core</code> is a single OTP application which provides all the services necessary to write a modern, well-behaved distributed application. <code>riak_core</code> began as part of <a href="http://wiki.basho.com">Riak</a>. Since the code was generally useful in building all kinds of distributed applications <a href="http://www.basho.com">we</a> decided to refactor and separate the core bits into their own codebase to make it easier to use.
  </p> 
<p> 
  Distributed systems are complex and some of that complexity shows in the amount of features available in <code>riak_core</code>. Rather than dive deeply into code, I&#8217;m going to separate the features into broad categories and give an overview of each.
  </p> 
<p> 
  <em>Note: If you&#8217;re the impatient type and want to skip ahead and start reading code, you can check out the source to <code>riak_core</code> via <a href="http://bitbucket.org/basho/riak_core">hg</a> or <a href="http://github.com/basho/riak_core">git</a>.</em> 
  </p> 
<h2 id="sec-2">Node Liveness &amp; Membership</h2> 
<p> 
  <code>riak_core_node_watcher</code> is the process responsible for tracking the status of nodes within a riak_core cluster. It uses <code>net_kernel</code> to efficiently monitor many nodes. <code>riak_core_node_watcher</code> also has the capability to take a node out of the cluster programmatically. This is useful in situations where a brief node outage is necessary but you don&#8217;t want to stop the server software completely.
  </p> 
<p> 
  <code>riak_core_node_watcher</code> also provides an API for advertising and locating services around the cluster. This is useful in clusters where nodes provide a specialized service, like a CUDA compute node, which is used by other nodes in the cluster.
  </p> 
<p> 
  <code>riak_core_node_watch_events</code> cooperates with <code>riak_core_node_watcher</code> to generate events based on node activity, i.e. joining or leaving the cluster, etc. Interested parties can register callback functions which will be called as events occur.
  </p> 
<h2 id="sec-2">Partitioning &amp; Distributing Work</h2> 
<p><code>riak_core</code> uses a master/worker configuration on each node to manage the execution of work units. Consistent hashing is used to determine which target node(s) to send the request and the master process on each node farms out the request to the actual workers. <code>riak_core</code> calls worker processes <code>vnode</code>s. The coordinating process is the <code>vnode_master</code>.
  </p> 
<p>The partitioning and distribution logic inside <code>riak_core</code> also handles hinted handoff when required. Hinted handoff occurs as a result of a node failure or outage. In order to assure availability, most clustered systems will use operational nodes in place of down nodes. When the down node comes back the cluster needs to migrate the data from its temporary home on the substitute nodes to the data&#8217;s permanent home on the restored node. This process is called hinted handoff and is managed by components inside <code>riak_core</code>. <code>riak_core</code> also handles migrating partitions to new nodes when they join the cluster such that all work continues to be evenly partitioned to all cluster members.
  </p> 
<p> 
  <code>riak_core_vnode_master</code> starts all the worker vnodes on a given node and routes requests to<br /> 
  the vnodes as the cluster runs.
  </p> 
<p> 
  <code>riak_core_vnode</code> is an OTP behavior wrapping all the boilerplate logic required to implement a vnode. Application-specific vnodes need to implement a handful of callback functions in order to participate in handoff sessions and receive work units from the master.
  </p> 
<h2 id="sec-2">Cluster State</h2> 
<p>A <code>riak_core</code> cluster stores global state in a ring structure. The state information is transferred between nodes in the cluster in a controlled manner to keep all cluster members in sync. This process is referred to as &#8220;gossiping&#8221;.
  </p> 
<p><code>riak_core_ring</code> is the module used to create and manipulate the ring state data shared by all nodes in the cluster. Ring state data includes items like partition ownership and cluster-specific ring metadata. Riak KV stores bucket metadata in the ring metadata, for example.
  </p> 
<p><code>riak_core_ring_manager</code> manages the cluster ring for a node. It is the main entry point for application code accessing the ring, via <code>riak_core_ring_manager:get_my_ring/1</code>, and also keeps a persistent snapshot of the ring in sync with the current ring state.
  </p> 
<p><code>riak_core_gossip</code> manages the ring gossip process and insures the ring is generally consistent across the cluster.</p> 
<h2 id="sec-2">What&#8217;s the plan?</h2> 

<p>Over the next several months  I&#8217;m going to cover the process of building a real application in a series of posts to this blog where each post covers some aspect of system building with <code>riak_core</code>. All of the source to the application will be published under the Apache2 licensed and shared via a public repo on github.
  </p> 
<p>And what type of application will we build? Since the goal of this series is to illustrate how to build distributed systems using <code>riak_core</code> and also satisfy my own technical curiosity I&#8217;ve decided to build a distributed graph database. A graph database should provide enough use cases to really exercise <code>riak_core</code> while at the same time not obscuring the core learning experience in tons of complexity.
  </p> 
<p> 
  Thanks to Sean Cribbs and Andy Gross for providing helpful review and feedback.
 </p>

]]></content:encoded>
    </item>
    <item>
      <title>Free Webinar - Riak with Rails - August 5 @ 2PM Eastern</title>
      <link>http://blog.basho.com/2010/07/29/free-webinar---riak-with-rails---august-5-@-2pm-eastern/</link>
      <pubDate>Thu, 29 Jul 2010 17:00:00 EDT</pubDate>
      <category><![CDATA[Riak]]></category>
      <category><![CDATA[Ruby]]></category>
      <category><![CDATA[NoSQL]]></category>
      <category><![CDATA[Database]]></category>
      <guid isPermaLink="true">http://blog.basho.com/2010/07/29/free-webinar---riak-with-rails---august-5-@-2pm-eastern/</guid>
      <description>Free Webinar - Riak with Rails - August 5 @ 2PM Eastern</description>
      <content:encoded><![CDATA[

<p><a href="http://rubyonrails.org/">Ruby on Rails</a> is a powerful web framework that focuses on developer
  productivity. Riak is a friendly key-value store that is
  simple, flexible and scalable. Put them together and you have lots
  of exciting possibilities!</p>

<p>We invite you to join us for a <strong>free webinar</strong>
  on <strong>Thursday, August 5 at 2:00PM Eastern Time (UTC-4)</strong>
  to talk about <strong>Riak with Rails</strong>.  In this hands-on
  webinar, we'll discuss:</p>

<ul>
  <li>Setting up a new Rails 3 project for Riak</li>
  <li>Storing, retrieving, manipulating key-value data from Ruby</li>
  <li>Issuing map-reduce queries</li>
  <li>Creating rich document models with Ripple</li>
  <li>Using Riak as a distributed cache and session store</li>
</ul>

<p><strong>The presentation will last 30 to 45
    minutes, with time for questions at the end.</strong> <del>Fill in the
    form below if you want to get started building Rails applications
    on top of Riak!</del> Sorry, registration is closed.</p>

<p>&mdash; <a href="http://twitter.com/basho/team">The Basho Team</a></p>
]]></content:encoded>
    </item>
    <item>
      <title>Consistent Smashing</title>
      <link>http://blog.basho.com/2010/07/28/consistent-smashing/</link>
      <pubDate>Wed, 28 Jul 2010 13:30:00 EDT</pubDate>
      <category><![CDATA[Riak]]></category>
      <category><![CDATA[NoSQL]]></category>
      <category><![CDATA[Database]]></category>
      <guid isPermaLink="true">http://blog.basho.com/2010/07/28/consistent-smashing/</guid>
      <description>Consistent Smashing</description>
      <content:encoded><![CDATA[


<p>Sometimes you need more than words to illustrate a point. Here is Basho's humble attempt to clarify the difference between "Dynamo-Style" systems (like Riak) that use <a href="http://wiki.basho.com/display/RIAK/Riak+Glossary#RiakGlossary-ConsistentHashing">consistent hashing</a> to achieve fault tolerance, simple scaling, and prevent data loss, and systems that use techniques like <a href="http://en.wikipedia.org/wiki/Sharding">sharding</a>.</p>

<p>Enjoy!</p>

<p><a href="http://twitter.com/pharkmillups">Mark</a></p>


<object width="400" height="300"><param name="allowfullscreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=13667174&amp;server=vimeo.com&amp;show_title=1&amp;show_byline=1&amp;show_portrait=0&amp;color=ff9933&amp;fullscreen=1" /><embed src="http://vimeo.com/moogaloop.swf?clip_id=13667174&amp;server=vimeo.com&amp;show_title=1&amp;show_byline=1&amp;show_portrait=0&amp;color=ff9933&amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"></embed></object><p><a href="http://vimeo.com/13667174">Consistent Smashing</a> from <a href="http://vimeo.com/user2820657">Basho Technologies</a> on <a href="http://vimeo.com">Vimeo</a>.</p>
]]></content:encoded>
    </item>
    <item>
      <title>Webinar Recap - MapReduce Querying in Riak</title>
      <link>http://blog.basho.com/2010/07/27/webinar-recap---mapreduce-querying-in-riak/</link>
      <pubDate>Tue, 27 Jul 2010 17:00:00 EDT</pubDate>
      <category><![CDATA[Riak]]></category>
      <category><![CDATA[Map/Reduce]]></category>
      <category><![CDATA[NoSQL]]></category>
      <category><![CDATA[Database]]></category>
      <guid isPermaLink="true">http://blog.basho.com/2010/07/27/webinar-recap---mapreduce-querying-in-riak/</guid>
      <description>Webinar Recap - MapReduce Querying in Riak</description>
      <content:encoded><![CDATA[

<p>Thank you to all who attended the webinar last Thursday, it was a
  great turnout with awesome engagement. Like before, we're recapping
  the questions below for everyone's sake (in no particular order). If you
  missed the webinar, want some more information, or want to ask us
  some more questions, we've prepared
  a <a href="http://forms.basho.com/basho-riak-mapreduce-and-querying-riak-key-value-nosql">resource 
    page</a> for you. As always, you can also <a href="http://wiki.basho.com/display/RIAK/Contact+Basho">get ahold of us directly</a>.</p>

<p><strong>Q: Say I want to perform two-fold link walking but don't
    want to keep the "walk-through" results, including the initial
    one. Can I do something to keep only the last result?</strong></p>

<p>In a MapReduce query, you can specify any number of phases to keep
  or ignore using the "keep" parameter on the phase.  Usually you only
  want to keep the final  phase. If you're using the link-walker resource, it'll return
  results from any phases whose specs end in "1". See
  the <a href="http://wiki.basho.com/display/RIAK/REST+API#RESTAPI-Linkwalking">REST
    API wiki page</a> for more information on link-walking.</p>

<p><strong>Q: Will Riak Search work along with MapReduce, for example,
    to avoid queries over entire bucket? Will there be a webinar
    about Riak Search?</strong></p>

<p>Yes, we intend to have this feature in the Generally Available
  release of Riak Search. We will definitely have a webinar about Riak
  Search close to its public release.</p>

<p><strong>Q: Are there still problems with executing "qfun" functions
    from Erlang during MapReduce?</strong></p>

<p>"qfun" phases (that use anonymous Erlang functions) will work on a
  one-node cluster, but not across a multi-node cluster. You can use
  them in development but it's best to switch to a compiled module
  function or Javascript function when moving to production.</p>

<p><strong>Q: Although streams weren't mentioned, do you have any
    recommendations on when to use streaming map/reduce versus normal
    map/reduce?</strong></p>

<p>Streaming MapReduce sends results back as they get produced from
  the last phase, in a <code>multipart/mixed</code> format. To invoke
  this, add <code>?chunked=true</code> to the URL when you submit the
  job. Streaming might be appropriate when you expect the result set
  to be very large and have constructed your application such that
  incomplete results are useful to it. For example, in an AJAX web application, it
  might make sense to send some results to the browser before the entire
  query is complete.</p>

<p><strong>Q: How do you indicate to Riak that the input key is a list
    of keys rather than a key whose value should be passed to the map
    function? </strong></p>

<p>A custom map function could accomplish this, like the Javascript example
  below. The example assumes its input has a JSON Array of keys in the
  target bucket, and that the target bucket is the key of the input
  object.</p>

<pre>function(object, keyData, arg){
  var keys = Riak.mapValuesJson(object)[0];
  return keys.map(function(item){ return [object.key, item] });
}</pre>

<p>There's more to this issue &mdash; we discuss it in the next
  question.</p>

<p><strong>Q: Which way is faster: storing a lot of links or storing
    the target keys in the value as a list? Are there any limits to
    the maximum number of links on a key?</strong></p>

<p>How the links are stored will likely not have a huge impact on
  performance. If you choose to store a key list in a document, both
  methods would work. There are two relevant operations that would be
  performed with the key list document (updating and traversal).</p>

<p>The update process would involve retrieving the list, adding a
  value, and saving the list. If you are using the REST interface you
  will need to be aware of limitations in the number of allowed
  headers and the allowed header length. Mochiweb restricts the number
  of allowed headers to 1000. Header length is limited to 8192
  characters. This imposes an upper limit for the number of Links that
  can be set through the REST interface.</p>

<p>The best method for updating a key list would be to write a post
  commit hook that performed the update. This avoids the need to
  access the key list using the REST interface so header limitations
  are no longer a concern. However, the post-commit hook could become
  a bottleneck in your update path if number of links grows
  large.</p>

<p>Traversal involves retrieving the key list document, collecting the
  related keys, and outputting a bucket/key list to be used in
  proceeding map phases. A built-in function is provided to process
  links. If you were to store keys in the value you would need to
  write a custom function to parse the keys and generate a bucket/key
  list. (see above question)</p>

<p><strong>Q: Are you planning to run distributed reduce phases in the
    future?</strong></p>

<p>Yes, here are two relevant feature requests you can track:</p>

<ul>
  <li><a href="https://issues.basho.com/148">#148 - Allow users to
      toggle the number of processes used during reduce</a></li>
  <li><a href="https://issues.basho.com/149">#149 - Distribute reduce phases
      across the cluster</a></li>
</ul>

<p><strong>Q: What's the benefit of passing an arg to a map or reduce
    phase? Couldn't you just send the function body with the arg value
    filled in? Can I pass in a list of args or an arbitrary number of
    args?</strong></p>

<p>When you have a lot of queries that are similar but with minor
  differences, you might be able to generalize a map or reduce
  function so that it can vary based on the 'arg' parameter. Then you
  could store that function in a built-ins library (see the question
  below) so it's preloaded rather than evaluated at query-time. The
  arg parameter can be any valid JSON value.</p>

<p><strong>Q: What's the behavior if the map function is missing from
    one or more nodes but present on others?</strong></p>

<p>The entire query will fail. It's best to make sure, perhaps via
  automated deployment, that all of your functions are available on
  all nodes. Alternatively, you can store Javascript functions
  directly in Riak and use them in a phase with "bucket" and "key" instead
  of "source" or "name".</p>

<p><strong>Q: If there are 2 map phases, for example, then does that
    mean that both phases will be run back to back on each individual
    node and *then* it's all sent back for reduce? Or is there some
    back and forth between phases?</strong></p>

<p>It's more like a pipeline, one phase feeds the next. All results
  from one phase are sent back to the coordinating node, which then
  initiates the subsequent phase once all participating nodes have
  replied.</p>

<p><strong>Q: Would it be possible to send a function which acts as
    both a map predicate and an updater?</strong></p>

<p>In general we don't recommend modifying objects as part of a
  MapReduce job because it can add latency to the request.  However,
  you may be able to implement this with a map function in
  Erlang. Erlang MapReduce functions have full access to Riak
  including being able to read and write data.</p>

<pre>%% Inside your own Erlang module
map_predicate_with_update(Value,_KeyData,_Arg) ->
  case predicate(Value) of
    true -> [update_passed_value(Value)];
    _ -> []
  end.

update_passed_value(Value) ->
  {ok, C} = riak:local_client(),
  %% modify your object here, store with C:put
  ModifiedValue.
</pre>

<p>This could come in handy for large updates instead of having to
  pull each object, update it and store it.</p>

<p><strong>Q: Are Erlang named functions or JS named functions more
    performant? Which are faster &mdash; JS or Erlang functions?</strong></p>

<p>There is a slight overhead when encoding the Riak object to JSON
  but otherwise the performance is comparable.</p>

<p><strong>Q: Is there a way to use namespacing to define named
    Javascript functions? In other words, if I had a bunch of app-specific
    functions, what's the best way to handle that?</strong></p>

<p>Yes, checkout
  the <a href="http://bitbucket.org/basho/riak_kv/src/tip/priv/mapred_builtins.js">built-in
    Javascript MapReduce functions</a> for an example.

<p><strong>Q: Can you specify how data is distributed among the
    cluster?</strong></p>

<p>In short, no. Riak consistently hashes keys to determine where in the cluster
  data is
  located. <a href="http://wiki.basho.com/display/RIAK/Replication#Replication-Understandingreplicationbyexample">This
    article</a> explains how data is replicated and
  distributed throughout the cluster. In most production situations,
  your data will be evenly distributed.</p>

<p><strong>Q: What is the reason for the nested list of inputs to a
    MapReduce query?</strong></p>

<p>The nested list lets you specify multiple keys as inputs to your
  query, rather than a single bucket name or key. From the Erlang client,
  inputs are expressed as lists of tuples (fixed-length arrays) which
  have length of 2 (for bucket/key) or 3
  (bucket/key/key-specific-data). Since JSON has no tuple type, we
  have to express the inputs as arrays of length 2 or 3 within an array.</p>

<p><strong>Q: Is there a syntax requirement of JSON for
    Riak?</strong></p>

<p>JSON is only required for the MapReduce query when submitted via
  HTTP, the objects you store can be in any format that your application will
  understand. JSON also happens to be a convenient format for MapReduce
  processing because it is accessible to both Erlang and
  Javascript. However, it is fairly common for Erlang-native applications to
  store data in Riak as serialized Erlang datatypes.</p>

<p><strong>Q: Is there any significance to the name of file for how
    Riak finds the saved functions? I assume you can leave other
    languages in the same folder and it would be ignored as long as language is set to
    javascript? Additionally, is it possible/does it make sense to
    combine all your languages into a single folder?</strong></p>

<p>Riak only looks for "*.js" files in the <code>js_source_dir</code>
  folder
  (see <a href="http://wiki.basho.com/display/RIAK/Configuration+Files">Configuration
    Files</a> on the <a href="http://wiki.basho.com/">wiki</a>). Erlang
  modules that contain map and reduce functions need to be on the
  code path, which could be completely separate from where the
  Javascript files are located.</p>

<p><strong>Q: Would you point us to any best practices around matrix
    computations in Riak? I don't see any references to matrix in the
    riak wiki...</strong></p>

<p>We don't have any specific support for matrix computations. We
  encourage you to find an appropriate Javascript or Erlang library to
  support your application.</p>

<p>&mdash; <a href="http://twitter.com/reverri">Dan</a> and <a href="http://twitter.com/seancribbs">Sean</a></p>
]]></content:encoded>
    </item>
  </channel>
</rss>
