Ripple 0.8 is Full of Good Stuff

August 31, 2010 at 11:00 PM | categories: Developers, Riak, Ruby, NoSQL

This is a repost from the blog of Sean Cribbs, our Developer Advocate. You can read the original post there.

It’s been a while since I’ve blogged about a release of Ripple, in fact, it’s been a long time since I’ve released Ripple. So this post is going to dig into Ripple 0.8 (released today, August 31) and catch you up on what has happened since 0.7.1 (and 0.5 if you don’t follow the Github project).

The major features, which I’ll describe in more detail below, are:

  • Supports Riak 0.12 features
  • Runs on Rails 3 (non-prerelease)
  • Adds Linked associations
  • Adds session stores for Rack and Rails 3 apps

Riak 0.12 Features

The biggest changes here were some bucket-related features. First of all, you can define default quorum parameters for requests on a per-bucket basis, exposed as bucket properties. Riak 0.12 also allows you to specify “symbolic” quorums, that is, “all” (N replies), “quorum” (N/2 + 1 replies), or “one” (1 reply). Riak::Bucket has support for these new properties and exposes them as attr_accessor-like methods. This is a big time saver if you need to tune your quorums for different use-cases or N-values.

Second, keys are not listed by default. There used to be a big flashing warning sign on Riak::Client#bucket that encouraged you to pass :keys => false. In Ripple 0.8 that’s the default, but it’s also explicit so that if you use the latest gem on Riak 0.11 or earlier, you should get the same behavior.

Runs on Rails 3

I’ve been pushing for Rails 3 ever since Ripple was conceived, but now that the actual release of Rails 3 is out, it’s an easier sell. Thanks to all the contributors who helped me keep Ripple up-to-date with the latest prereleases.

Linked associations

These are HOT, and were the missing features that held me back from saying “Yes, you should use Ripple in your app.” The underlying concepts take some time to understand (the upcoming link-walking page to the Fast Track will help), but you actually have a lot more freedom than foreign keys. Here’s some examples (with a little detail of how they work):

You’ll notice only one and many in the above examples. From the beginning, I’ve eschewed creating the belongs_to macro because I think it has the wrong semantics for how linked associations work (links are all on the origin side). It’s more like you “point to one of” or “point to many of”. Minor point, but often it’s the language you choose that frames how you think about things.

Session stores

Outside the Ruby-sphere, web session storage is one of Riak’s most popular use-cases. Both Mochi and Wikia are using it for this. Now, it’s really easy to do the same for your Rails or Sinatra app.

For Sinatra, Padrino and other pure Rack apps, use Riak::SessionStore:

For Rails 3, use Ripple::SessionStore:

Roadmap

If you’re curious what’s coming next in Ripple, or to provide feedback on the direction, be sure to check out the issue tracker. As always, questions can be sent direct to me (sean AT basho DOT com) or to the riak-users mailing list. Documentation is available on Github Pages.

Cheers!



Webinar Recap - Riak in Action: Wriaki

August 20, 2010 at 11:00 AM | categories: Riak, Erlang, Webmachine, NoSQL, Database

Thank you to those who attended our webinar yesterday. Like before, we're recapping the questions below for everyone's sake (in no particular order). If you missed the webinar, want some more information, or want to ask us some more questions, we've prepared a resource page for you. As always, you can also get ahold of us directly.

Q: How would solve full text search with the current versions of Riak? One could also take Wriaki as an example as most wikis have some sort of fulltext search functionality.

I recommend using existing fulltext solutions. Solr has matched up well with most of the web applications I have written, and would certainly work for Wriaki as well.

Q: Where in the course of the interaction (shown on slide 18) are you defining the client ID? Don't you need the client ID and vclock to match between updates?

On slide 42, we talk about "actors" which are essentially client IDs. Using the logged-in user as the client ID can help prevent vclock explosion and is a sensible way of structuring your updates.

Bryan



Free Webinar - Riak in Action: Wriaki - August 19 @ 2PM Eastern

August 13, 2010 at 11:00 AM | categories: Riak, Webmachine, NoSQL, Database

Documentation is great, but playing with examples can also be a helpful way to tackle steep learning curves. To help you learn about ways of using Riak, we'd like to present "Wriaki", an example implementation of a wiki that stores its data in Riak.

We invite you to join us for a free webinar on Thursday, August 19 at 2:00PM Eastern Time (UTC-4) about Riak in Action: Wriaki. During the presentation, Bryan Fink will cover:

  • Modeling wiki data in the Riak key/value store
  • Access patterns using both get/put and map/reduce
  • Three strategies that Wriaki uses for dealing with eventual consistency
  • how the user interface changes to accommodate Wriaki's models

The code for Wriaki will be open-source at the time of the presentation. The presentation will last 30 to 45 minutes, with time for questions at the end. Fill in the form below to reserve your seat! Sorry, registration has closed!

If you cannot attend, the video and slides will be made available afterward in the recap post on the blog.



Basho Partners with Joyent to Bring You Hosted Riak

August 08, 2010 at 12:30 AM | categories: Riak, NoSQL, Database

This is a huge day for Basho Technologies, Riak, and our growing community of users.

We are thrilled to announce Basho's partnership with Joyent to bring our community hosted Riak on Joyent's Smart platform. With both open source and enterprise versions available, anyone can quickly spin up a Riak cluster and start building applications.

When we first began talking to Jason and David and the rest of the Joyent team early this year, we realized we shared a common vision for the future of infrastructure. The past several months have been spent finalizing the details, and in just a few weeks you'll be able to go to my.joyent.com and, with a few clicks, purchase and deploy as many nodes of Riak you want, need, and can handle.

Making pre-configured Riak SmartMachines available in the Joyent cloud will enable developers to combine all the benefits of Riak with the proven, advanced hosting platform that businesses like LinkedIn, Gilt, and Backstage rely on every day.

The team at Joyent has posted more details about the partnership on their blog. Go read it, and mark your calendar, because hosted Riak is here!

Thanks,

Earl


Webinar Recap - Riak with Rails

August 06, 2010 at 10:00 AM | categories: Riak, Ruby, NoSQL, Database

Thank you to those who attended our Rails-oriented webinar yesterday. Like before, we're recapping the questions below for everyone's sake (in no particular order). If you missed the webinar, want some more information, or want to ask us some more questions, we've prepared a resource page for you. As always, you can also get ahold of us directly.

Q: When you have multiple application servers and Riak nodes, how do you handle "replication lag"?

Most web applications have some element of eventual consistency (or potential inconsistency) in them by their nature. Object and view caches sacrifice immediate consistency for gains in throughput and latency, and hopefully provide a better user experience. With Riak, you can achieve acceptable data freshness by "reading your writes". That is, use the same read quorum as your write quorum and make sure that the R+W is greater than N. For example, using R=W=DW=2 when N=3 will give a strong assurance of consistency.

Q: I find myself doing def key; id; end. Is there any easier way to tell Ripple the key?

Currently there is not. However, I've found myself using this pattern frequently when I want a meaningful key that is also an attribute. There's an issue on the tracker just for this feature. In the meantime, you could use two method aliases:

class User
  include Ripple::Document
  property :email, String, :presence => true
  
  # This forces all attribute methods to be defined
  define_attribute_methods
  alias_method :key, :email
  alias_method :key=, :email=
end

As long as your property is a string, this should work just fine.

Q: Any tips on how to handle pagination over MapReduce queries?

The challenge with pagination in Riak is that reduce phases are not guaranteed to run only once, but instead are run in parallel as results from the previous phase come in asynchronously, and then followed by a final reduce. So in a sense, you have to treat all invocations of your reduce function as a "re-reduce". We have plans to allow reduce phases to specify that they should be run only once, but for right now you can get around this limitation.

Reduce phases are always run on the coordinating node, so if you put a reduce phase before the one where you want to perform pagination, you are pretty much guaranteed that the whole result set is going to be available in a single application of the final reduce. A typical combination would be a "sorting" phase followed by a "pagination" phase. Riak.reduceSort and Riak.reduceSlice are two built-in functions that could help accomplish this task.

Sean and Grant



Introducing Riak Core

July 30, 2010 at 01:30 PM | categories: Riak, Riak Core, NoSQL, Database

This post was originally published on Kevin Smith's Blog, Hypothetical Labs. If you have questions or comments, please use the original post.

What is riak_core?

riak_core is a single OTP application which provides all the services necessary to write a modern, well-behaved distributed application. riak_core began as part of Riak. Since the code was generally useful in building all kinds of distributed applications we decided to refactor and separate the core bits into their own codebase to make it easier to use.

Distributed systems are complex and some of that complexity shows in the amount of features available in riak_core. Rather than dive deeply into code, I’m going to separate the features into broad categories and give an overview of each.

Note: If you’re the impatient type and want to skip ahead and start reading code, you can check out the source to riak_core via hg or git.

Node Liveness & Membership

riak_core_node_watcher is the process responsible for tracking the status of nodes within a riak_core cluster. It uses net_kernel to efficiently monitor many nodes. riak_core_node_watcher also has the capability to take a node out of the cluster programmatically. This is useful in situations where a brief node outage is necessary but you don’t want to stop the server software completely.

riak_core_node_watcher also provides an API for advertising and locating services around the cluster. This is useful in clusters where nodes provide a specialized service, like a CUDA compute node, which is used by other nodes in the cluster.

riak_core_node_watch_events cooperates with riak_core_node_watcher to generate events based on node activity, i.e. joining or leaving the cluster, etc. Interested parties can register callback functions which will be called as events occur.

Partitioning & Distributing Work

riak_core uses a master/worker configuration on each node to manage the execution of work units. Consistent hashing is used to determine which target node(s) to send the request and the master process on each node farms out the request to the actual workers. riak_core calls worker processes vnodes. The coordinating process is the vnode_master.

The partitioning and distribution logic inside riak_core also handles hinted handoff when required. Hinted handoff occurs as a result of a node failure or outage. In order to assure availability, most clustered systems will use operational nodes in place of down nodes. When the down node comes back the cluster needs to migrate the data from its temporary home on the substitute nodes to the data’s permanent home on the restored node. This process is called hinted handoff and is managed by components inside riak_core. riak_core also handles migrating partitions to new nodes when they join the cluster such that all work continues to be evenly partitioned to all cluster members.

riak_core_vnode_master starts all the worker vnodes on a given node and routes requests to
the vnodes as the cluster runs.

riak_core_vnode is an OTP behavior wrapping all the boilerplate logic required to implement a vnode. Application-specific vnodes need to implement a handful of callback functions in order to participate in handoff sessions and receive work units from the master.

Cluster State

A riak_core cluster stores global state in a ring structure. The state information is transferred between nodes in the cluster in a controlled manner to keep all cluster members in sync. This process is referred to as “gossiping”.

riak_core_ring is the module used to create and manipulate the ring state data shared by all nodes in the cluster. Ring state data includes items like partition ownership and cluster-specific ring metadata. Riak KV stores bucket metadata in the ring metadata, for example.

riak_core_ring_manager manages the cluster ring for a node. It is the main entry point for application code accessing the ring, via riak_core_ring_manager:get_my_ring/1, and also keeps a persistent snapshot of the ring in sync with the current ring state.

riak_core_gossip manages the ring gossip process and insures the ring is generally consistent across the cluster.

What’s the plan?

Over the next several months I’m going to cover the process of building a real application in a series of posts to this blog where each post covers some aspect of system building with riak_core. All of the source to the application will be published under the Apache2 licensed and shared via a public repo on github.

And what type of application will we build? Since the goal of this series is to illustrate how to build distributed systems using riak_core and also satisfy my own technical curiosity I’ve decided to build a distributed graph database. A graph database should provide enough use cases to really exercise riak_core while at the same time not obscuring the core learning experience in tons of complexity.

Thanks to Sean Cribbs and Andy Gross for providing helpful review and feedback.



Free Webinar - Riak with Rails - August 5 @ 2PM Eastern

July 29, 2010 at 05:00 PM | categories: Riak, Ruby, NoSQL, Database

Ruby on Rails is a powerful web framework that focuses on developer productivity. Riak is a friendly key-value store that is simple, flexible and scalable. Put them together and you have lots of exciting possibilities!

We invite you to join us for a free webinar on Thursday, August 5 at 2:00PM Eastern Time (UTC-4) to talk about Riak with Rails. In this hands-on webinar, we'll discuss:

  • Setting up a new Rails 3 project for Riak
  • Storing, retrieving, manipulating key-value data from Ruby
  • Issuing map-reduce queries
  • Creating rich document models with Ripple
  • Using Riak as a distributed cache and session store

The presentation will last 30 to 45 minutes, with time for questions at the end. Fill in the form below if you want to get started building Rails applications on top of Riak! Sorry, registration is closed.

The Basho Team



Consistent Smashing

July 28, 2010 at 01:30 PM | categories: Riak, NoSQL, Database

Sometimes you need more than words to illustrate a point. Here is Basho's humble attempt to clarify the difference between "Dynamo-Style" systems (like Riak) that use consistent hashing to achieve fault tolerance, simple scaling, and prevent data loss, and systems that use techniques like sharding.

Enjoy!

Mark

Consistent Smashing from Basho Technologies on Vimeo.



Next Page ยป