Tuesday, April 9, 2013

Ruby 1.8.7-p358 on Mountain Lion with RVM

While trying to install 1.8.7-p358 I ran into a compile error
/usr/include/tk.h:78:23: error: X11/Xlib.h: No such file or directory
I found the answer to the problem here and an explanation here.

Resolving the problem is easy. Just follow these steps:

  1. Install xQuartz
  2. Tell the compiler to use the correct installation with: export CPPFLAGS=-I/opt/X11/include
  3. Then try to attempt to install again with rvm: rvm install 1.8.7-p358

Wednesday, January 16, 2013

Riak: What's faster? 2i or Key Filters

Recently we've experienced a large surge in the size of our data and it's called into question some of our querying approaches and node configuration.

We had been using 2i for most of our querying and a combination of "Data Point Objects" and MapReduce for our more analytical needs. However when our MapReduce started bombing we  questioned/reviewed our querying approach. (for the record, it was bombing due to pref_list exhausted, currently resolved. more on this in a subsequent post)

It didn't take long to find posts like this on Google:
Be aware that key filters are just a thin layer on top of full-bucket key listings. You'd be better off storing the field you want to filter in a secondary index, which more efficiently supports range queries (note that only the LevelDB storage engine currently supports secondary indexes). Barring that, you could use the special "$key" index for a range query on the key. - Sean Cribbs (http://riak-users.197444.n3.nabble.com/preflist-exhausted-error-td3935922.html)
Key Filters are the backbone of our querying approach. We were/are under the impression this is the best way to get a subset of the data for MapReduce. Since most of our queries are over date ranges all of our keys are prefixed with yyyymmddhhmmss.  This allows us to use date based Key Filters in our Map Reduce. According to the post however, we should be using 2i and the special $key field for performance reasons.

First of all, what is $key??? Here is what the Riak Handbook has to say about the $key field:
There are special field names at your disposal too, namely the field $key, which automatically indexes the key of the Riak object. Saves you the trouble of specifying it twice. Riak automatically indexes the key as a binary field for your convenience ...
So I put it to the test. For the record we are using Ruby and Rails and the Ripple(ODM) with Ripplr (RiakSearch). I opened a rails console and ran my Key Filter based MapReduce. It returned in 31,830 rows in a little over a minute on average. I then queried the same bucket using 2i and $key with the same parameters with no Reduce phase. It ran for over 3 minutes before I canceled it! For a sanity check I shrunk the range to two months that I knew had much less data ... it returned results over a minute later!

The code for our Key Filter:
The code for the comparable 2i range query:
So my question to all of you is why is 2i/$key being recommended over Key Filters? What am I missing here?

As a side note ... Riak handled an extraordinary spike in load like a champ writing and reading data. Our concerns strictly revolve around how best to query that data now that we have it.

Monday, January 14, 2013

Why CodeMash continues to be the awesome!



What is CodeMash? 

"CodeMash is a unique event that will educate developers on current practices, methodologies, and technology trends in a variety of platforms and development languages such as Java, .Net, Ruby, Python and PHP." - Codemash.org 

This was CodeMash's 7 year and my 4th year in attendance. While the CodeMash description is accurate, what it fails to describe is the CodeMash culture. Education does not just happen during speakers' sessions or precompilers. It also occurs while sitting down and having a brew with a signatory of the Agile Manifesto or while having a casual conversation about Rails 4.0 with a core contributor. It even occurs while sitting in the hot tub/swim up bar with other agile coaches talking about real world scenarios. There are no rules at CodeMash, except one ... be yourself. Oh, and maybe lots of crispy bacon.

2013 tech trends at CodeMash

I tend to be a Ruby centric developer these days, but these were the buzzwords I continued to hear while at CodeMash.
  • JavaScript
  • Gamification
  • Single Page Applications
  • Bacon Bar
  • and more Javascript

Javascripts

It was extremely difficult to not attend a talk that did not mention Javascript. Several frameworks have evolved around JavaScript including AngularJS, Backbone, Node, and Knockout. Testing for Javascript has also matured significantly with tools like Jasmine and lineman (see my notes on the Test Double talk).
On a similar note, CoffeeScript is the language of choice for crafting Javascript as it compiles down to and promotes best practice Javascript code that is 99.9% guaranteed to work in IE! And you don't need to work in Rails to use CoffeeScript. You can use the coffee command line tool to watch a directory and have it (re)compile your CoffeeScript as you make changes to it.  Keep reading for more on testing in Javascript!

Gamification and Single Page Applications

Dennis and Brian from SRT Solutions have crafted an application for exploring different ways to write single page applications. If you have ever read a choose your own adventure book, you're going to love their application titled Choose your own application. The focus is on building your own single page application with your choice of technologies. The application has been "gamified" and "players" earn badges for each choice they make. This is a great opportunity to explore a new technology in a fun way. Technology choices include Backbone, Knockout.js, .NET, Rails, Node.js, Heroku, CoffeeScript, and Azure.
Rails and Single Page Applications ... with the release of Rails 4.0, Rails will be adding in default support for Single Page Applications (TurboLinks). Currently it can be disabled by removing the TurboLinks gem from the Gemfile, otherwise you will need to disable it on a per link basis. DHH has stated he intends to drive rails in the direction which is best for BaseCamp, a single page application, so expect more changes like this in the near future. My $.02, expect a community fork of Rails in the near future
Brian Prince delivered an excellent talk on Gamification. He discussed several real world examples where Gamification has led to modifying user behavior, including applications that encourage diabetics to test insulin levels regularly and elderly people living at home alone to stay active and engaged. The important thing to remember is to identify the behavior you want to change and then gamify that aspect of your application to encourage that behavior. Adding badges for the sake of adding badges often results in encouraging the wrong behavior.

And Gamification does not just mean badges. Take the bottomless trash can for example. It changes behavior by encouraging them to put their trash into a trash can. And when they do it sounds like their garbage is falling into a deep chasm. It's fun and gets people to do it again. They actually found people looking for trash nearby just to throw in it!

Machine Learning

Seth Juarez delivered two excellent presentations on Machine Learning. Machine learning allows us to find and exploit patterns in data. There are two main classifications of machine learning, supervsed and unsupervised. Supervised learning allows us to be predictive while unsupervised learning helps us to understand the structure of the data. For more details, read my notes from Seth's talks (part 1, part 2).
Seth also has a NuGet package that can be imported into Visual Studio. It is called NuML and can be found here. It was demoed during his talk and looks awesome! As the number of Big Data projects grow, this is going to increasingly become a more and more common topic for discussion and application.

Real world Javascript testing

Javascript testing has really improved since I last looked into it. Jasmine appears to be the front runner and from what I saw and experienced is my prefered choice. It looks alot like rspec and can use the rspec-given syntax thanks to Justin Searls and Jasmine-Given. Justin demonstrated a combination of tools that makes testing Javascript extremely easy. Lineman is one of those tools and requires Node.js and NPM in order to install it. Lineman is used to run your Jasmine specs. You can read more about Javascript testing in my detailed notes on his talk.

Better Metrics for your team

Nayan Hajratwala gave a fantastic demonstration on measuring your team's effectiveness. Traditionally teams have been measured by cyclomatic complexity, velocity, hours in office, etc. However, none of those answer what the customer really wants to know ... What is the team's throughput?
Throughput is the rate at which features are passing through the system. Most often teams try to deliver more by putting more work in progress into the system. This often results in lower quality, bottlenecks, and overall lower throughput.
Cycle Time is the time between two succesfully delivered features and applies Little's Law to compute. Little's Law is described as:
The average number of work items in a stable system is equal to
their average completion rate, multiplied by their average time in the
system.  
To demonstrate, Nyan created 4 teams and had each team play "The Dot Game". The game has the team divided up into 8 roles and the team measures how fast they can assemble the "product". The demonstration showed that adding more work in progress only resulted in less being delivered. Nyan then changed the rules of the game such that there was less work in progress by requiring each role to only work on one product at a time and repeated the exercise. Each of the 4 teams saw an average of 8x improvement in Cycle Time, a huge improvement in quality, and increase in the amount of product produced.
The goal should not be 100% utilization of workforce, it should be maximizing throughput. This demonstration showed that by minimizing work in progress and having each role focus on one thing at a time resulted in less than 100% utilization, but it also resulted in much higher throughput and higher quality.


Bacon Bar!

Several stations were assembled, each with their own mouth watering trays of bacon and selection of toppings. 350lbs of bacon were consumed in a very short amount of time and no heart attacks were reported. Thanks to Josh Walsh and Designing Interactive for coming up with this great idea and sponsoring the activity!!


I was however surprised that Duct Tape beat Bacon 34-29 in the first round of Manifest's MashMadness. Duct Tape even went on to beat Gandolf the Grey in the championship round. Gonzaga??




Monday, October 1, 2012

Custom RSpec matchers for testing your Ripple Documents

We're currently using Ripple for modeling our objects in Riak. Since we are test first, we needed some matchers to help make specifying our Ripple::Documents easier.

(As a side note, we are also developing a gem to wrap RiakSearch functionality. The gem is called Ripplr and has been pushed to ruby gems.)

These matchers were created by extracting specifications from within Ripple's test suite.

**Note: Matchers updated on 10-17-2012 to support embedded docs and added failure messages

Why Nginx kept Riak's secondary indexes(2i) from working

Riak recently introduced Secondary Indexes (a.k.a. 2i). 2i allows objects in Riak to be stored with additional queryable values. At Validas we have leveraged 2i for validating uniqueness instead of relying on MapReduce or RiakSearch to perform the check, among other things. This worked great for us in our local dev environment.

Unfortunately when we deployed to our staging environment any feature that leveraged 2i failed to work. We spent a good amount of time reviewing our specs and feature implementation, but everything checked out fine so we moved on to our Riak app.config, comparing line for line the staging and development configuration. This did not produce any answers either.

Eventually we landed at our load balancer, Nginx. When we bypassed Nginx 2i worked perfectly! We could add an index to an object and then find the object by searching for the indexed value. We were able to determine that Nginx was removing the index header information from our request.

Not sure why Nginx was squashing a seemingly innocent 2i header and not other Riak specific headers, we ran across this post ... How to get non standard http headers on Nginx. Nginx by default removes header that use underscores.

As you can see here, Riak depends on underscores for managing secondary indexes as it suffixes the index name with '_bin'.
That same post generously linked to Nginx: underscores in headers which shows you how to modify your Nginx config and allow underscores in headers. Once we made that change, all of our Riak 2i features worked flawlessly!

Hopefully if you have this problem you find this blog post in less time than it took us to troubleshoot!

Tuesday, September 4, 2012

Riak Search with Ripple in 3 easy steps

Modify your Riak app.config

By default, Riak Search is disabled. You can enable it in the app.config on each Riak server by changing the value from false to true. The config section should then look something like this:

%% Riak Search Config
 {riak_search, [
                %% To enable Search functionality set this 'true'.
                {enabled, true}
               ]},

Don't forget to restart Riak afterwards.

Enable the commit hook

From the Rails console you can enable the commit hook for the model you would like to make searchable. Keep in mind anything created prior to doing this will not be searchable. You will need to go back afterwards and essentially touch each of those objects.

MyObject.bucket.enable_index!
Just replace MyObject with class you wish to make searchable.

Search!

Ripple.client.search MyObject.bucket.name, "some_field:query"

For additional information, refer to these sites:

http://djds4rce.wordpress.com/2012/03/22/riak-fulll-text-search-with-ripple/ 

 

Wednesday, August 15, 2012

How to install Ruby and Rails on Mountain Lion as of now

Since XCode no longer packages gcc, you need to do some fancy dancing to get your rubies compiled.


Install XCode 4.4
Install XCode 4.4 from the App Store
Install XCode Command Line Tools (Open XCode, Preferences, Downloads)
sudo xcodebuild -license agree

If you skip that last step, you will see the following while attempting to install apple-gcc42
You have not agreed to the Xcode license agreements, please run 'xcodebuild -license' 

Install Mac Ports
Download the Mountain Lion installer from http://www.macports.org/install.php/ and run it
sudo port selfupdate

Install GCC
sudo port install apple-gcc42
sudo port install gmake
sudo port install gpatch
sudo ln -s /opt/local/bin/gcc-apple-4.2 /opt/local/bin/gcc
sudo ln -s /opt/local/bin/gmake /opt/local/bin/make
sudo ln -s /opt/local/bin/gpatch /opt/local/bin/patch

Install RVM
curl -L https://get.rvm.io | bash -s stable --ruby
source ~/.rvm/scripts/rvm
rvm install 1.9.3-p194
rvm use 1.9.3-p194 --default

Happy Times!