Sorting with Ruby, RubyInline, and Thrust
I decided to see what sort of speed-up I could get just doing a straight brute-force sort. First, I wanted a pure Ruby implementation. Next, I wanted a RubyInline version. Finally, I wanted to use RubyInline with Thrust. Here is the code I used.
Please NOTE: I understand that the appropriate choice of an algorithm is more important than ‘micro’ optimizations. I just thought it would be fun to check out.
# Ruby Sort Neighbors example # Take N random (x,y) points # For each point, sort the neighbors! require 'rubygems' require 'inline' N = 1_000 class Point attr_accessor :x, :y def initialize(x, y) @x = x @y = y end def distance(other) Math.sqrt((other.x - @x)**2 + (other.y - @y)**2) end end def find_nearest_neighbors(point, neighbors) return neighbors.sort { |a,b| point.distance(a) <=> point.distance(b) } end class NearestNeighbor inline(:C) do |builder| builder.add_compile_flags '-x c++', '-lstdc++', '-I/usr/local/include/boost-1_39/' builder.include '<algorithm>' builder.include '<vector>' builder.include '<boost/bind.hpp>' builder.include '<cmath>' builder.prefix ' int distance(VALUE a, VALUE b, VALUE c) { double a_x = NUM2DBL(rb_iv_get(a, "@x")); double a_y = NUM2DBL(rb_iv_get(a, "@y")); double b_x = NUM2DBL(rb_iv_get(b, "@x")); double b_y = NUM2DBL(rb_iv_get(b, "@y")); double c_x = NUM2DBL(rb_iv_get(c, "@x")); double c_y = NUM2DBL(rb_iv_get(c, "@y")); double b_dist = sqrt(pow(b_x-a_x, 2) + pow(b_y-a_y, 2)); double c_dist = sqrt(pow(c_x-a_x, 2) + pow(c_y-a_y, 2)); return (b_dist < c_dist); }' builder.c ' VALUE find_nearest_neighbors(VALUE point, VALUE neighbors) { const VALUE *base = RARRAY(neighbors)->ptr; std::vector<VALUE> sorted_neighbors(base, base + RARRAY(neighbors)->len); std::sort(sorted_neighbors.begin(), sorted_neighbors.end(), boost::bind(&distance, point, _1, _2)); return rb_ary_new4(RARRAY(neighbors)->len, &sorted_neighbors.front()); }' end end class NearestNeighborThrust inline(:C) do |builder| builder.add_compile_flags '-x c++', '-lstdc++', '-I/usr/local/cuda/include' builder.include '<vector>' builder.include '<cmath>' builder.include '<thrust/sort.h>' builder.prefix ' float distance(VALUE a, VALUE b) { double a_x = NUM2DBL(rb_iv_get(a, "@x")); double a_y = NUM2DBL(rb_iv_get(a, "@y")); double b_x = NUM2DBL(rb_iv_get(b, "@x")); double b_y = NUM2DBL(rb_iv_get(b, "@y")); return sqrt(pow(b_x-a_x, 2) + pow(b_y-a_y, 2)); }' builder.c ' VALUE find_nearest_neighbors(VALUE point, VALUE neighbors) { VALUE *base = RARRAY(neighbors)->ptr; const int length = RARRAY(neighbors)->len; std::vector<double> keys(length); for(int i = 0; i < length; i++) keys[i] = distance(point, base[i]); thrust::sort_by_key(&keys.front(), &keys.front()+length, base); return rb_ary_new4(length, base); }' end end # initialize our points points = [] N.times { points << Point.new(rand, rand) } nearest_neighbors = [] start = Time.now points.each_index { |i| nearest_neighbors[i] = find_nearest_neighbors(points[i], points[0...i] + points[i+1..(N-1)]) } finish = Time.now puts "Ruby Time: #{finish-start}" nearest_neighbors_inline = [] nn = NearestNeighbor.new() start = Time.now points.each_index { |i| nearest_neighbors_inline[i] = nn.find_nearest_neighbors(points[i], points[0...i] + points[i+1..(N-1)]) } finish = Time.now puts "Inline Time: #{finish-start}" nearest_neighbors_thrust = [] nn_thrust = NearestNeighborThrust.new() start = Time.now points.each_index { |i| nearest_neighbors_thrust[i] = nn_thrust.find_nearest_neighbors(points[i], points[0...i] + points[i+1..(N-1)]) } finish = Time.now puts "Inline w/ Thrust: #{finish-start}"
Here are the results I got on my new MacBook 13″:
Ruby Time: 46.998292
Inline Time: 5.911666
Inline w/ Thrust: 0.726583
June 18th, 2009 at 9:03 am
Heya, looks great!
Can you please post a link to Thrust? Google and RubyForge didn’t say much on it.
June 18th, 2009 at 9:08 am
Thrust is a C++ library that is used in conjunction with CUDA. You can find it at http://code.google.com/p/thrust/.
November 12th, 2009 at 8:27 am
so thrust is some kind of STL replacement that’s faster?
you could also take a gander at
http://github.com/rdp/crystalizer
though I doubt it’d be faster since it doesn’t know about types as well as the C examples do (“oh I’m sorting floats?”).
Also what about boost?
GL.
-r
November 12th, 2009 at 8:49 am
No, thrust is a STL like wrapper system for NVIDIA CUDA — so the code runs on your GPU.