Sep 18 2009

Monte-Carlo Integration using Thrust

Bored. Wrote a numerical integrator using Thrust for fun.

Solves, numerically, the integral from 0 to 1 for (x^2)*cos(x).

x*x*cos(x)

Analytically, we can solve and get: 2*cos(1) – sin(1) = 0.239133627

So, the code is as follows:

#include <vector_functions.h>
 
#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
 
#include <thrust/generate.h>
#include <thrust/transform.h>
#include <thrust/count.h>
 
#include <cstdlib>
#include <math.h>
 
float2 bottomLeft;
float2 topRight;
 
__device__
float f(float x) {
	return (x*x*__cosf(x));
}
 
struct above_below_function {
	__device__ 
	float operator()(float2 p) const {
		return (p.y < f(p.x)) ? 1 : 0;
	}
};
 
float2 make_random_float2(){
	return make_float2( (rand() / (float)RAND_MAX) * (float)(topRight.x - bottomLeft.x) + (float)bottomLeft.x,
						(rand() / (float)RAND_MAX) * (float)(topRight.y - bottomLeft.y) + (float)bottomLeft.y);
}
 
int main() {
	srand(time(0));
 
	bottomLeft.x = 0;
	bottomLeft.y = 0;
 
	topRight.x = 1;
	topRight.y = 1;
 
	int numPoints = 10000000;
 
	// find the area of our bounding square
	float area = (topRight.x - bottomLeft.x) * (topRight.y - bottomLeft.y);
 
	// generate our points
	thrust::host_vector<float2> h_locations(numPoints);		
	thrust::generate(h_locations.begin(), h_locations.end(), make_random_float2);
 
	// create two device vectors -- one for our locations, and another to create a count
	thrust::device_vector<float2> d_locations = h_locations;
	thrust::device_vector<unsigned int> d_boolean(numPoints);
 
	// for each element, if the y is below f(x), then put a 1 in the array.  otherwise, put a 0
	thrust::transform(d_locations.begin(), d_locations.end(), d_boolean.begin(), above_below_function());
 
	int total = thrust::count(d_boolean.begin(), d_boolean.end(), 1);
 
	std::cout << "Area under curve: " << area * (total / (float)numPoints) << std::endl;
 
	return 0;
}

The solutions? Here are a few sample runs…
Area under curve: 0.239177
Area under curve: 0.238916
Area under curve: 0.239181
Area under curve: 0.239129

Pretty accurate, and pretty damn fast too.

  • Share/Bookmark

Jun 15 2009

Ruby and CUDA and Thrust, Oh My!

Thought I would have a bit of fun to see if I could get CUDA and Thrust working with Ruby. Since no CUDA bindings exist, and I didn’t want to give up the power of Thrust, I decided to emply RubyInline to create wrappers. Got it a nice little demo working in under 30 minutes.

vectors.cu

#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
 
#include <iostream>
 
#include "vector_test.h"
 
extern "C" int vector_test() {
   thrust::host_vector H(4);
   H[0] = 14;
   H[1] = 20;
   H[2] = 38;
   H[3] = 46;
 
   std::cout << "H has size "<< H.size() << std::endl;
 
   for(int i = 0; i < H.size(); i++)
      std::cout << "H[" << i << "] = " << H[i] << std::endl;
 
   H.resize(2);
 
   std::cout << "H now has size " << H.size() << std::endl;
 
   thrust::device_vector D = H;
 
   D[0] = 99;
   D[1] = 88;
 
   for(int i = 0; i < D.size(); i++)
      std::cout << "D[" << i << "] = " << D[i] << std::endl;
 
   return 0;
}

vector_test.h

extern "C" int vector_test();

Compile a shared library: nvcc –shared -o libvectors.so vectors.cu -I/usr/local/cuda/include -L/usr/local/cuda/lib -lcudart

cuda_inline.rb

require 'rubygems'
require 'inline'
 
class MyWrapper
   inline(:C) do |builder|
      builder.include '"vector_test.h"'
      builder.add_compile_flags '-x c++', '-lstdc++', '-L/usr/local/cuda/lib', '-L.', '-lvectors', '-I.'
      builder.c '
         void wrapper() {
            vector_test();
         }'
   end
end
t = MyWrapper.new()
t.wrapper

Note here that I have to specify for RubyInline to look in the current directory for headers and libraries!

So, as long as libvectors.so and vector_test.h is in the directory you are running cuda_inline.rb from, everything is gravy! Maybe not the most efficient tool-chain, but it works!

$ ruby cuda_inline.rb
ld warning: in ./libvectors.so, file is not of required architecture
H has size 4
H[0] = 14
H[1] = 20
H[2] = 38
H[3] = 46
H now has size 2
D[0] = 99
D[1] = 88

  • Share/Bookmark

Jun 15 2009

Thrust

Here is a cool tool I found while browsing GPGPU. I finally got a computer that had a video card that could be used with CUDA, so I have been dying to try it out. Thrust makes this transition look painless…

Thrust is a CUDA library of parallel algorithms with an interface resembling the C++ Standard Template Library (STL). Thrust provides a flexible high-level interface for GPU programming that greatly enhances developer productivity. Develop high-performance applications rapidly with Thrust!

Check it out here.

  • Share/Bookmark