tutorial from Nvidia GTC

October 16, 2009

My talk is attached in *pdf. Overall the conference felt like a rock’n'roll show, my first non-academic meeting of this nature. Thoroughly enjoyable although a little difficult to tell what people are doing since it’s mostly proprietary work. The announcement of fermi was certainly exciting; ~8x times the double prec performance, ECC and many more cores/memory…

Talk outline:

  • Company Background

  • CUDA accelerates Geophysics:
    • Data Processing w/ Linear Algebra

    • Sparse Matrices
    • Seismic Imaging:Kirchhoff Time Migration
      • Frequency Filtering

      • Traveltime/Weight Calculations
      • GPU Migration
  • Summary

nvidiaSummit0909


CUDA + OCTAVE

September 30, 2009

I’ve been experimenting with CUDA and OCTAVE; there is at least one company who have produced GPU enabled MEX functions. The big difficulty is of course that there is no support for internal floats within OCTAVE (afaik) and similarly with Matlab. However if one can leave the data and work with it on the device for some time, then there are only two explicit conversions btwn float <-> double needed. Or you could sacrifice some performance in CUDA in return for using doubles. At any rate here’s an example Makefile, happy experimenting. For this example I gutted the matrix Mul example from the CUDA sdk; the wrapper *oct source code (*cc, really C++ with octave extensions) contains an extern C section which references the cuda kernel (*cu). Don’t forget to indent instructions under ‘all’ with a single tab for make.


#! /usr/bin/env make
#make file for octfile/cuda
#Mac OSX 10.5.8 intel core 2 duo
#cuda include/lib
CUDA_INC_PATH=/usr/local/cuda/include
CUDA_LIB_PATH=/usr/local/cuda/lib
#octfile compiler
CC=mkoctfile
#basic flags
CFLAGS= -I$(CUDA_INC_PATH)
LDFLAGS= -L$(CUDA_LIB_PATH) -lcudart -lcuda

all:
$(CC) $(CFLAGS) -c cudaMatrixMul.cc -o cudaMatrixMul.o
nvcc  -c matMul_kernel.cu -o matMul_kernel.o -Wall
$(CC) $(LDFLAGS) cudaMatrixMul.o matMul_kernel.o -o cudaMatrixMul.oct

#clean:
rm -f  cudaMatrixMul.o matMul_kernel.o


MCMC paper finished

September 30, 2009

Abstract:

Many nuclei probed by NMR are relatively insensitive to detection, requiring methods such as the Carr-Purcell Meiboom-Gill (CPMG) pulse sequence. Experiments which follow this general approach are composed of pulse trains, giving rise to characteristic spikelet patterns in the frequency domain. In the presence of multiple underlying chemical sites, each spikelet intensity is a sum of some unknown proportion of contributions from each site. This work outlines a modeling approach based around Markov Chain Monte Carlo (MCMC), which negates the need for intensive simulations using density matrix formalism. In support of this technique, a spikelet pattern is produced using the density matrix formalism for an ensemble of spin 1/2 nuclei, and the underlying chemical shifts and intensities reproduced using the method outlined. Finally, MCMC is used to model the CPMG spectrum of a (3,3,3-trifluoropropyl)dimethylchlorosilane (TFS) treated aluminosilicate, providing evidence in support of a particular model of silanol group surface attachment to the bulk.


first IPA

August 10, 2009

I finally got it together and brewed recently, the results were superb.

I used a fairly traditional IPA recipe, leaving most of the cascade malt in tact, ie., very little steeping was actually done. I also used about 2 pounds of black raspberries for an overall result that has strong citrus notes, very sweet and pleasant aftertaste. Next: extra stout in time for fall
first_ipa


CUDA crash course

August 10, 2009

I’m really excited to be going to the Nvidia Research Summit in my new capacity as Senior Physicist for Stone Ridge Technology. Nvidia provide a remarkable product, with exceptional service and support for the scientist, made all the more possible by the majority of money coming from gaming. I’d encourage anyone with a view to doing affordable HPC to start at the CUDA zone, pick up a card from your favorite electronics store, or apply for one via the academic program, and start programming. Here’s a short course below, compiled from notes I made while at PSU, for a rapid introduction, more to come…
cuda crash course


Bachelor City

August 6, 2009

A picture of me, an XE ford falcon, and a dishwasher circa 1996 where I lived with four other friends in Brisbane. Incidentally, the tin foil over my housemate’s window was placed there in order to allow him to sleep in. Till noon.
Bill crushing washing machine


Split-Step Algorithm

June 30, 2009

This is a fast method for the solution of parabolic PDE’s, relying on the FFT implementation of the Fourier Transform to speed things up. Here it’s applied to an acoustic/seismic problem, following the development of Kuperman/Jackson in “Imaging of Complex Media with Acoustic and Seismic Waves”. Consider first the Helmholtz equation for a point source at r’,z’:
eq1_ssa
where G is the Green’s function, K the wavenumber (function of the frequency omega and sound speed c). Assuming azimuthal symmetry, G may be expressed as a product of two functions:
eq2_ssa
and similarly K, now a product of (constant) K0 and index of refraction n. Substituting into the Helmholtz equation gives two PDE’s:
eq3_ssa
The first PDE has Bessel functions as solutions; taking the assymptotic outgoing Hankel function solution and substituting it into the second, with the narrow angle approximation (second derivative of psi with respect to r much smaller than first derivative wrt r), one finds for the second (parabolic) PDE:
eq4_ssa
where chi is the fourier transform of psi. Assuming the variation of n is insignificant, in the wave-space domain the PDE and solution are:
eq5_ssa
Finally, the inverse FT gives the field as a function of depth (Delta r = r-r0, r0 the boundary value):
eq6_ssa

The following is a little hack in octave, to demonstrate the solution method:

function [Psi,s]=test_ssa(N,Dr,P0,n,K0)

% to test split step algo on parabolic PDE:
%
% \frac{\partial^2 \psi}{\partial r^2}
% +2iK0\frac{\partial \psi}{\partial r}
% +K0^2(n^2-1)\psi = 0
%
% K0*n^2 = (const.) wave number
% N=marching steps in field
% P0=boundary field data
% Dr=field step size

% wjb 0609

%initialize
m=length(P0); Psi=zeros(N,m); Psi(1,:)=P0;

%Nyquist thm, s (wave space) range

s=fftshift(-m/2:m/2-1)./Dr;

for k=2:N

Psi(k,:)=exp((i*K0/2)*(n^2-1)*Dr).*ifft(exp(-(i*Dr*s.^2)./(2*K0)).*fft(Psi(k-1,:)));

end

ssa_test


Depeche Mode

June 25, 2009

I may just be hanging on to the last tattered threads of my genX youth, but the new album is sublime, particularly tracks 1,4,8,10


Custom Electric Potential

June 25, 2009

I would like an electric potential with specific qualities, namely, quadratic in two dimensions. On the one hand, I could try and do a little thought experiment, as to which boundary gives the desired properties. Realistically though on the length scale I need it, the odds of producing the boundary are zilch. Much easier in terms of the multipole expansion. With this in mind, and knowing that copper wire along a cylinder makes for an easy setup, and that at least four terms are required for a quadratic potential, the following should work: V(x=0,z=a)=k, V(x=0,z=-a)=k, V(x=-a,z=0)=-k, V(x=a,z=0)=-k, with each wire along y. Assuming long wires, the potential for all four in the x/z plane is:

eq1_efg







To confirm this has the desired behavior near the origin (and hopefully over some reasonable distance), I turn to the Taylor multi-variable expansion. Example partial derivatives:

eq2_efg

Substituting into the expansion:
eq3_efg

does indeed reveal that V ~ (2/{a*a})[z*z-x*x], as desired:

lef


CUDA/GPU install

April 15, 2009

assuming you’re installing GPU + CUDA from scratch, here’s the steps
for a linux i686:

1. Find and download appropriate driver from Nvidia. With the old GPU card still in place, switch to terminal ‘ctrl-alt-F1′ and as root stop the X-server by issuing ‘init 3′. Then install via ’sh NV*run’, answering yes to all the prompts. Issue ‘init 5′ and shut down… when done, replace video card

2. Reboot and install the CUDA toolkit, don’t forget to ‘chmod +x cud*run’ before issuing ’sh cud*run’. After answering yes to all the prompts & installing, add to .bash_profile:

PATH=$PATH:/usr/local/cuda/bin
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib

export PATH
export LD_LIBRARY_PATH

3. Install the CUDA SDK, you should be good to go, but may also need to disable SELinux (eg., temporarily: ‘echo 0 > /selinux/enforce’)