Wednesday, 29 April 2009

A neural network on the GPU

In my spare time I'm trying to put a neural
network classifier onto the GPU with the help
of GPULib. It's a lot of work because the training
algorithm, scaled conjugate gradient, is fairly
complicated. I'm converting the original CPU version
that I wrote for my book piece-by-piece, so as
not to get completely frustrated.

Here is an chunk of code (which is now working)
which propagates an input vector through the network
to give an output signal that is supposed to reflect
its class membership. (The network is programmed as
an IDL object class, hence all the SELF-references.)

Pro GPUFFNCG::forwardPass
gpuView,*self.N_gpu,self.np,self.LL*self.np,Nv_gpu
gpuReform,Nv_gpu,self.np,self.LL
;logistic output of hidden layer
expnt_gpu = gpuMatrix_Multiply(*self.Gs_gpu,$
*self.Wh_gpu,/btranspose)
tmp_gpu = gpuExp(1.0,-1.0,expnt_gpu,0.0,1.0)
gpuPutArr,fltarr(self.np,self.LL)+1.0,onesL_gpu
gpuDiv,onesL_gpu,tmp_gpu,Nv_gpu
;softmax network output
Io_gpu = gpuMatrix_Multiply(*self.N_gpu,$
*self.Wo_gpu,/btranspose)
maxIo_gpu = gpuMax(Io_gpu,dimension=2)
for k=0,self.KK-1 do begin
gpuView,Io_gpu,k*self.np,self.np,Iov_gpu
gpuSub,Iov_gpu,maxIo_gpu,Iov_gpu
endfor
A_gpu = gpuExp(Io_gpu)
sum_gpu = gpuTotal(A_gpu,2)
for k=0,self.KK-1 do begin
gpuView,*self.M_gpu,k*self.np,self.np,Mv_gpu
gpuView,A_gpu,k*self.np,self.np,Av_gpu
gpuDiv,Av_gpu,sum_gpu,Mv_gpu
endfor
;cleanup
gpuFree, [tmp_gpu,expnt_gpu,onesL_gpu,Io_gpu,$
maxIo_gpu,A_gpu,sum_gpu]
End

I try to make as much use of GPUVIEW as possible
in order to avoid shifting arrays around, and of
course I try to avoid transfers back to the CPU.

The hardest part is calculating the matrix of second
derivatives of the cost function with respect to the
synaptic weights (the Hessian). This is where I hope
for a big speedup using GPULib. Still working on that.

More to come, I hope.

0 comments: