Dean Shaff Personal Blog: Solving differential equations in Theano

I've spent some time messing around with Theano, building and training neural nets. For my senior thesis, I used my own implementation of a Theano neural net to classify waveforms in the XENON100 dark matter detector. I've even spent some time working on LSTMs for text generation (The latter project is essentially my attempt to implement Karpathy's char-rnn in Theano). My understanding is that Theano was built with training neural networks in mind, hence all the emphasis on automatic differentiation and GPU support. In reality, Theano is a general purpose symbolic math library with lots of convenient neural net functionality, like built in squashing functions, downsampling functions, and even 2D convolution functions. Theano is fast because it compiles code into C code on the fly. We make Theano functions by creating computational graphs using symbolic variables. This means that there is some overhead when running Theano code -- my LSTM might take as much as a minute to compile before it starts actually training.

As Theano promises to be so fast, I've been wondering for a while if I could use it to solve differential equations. Moreover, I've been interested to know if it would be faster than equivalent Numpy code. For the purpose of this post, I chose to solve the Lorentz attractor equations using the 4th order Runge-Kutta method. These equations look as follows (from Wikipedia):

$\frac{dx}{dt} = \sigma(y - x) \\ \frac{dy}{dt} = x(\rho - z) - y \\ \frac{dz}{dt} = xy - \beta z \\$
Here

$\sigma$ ,

$\beta$ and

$\rho$ are system parameters. This is a pretty cool looking set of equations when you visualize each of the variables

$x$ ,

$y$ , and

$z$ in three dimensions. Check it out here. I'm not interested in visualizing the results this time around. I simply want to do a speed comparison between two different implementations of the solution to this problem, while ensuring that I get the same result for both implementations.

Note that I didn't run these in separate scripts, so we're running the Numpy and Theano implementations in the same program. Switching the order of which one went first didn't change the results. I made a github gist with all the code, and you can find it here (or just check it out below). Theano scan is a confusing beast that still doesn't make a whole hell of a lot of sense to me. Another thing to note is that the program spends the vast majority of its time compiling the Theano function that does the calculation. Unless you're doing something that is incredibly computationally expensive, it seems like it would be smart to just stick with good ole' Numpy.

	import theano
	import theano.tensor as T
	import numpy as np
	import h5py
	import os
	import time


	def rungekuttastep(h,y,fprime,*args):
	k1 = hfprime(y,args)
	k2 = hfprime(y + 0.5k1,*args)
	k3 = hfprime(y + 0.5k2,*args)
	k4 = hfprime(y + k3,args)
	y_np1 = y + (1./6.)k1 + (1./3.)k2 + (1./3.)k3 + (1./6.)k4
	return y_np1

	def fprime_lorenz(y,*args):
	sigma, rho, beta = args
	yprime = T.zeros_like(y)
	# yprime = theano.shared(np.zeros(10))
	yprime = T.set_subtensor(yprime[0],sigma*(y[1] - y[0]) )
	yprime = T.set_subtensor(yprime[1], y[0]*(rho - y[2]) - y[1])
	yprime = T.set_subtensor(yprime[2],y[0]y[1] - betay[2] )
	return yprime

	def fprime_lorenz_numpy(y,*args):

	sigma, rho, beta = args
	yprime = np.zeros(y.shape[0])
	yprime[0] = sigma*(y[1] - y[0])
	yprime[1] = y[0]*(rho - y[2]) - y[1]
	yprime[2] = y[0]y[1] - betay[2]
	return yprime

	y = T.vector('y')
	h = T.scalar('h')
	i = T.iscalar('i')
	out = rungekuttastep(h,y,fprime_lorenz,10.,28.,8./3.)

	result, updates = theano.scan(fn=lambda y, h: rungekuttastep(h,y,fprime_lorenz,10.,28.,8./3.) ,
	outputs_info =[{'initial':y}],
	non_sequences=h,
	n_steps=i)
	t0 = time.time()
	f = theano.function([h,y,i],result)
	print("Compilation time: {:.5f} seconds".format(time.time() - t0))

	n_iter = 1000

	t0 = time.time()
	y = np.arange(3)
	for i in np.arange(n_iter):
	y = rungekuttastep(0.001,y,fprime_lorenz_numpy,10.,28.,8./3.)
	print("Time in calculation (numpy): {:.5f}".format(time.time()-t0))

	t0 = time.time()
	result = f(0.001,np.arange(3),n_iter)
	print("Time in calculation (theano): {:.5f}".format(time.time()-t0))

	print("Same array? {}".format(np.allclose(y, result[-1])))

view raw theano_numpy_diffeq.py hosted with ❤ by GitHub

Dean Shaff Personal Blog

Saturday, September 3, 2016

Solving differential equations in Theano

1 comment: