As promised, in this post I'll talk about my implementation of the t-SNE dimensionality reduction algorithm in Python. Sorry in advance for the lack of syntax highlighting -- I haven't figured that out yet. I've made a version that explicitly calculates the gradient of each vector $Y$ in the reduced dataset, and another version that employs Theano's grad function. The Numpy version was a bit tricky to implement. The Theano version, for once, was actually easier than the Numpy version, because all you do is just slam in T.grad. I'll start with the Numpy version, and then move on to the Theano version. Before we jump into any code, let's state the the gradient of the cost function:
$$
\frac{\partial Cost}{\partial y_{i}} = 4\sum_{j} (p_{ij} - q_{ij})(y_i - y_j)(1 + ||y_i - y_j||^2)^{-1}
$$
Whaaaaaaa??? Take 11 lines of code and turn it into 2? Theano, you've stolen my heart! The crazy part about this is that it even runs a little faster than the Numpy version. As always, there is some overhead (around 3 seconds on my machine) to compile the function that computes the gradient. When 'training' the t-SNE algorithm however, the Theano version is about 1.5x faster, so you quickly make back this time.
$$
\frac{\partial Cost}{\partial y_{i}} = 4\sum_{j} (p_{ij} - q_{ij})(y_i - y_j)(1 + ||y_i - y_j||^2)^{-1}
$$
Whaaaaaaa??? Take 11 lines of code and turn it into 2? Theano, you've stolen my heart! The crazy part about this is that it even runs a little faster than the Numpy version. As always, there is some overhead (around 3 seconds on my machine) to compile the function that computes the gradient. When 'training' the t-SNE algorithm however, the Theano version is about 1.5x faster, so you quickly make back this time.
No comments:
Post a Comment