A nice explanation of LSTMs.

It's expensive to compute the softmax layer over the vocabulary to compute p(word|context). Three solutions which have been shown to work are: hierarchical softmax (Goodman 2001), noise contrastive estimation (Gutmann and Hyvarinen 2010), self-normalizing neural networks (Devlin et al. 2014).

Chris Dyer's blog.

a paper that compares "off the shelf" dependency parsers.

a tutorial and a blog post on spectral clustering.

Andrew Jones' brief explanation for Xavier Glorot's initialization of neural network parameters.

Sergey Sundukovskiy slides on prototypes, minimal viable products (MVPs), ...etc.

Brendan's tool for visualizing syntactic trees (parseviz).

A bunch of monolingual corpora.

Compress: tar -cfvz compressed-output.tar.gz uncompressed-input-files.* OR tar -cvf mystuff.tar foo.tex fig1.eps fig2.eps && gzip mystuff.tar

Decompress: tar -xfvz compressed-input.tar.gz [-C uncompressed-output-dir]

Boyd and Vandenberghe's book "Convex Optimization"

to initialize submodules after a `git clone', execute `git submodule update --init' at the root directory (reference).

Groups, rings, fields, and vector spaces

Unicode points for Math symbols, Greek letters, Math operators (handy for preparing slides).

Tips and tricks in stochastic gradient descent land.

How to use stochastic gradient descent with L1-regularization? prox-grad, dual averaging, FRTL

installing standard R packages, custom packages in R, and what to do when cpp compilation fails while installing custom R packages

locality sensitive hashing (LSH)

history of deep learning

count-min sketches (a cool data structure that approximates counts of elements in a set)

style guidelines for python

an introduction to GCC

NLP conferences

simulations of beta (and other) distribution density

evaluating clusterings (a ps version of the paper which I like more)

sequence labeling tutorial

configure; make; make install

step-by-step example for using GDB within Emacs to debug a C or C++ program. See this for more GDB commands.

gentle tutorial on using valgrind to find memory problems in c++ code

using *screen* to survive dropped ssh connections while running your jobs

productivity tips for using ssh

blacklight frontend machine blacklight.psc.teragrid.org

learning topic models; beyond svd. slides, paper

mit's matrix cookbook, and Tom Minka's awesome writeup on matrix derivatives.

EM tutorial

Why are the objectives of logistic regression and crf models convex?

LaTeX on blogger

git concepts

Eigen: a c++ matrix library