Gibbs sampler for toy topic model example

Several years ago, I did an implementation of a Gibbs sampler in R for the artificial data of Steyvers and Griffiths (2007) “Probabilistic topic models” that I used for a class demo and have been meaning to post as a Github gist. Here it is:

The artificial problem provides a very nice, simple test case for seeing the inference of the topic-word and document-topic distributions using Gibbs sampling.  The code for the sampling is shorter than the setup code. There are comments in the code that should make everything self explanatory if you read Steyvers and Griffiths.

To run it, you can of course just paste it into an R session. You can also run it from the command line, e.g.:

[sourcecode lang=”bash”]

$ R –no-save < topics_gibbs_sg_example.R

[/sourcecode]

If you are interested in other tutorials that discuss Bayesian learning and samplers (with a definite slant toward natural language processing), check these out:

Author: jasonbaldridge

Co-founder of People Pattern and Associate Professor in the Department of Linguistics at the University of Texas at Austin. My primary specialization is computational linguistics and my core research interests are formal and computational models of syntax, probabilistic models of both syntax and discourse structure, and machine learning for natural language tasks in general.

Leave a Reply

Your email address will not be published. Required fields are marked *