Markov algorithm
This page is about a property of algorithms. For the string-rewriting systems, see wikipedia:Markov algorithm.
An algorithm is Markov, has the Markov property, or is memoryless, if its next state never depends on more than a fixed number of its prior states. A Markov chain is a sequence of states produced by a Markov algorithm. Markov algorithms were introduced to describe meteorology and model natural languages.
For simplicity, imagine a Markov algorithm as maintaining a sliding window of states. To generate a new state, the algorithm combines data from prior states in the window, and then advances the window to include the new state, discarding the oldest state.
Can't provide the image, but here is the link below:
https://upload.wikimedia.org/wikipedia/commons/2/2b/Markovkate_01.svg
Markov distributions
Every Markov algorithm has an underlying probability distribution for the next state. This distribution is conditional on the prior states but nothing else. Knowledge of this distribution is approximate knowledge of the underlying Markov algorithm.
A Markov distribution can be used to generate Markov chains, even if the original algorithm is not known. Given an initial window of states and a random number generator (RNG), we may sample a Markov chain by repeatedly making random weighted choices of next states and sliding the window:
states = … window = … while true: window.push_right(sample(states)) window.pop_left()
n-grams
A standard programming exercise is to take a hunk of text and, assuming that it was generated by some Markov algorithm, compute the Markov distribution. An introductory approach involves approximating Markov distributions with n-grams, or fragments with n words, by scanning the text and recording all of the n+1-grams. Then, treat the first n words of each n+1-gram as the window of prior states, and let the final word be the next state.
For example, a bigram is a pair of words and a trigram is a triple of words. We may approximate a Markov distribution by recording every trigram, breaking each one up into a bigram and next state, and then collecting all of the bigrams into windows. Consider the example sentence:
a black cat and a black dog took a nap
Punctuation has been removed and capitalization has been normalized for clarity. Only one bigram appears twice, "a black"; it may be followed by "cat" or "dog". So, the Markov distribution is fairly tame, except that the window with the words "a" and "black" has a 50% chance of being followed by "cat" and a 50% chance of being followed by "dog".
Examples
Gambler's ruin
- 2 has a 50% chance of going to 3 and a 50% chance of going to 1
- 3 has a 50% chance of going to 2 and a 50% chance of going to 4
- 1 and 4 have a 100% chance of going to themselves
External resources
- Markov chain generator
- Wikipedia has related articles: wikipedia:Markov chain, wikipedia:Markov property