Probably useful: * example featuregazetteer pipeline for filtering stopwords * example gazetteer pipeline for doing something useful based on a smallish gazetteer (what?) * example java regexp pipeline for Conll-like tokenisation