Extracting n word phrases in large texts
This is a summary of resources posted on [Corpora-List] early 2014 CMU-Cambridge Statistical Language Modeling toolkit http://mi.eng.cam.ac.uk/~prc14/toolkit.html Sketch Engine http://www.sketchengine.co.uk/documentation/wiki/SkE/NGrams Lawrence Anthony’s AntConc http://www.antlab.sci.waseda.ac.jp/software.html kfNgram http://www.kwicfinder.com/kfNgram/kfNgramHelp.html Colibri Software for the extraction of n-grams as well as patterns that are not consecutive (skipgrams). The software is written in C++ for speed and memory efficiency but comes … Read more