WordSmith Tools

WordSmith Tools is an integrated suite of programs for looking at how words behave in texts. You will be able to use the tools to find out how words are used in your own texts, or those of others. The WordList tool lets you see a list of all the words or word-clusters in a text, set out in alphabetical or frequency order. The concordancer, Concord, gives you a chance to see any word or phrase in context – so that you can see what sort of company it keeps. With KeyWords you can find the key words in a text. The tools have been used by Oxford University Press for their own lexicographic work in preparing dictionaries, by language teachers and students, and by researchers investigating language patterns in lots of different languages in many countries world-wide. There are several extras available, such as a BNC word list and a Shakespeare corpus.

http://www.lexically.net/wordsmith/index.html

Free alternatives:

AntConc

A freeware concordance program for Windows, Macintosh OS X, and Linux.

You can find a good and detailed description of its functions here (German): http://litre.uni-goettingen.de/index.php/AntConc#Keyword_List_Tool

http://www.antlab.sci.waseda.ac.jp/antconc_index.html

CasualConc

A freeware concordance program for MacOS X, which runs natively on Macs. Might be worth a try for Apple users.

https://sites.google.com/site/casualconc/

ConcApp

ConcApp provides concordance searches, and includes full editing support and testing activities, and also word frequency text analysis. ConcApp also has support for unicode and can process not only English, French and probably most other European languages, but Chinese, Japanese, Thai and Russian texts in unicode.

http://www.edict.com.hk/pub/concapp/

QUITA: Quantitative Text Analyzer

QUITA is a freeware tool for easy calculation of the some basic quantitative indices of a corpus (e.g. Type-Token-Ratio, distance between verbs, etc.) and some more advanced index numbers (h-point, entropy, …). It supports automatic tokenization (whitespace or nltk), lemmatization (nltk), POS-tagging (nltk treebank tagger) and can output N-grams.

https://code.google.com/p/oltk/

Please note that QUITA requires Python 2.X + NLTK + numpy installed. For a detailed instruction see How to install QUITA (to access the knowledgebase, a free registration is required)


ShinyConc

ShinyConc is a framework for generating custom web-based concordancers. (Christop Wolk, http://shinyconc.de/index.html)





Last modified: Thursday, 7 December 2017, 12:32 PM