A Markov chain analysis of a network generated by the matrix of lexical distances allows for representing complex relationships between different languages in a language family geometrically, in terms of distances and angles. The fully automated method for construction of language taxonomy is tested on a sample of fifty languages of the Indo-European language group and applied to a sample of fifty languages of the Austronesian language group. The Anatolian and Kurgan hypotheses of the Indo-European origin and the 'express train' model of the Polynesian origin are thoroughly discussed. © 2010 Elsevier Ltd. All rights reserved.
|Journal||Computer Speech and Language|
|State||Published - Jul 1 2011|