The process of data interpretation is always based on the implicit introduction of equivalence relations on the set of walks over the database. Every equivalence relation on the set of walks specifies a Markov chain describing the transitions of a discrete time random walk. In order to geometrize and interpret the data, we propose the new distance between data units defined as a "Feynman path integral", in which all possible paths between any two nodes in a graph model of the data are taken into account, although some paths are more preferable than others. Such a path integral distance approach to the analysis of databases has proven its efficiency and success, especially on multivariate strongly correlated data where other methods fail to detect structural components (urban planning, historical language phylogenies, music, street fashion traits analysis, etc.). We believe that it would become an invaluable tool for the intelligent complexity reduction and big data interpretation.
- Data interpretation Markov chains
- Multivariate strongly correlated data
- Path integral distance