Pattern Statistics in Permutations and Words
Patterns in a large combinatorial structure, such as a graph, a word, or a permutation, are the induced substructures on small subsets of vertices or entries. If a permutation describes the relation between the y-ranks and x-ranks of n points in the xy-plane, then every subset of k points induces one of k! possible patterns. Similarly, in an n-letter word over a finite alphabet, every k positions induce a k-letter subword. Some important questions are: How frequently does each pattern occur in typical settings? How efficiently can we count these occurrences? What do they tell us about the global properties of the structure?
Consider the k!-dimensional vector of pattern statistics in a uniformly random permutation. I will present a decomposition of its distribution into k asymptotically uncorrelated components of different orders in n. This extends work by Janson, Nakamura, and Zeilberger (2015) who characterized the first and last components. I will also discuss new algorithms for counting permutation patterns (SODA 2021). Our methods lead to near linear time computation of consistent variants of Hoeffding’s independence test, improving upon the so far most efficient quadratic time algorithms. Finally, I will discuss the full spectral decomposition of the distribution of subword statistics under different random models, with relations to several questions in statistics and combinatorics and potential applications to machine learning.
Joint work with Calvin Leng, and with Ran Tessler and Tsviqa Lakrec.
תאריך עדכון אחרון : 23/11/2020