My research interests lie at the intersection of numerical linear algebra, algebraic graph theory, and data science. In particular, my work has made notable contributions to the areas of matrix polynomials, the numerical solution of polynomial equations, the rankability of data. More recently, I have worked on a combinatorial optimization approach to measuring the rankability of data, and I have developed a novel approach to characterizing digraphs via the numerical range.
Spectral graph theory has a long and interesting history which intersects the seminal works on the algebraic connectivity, i.e., the second smallest eigenvalue of the graph Laplacian. Furthermore, the eigenvalues of the graph Laplacian have been used to characterize graphs and have applications to data mining, image processing, mixing of Markov chains, chromatic numbers, and much more. However, there has been far less success in the study of the spectra of directed graphs, which is mainly due to the asymmetry of the associated matrices.
Most of the results on the spectra of digraphs are specific to the adjacency matrix or the (symmetric) Laplacian matrix for strongly connected digraphs, see Brualdi and the references therein. A few notable exceptions, which apply to the (asymmetric) Laplacian matrix for digraphs, include the results of Bauer and our recent work on characterizing acyclic tournaments in CamLanSmi2020. Even more recently, we have developed a novel method which uses the "restricted numerical range" of the Laplacian matrix to characterize digraphs, see CamRobWie2020
Recently, the rankability problem was proposed, which seeks to quantify a dataset's inherent ability to produce a meaningful ranking of its items. For instance, in the context of a round-robin tournament, a perfectly rankable season could be modeled as an acyclic tournament digraph. In CamLanSmi2020, we give a spectral-degree characterization of such digraphs and use that characterization to formulate a measure of rankability.
Below is our rankability measure for each year in the Big East NCAA College Football conference from 1995 to 2012. In addition, we provide a ranking fit measure, which is equal to the percentage of games correctly predicted by a ranking obtained from the linear ordering problem (LOP).
Our rankability measure is effective for data that can be modeled as a digraph with an edge between every pair of vertices. However, there are many datasets that cannot be modeled this way, e.g., the network of all college football teams is very sparse since many teams never play each other. This observation motivates the following problems.
- Can we develop a rankability measure that is based on properties of the LOP, e.g., the optimal value and the set of optimal solutions, rather than the acyclic tournament characterization?
- What other graph properties (if any) will help us determine when the LOP has a unique optimal solution, not necessarily without violations?
- Can we develop a discrete optimization model that will solve for two extremal solutions of an LOP?
- What other factors should we consider to determine which of the optimal solutions of an LOP is "best"? Theoretically, we might consider the optimal solution that minimizes the sum of the Kendall tau distances to the other optimal solutions. In applications, there may be other factors to consider which will favor one optimal solution over the others.
In CamRobWie2020, we develop the restricted numerical range, which is a convex set in the complex plan that is produced via the Laplacian matrix of a digraph. We show that the minimum real part of the restricted numerical range is equal to the algebraic connectivity of the digraph as defined by Wu. In fact, the algebraic connectivity is an eigenvalue of of the symmetric part of the restricted Laplacian. This observation motivates the following problem.
- Can we use an eigenvector of the symmetric part of the restricted Laplacian associated with the algebraic connectivity to perform a clustering of the digraph? In Wu's paper, bounds related to the maximum directed cut, isoperimetric number, and the algebraic connectivity are proved, which suggests that the corresponding eigenvector may be used to bi-partition the vertices of the digraph.
Moreover, we use the restricted numerical range to characterize digraphs. For instance, a digraph is a cycle on n vertices if and only if its restricted numerical range is a complex polygon with vertices equal to 1 minus the roots of unity, not including zero. In fact, these vertices (including zero) make up the eigenvalues of the Laplacian, which also characterize the cycle. What makes the restricted numerical range so powerful is that it is capable of characterizing digraphs that are not characterized by their Laplacian spectrum.
Consider the k-imploding star defined as the directed join of the empty digraph on (n-k) vertices and the completely connected digraph on k vertices. Examples of the 1-Imploding and 2-Imploding stars are shown below.
- Are there digraphs that are characterized by their Laplacian spectrum, but are not characterized by their restricted numerical range?
Beyond the digraphs with a real restricted numerical range, we have characterized cycles and regular tournaments, both of which have a restricted numerical range that is a complex polygon. This observation motivates the following problem.
- What other digraphs have a restricted numerical range that is a complex polygon?
- What digraphs have non-negative algebraic connectivity, i.e., what digraphs have a restricted numerical range that does not intersect the left half of the complex plane?