
We introduce a technique for identifying the most salient participants in a discussion. Our method, MavenRank, is based on lexical centrality: a random walk is performed on a graph in which each node is a participant in the discussion and an edge links two participants who use similar rhetoric. As a test, we used MavenRank to identify the most influential members of the US Senate using data from the US Congressional Record and used committee ranking to evaluate the output. Our results show that MavenRank scores are largely driven by committee status in most topics, but can capture speaker centrality in topics where speeches are used to indicate ideological position instead of influence legislation.
We are currently working on a dynamic extension of MavenRank that identifies influential speakers at a given time.
The input data used consists of Senate speeches from the 105th-108th Congresses (1997-2004). The basic units of the electronic version of the Congressional Record are "html documents", which correspond roughly to titled subsections in the printed Record. Each document can contain zero, one or several speakers, discussing one or more items or topics. There are 71,181 documents from 1997-2004.
This work is supported by the National Science Foundation under Grant No. 0527513, "DHB: The dynamics of Political Representation and Political Rhetoric". Any opinions, findings, and conclusions or recommendations expressed in this work are those of the authors and do not necessarily reflect the views of the National Science Foundation.