Efficiently approximating the expected number of infections in disease spreading models
Frank Bauer and Joseph Lizier
Contact the author: Please use for correspondence this email.
Submission date: 05. Jan. 2012
published in: epl, 99 (2012) 6, art-no. 68007
DOI number (of the published article): 10.1209/0295-5075/99/68007
with the following different title: Identifying influential spreaders and efficiently estimating infection numbers in epidemic models : a walk counting approach
The problem of identifying the most effective nodes in a network for seeding disease spreading (or other types of information spreading such as rumours) has gained much recent attention. Exact solutions to this problem are possible but computationally prohibitive. Several approximation methods have been proposed. Ideally, such a method should accurately infer the number of resulting infections from each initially infected node, therefore also accurately order nodes in terms of their spreading effectiveness, and be computationally efficient. The previously proposed methods (based on degree, k-shell decomposition analysis and different centrality measures) are not usually based on the underlying disease spreading problem, rather measures of network topology. As such, they typically infer only the ordering of effectiveness of nodes but do not estimate number of infections, and indeed there is potential for the ordering they infer to be improved. We introduce a new method to approximate the number of infections resulting from a given initially infected node, based on counting the number of paths of various lengths (and in particular, self-avoiding walks for SIR models) to each other node in the network. This provides an advance over existing inference methods by taking the underlying problem into account and therefore providing estimation of actual numbers of infections. Additionally, in simulating infections on various real-world networks with the SIR model, we show that our self-avoiding walks based method improves the inference of effectiveness of nodes over a wide range of infection rates compared to existing methods. We also analyse the trade-off between estimate accuracy and computational cost of our method, showing that good accuracy can be obtained at reasonable cost.