Community Detection and Profiling
Human activities on social media embed actionable signals to reflect the behavior and interaction of humans and provide indication of events in physical world. However, discovering these signals from the large amount of data generated in social media platform and utilizing them to inform policy decision are challenging problems. In this project, we propose to unveil the signals from the perspective of latent communities. Detecting and subsequently profiling communities is critical to gaining an in-depth understanding of the network structure and dynamics and discovering hidden node or edge based features. More importantly, the discovery and classification of communities can facilitate solving pressing social and health problems. We propose new community detections algorithms tailed for domain specific considerations and profile the community with topics discovered with topic models and incorporate community detection algorithms to build impactful applications. This project has been published in IEEE Access 2018 and Wireless Algorithms, Systems, and Applications 2017.
Twitter-Based Real Time Flu Surveillance System
Despite increased advancements of medical technology and availability of vaccines, emerging and re-emerging epidemics such as SARS, influenza A(H1N1), avian flu, Ebola and Zika, continue to pose tremendous threats. Early detection and immediate response are essential to avoid the repercussions of an epidemic. However, in many cases current methods and algorithms for epidemic detection are no longer able to keep up with the new demands. The availability of unprecedented amount of social media data has provided an opportunity to develop new surveillance systems. In this project we propose a novel integrated framework of network theory, data mining and partial differential equation for early detection of epidemic outbreaks based on real-time Geo-tagged data in Twitter. The project develops new algorithms of community detection and topic analysis of Geo-tagged data in Twitter, new effective distance metric for epidemic spreading and new mathematical theorems for faster, near real-time, and localized detection of epidemic outbreaks. This project is funded by NSF and has been published in Scientific Report 2018 and Journal of Medical System 2016.
Dynamic Mathematical Modeling of Information Diffusion in OSN
Understanding information diffusion process in online social networks is a challenging problem due to the intricacy of human dynamics and social interactions, the vast scale of users and information, and the heterogeneity and diversity of online social networks. In this project, we proposed a Partial Differential Equation (PDE), specifically, a Diffusive Logistic (DL) equation to model the temporal and spatial characteristics of information diffusion. We present the temporal and spatial patterns in a real dataset collected from a social news aggregation site, Digg, and validate the proposed DL equation in terms of predicting the information diffusion process. Our experiment results show that the DL model is able to characterize the process of information propagation in online social networks. For example, for the most popular news with 24,099 votes in Digg, the average prediction accuracy of DL model over all distances during the first 6 hours is 92.08%. To the best of our knowledge, this paper is the first attempt to use PDE-based model to study the information diffusion process in both temporal and spatial dimensions in online social networks. This work is funded by NSF and has been published in ICDCS 2013 and International Workshop on Hot Topics in Peer-to-Peer Computing and Online Social Networking (HotPOST), 2012.
Positive Influence Dominating Set in Online Social Networks
This project studies the problem of how to effectively utilize online social networks to spread ideas and information. We introduced Positive Influence Dominating Set problem that is to choose a subset of users such that all users in the network have at least half of its neighbors in the chosen subset. Finding PIDS has important applications in social networks. For example, for college drinking problem, an intervention program needs to be launched to convert the drinker to abstainer. Due to the limitation of education resources, instead of choosing every binge student to participate in the intervention program, choose a PIDS of all the binge student to participate. Since each binge student will have more abstainer friends than binge friends, the binge student has a better chance to convert to abstainer. We studied the complexity of the PIDS problem and proposed a greedy approximation algorithm. We further analyzed the performance of proposed algorithm in both general random graph and power law graph. This work is support by ASU New College SRCA and has been published in Theoretical Computer Science, 2011 and Social Network Analysis and Mining, 2012.
Research Projects - Internet Measurement and Network Security
Internet Measurement and Monitoring
Internet Measurement and Monitoring: Over the last two decades, the Internet has become a critical communication infrastructure and has changed our lives with innovative services. However, the Internet is also under increasing attacks with unwanted traffic in the form of distributed denial of service attacks, virus and worms. Network traffic profiling has become increasingly imperative and urgent due to these wide spread cyber-attacks that bring down valuable Internet services. Our research project in network traffic profiling has established behavior-oriented framework for extracting and understanding the inherent structures and communication patterns of Internet traffic. Understanding behavioral patterns of Internet traffic is fundamental in managing and securing today's Internet. This project has produced two journal articles published in IEEE/ACM Transactions on Networking, 2014 and Journal of Network and Systems Management. 2012, and two conference papers published in IEEE INFOCOM, 2011 and IEEE GLOBEOM, 2011.
Securing Home Networks
The last decade has witnessed the rapid growth of residential broadband connections and home networks. The availability of home networks brings new application opportunities such as video streaming and remote health care, and has changed the landscape of Internet traffic. However, managing and securing home networks poses a serious challenge, because many home users have little technical expertise to manage, secure and troubleshoot their networks and connected devices. Most prior research on home networks have focused on usability and performance measurements, while few attempt has been made to study traffic patterns and security threats of home networks. Our research efforts focus on understanding traffic behavior in home networks for improved network management and security, and developing a built-in behavior-oriented traffic monitoring system on programmable home routers to collect, analyze and make sense of incoming and outgoing traffic for all Internet-connected devices in home networks. The behavior monitoring system not only provides an in-depth understanding of behavior patterns of home devices, but also detects unwanted traffic towards vulnerable home devices as well as malicious traffic originating from compromised devices in home networks. This project has resulted in two journal articles published in Security and Communication Networks, 2016 and Journal of Personal and Ubiquitous Computing, 2014, and three conference papers published in IEEE GLOBEOM, 2015, International Conference on Wireless Algorithms, Systems, and Applications, 2014, Best Paper Award, and IEEE GLOBEOM, 2013.
Research Projects - Wireless Networks
Fault Tolerant Virtual Backbone
Virtual backbone has been proposed as the routing infrastructure to alleviate the broadcasting storm problem in ad hoc networks. Since the nodes in the virtual backbone need to carry other node's traffic, and node and link failure are inherent in wireless networks, it is desirable that the virtual backbone is fault tolerant. We proposed a new algorithm called Connecting Dominating Set Augmentation (CDSA) to construct a 2-connected virtual backbone which can resist the failure of one wireless node. We proved that CDSA has constant approximation ratio. We also demonstrated the node size of the 2-connected virtual backbone constructed by CDSA is only 5% larger than the size of a 1-connected backbone. This work has been published in IEEE Transactions on Wireless Communications, 2009.
Fault Tolerant Topology Control
Fault tolerant topology control, that is, how to provide fault tolerance while minimizing total node power consumption in wireless network is a very challenging problem. We introduced and studied the problem of fault tolerant topology control for all-to-one and one-to-all communication model in static wireless networks. We modeled the networks with asymmetric wireless links with directed graph and investigated two approaches, namely minimum weight based approach and nearest neighbor augmentation approach. We proved that the minimum weight based approach has a k-approximation algorithm for all-to-one fault tolerant topology control where k is the number of disjoint paths. When k = 1, this approach gave a polynomial time solution for the minimum power sink tree problem. We also studied the one-to-all fault tolerant topology control problem in symmetric wireless networks which can be modelled using an undirected graph. We prove that minimum weight based approach is a 4k approximation and the nearest neighbor augmentation approach is a (k+4) approximation. This work has been published in IEEE Transactions on Mobile Computing 2008.
Connected Dominating Set for Different Transmission Ranges
Connected Dominating Set has been studied extensively in Unit Disk Graphs (UDG), in which each node has the same transmission range. However, in practice, the transmission ranges of all nodes are not necessary equal. We modeled a network as a disk graph and introduce the CDS problem in disk graphs. We presented three constant approximation algorithms to obtain a CDS of a given network and also gave their distributed versions. In addition, we studied the relationship between the size of an independent set and a CDS as well as a bound of the maximum number of independent neighbors of a node in disk graphs. This work has been published in IEEE Transactions on Mobile Computing, 2007.
Broadcast Tree Construction
Energy conservation is an important concern in wireless networks. Many algorithms for constructing a broadcast tree with minimum energy consumption and other goals have been developed. However, no previous research work considers the total energy consumption and transmission delays of the broadcast tree simultaneously. In this work, based on (\alpha,\beta)-tree, a novel concept to wireless networks, we defined a new Strongly connected Broadcast Arborescence with bounded Transmission delay (SBAT) problem and design the Strongly connected Broadcast Arborescence (SBA) algorithm with linear running time to construct a strongly connected broadcast tree with bounded total power, while satisfying the constraint that the transmission delays between the source and the other hosts are also bounded. We also proposed the distributed version and gave theoretical performance ratio analysis. This work has been published in IEEE Transactions on Mobile Computing, 2006.