My research is broadly in the interplay of large-scale stochastic networks and big-data analytics. My group is interested in a broad range of fundamental problems arising from the entire lifespan of big-data analytics, including

Design of private data marketplaces for big-data analytics: We are developing a market model for private data analytics such that private data are treated as commodity and traded in the market. We have used a game-theoretical approach to develop an innovative incentive mechanism to pay (or reward) individuals for reporting informative data. This market approach allows data subjects take full control of their own privacy and, at the same time, guarantees the data collector gets informative data.

Efficient network resource allocation of centralized and distributed big-data systems: Asymptotic analysis is common for analyzing complex stochastic systems when the exact analysis is difficult. The traditional heavy-traffic analysis and mean-field analysis are two popular approaches that concern two different asymptotic regimes: small systems with heavy load and large systems with light load. We are currently working on establishing a universal analytical framework for the design and operation of large-scale stochastic systems related to big-data analytics (such as cloud computing systems, ride-sharing system, crowd-sourcing, the Internet of Things) under a broad range of operating conditions, including both the traditional heavy-traffic regime and the mean-field regime as special cases.

Diffusion, detection and intervention of misinformation (colloquially known as "fake news") in complex social networks: The proliferation of misinformation on online social networks has become one of the greatest threats to our national security. Our work on diffusion source localization addresses the problem of locating the source of misinformation in large networks. A fundamental question for combating misinformation is how to identify news to be misinformation in real-time and how to effectively counter the spreading of misinformation after identifying it. We are currently developing a joint data-driven approach (based on data mining) and model-driven solution (based on statistical machine learning) for real-time detection of misinformation, where the goal is identify misinformation at its early stage of spreading. We are also interested in developing intervention algorithms based on optimal control and game theory for limiting the spreading of misinformation.

From a theory perspective, addressing the problems above will lead to new analytical methods to understand large-scale stochastic systems with strategic entities, interacting with online platforms, and driven by big data applications. We are particularly interested in using tools from probability, stochastic networks, game theory, optimization to derive models that are computationally tractable and asymptotically accurate; and to develop resource allocation mechanisms that are simple, effective and provably optimal. From an application perspective, this line of research has broad applications in many different domains. Because of its fundamental importance and practical relevance, our research has received diverse support from NSF, ONR, ARO, DTRA, and NASA. These projects cover a wide range of topics, from communications networks, to cloud computing systems, to complex social networks, and to multilayered critical infrastructure networks. Our research contributions have also been recognized as best papers in conferences across different disciplines, including communication networks (INFOCOM and WiOpt), computer systems (SIGMETRICS) and data mining (KDD).