introduction
The blogosphere is growing extremely fast and offering great commercial values and provides new business opportunities in areas such as advertisement, opinion extraction, and marketing. However, spam blogs (splogs) have become a major problem in the blogosphere—they reduce the quality of information retrieval results and waste network resources. We propose our solution to detecting splogs in the blogosphere. The main contributions of our work include:
- Understanding the splog problem: Unlike web or email spams, a splog as a whole is not static. We consider a blog as a unit whose contents dynamically change with time which is applicable to real-world splog problem.
- Evaluation: Splogs need to be combated as quickly as possible before they already waste a lot of network resource. We propose a time-sensitive evaluation framework to measure splog detection performance based on how fast a right decision is made.
- Detection: Splogs have properties different from web or email spams. Our detection method identifies unique features useful for detecting splogs.