SCuBA: Subspace Clustering Based Analysis

Arizona State University, Computer Science and Engineering, Data Mining and Machine Learning


Researchers spend considerable time searching for relevant papers on the topic in which they are currently interested. Often, despite having similar interests, researchers in the same lab do not find it convenient to share results of bibliographic searches and thus conduct independent time-consuming searches. Research paper recommender systems can help the researcher avoid such time-consuming searches by allowing each researcher to automatically take advantage of previous searches performed by others in the lab. Existing recommender systems were developed for commercial domains to assist users by focusing towards products of their interests. Unlike those domains, the research paper domain has relatively few users when compared with the huge number of research papers. Here we present a novel system to recommend relevant research papers to a user based on the user's recent querying and browsing habits. The core of the system is a scalable subspace clustering algorithm (SCuBA) that performs well on the sparse, high-dimensional data collected in this domain.

About SCuBA Software

This software package is prepared in Java. It is provided free of charge to the research community as an academic software package with no commitment in terms of support or maintenance. Please read the README.TXT file in the package for further help in executing the software. Please refer to the paper below for data preperation and experimentation.


        README provides the details about how to run, how to prepare the data, and how to read the results.



Created on 02/04/2007