Exploiting Linked Data in DBpedia to Reduce Prediction Error in Matrix Factorization Recommenders
Recommender Systems, Matrix Factorization, Linked Open Data, Prediction Error
Recommender Systems provide suggestions for items that are most likely of interest to users. Providing personalized recommendations is a challenge that can be addressed by filtering algorithms among which Collaborative Filtering (CF) has demonstrated much progress in the last few years. By using Matrix Factorization (MF) techniques, CF methods reduce prediction error by using optimization algorithms. However, they usually face problems such as data sparsity and prediction error. Studies point to the use of data available in Semantic Web as a path to improve recommender systems and address the challenges related to CF techniques. Motivated by these premises, the present work developed a data pipeline along with an algorithm that processes the Ratings Matrix combining semantic similarities of Linked Open Data (LOD) and estimates missing rat- ings. The experiments take subsets of three different datasets (Movielens, LastFM and LibraryThing), two semantic similarity metrics, Linked Data Similarity Distance (LDSD) and Resource Similarity (RESIM), and three MF-based algorithms (SVD, SVD++ and NMF). Our experiments reduced sparsity by more than 75% in Movielens subset and 28% in LastFM. Prediction error is reduced in all subsets with statistical confidence using parametric test one-way ANOVA followed by Tukey’s multiple comparison test.