Francesco Ricci · Lior Rokach · Bracha Shapira · Paul B. Kantor
Editors
Recommender Systems Handbook
123
Lior Rokach
Ben-Gurion University of the Negev
Dept. Information Systems Engineering
84105 Beer-Sheva Israel
[email protected] Paul B. Kantor Rutgers University School of Communication, Information & Library Studies Huntington Street 4
08901-1071 New Brunswick New Jersey
SCILS Bldg.
USA
ISBN 978-0-387-85819-7 e-ISBN 978-0-387-85820-3 DOI 10.1007/978-0-387-85820-3
Springer New York Dordrecht Heidelberg London
c Springer Science+Business Media, LLC 2011
All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com) Library of Congress Control Number: 2010937590
Francesco Ricci
Free University of Bozen-Bolzano Faculty of Computer Science Piazza Domenicani 3 39100 Bolzano Italy
Bracha Shapira
Ben-Gurion University of the Negev
Dept. Information Systems Engineering
Beer-Sheva Israel
Recommender Systems are software tools and techniques providing suggestions for items to be of use to a user. The suggestions provided are aimed at supporting their users in various decision-making processes, such as what items to buy, what music
Development of recommender systems is a multi-disciplinary effort which in- volves experts from various fields such as Artificial intelligence, Human Computer Interaction, Information Technology, Data Mining, Statistics, Adaptive User Inter- faces, Decision Support Systems, Marketing, or Consumer Behavior.Recommender Systems Handbook: A Complete Guide for Research Scientists and Practitioners aims to impose a degree of order upon this diversity by presenting a coherent and unified repository of recommender systems’ major concepts, theories, methodolo- gies, trends, challenges and applications. This is the first comprehensive book which is dedicated entirely to the field of recommender systems and covers several aspects of the major techniques. Its informative, factual pages will provide researchers, stu-
classical methods, as well as extensions and novel approaches that were recently in- troduced. The book consists of five parts: techniques, applications and evaluation of recommender systems, interacting with recommender systems, recommender sys- tems and communities, and advanced algorithms. The first part presents the most popular and fundamental techniques used nowadays for building recommender sys- tems, such as collaborative filtering, content-based filtering, data mining methods and context-aware methods. The second part starts by surveying techniques and ap- proaches that have been used to evaluate the quality of the recommendations. Then deals with the practical aspects of designing recommender systems, it describes de- sign and implementation consideration, setting guidelines for the selection of the
vii
to listen, or what news to read. Recommender systems have proven to be valu- able means for online users to cope with the information overload and have Correspondingly, various techniques for recommendation generation have been proposed and during the last decade, many of them have also been successfully deployed in commercial environments.
become one of the most powerful and popular tools in electronic commerce.
dents and practitioners in industry with a comprehensive, yet concise and con- venient reference source to recommender systems. The book describes in detail the
more suitable algorithms. The section continues considering aspects that may affect the design and finally, it discusses methods, challenges and measures to be applied for the evaluation of the developed systems. The third part includes papers dealing with a number of issues related to the presentation, browsing, explanation and vi- sualization of the recommendations, and techniques that make the recommendation process more structured and conversational.
The fourth part is fully dedicated to a rather new topic, which is however rooted in the core idea of a collaborative recommender, i.e., exploiting user generated content Finally the last section collects a few papers on some advanced topics, such as the exploitation of active learning principles to guide the acquisition of new knowl- edge, techniques suitable for making a recommender system robust against attacks of malicious users, and recommender systems that aggregate multiple types of user feedbacks and preferences to build more reliable recommendations.
We would like to thank all authors for their valuable contributions. We would like to express gratitude for all reviewers that generously gave comments on drafts or counsel otherwise. We would like to express our special thanks to Susan Lagerstrom- Fife and staff members of Springer for their kind cooperation throughout the pro- duction of this book. Finally, we wish this handbook will contribute to the growth of this subject, we wish to the novices a fruitful learning path, and to those more ex- perts a compelling application of the ideas discussed in this handbook and a fruitful
Francesco Ricci Lior Rokach Bracha Shapira
May 2010 Paul B. Kantor
of various types to build new types and more credible recommendations.
development of this challenging research area.
1 Introduction to Recommender Systems Handbook. . . . 1
Francesco Ricci, Lior Rokach and Bracha Shapira 1.1 Introduction . . . 1
1.2 Recommender Systems Function . . . 4
1.3 Data and Knowledge Sources . . . 7
1.4 Recommendation Techniques . . . 10
1.5 Application and Evaluation . . . 14
1.6 Recommender Systems and Human Computer Interaction . . . 17
1.6.1 Trust, Explanations and Persuasiveness . . . 18
1.6.2 Conversational Systems . . . 19
1.6.3 Visualization . . . 21
1.7 Recommender Systems as a Multi-Disciplinary Field . . . 21
1.8 Emerging Topics and Challenges . . . 23
1.8.1 Emerging Topics Discussed in the Handbook . . . 23
1.8.2 Challenges . . . 26
References . . . 29
Part I Basic Techniques 2 Data Mining Methods for Recommender Systems . . . . 39
Xavier Amatriain, Alejandro Jaimes, Nuria Oliver, and Josep M. Pujol 2.1 Introduction . . . 39
2.2 Data Preprocessing . . . 40
2.2.1 Similarity Measures . . . 41
2.2.2 Sampling . . . 42
2.2.3 Reducing Dimensionality . . . 44
2.2.4 Denoising . . . 47
2.3 Classification . . . 48
2.3.1 Nearest Neighbors . . . 48
2.3.2 Decision Trees . . . 50
2.3.3 Ruled-based Classifiers . . . 51
ix
2.3.4 Bayesian Classifiers . . . 52
2.3.5 Artificial Neural Networks . . . 54
2.3.6 Support Vector Machines . . . 56
2.3.7 Ensembles of Classifiers . . . 58
2.3.8 Evaluating Classifiers . . . 59
2.4 Cluster Analysis . . . 61
2.4.1 k-Means . . . 62
2.4.2 Alternatives tok-means . . . 63
2.5 Association Rule Mining . . . 64
2.6 Conclusions . . . 66
References . . . 67
3 Content-based Recommender Systems: State of the Art and Trends . 73 Pasquale Lops, Marco de Gemmis and Giovanni Semeraro 3.1 Introduction . . . 74
3.2 Basics of Content-based Recommender Systems . . . 75
3.2.1 A High Level Architecture of Content-based Systems . . . 75
3.2.2 Advantages and Drawbacks of Content-based Filtering . . 78
3.3 State of the Art of Content-based Recommender Systems . . . 79
3.3.1 Item Representation . . . 80
3.3.2 Methods for Learning User Profiles . . . 90
3.4 Trends and Future Research . . . 94
3.4.1 The Role of User Generated Content in the Recommendation Process . . . 94
3.4.2 Beyond Over-specializion: Serendipity . . . 96
3.5 Conclusions . . . 99
References . . . 100
4 A Comprehensive Survey of Neighborhood-based Recommendation Methods . . . . . . . . 107
Christian Desrosiers and George Karypis 4.1 Introduction . . . 107
4.1.1 Formal Definition of the Problem . . . 108
4.1.2 Overview of Recommendation Approaches . . . 110
4.1.3 Advantages of Neighborhood Approaches . . . 112
4.1.4 Objectives and Outline . . . 113
4.2 Neighborhood-based Recommendation . . . 114
4.2.1 User-based Rating Prediction . . . 115
4.2.2 User-based Classification . . . 116
4.2.3 Regression VS Classification . . . 117
4.2.4 Item-based Recommendation . . . 117
4.2.5 User-based VS Item-based Recommendation . . . 118
4.3 Components of Neighborhood Methods . . . 120
4.3.1 Rating Normalization . . . 121
4.3.2 Similarity Weight Computation . . . 124
4.3.3 Neighborhood Selection . . . 129 . . . .
4.4 Advanced Techniques . . . 131
4.4.1 Dimensionality Reduction Methods . . . 132
4.4.2 Graph-based Methods . . . 135
4.5 Conclusion . . . 139
References . . . 140
5 Advances in Collaborative Filtering. . . . 145
Yehuda Koren and Robert Bell 5.1 Introduction . . . 145
5.2 Preliminaries . . . 147
5.2.1 Baseline predictors . . . 148
5.2.2 The Netflix data . . . 149
5.2.3 Implicit feedback . . . 150
5.3 Matrix factorization models . . . 151
5.3.1 SVD . . . 151
5.3.2 SVD++ . . . 153
5.3.3 Time-aware factor model . . . 154
5.3.4 Comparison . . . 159
5.3.5 Summary . . . 160
5.4 Neighborhood models . . . 161
5.4.1 Similarity measures . . . 162
5.4.2 Similarity-based interpolation . . . 163
5.4.3 Jointly derived interpolation weights . . . 165
5.4.4 Summary . . . 168
5.5 Enriching neighborhood models . . . 168
5.5.1 A global neighborhood model . . . 169
5.5.2 A factorized neighborhood model . . . 173
5.5.3 Temporal dynamics at neighborhood models . . . 180
5.5.4 Summary . . . 182
5.6 Between neighborhood and factorization . . . 182
References . . . 184
6 Developing Constraint-based Recommenders. . . . 187
Alexander Felfernig, Gerhard Friedrich, Dietmar Jannach and Markus Zanker 6.1 Introduction . . . 187
6.2 Development of Recommender Knowledge Bases . . . 191
6.3 User Guidance in Recommendation Processes . . . 194
6.4 Calculating Recommendations . . . 203
6.5 Experiences from Projects and Case Studies . . . 205
6.6 Future Research Issues . . . 207
6.7 Summary . . . 212
References . . . 212
7 Context-Aware Recommender Systems . . . . 217
Gediminas Adomavicius and Alexander Tuzhilin 7.1 Introduction and Motivation . . . 218
7.2 Context in Recommender Systems . . . 219
7.2.1 What is Context? . . . 219
7.2.2 Modeling Contextual Information in Recommender Systems . . . 223
7.2.3 Obtaining Contextual Information . . . 228
7.3 Paradigms for Incorporating Context in Recommender Systems . . 230
7.3.1 Contextual Pre-Filtering . . . 233
7.3.2 Contextual Post-Filtering . . . 237
7.3.3 Contextual Modeling . . . 238
7.4 Combining Multiple Approaches . . . 243
7.4.1 Case Study of Combining Multiple Pre-Filters: Algorithms . . . 244
7.4.2 Case Study of Combining Multiple Pre-Filters: Experimental Results . . . 245
7.5 Additional Issues in Context-Aware Recommender Systems . . . 247
7.6 Conclusions . . . 249
References . . . 250
Part II Applications and Evaluation of RSs 8 Evaluating Recommendation Systems . . . . 257
Guy Shani and Asela Gunawardana 8.1 Introduction . . . 258
8.2 Experimental Settings . . . 260
8.2.1 Offline Experiments . . . 261
8.2.2 User Studies . . . 263
8.2.3 Online Evaluation . . . 266
8.2.4 Drawing Reliable Conclusions . . . 267
8.3 Recommendation System Properties . . . 271
8.3.1 User Preference . . . 272
8.3.2 Prediction Accuracy . . . 273
8.3.3 Coverage . . . 281
8.3.4 Confidence . . . 283
8.3.5 Trust . . . 285
8.3.6 Novelty . . . 285
8.3.7 Serendipity . . . 286
8.3.8 Diversity . . . 288
8.3.9 Utility . . . 289
8.3.10 Risk . . . 290
8.3.11 Robustness . . . 290
8.3.12 Privacy . . . 291
8.3.13 Adaptivity . . . 292 . . . .
8.3.14 Scalability . . . 293
8.4 Conclusion . . . 293
References . . . 294
9 A Recommender System for an IPTV Service Provider: a Real Large-Scale Production Environment. . . 299
Riccardo Bambini, Paolo Cremonesi and Roberto Turrin 9.1 Introduction . . . 299
9.2 IPTV Architecture . . . 301
9.2.1 IPTV Search Problems . . . 302
9.3 Recommender System Architecture . . . 303
9.3.1 Data Collection . . . 304
9.3.2 Batch and Real-Time Stages . . . 306
9.4 Recommender Algorithms . . . 308
9.4.1 Overview of Recommender Algorithms . . . 308
9.4.2 LSA Content-Based Algorithm . . . 311
9.4.3 Item-based Collaborative Algorithm . . . 314
9.4.4 Dimensionality-Reduction-Based Collaborative Algorithm . . . 316
9.5 Recommender Services . . . 318
9.6 System Evaluation . . . 319
9.6.1 Off-Line Analysis . . . 321
9.6.2 On-line Analysis . . . 325
9.7 Conclusions . . . 329
References . . . 329
10 How to Get the Recommender Out of the Lab? . . . . 333
J´erome Picault, Myriam Ribi`ere, David Bonnefoy and Kevin Mercer 10.1 Introduction . . . 334
10.2 Designing Real-World Recommender Systems . . . 334
10.3 Understanding the Recommender Environment . . . 335
10.3.1 Application Model . . . 335
10.3.2 User Model . . . 340
10.3.3 Data Model . . . 344
10.3.4 A Method for Using Environment Models . . . 349
10.4 Understanding the Recommender Validation Steps in an Iterative Design Process . . . 350
10.4.1 Validation of the Algorithms . . . 350
10.4.2 Validation of the Recommendations . . . 351
10.5 Use Case: a Semantic News Recommendation System . . . 355
10.5.1 Context: MESH Project . . . 356
10.5.2 Environmental Models in MESH . . . 357
10.5.3 In Practice: Iterative Instantiations of Models . . . 361
10.6 Conclusion . . . 362
References . . . 362
11 Matching Recommendation Technologies and Domains . . . . 367
Robin Burke and Maryam Ramezani 11.1 Introduction . . . 367
11.2 Related Work . . . 368
11.3 Knowledge Sources . . . 368
11.3.1 Recommendation types . . . 370
11.4 Domain . . . 372
11.4.1 Heterogeneity . . . 372
11.4.2 Risk . . . 373
11.4.3 Churn . . . 373
11.4.4 Interaction Style . . . 374
11.4.5 Preference stability . . . 374
11.4.6 Scrutability . . . 375
11.5 Knowledge Sources . . . 375
11.5.1 Social Knowledge . . . 375
11.5.2 Individual . . . 376
11.5.3 Content . . . 377
11.6 Mapping Domains to Technologies . . . 378
11.6.1 Algorithms . . . 380
11.6.2 Sample Recommendation Domains . . . 381
11.7 Conclusion . . . 382
References . . . 382
12 Recommender Systems in Technology Enhanced Learning. . . . 387
Nikos Manouselis, Hendrik Drachsler, Riina Vuorikari, Hans Hummel and Rob Koper 12.1 Introduction . . . 388
12.2 Background . . . 389
12.3 Related Work . . . 392
12.4 Survey of TEL Recommender Systems . . . 399
12.5 Evaluation of TEL Recommenders . . . 404
12.6 Conclusions and further work . . . 408
References . . . 409
Part III Interacting with Recommender Systems 13 On the Evolution of Critiquing Recommenders . . . . 419
Lorraine McGinty and James Reilly 13.1 Introduction . . . 419
13.2 The Early Days: Critiquing Systems/Recognised Benefits . . . 420
13.3 Representation & Retrieval Challenges for Critiquing Systems . . . 422
13.3.1 Approaches to Critique Representation . . . 422
13.3.2 Retrieval Challenges in Critique-Based Recommenders . . 430
13.4 Interfacing Considerations Across Critiquing Platforms . . . 438
13.4.1 Scaling to Alternate Critiquing Platforms . . . 438
13.4.2 Direct Manipulation Interfaces vs Restricted User Control. . . .440
13.4.3 Supporting Explanation, Confidence & Trust . . . 441
13.4.4 Visualisation, Adaptivity, and Partitioned Dynamicity . . . 443
13.4.5 Respecting Multi-cultural Usability Differences . . . 445
13.5 Evaluating Critiquing: Resources, Methodologies and Criteria . . . . 445
13.5.1 Resources & Methodologies . . . 446
13.5.2 Evaluation Criteria . . . 446
13.6 Conclusion / Open Challenges & Opportunities . . . 448
References . . . 449
14 Creating More Credible and Persuasive Recommender Systems: Kyung-Hyan Yoo and Ulrike Gretzel 14.1 Introduction . . . 455
14.2 Recommender Systems as Social Actors . . . 456
14.3 Source Credibility . . . 457
14.3.1 Trustworthiness . . . 458
14.3.2 Expertise . . . 458
14.3.3 Influences on Source Credibility . . . 458
14.4 Source Characteristics Studied in Human-Human Interactions . . . . 459
14.4.1 Similarity . . . 459
14.4.2 Likeability . . . 460
14.4.3 Symbols of Authority . . . 460
14.4.4 Styles of Speech . . . 461
14.4.5 Physical Attractiveness . . . 461
14.4.6 Humor . . . 461
14.5 Source Characteristics in Human-Computer Interactions . . . 462
14.6 Source Characteristics in Human-Recommender System Interactions . . . 463
14.6.1 Recommender system type . . . 463
14.6.2 Input characteristics . . . 464
14.6.3 Process characteristics . . . 465
14.6.4 Output characteristics . . . 465
14.6.5 Characteristics of embodied agents . . . 467
14.7 Discussion . . . 468
14.8 Implications . . . 468
14.9 Directions for future research . . . 470
References . . . 471
15 Designing and Evaluating Explanations for Recommender Systems 479 Nava Tintarev and Judith Masthoff 15.1 Introduction . . . 479
15.2 Guidelines . . . 481
15.3 Explanations in Expert Systems . . . 481
15.4 Defining Goals . . . 482
15.4.1 Explain How the System Works: Transparency . . . 483
. . . . The Influence of Source Characteristics on Recommender Evaluations. . . . 455 System
15.4.2 Allow Users to Tell the System it is
Wrong: Scrutability 485
15.4.3 Increase Users’ Confidence in the System: Trust . . . 485
15.4.4 Convince Users to Try or Buy: Persuasiveness . . . 487
15.4.5 Help Users Make Good Decisions: Effectiveness . . . 488
15.4.6 Help Users Make Decisions Faster: Efficiency . . . 490
15.4.7 Make the use of the system enjoyable: Satisfaction . . . 491
15.5 Evaluating the Impact of Explanations on the Recommender System . . . 492
15.5.1 Accuracy Metrics . . . 493
15.5.2 Learning Rate . . . 493
15.5.3 Coverage . . . 494
15.5.4 Acceptance . . . 494
15.6 Designing the Presentation and Interaction with Recommendations 495 15.6.1 Presenting Recommendations . . . 495
15.6.2 Interacting with the Recommender System . . . 496
15.7 Explanation Styles . . . 497
15.7.1 Collaborative-Based Style Explanations . . . 500
15.7.2 Content-Based Style Explanation . . . 501
15.7.3 Case-Based Reasoning (CBR) Style Explanations . . . 503
15.7.4 Knowledge and Utility-Based Style Explanations . . . 504
15.7.5 Demographic Style Explanations . . . 505
15.8 Summary and future directions . . . 505
References . . . 507
16 Usability Guidelines for Product Recommenders Based on Example Critiquing Research. . . . 511
Pearl Pu, Boi Faltings, Li Chen, Jiyong Zhang and Paolo Viappiani 16.1 Introduction . . . 512
16.2 Preliminaries . . . 513
16.2.1 Interaction Model . . . 513
16.2.2 Utility-Based Recommenders . . . 515
16.2.3 The Accuracy, Confidence, Effort Framework . . . 517
16.2.4 Organization of this Chapter . . . 518
16.3 Related Work . . . 518
16.3.1 Types of Recommenders . . . 518
16.3.2 Rating-based Systems . . . 519
16.3.3 Case-based Systems . . . 519
16.3.4 Utility-based Systems . . . 519
16.3.5 Critiquing-based Systems . . . 520
16.3.6 Other Design Guidelines . . . 520
16.4 Initial Preference Elicitation . . . 521
16.5 Stimulating Preference Expression with Examples . . . 525
16.5.1 How Many Examples to Show . . . 527
16.5.2 What Examples to Show . . . 527
16.6 Preference Revision . . . 530 . . . .
. . . .
16.6.1 Preference Conflicts and Partial Satisfaction . . . 531
16.6.2 Tradeoff Assistance . . . 532
16.7 Display Strategies . . . 534
16.7.1 Recommending One Item at a Time . . . 534
16.7.2 Recommending K best Items . . . 535
16.7.3 Explanation Interfaces . . . 536
16.8 A Model for Rationalizing the Guidelines . . . 537
16.9 Conclusion . . . 541
References . . . 541
17 Map Based Visualization of Product Catalogs. . . . 547
Martijn Kagie, Michiel van Wezel and Patrick J.F. Groenen 17.1 Introduction . . . 547
17.2 Methods for Map Based Visualization . . . 549
17.2.1 Self-Organizing Maps . . . 550
17.2.2 Treemaps . . . 551
17.2.3 Multidimensional Scaling . . . 553
17.2.4 Nonlinear Principal Components Analysis . . . 553
17.3 Product Catalog Maps . . . 554
17.3.1 Multidimensional Scaling . . . 555
17.3.2 Nonlinear Principal Components Analysis . . . 558
17.4 Determining Attribute Weights using Clickstream Analysis . . . 559
17.4.1 Poisson Regression Model . . . 560
17.4.2 Handling Missing Values . . . 560
17.4.3 Choosing Weights Using Poisson Regression . . . 561
17.4.4 Stepwise Poisson Regression Model . . . 562
17.5 Graphical Shopping Interface . . . 562
17.6 E-Commerce Applications . . . 563
17.6.1 MDS Based Product Catalog Map Using Attribute Weights . . . 564
17.6.2 NL-PCA Based Product Catalog Map . . . 568
17.6.3 Graphical Shopping Interface . . . 570
17.7 Conclusions and Outlook . . . 573
References . . . 574
Part IV Recommender Systems and Communities 18 Communities, Collaboration, and Recommender Systems in Personalized Web Search. . . . 579
Barry Smyth, Maurice Coyle and Peter Briggs 18.1 Introduction . . . 579
18.2 A Brief History of Web Search . . . 581
18.3 The Future of Web Search . . . 583
18.3.1 Personalized Web Search . . . 584
18.3.2 Collaborative Information Retrieval . . . 588
18.3.3 Towards Social Search . . . 590
18.4 Case-Study 1 - Community-Based Web Search . . . 591
18.4.1 Repetition and Regularity in Search Communities . . . 592
18.4.2 The Collaborative Web Search System . . . 593
18.4.3 Evaluation . . . 596
18.4.4 Discussion . . . 598
18.5 Case-Study 2 - Web Search. Shared. . . 598
18.5.1 The HeyStaks System . . . 599
18.5.2 The HeyStaks Recomendation Engine . . . 602
18.5.3 Evaluation . . . 604
18.5.4 Discussion . . . 607
18.6 Conclusions . . . 607
References . . . 609
19 Social Tagging Recommender Systems. . . . 615
Leandro Balby Marinho, Alexandros Nanopoulos, Lars Schmidt- Thieme, Robert J¨aschke, Andreas Hotho, Gerd Stumme and Panagiotis Symeonidis 19.1 Introduction . . . 616
19.2 Social Tagging Recommenders Systems . . . 617
19.2.1 Folksonomy . . . 618
19.2.2 The Traditional Recommender Systems Paradigm . . . 619
19.2.3 Multi-mode Recommendations . . . 620
19.3 Real World Social Tagging Recommender Systems . . . 621
19.3.1 What are the Challenges? . . . 621
19.3.2 BibSonomy as Study Case . . . 622
19.3.3 Tag Acquisition . . . 624
19.4 Recommendation Algorithms for Social Tagging Systems . . . 626
19.4.1 Collaborative Filtering . . . 626
19.4.2 Recommendation based on Ranking . . . 630
19.4.3 Content-Based Social Tagging RS . . . 634
19.4.4 Evaluation Protocols and Metrics . . . 637
19.5 Comparison of Algorithms . . . 639
19.6 Conclusions and Research Directions . . . 640
References . . . 642
20 Trust and Recommendations. . . . 645
Patricia Victor, Martine De Cock, and Chris Cornelis 20.1 Introduction . . . 645
20.2 Computational Trust . . . 647
20.2.1 Trust Representation . . . 648
20.2.2 Trust Computation . . . 650
20.3 Trust-Enhanced Recommender Systems . . . 655
20.3.1 Motivation . . . 656
20.3.2 State of the Art . . . 658
20.3.3 Empirical Comparison . . . 664
20.4 Recent Developments and Open Challenges . . . 670
20.5 Conclusions . . . 672
References . . . 672
21 Group Recommender Systems: Combining Individual Models. . . . 677
Judith Masthoff 21.1 Introduction . . . 677
21.2 Usage Scenarios and Classification of Group Recommenders . . . 679
21.2.1 Interactive Television . . . 679
21.2.2 Ambient Intelligence . . . 679
21.2.3 Scenarios Underlying Related Work . . . 680
21.2.4 A Classification of Group Recommenders . . . 681
21.3 Aggregation Strategies . . . 682
21.3.1 Overview of Aggregation Strategies . . . 682
21.3.2 Aggregation Strategies Used in Related Work . . . 683
21.3.3 Which Strategy Performs Best . . . 685
21.4 Impact of Sequence Order . . . 686
21.5 Modelling Affective State . . . 688
21.5.1 Modelling an Individual’s Satisfaction on its Own . . . 689
21.5.2 Effects of the Group on an Individual’s Satisfaction . . . 690
21.6 Using Affective State inside Aggregation Strategies . . . 691
21.7 Applying Group Recommendation to Individual Users . . . 693
21.7.1 Multiple Criteria . . . 693
21.7.2 Cold-Start Problem . . . 695
21.7.3 Virtual Group Members . . . 697
21.8 Conclusions and Challenges . . . 697
21.8.1 Main Issues Raised . . . 697
21.8.2 Caveat: Group Modelling . . . 698
21.8.3 Challenges . . . 698
References . . . 701
Part V Advanced Algorithms 22 Aggregation of Preferences in Recommender Systems. . . . 705
Gleb Beliakov, Tomasa Calvo and Simon James 22.1 Introduction . . . 705
22.2 Types of Aggregation in Recommender Systems . . . 706
22.2.1 Aggregation of Preferences in CF . . . 708
22.2.2 Aggregation of Features in CB and UB Recommendation 708 22.2.3 Profile Construction for CB, UB . . . 709
22.2.4 Item and User Similarity and Neighborhood Formation . . 709
22.2.5 Connectives in Case-Based Reasoning for RS . . . 711
22.2.6 Weighted Hybrid Systems . . . 711
22.3 Review of Aggregation Functions . . . 712
22.3.1 Definitions and Properties . . . 712
22.3.2 Aggregation Families . . . 716
22.4 Construction of Aggregation Functions . . . 722 . . . .
22.4.1 Data Collection and Preprocessing . . . 722
22.4.2 Desired Properties, Semantics and Interpretation . . . 724
22.4.3 Complexity and the Understanding of Function Behavior 725 22.4.4 Weight and Parameter Determination . . . 726
22.5 Sophisticated Aggregation Procedures in Recommender Systems: Tailoring for Specific Applications . . . 726
22.6 Conclusions . . . 731
22.7 Further Reading . . . 732
References . . . 733
23 Active Learning in Recommender Systems . . . . 735
Neil Rubens, Dain Kaplan, and Masashi Sugiyama 23.1 Introduction . . . 735
23.1.1 Objectives of Active Learning in Recommender Systems 737 23.1.2 An Illustrative Example . . . 738
23.1.3 Types of Active Learning . . . 739
23.2 Properties of Data Points . . . 740
23.2.1 Other Considerations . . . 741
23.3 Active Learning in Recommender Systems . . . 742
23.3.1 Method Summary Matrix . . . 742
23.4 Active Learning Formulation . . . 742
23.5 Uncertainty-based Active Learning . . . 746
23.5.1 Output Uncertainty . . . 746
23.5.2 Decision Boundary Uncertainty . . . 748
23.5.3 Model Uncertainty . . . 749
23.6 Error-based Active Learning . . . 751
23.6.1 Instance-based Methods . . . 752
23.6.2 Model-based . . . 754
23.7 Ensemble-based Active Learning . . . 756
23.7.1 Models-based . . . 756
23.7.2 Candidates-based . . . 757
23.8 Conversation-based Active Learning . . . 760
23.8.1 Case-based Critique . . . 761
23.8.2 Diversity-based . . . 761
23.8.3 Query Editing-based . . . 762
23.9 Computational Considerations . . . 762
23.10 Discussion . . . 763
References . . . 764
24 Multi-Criteria Recommender Systems. . . . 769
Gediminas Adomavicius, Nikos Manouselis and YoungOk Kwon 24.1 Introduction . . . 769
24.2 Recommendation as a Multi-Criteria Decision Making Problem 771 24.2.1 Object of Decision . . . 772
24.2.2 Family of Criteria . . . 773 . . . .
. . . .
. . . .
24.2.4 Decision Support Process . . . 775
24.3 MCDM Framework for Recommender Systems: Lessons Learned 776 24.4 Multi-Criteria Rating Recommendation . . . 780
24.4.1 Traditional single-rating recommendation problem . . . 781
24.4.2 Extending traditional recommender systems to include multi-criteria ratings . . . 782
24.5 Survey of Algorithms for Multi-Criteria Rating Recommenders . . . 783
24.5.1 Engaging Multi-Criteria Ratings during Prediction . . . 784
24.5.2 Engaging Multi-Criteria Ratings during Recommendation 791 24.6 Discussion and Future Work . . . 795
24.7 Conclusions . . . 797
References . . . 798
25 Robust Collaborative Recommendation. . . . 805
Robin Burke, Michael P. O’Mahony and Neil J. Hurley 25.1 Introduction . . . 805
25.2 Defining the Problem . . . 807
25.2.1 An Example Attack . . . 809
25.3 Characterising Attacks . . . 810
25.3.1 Basic Attacks . . . 810
25.3.2 Low-knowledge attacks . . . 811
25.3.3 Nuke Attack Models . . . 812
25.3.4 Informed Attack Models . . . 813
25.4 Measuring Robustness . . . 814
25.4.1 Evaluation Metrics . . . 815
25.4.2 Push Attacks . . . 816
25.4.3 Nuke Attacks . . . 818
25.4.4 Informed Attacks . . . 819
25.4.5 Attack impact . . . 820
25.5 Attack Detection . . . 820
25.5.1 Evaluation Metrics . . . 821
25.5.2 Single Profile Detection . . . 822
25.5.3 Group Profile Detection . . . 824
25.5.4 Detection findings . . . 827
25.6 Robust Algorithms . . . 828
25.6.1 Model-based Recomendation . . . 828
25.6.2 Robust Matrix Factorisation (RMF) . . . 829
25.6.3 Other Robust Recommendation Algorithms . . . 830
25.6.4 The Influence Limiter and Trust-based Recommendation . . . 831
25.7 Conclusion . . . 832
References . . . 833
Index . . . 837
. . . . . . . . 24.2.3 Global Preference Model . . . 774
Gediminas Adomavicius
Department of Information and Decision Sciences
Carlson School of Management, University of Minnesota, Minneapolis, MN 55455, USA
e-mail: [email protected] Xavier Amatriain
Telefonica Research, Via Augusta, 122, Barcelona 08021, Spain e-mail: [email protected]
Riccardo Bambini
Fastweb, via Francesco Caracciolo 51, Milano, Italy e-mail: [email protected]
Gleb Beliakov
School of Information Technology, Deakin University, 221 Burwood Hwy, Burwood 3125, Australia,
e-mail: [email protected] Robert Bell
AT&T Labs – Research e-mail: [email protected] David Bonnefoy
Pearltrees,
e-mail: [email protected] Peter Briggs
CLARITY: Centre for Sensor Web Technologies, School of Computer Science &
Informatics, University College Dublin, Ireland, e-mail: [email protected]
Robin Burke
Center for Web Intelligence, School of Computer Science, Telecommunication and
xxiii
Information Systems, DePaul University, Chicago, Illinois, USA e-mail: [email protected]
Tomasa Calvo
Departamento de Ciencias de la Computaci´on, Universidad de Alcal´a 28871-Alcal´a de Henares (Madrid), Spain.
e-mail: [email protected] Li Chen
Human Computer Interaction Group, School of Computer and Communication Sciences,
Swiss Federal Institute of Technology in Lausanne (EPFL), CH-1015, Lausanne, Switzerland
e-mail: [email protected] Martine De Cock
Institute of Technology, University of Washington Tacoma, 1900 Pacific Ave, Tacoma, WA, USA (on leave from Ghent University)
e-mail: [email protected] Chris Cornelis
Dept. of Applied Mathematics and Computer Science, Ghent University, Krijgslaan 281 (S9), 9000 Gent, Belgium
e-mail: [email protected] Maurice Coyle
CLARITY: Centre for Sensor Web Technologies, School of Computer Science &
Informatics, University College Dublin, Ireland, e-mail: [email protected]
Paolo Cremonesi
Politecnico di Milano, p.zza Leonardo da Vinci 32, Milano, Italy Neptuny, via Durando 10, Milano, Italy
e-mail: [email protected] Christian Desrosiers
Department of oftware Engineering and I ,T Ecole de Technologie Superieure,´ ´ Montreal,
e-mail: [email protected] Hendrik Drachsler
Centre for Learning Sciences and Technologies (CELSTEC), Open Universiteit Nederland
e-mail: [email protected] Boi Faltings
Artificial Intelligence Laboratory, School of Computer and Communication Sciences
Swiss Federal Institute of Technology in Lausanne (EPFL), CH-1015, Lausanne, Switzerland
S Canada
e-mail: [email protected]
Alexander Felfernig
Graz University of Technology
e-mail: [email protected] Gerhard Friedrich
University Klagenfurt
e-mail: [email protected] Marco de Gemmis
Department of Computer Science, University of Bari “Aldo Moro”, Via E. Orabona, 4, Bari (Italy)
e-mail: [email protected] Ulrike Gretzel
Texas A&M University, 2261 TAMU, College Station, TX, USA, e-mail: [email protected]
Patrick J.F. Groenen
Econometric Institute, Erasmus University Rotterdam, The Netherlands, e-mail: [email protected]
Asela Gunawardana
Microsoft Research, One Microsoft Way, Redmond, WA, e-mail: [email protected]
Andreas Hotho
Knowledge & Data Engineering Group (KDE), University of Kassel, Wilhelmsh¨o, her Allee 73, 34121 Kassel, Germany,
e-mail: [email protected] Hans Hummel
Centre for Learning Sciences and Technologies (CELSTEC), Open Universiteit Nederland
e-mail: [email protected] Neil J. Hurley
School of Computer Science and Informatics, University College Dublin, Ireland e-mail: [email protected]
Robert J¨aschke
Knowledge & Data Engineering Group (KDE), University of Kassel, Wilhelmsh¨o her Allee 73, 34121 Kassel, Germany,
e-mail: [email protected] Alejandro Jaimes
Yahoo! Research, Av.Diagonal, 177, Barcelona 08018, Spain e-mail: [email protected]
School of Information Technology, Deakin University, 221 Burwood Hwy, Burwood 3125, Australia,
e-mail: [email protected] Dietmar Jannach
TU Dortmund
e-mail: [email protected] Martijn Kagie
Econometric Institute, Erasmus University Rotterdam, The Netherlands, e-mail: [email protected]
Dain Kaplan
Tokyo Institute of Technology, Tokyo, Japan e-mail: [email protected]
Minneapolis, USA
e-mail: [email protected] Rob Koper
Centre for Learning Sciences and Technologies (CELSTEC), Open Universiteit Nederland
e-mail: [email protected] Yehuda Koren
Yahoo! Research,
e-mail: [email protected] YoungOk Kwon
Department of Information and Decision Sciences
Carlson School of Management, University of Minnesota, Minneapolis, MN 55455, USA
e-mail: [email protected] Pasquale Lops
Department of Computer Science, University of Bari “Aldo Moro”, Via E. Orabona, 4, Bari (Italy)
e-mail: [email protected] Nikos Manouselis
Greek Research and Technology Network (GRNET S.A.) 56 Messogeion Av., 115 27, Athens, Greece
e-mail: [email protected] Leandro Balby Marinho
Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Marienburger Platz 22, 31141 Hildesheim, Germany,
e-mail: [email protected] George Karypis
Computer Science & Engineering, University of Minnesota, Simon James
Department of
Judith Masthoff
University of Aberdeen, AB24 3UE Aberdeen UK, e-mail: [email protected]
Lorraine McGinty
UCD School of Computer Science and Informatics, University College Dublin, Dublin 4, Ireland.
e-mail: [email protected] Kevin Mercer
Loughborough University, e-mail: [email protected] Alexandros Nanopoulos
Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Marienburger Platz 22, 31141 Hildesheim, Germany,
e-mail: [email protected] Michael P. O’Mahony
CLARITY: Centre for Sensor Web Technologies, School of Computer Science and Informatics, University College Dublin, Ireland
e-mail: [email protected] Nuria Oliver
Telefonica Research, Via Augusta, 122, Barcelona 08021, Spain e-mail: [email protected]
J´erˆome Picault
Alcatel-Lucent Bell Labs,
e-mail: [email protected] Pearl Pu
Human Computer Interaction Group, School of Computer and Communication Sciences,
Swiss Federal Institute of Technology in Lausanne (EPFL), CH-1015, Lausanne, Switzerland
e-mail: pearl.pu, li.chen, [email protected] Josep M. Pujol
Telefonica Research, Via Augusta, 122, Barcelona 08021, Spain e-mail: [email protected]
Maryam Ramezani
Center for Web Intelligence, College of Computing and Digital Media, 243 S.
Wabash Ave., DePaul University, Chicago, Illinois, USA e-mail: [email protected]
James Reilly
Google Inc., 5 Cambridge Center, Cambridge, MA 02142, United States.
e-mail: [email protected]
Myriam Ribi`ere
Alcatel-Lucent Bell Labs,
e-mail: [email protected] Francesco Ricci
Faculty of Computer Science, Free University of Bozen-Bolzano, Italy e-mail: [email protected]
Lior Rokach
Department of Information Systems Engineering, Ben-Gurion University of the Negev, Israel
e-mail: [email protected] Neil Rubens
University of Electro-Communications, Tokyo, Japan, e-mail: [email protected]
Lars Schmidt-Thieme
Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Marienburger Platz 22, 31141 Hildesheim, Germany,
e-mail: [email protected] Giovanni Semeraro
Department of Computer Science, University of Bari “Aldo Moro”, Via E. Orabona, 4, Bari (Italy)
e-mail: [email protected] Guy Shani
Bracha Shapira
Department of Information Systems Engineering, Ben-Gurion University of the Negev, Israel
e-mail: [email protected] Barry Smyth
CLARITY: Centre for Sensor Web Technologies, School of Computer Science &
Informatics, University College Dublin, Ireland, e-mail: [email protected]
Gerd Stumme
Knowledge & Data Engineering Group (KDE), University of Kassel, Wilhelmsh¨o her Allee 73, 34121 Kassel, Germany,
e-mail: [email protected] Masashi Sugiyama
Tokyo Institute of Technology, Tokyo, Japan e-mail: [email protected]
Department of Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
e-mail: [email protected]
Department of Informatics, Aristotle University, 54124 Thessaloniki, Greece, e-mail: [email protected]
Nava Tintarev
University of Aberdeen, Aberdeen, U.K, e-mail: [email protected]
Roberto Turrin
Politecnico di Milano, p.zza Leonardo da Vinci 32, Milano, Italy Neptuny, via Durando 10, Milano, Italy
e-mail: [email protected] Alexander Tuzhilin
Department of Information, Operations and Management Sciences Stern School of Business, New York University
e-mail: [email protected] Paolo Viappiani
Department of Computer Science, University of Toronto, 6 King’s College Road, M5S3G4, Toronto, ON, CANADA
e-mail: [email protected] Patricia Victor
Dept. of Applied Mathematics and Computer Science, Ghent University, Krijgslaan 281 (S9), 9000 Gent, Belgium
e-mail: [email protected] Riina Vuorikari
European Schoolnet (EUN), 24, Rue Paul Emile Janson, 1050 Brussels, Belgium e-mail: [email protected]
Michiel van Wezel
Econometric Institute, Erasmus University Rotterdam, The Netherlands, e-mail: [email protected]
Kyung-Hyan Yoo
William Paterson University, Communication Department, 300 Pompton Road, Wayne, NJ, USA,
e-mail: [email protected] Markus Zanker
University Klagenfurt
e-mail: [email protected] Jiyong Zhang
Human Computer Interaction Group, School of Computer and Communication Sciences,
Swiss Federal Institute of Technology in Lausanne (EPFL), CH-1015, Lausanne, Switzerland
e-mail: [email protected] Panagiotis Symeonidis
Introduction to Recommender Systems Handbook
Francesco Ricci, Lior Rokach and Bracha Shapira
AbstractRecommender Systems (RSs) are software tools and techniques providing suggestions for items to be of use to a user. In this introductory chapter we briefly discuss basic RS ideas and concepts. Our main goal is to delineate, in a coherent and structured way, the chapters included in this handbook and to help the reader navigate the extremely rich and detailed content that the handbook offers.
1.1 Introduction
Recommender Systems (RSs) are software tools and techniques providing sugges- tions for items to be of use to a user [60, 85, 25]. The suggestions relate to various decision-making processes, such as what items to buy, what music to listen to, or what online news to read.
“Item” is the general term used to denote what the system recommends to users.
A RS normally focuses on a specific type of item (e.g., CDs, or news) and accord- ingly its design, its graphical user interface, and the core recommendation technique used to generate the recommendations are all customized to provide useful and ef- fective suggestions for that specific type of item.
RSs are primarily directed towards individuals who lack sufficient personal ex- perience or competence to evaluate the potentially overwhelming number of alter- Francesco Ricci
Faculty of Computer Science, Free University of Bozen-Bolzano, Italy e-mail:fricci@unibz.
it Lior Rokach
Department of Information Systems Engineering, Ben-Gurion University of the Negev, Israel e- mail:[email protected]
Bracha Shapira
Department of Information Systems Engineering, Ben-Gurion University of the Negev, Israel e- mail:[email protected]
F. Ricci et al. (eds.), Recommender Systems Handbook, 1
DOI 10.1007/978-0-387-85820-3_1, © Springer Science+Business Media, LLC 2011
native items that a Web site, for example, may offer [85]. A case in point is a book recommender system that assists users to select a book to read. In the popular Web site, Amazon.com, the site employs a RS to personalize the online store for each customer [47]. Since recommendations are usually personalized, different users or user groups receive diverse suggestions. In addition there are also non-personalized recommendations. These are much simpler to generate and are normally featured in magazines or newspapers. Typical examples include the top ten selections of books, CDs etc. While they may be useful and effective in certain situations, these types of non-personalized recommendations are not typically addressed by RS research.
In their simplest form, personalized recommendations are offered as ranked lists of items. In performing this ranking, RSs try to predict what the most suitable prod- ucts or services are, based on the user’s preferences and constraints. In order to complete such a computational task, RSs collect from users their preferences, which are either explicitly expressed, e.g., as ratings for products, or are inferred by inter- preting user actions. For instance, a RS may consider the navigation to a particular product page as an implicit sign of preference for the items shown on that page.
RSs development initiated from a rather simple observation: individuals often rely on recommendations provided by others in making routine, daily decisions [60, 70]. For example it is common to rely on what one’s peers recommend when selecting a book to read; employers count on recommendation letters in their re- cruiting decisions; and when selecting a movie to watch, individuals tend to read and rely on the movie reviews that a film critic has written and which appear in the newspaper they read.
In seeking to mimic this behavior, the first RSs applied algorithms to leverage recommendations produced by a community of users to deliver recommendations to an active user, i.e., a user looking for suggestions. The recommendations were for items that similar users (those with similar tastes) had liked. This approach is termed collaborative-filtering and its rationale is that if the active user agreed in the past with some users, then the other recommendations coming from these similar users should be relevant as well and of interest to the active user.
As e-commerce Web sites began to develop, a pressing need emerged for pro- viding recommendations derived from filtering the whole range of available alter- natives. Users were finding it very difficult to arrive at the most appropriate choices from the immense variety of items (products and services) that these Web sites were offering.
The explosive growth and variety of information available on the Web and the rapid introduction of new e-business services (buying products, product compari- son, auction, etc.) frequently overwhelmed users, leading them to make poor deci- sions. The availability of choices, instead of producing a benefit, started to decrease users’ well-being. It was understood that while choice is good, more choice is not always better. Indeed, choice, with its implications of freedom, autonomy, and self- determination can become excessive, creating a sense that freedom may come to be regarded as a kind of misery-inducing tyranny [96].
RSs have proved in recent years to be a valuable means for coping with the infor- mation overload problem. Ultimately a RS addresses this phenomenon by pointing
a user towards new, not-yet-experienced items that may be relevant to the users current task. Upon a user’s request, which can be articulated, depending on the rec- ommendation approach, by the user’s context and need, RSs generate recommen- dations using various types of knowledge and data about users, the available items, and previous transactions stored in customized databases. The user can then browse the recommendations. She may accept them or not and may provide, immediately or at a next stage, an implicit or explicit feedback. All these user actions and feed- backs can be stored in the recommender database and may be used for generating new recommendations in the next user-system interactions.
As noted above, the study of recommender systems is relatively new compared to research into other classical information system tools and techniques (e.g., databases or search engines). Recommender systems emerged as an independent research area in the mid-1990s [35, 60, 70, 7]. In recent years, the interest in recommender sys- tems has dramatically increased, as the following facts indicate:
1. Recommender systems play an important role in such highly rated Internet sites as Amazon.com, YouTube, Netflix, Yahoo, Tripadvisor, Last.fm, and IMDb.
Moreover many media companies are now developing and deploying RSs as part of the services they provide to their subscribers. For example Netflix, the online movie rental service, awarded a million dollar prize to the team that first suc- ceeded in improving substantially the performance of its recommender system [54].
2. There are dedicated conferences and workshops related to the field. We refer specifically to ACM Recommender Systems (RecSys), established in 2007 and now the premier annual event in recommender technology research and appli- cations. In addition, sessions dedicated to RSs are frequently included in the more traditional conferences in the area of data bases, information systems and adaptive systems. Among these conferences are worth mentioning ACM SIGIR Special Interest Group on Information Retrieval (SIGIR), User Modeling, Adap- tation and Personalization (UMAP), and ACM’s Special Interest Group on Man- agement Of Data (SIGMOD).
3. At institutions of higher education around the world, undergraduate and graduate courses are now dedicated entirely to RSs; tutorials on RSs are very popular at computer science conferences; and recently a book introducing RSs techniques was published [48].
4. There have been several special issues in academic journals covering research and developments in the RS field. Among the journals that have dedicated issues to RS are: AI Communications (2008); IEEE Intelligent Systems (2007); Inter- national Journal of Electronic Commerce (2006); International Journal of Com- puter Science and Applications (2006); ACM Transactions on Computer-Human Interaction (2005); and ACM Transactions on Information Systems (2004).
In this introductory chapter we briefly discuss basic RS ideas and concepts. Our main goal is not much to present a self-contained comprehensive introduction or survey on RSs but rather to delineate, in a coherent and structured way, the chapters
included in this handbook and to help the reader navigate the extremely rich and detailed content that the handbook offers.
The handbook is divided into five sections: techniques; applications and evalua- tion of RSs; interacting with RSs; RSs and communities; and advanced algorithms.
The first section presents the techniques most popularly used today for build- ing RSs, such as collaborative filtering; content-based, data mining methods; and context-aware methods.
The second section surveys techniques and approaches that have been utilized to evaluate the quality of the recommendations. It also deals with the practical aspects of designing recommender systems; describes design and implementation consider- ations; and sets guidelines for selecting the more suitable algorithms. The section also considers aspects that may affect RS design (domain, device, users, etc.). Fi- nally, it discusses methods, challenges and measures to be applied in evaluating the developed systems.
The third section includes papers dealing with a number of issues related to how recommendations are presented, browsed, explained and visualized. The techniques that make the recommendation process more structured and conversational are dis- cussed here.
The fourth section is fully dedicated to a rather new topic, exploiting user- generated content (UGC) of various types (tags, search queries, trust evaluations, etc.) to generate innovative types of recommendations and more credible ones. De- spite its relative newness, this topic is essentially rooted in the core idea of a collab- orative recommender,
The last selection presents papers on various advanced topics, such as: the ex- ploitation of active learning principles to guide the acquisition of new knowledge;
suitable techniques for protecting a recommender system against attacks of mali- cious users; and RSs that aggregate multiple types of user feedbacks and preferences to build more reliable recommendations.
1.2 Recommender Systems Function
In the previous section we defined RSs as software tools and techniques providing users with suggestions for items a user may wish to utilize. Now we want to refine this definition illustrating a range of possible roles that a RS can play. First of all, we must distinguish between the role played by the RS on behalf of the service provider from that of the user of the RS. For instance, a travel recommender system is typically introduced by a travel intermediary (e.g., Expedia.com) or a destination management organization (e.g., Visitfinland.com) to increase its turnover (Expedia), i.e., sell more hotel rooms, or to increase the number of tourists to the destination [86]. Whereas, the user’s primary motivations for accessing the two systems is to find a suitable hotel and interesting events/attractions when visiting a destination.
In fact, there are various reasons as to why service providers may want to exploit this technology:
• Increase the number of items sold.This is probably the most important function for a commercial RS, i.e., to be able to sell an additional set of items compared to those usually sold without any kind of recommendation. This goal is achieved because the recommended items are likely to suit the user’s needs and wants.
Presumably the user will recognize this after having tried several recommenda- tions1. Non-commercial applications have similar goals, even if there is no cost for the user that is associated with selecting an item. For instance, a content net- work aims at increasing the number of news items read on its site.
In general, we can say that from the service provider’s point of view, the primary goal for introducing a RS is to increase the conversion rate, i.e., the number of users that accept the recommendation and consume an item, compared to the number of simple visitors that just browse through the information.
• Sell more diverse items.Another major function of a RS is to enable the user to select items that might be hard to find without a precise recommendation.
For instance, in a movie RS such as Netflix, the service provider is interested in renting all the DVDs in the catalogue, not just the most popular ones. This could be difficult without a RS since the service provider cannot afford the risk of advertising movies that are not likely to suit a particular user’s taste. Therefore, a RS suggests or advertises unpopular movies to the right users
• Increase the user satisfaction.A well designed RS can also improve the expe- rience of the user with the site or the application. The user will find the recom- mendations interesting, relevant and, with a properly designed human-computer interaction, she will also enjoy using the system. The combination of effective, i.e., accurate, recommendations and a usable interface will increase the user’s subjective evaluation of the system. This in turn will increase system usage and the likelihood that the recommendations will be accepted.
• Increase user fidelity.A user should be loyal to a Web site which, when visited, recognizes the old customer and treats him as a valuable visitor. This is a nor- mal feature of a RS since many RSs compute recommendations, leveraging the information acquired from the user in previous interactions, e.g., her ratings of items. Consequently, the longer the user interacts with the site, the more refined her user model becomes, i.e., the system representation of the user’s preferences, and the more the recommender output can be effectively customized to match the user’s preferences.
• Better understand what the user wants. Another important function of a RS, which can be leveraged to many other applications, is the description of the user’s preferences, either collected explicitly or predicted by the system. The service provider may then decide to re-use this knowledge for a number of other goals such as improving the management of the item’s stock or production. For instance, in the travel domain, destination management organizations can decide to advertise a specific region to new customer sectors or advertise a particular
1This issue, convincing the user to accept a recommendation, is discussed again when we explain the difference between predicting the user interest in an item and the likelihood that the user will select the recommended item.
type of promotional message derived by analyzing the data collected by the RS (transactions of the users).
We mentioned above some important motivations as to why e-service providers introduce RSs. But users also may want a RS, if it will effectively support their tasks or goals. Consequently a RS must balance the needs of these two players and offer a service that is valuable to both.
Herlocker et al. [25], in a paper that has become a classical reference in this field, define eleven popular tasks that a RS can assist in implementing. Some may be considered as the main or core tasks that are normally associated with a RS, i.e., to offer suggestions for items that may be useful to a user. Others might be considered as more “opportunistic” ways to exploit a RS. As a matter of fact, this task differentiation is very similar to what happens with a search engine, Its primary function is to locate documents that are relevant to the user’s information need, but it can also be used to check the importance of a Web page (looking at the position of the page in the result list of a query) or to discover the various usages of a word in a collection of documents.
• Find Some Good Items:Recommend to a user some items as a ranked list along with predictions of how much the user would like them (e.g., on a one- to five- star scale). This is the main recommendation task that many commercial systems address (see, for instance, Chapter 9). Some systems do not show the predicted rating.
• Find all good items:Recommend all the items that can satisfy some user needs.
In such cases it is insufficient to just find some good items. This is especially true when the number of items is relatively small or when the RS is mission-critical, such as in medical or financial applications. In these situations, in addition to the benefit derived from carefully examining all the possibilities, the user may also benefit from the RS ranking of these items or from additional explanations that the RS generates.
• Annotation in context:Given an existing context, e.g., a list of items, emphasize some of them depending on the user’s long-term preferences. For example, a TV recommender system might annotate which TV shows displayed in the elec- tronic program guide (EPG) are worth watching (Chapter 18 provides interesting examples of this task).
• Recommend a sequence:Instead of focusing on the generation of a single rec- ommendation, the idea is to recommend a sequence of items that is pleasing as a whole. Typical examples include recommending a TV series; a book on RSs after having recommended a book on data mining; or a compilation of musical tracks [99], [39].
• Recommend a bundle:Suggest a group of items that fits well together. For in- stance a travel plan may be composed of various attractions, destinations, and accommodation services that are located in a delimited area. From the point of view of the user these various alternatives can be considered and selected as a single travel destination [87].
• Just browsing:In this task, the user browses the catalog without any imminent intention of purchasing an item. The task of the recommender is to help the user to browse the items that are more likely to fall within the scope of the user’s inter- ests for that specific browsing session. This is a task that has been also supported by adaptive hypermedia techniques [23].
• Find credible recommender:Some users do not trust recommender systems thus they play with them to see how good they are in making recommendations.
Hence, some system may also offer specific functions to let the users test its behavior in addition to those just required for obtaining recommendations.
• Improve the profile:This relates to the capability of the user to provide (input) information to the recommender system about what he likes and dislikes. This is a fundamental task that is strictly necessary to provide personalized recommen- dations. If the system has no specific knowledge about the active user then it can only provide him with the same recommendations that would be delivered to an
“average” user.
• Express self:Some users may not care about the recommendations at all. Rather, what it is important to them is that they be allowed to contribute with their rat- ings and express their opinions and beliefs. The user satisfaction for that activity can still act as a leverage for holding the user tightly to the application (as we mentioned above in discussing the service provider’s motivations).
• Help others:Some users are happy to contribute with information, e.g., their evaluation of items (ratings), because they believe that the community benefits from their contribution. This could be a major motivation for entering informa- tion into a recommender system that is not used routinely. For instance, with a car RS, a user, who has already bought her new car is aware that the rating en- tered in the system is more likely to be useful for other users rather than for the next time she will buy a car.
• Influence others:In Web-based RSs, there are users whose main goal is to ex- plicitly influence other users into purchasing particular products. As a matter of fact, there are also some malicious users that may use the system just to promote or penalize certain items (see Chapter 25).
As these various points indicate, the role of a RS within an information system can be quite diverse. This diversity calls for the exploitation of a range of different knowledge sources and techniques and in the next two sections we discuss the data a RS manages and the core technique used to identify the right recommendations.
1.3 Data and Knowledge Sources
RSs are information processing systems that actively gather various kinds of data in order to build their recommendations. Data is primarily about the items to sug- gest and the users who will receive these recommendations. But, since the data and knowledge sources available for recommender systems can be very diverse, ultimately, whether they can be exploited or not depends on the recommendation
technique (see also section 1.4). This will become clearer in the various chapters included in this handbook (see in particular Chapter 11).
In general, there are recommendation techniques that are knowledge poor, i.e., they use very simple and basic data, such as user ratings/evaluations for items (Chapters 5, 4). Other techniques are much more knowledge dependent, e.g., us- ing ontological descriptions of the users or the items (Chapter 3), or constraints (Chapter 6), or social relations and activities of the users (Chapter 19). In any case, as a general classification, data used by RSs refers to three kinds of objects: items, users, and transactions, i.e., relations between users and items.
Items.Items are the objects that are recommended. Items may be characterized by their complexity and their value or utility. The value of an item may be positive if the item is useful for the user, or negative if the item is not appropriate and the user made a wrong decision when selecting it. We note that when a user is acquiring an item she will always incur in a cost, which includes the cognitive cost of searching for the item and the real monetary cost eventually paid for the item.
For instance, the designer of a news RS must take into account the complexity of a news item, i.e., its structure, the textual representation, and the time-dependent im- portance of any news item. But, at the same time, the RS designer must understand that even if the user is not paying for reading news, there is always a cognitive cost associated to searching and reading news items. If a selected item is relevant for the user this cost is dominated by the benefit of having acquired a useful information, whereas if the item is not relevant the net value of that item for the user, and its recommendation, is negative. In other domains, e.g., cars, or financial investments, the true monetary cost of the items becomes an important element to consider when selecting the most appropriate recommendation approach.
Items with low complexity and value are: news, Web pages, books, CDs, movies.
Items with larger complexity and value are: digital cameras, mobile phones, PCs, etc. The most complex items that have been considered are insurance policies, fi- nancial investments, travels, jobs [72].
RSs, according to their core technology, can use a range of properties and fea- tures of the items. For example in a movie recommender system, the genre (such as comedy, thriller, etc.), as well as the director, and actors can be used to describe a movie and to learn how the utility of an item depends on its features. Items can be represented using various information and representation approaches, e.g., in a minimalist way as a single id code, or in a richer form, as a set of attributes, but even as a concept in an ontological representation of the domain (Chapter 3).
Users.Users of a RS, as mentioned above, may have very diverse goals and char- acteristics. In order to personalize the recommendations and the human-computer interaction, RSs exploit a range of information about the users. This information can be structured in various ways and again the selection of what information to model depends on the recommendation technique.
For instance, in collaborative filtering, users are modeled as a simple list contain- ing the ratings provided by the user for some items. In a demographic RS, socio- demographic attributes such as age, gender, profession, and education, are used.
User data is said to constitute the user model [21, 32]. The user model profiles the
user, i.e., encodes her preferences and needs. Various user modeling approaches have been used and, in a certain sense, a RS can be viewed as a tool that generates recommendations by building and exploiting user models [19, 20]. Since no person- alization is possible without a convenient user model, unless the recommendation is non-personalized, as in the top-10 selection, the user model will always play a cen- tral role. For instance, considering again a collaborative filtering approach, the user is either profiled directly by its ratings to items or, using these ratings, the system derives a vector of factor values, where users differ in how each factor weights in their model (Chapters 5 and 4).
Users can also be described by their behavior pattern data, for example, site browsing patterns (in a Web-based recommender system) [107], or travel search patterns (in a travel recommender system) [60]. Moreover, user data may include re- lations between users such as the trust level of these relations between users (Chap- ter 20). A RS might utilize this information to recommend items to users that were preferred by similar or trusted users.
Transactions.We generically refer to a transaction as a recorded interaction be- tween a user and the RS. Transactions are log-like data that store important infor- mation generated during the human-computer interaction and which are useful for the recommendation generation algorithm that the system is using. For instance, a transaction log may contain a reference to the item selected by the user and a description of the context (e.g., the user goal/query) for that particular recommen- dation. If available, that transaction may also include an explicit feedback the user has provided, such as the rating for the selected item.
In fact, ratings are the most popular form of transaction data that a RS collects.
These ratings may be collected explicitly or implicitly. In the explicit collection of ratings, the user is asked to provide her opinion about an item on a rating scale.
According to [93], ratings can take on a variety of forms:
• Numerical ratings such as the 1-5 stars provided in the book recommender asso- ciated with Amazon.com.
• Ordinal ratings, such as “strongly agree, agree, neutral, disagree, strongly dis- agree” where the user is asked to select the term that best indicates her opinion regarding an item (usually via questionnaire).
• Binary ratings that model choices in which the user is simply asked to decide if a certain item is good or bad.
• Unary ratings can indicate that a user has observed or purchased an item, or otherwise rated the item positively. In such cases, the absence of a rating indicates that we have no information relating the user to the item (perhaps she purchased the item somewhere else).
Another form of user evaluation consists of tags associated by the user with the items the system presents. For instance, in Movielens RS (http://movielens.umn.edu) tags represent how MovieLens users feel about a movie, e.g.: “too long”, or “act- ing”. Chapter 19 focuses on these types of transactions.
In transactions collecting implicit ratings, the system aims to infer the users opin- ion based on the user’s actions. For example, if a user enters the keyword “Yoga” at