# fedrec_lossless_federated_recommendation_with_explicit_feedback__c3c86ffc.pdf Fed Rec++: Lossless Federated Recommendation with Explicit Feedback Feng Liang, Weike Pan*, Zhong Ming* National Engineering Laboratory for Big Data System Computing Technology College of Computer Science and Software Engineering Shenzhen University, Shenzhen 518060, China liangfeng2018@email.szu.edu.cn, panweike@szu.edu.cn, mingz@szu.edu.cn With the marriage of federated machine learning and recommender systems for privacy-aware preference modeling and personalization, there comes a new research branch called federated recommender systems aiming to build a recommendation model in a distributed way, i.e., each user is represented as a distributed client where his/her original rating data are not shared with the server or the other clients. Notice that, besides the sensitive information of a specific rating score assigned to a certain item by a user, the information of a user s rated set of items shall also be well protected. Some very recent works propose to randomly sample some unrated items for each user and then assign some virtual ratings, so that the server can not identify the scores and the set of rated items easily during the server-client interactions. However, the virtual ratings assigned to the randomly sampled items will inevitably introduce some noise to the model training process, which will then cause loss in recommendation performance. In this paper, we propose a novel lossless federated recommendation method (Fed Rec++) by allocating some denoising clients (i.e., users) to eliminate the noise in a privacy-aware manner. We further analyse our Fed Rec++ in terms of security and losslessness, and discuss its generality in the context of existing works. Extensive empirical studies clearly show the effectiveness of our Fed Rec++ in providing accurate and privacy-aware recommendation without much additional communication cost. Introduction In the era of information overload, it is often difficult for people to find what they like among a huge number of items. Recommender systems solve the problem by exploiting users historical data to recommend some items that the users may like. Traditional collaborative filtering (CF) algorithms (Mnih and Salakhutdinov 2007; Koren 2008; Rendle 2012) need to collect all the users rating data in one central place, e.g., the server, for model training. With the increasing awareness of privacy and the publishing of some related privacy protection laws such as GDPR (EU 2016), collecting users data may not be feasible in many cases. Recently, federated machine learning (Mc Mahan et al. 2017; Yang et al. 2019; Kairouz et al. 2019) has been proposed for protecting users privacy in machine learning al- Copyright c 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. gorithms and systems (including recommender systems). Some recent works (Ammad-ud-din et al. 2019; Chai et al. 2020) revisit some recommendation algorithms in the new federated learning paradigm. Specifically, the original rating data are always kept locally in each client (i.e., user) in the whole process of model training, and each client only uploads the corresponding model parameters to the server in order to update the model jointly. For example, federated collaborative filtering (FCF) (Ammad-ud-din et al. 2019) focuses on item ranking with implicit feedback, and treats all the unrated items as negative ones, which may cause bias in model training and also high communication cost during the server-client interactions. Fed MF (Chai et al. 2020) uses the homomorphic encryption technology to encrypt the items gradients before uploading them to the server in order to protect the users privacy. A federated meta learning work (Jalalirad et al. 2019) combines a meta learning method called REPTILE (Nichol, Achiam, and Schulman 2018) with federated learning for rating prediction with explicit feedback, which is able to fine tune the model parameters for each user. Federated multi-view matrix factorization (FED-MVMF) (Flanagan et al. 2020) combines multiview matrix factorization with federated learning in order to protect the users original rating data when modeling multiparty data. However, it will leak the users rating behaviors (i.e., the set of items rated by a user) similar to the aforementioned methods. We can see that most existing federated recommendation methods will either bias the model training or do not protect the users rating behaviors well. Recently, SDCF (Jiang, Li, and Lin 2019) proposes a two-stage randomized response algorithm to perturb the rated and unrated items of each user, and then calculates and uploads their gradients to the server. In this way, the server can not identify the set of items rated by the users easily, which thus protects the users rating behaviors. Fed Rec (Lin et al. 2020) also uploads the gradients of the rated items and the randomly sampled unrated items of the users. Notice that it uses a hybrid filling strategy to assign some virtual ratings to the unrated items. However, both of them will introduce some noise to the gradients, which will cause loss in the recommendation performance. In order to eliminate the gradient noise, we propose to allocate some denoising users (i.e., clients) to eliminate the noise caused by the randomly sampled items and their as- The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21) signed virtual ratings. Our denoising strategy is secure in terms of protecting users privacy even the server colludes with the denoising clients. We then incorporate the denoising component into the very recent method Fed Rec (Lin et al. 2020) and obtain our solution called Fed Rec++. As far as we know, we are the first to study gradient noise elimination in federated recommendation. We summarize our main contributions as follows. (i) We propose a novel lossless federated recommendation method (Fed Rec++) for modeling users explicit feedback, which is able to completely eliminate the gradient noise brought by the virtual ratings assigned to the randomly sampled unrated items. (ii) We discuss the relationships of our Fed Rec++ with existing works, e.g., our Fed Rec++ reduces to Fed Rec (Lin et al. 2020) when the component of noise elimination is removed, and also analyse its security in protecting users privacy. (iii) We conduct extensive empirical studies on three public datasets, and find our Fed Rec++ is able to protect users privacy without sacrificing the recommendation performance. Related Work Probabilistic Matrix Factorization In probabilistic matrix factorization (PMF) (Mnih and Salakhutdinov 2007), the rating of a user u to an item i is calculated via the inner product of their latent feature vectors, i.e., ˆrui = Uu V T i , where Uu , Vi R1 d. Federated Recommendation with Explicit Feedback In federated recommendation with explicit feedback (Fed Rec) (Lin et al. 2020), in order to protect a user s rating behaviors, i.e., the set of items Iu rated by a user u, the authors design an effective hybrid filling (HF) strategy to randomly sample some unrated items. Firstly, it randomly samples |I u| unrated items of user u from I\Iu, where |I u| = ρ|Iu| with ρ {1, 2, 3}. Secondly, it uses the average rating or predicted rating of user u to a sampled item i as a virtual rating r ui. Thirdly, it calculates the gradients of user u to the rated items and the unrated items, i.e., Vi , i Iu I u, and then uploads these gradients to the server. In this way, Fed Rec with the HF strategy achieves the purpose of protecting the user s original rating records and the rating behaviors in the preference modeling process. In particular, the virtual rating r ui is as follows, Pm k=1 yukruk Pm k=1 yuk , t