Computer and Information Sciences

Item

Metaheuristics for Collaborative Filtering in Recommender Systems

(University of Hyderabad, 2021-06-29) Ayangleima Laishram ; Vineet Padmanabhan

It has become imperative in the current internet based era to advance technology in such a way that the preferences of individuals/users’ could be learned from the existing data and recommendations be made on unseen data wherein the user is satisfied with the recommended data/items to a large extend. Recommender systems technology have been put forward by keeping this idea in mind and several multinationals make use of this paradigm to expand their business initiatives. In this thesis we are mainly focused on devising methods that can improve the recommendation as well as prediction accuracy in collaborative filtering (CF) based recommender systems. To achieve this end we propose a variety of algorithms in which metaheuristic techniques are combined with matrix factorisation methods and the combined framework is tested on two main approaches used for collaborative filtering in recommender systems, namely, model based and neighbourhood based collaborative filtering. In the case of model based collaborative filtering we demonstrate how metaheuristic techniques like Particle Swarm Optimisation (PSO) and Genetic Algorithm (GA) can be combined with matrix factorisation techniques like Maximum Margin Matrix Factorisation (MMMF). The metaheuristic algorithms such as PSO and GA are exploratory in nature which enhance the traditional model-based collaborative filtering techniques like MMMF with exploitatory nature of gradient descent. The gradient descent approach may get trapped in local optima which is why we plan to employ metaheuristic techniques. Our algorithm starts from multiple initial points and uses gradient information and swarm-search as the search progresses. We show that by this process we get an efficient search scheme to get near optimal point for maximum margin matrix factorization. Our experimental results on benchmark datasets demonstrate that when the exploration capability of popula- tion based search algorithms is combined with gradient search direction of MMMF, the proposed models are able to achieve better accuracy as can be evidenced from the derived RMSE and MAE values . We extended the neighbourhood based collaborative filtering technique by adopting the concept of discovering highly correlated user-item subgroups. Our proposal of constructing the user-item-subgroup based collaborative filtering can be done in two ways, namely, through a two-step approach and a fuzzy c-means clustering approach. In the two-step process, we proposed different algorithms to identify the highly correlated user-item subgroups. Then, we used least squares method to predict the missing ratings by using the rating information of the highly correlated user-item subgroups. In fuzzy c-means clustering based user-item subgroup algorithm, the highly correlated user-item subgroups are discovered in one step. We optimized the initialization of centroids in fuzzy cmeans by using particle swarm optimization to accurately discover highly correlated user-item subgroups in CF. We observed that our proposed algorithms in the two step approach outperforms all the CF models under comparison for all benchmark datasets. Our findings in terms of MAP suggest that the correlation of the subgroups discovered by GA that evaluates fitness by calculating mean squared residue and row variance is significant though the effectiveness is less for smaller dataset. In the case of fuzzy c-means approach, the metaheuristic optimization algorithm acts as a booster to improve the fuzzy c-means clustering in discovering highly correlated user-item subgroups by initializing the initial centroid of the clusters to the nearest optimal solutions. Our experimental results have shown a promising way of making use of user-item subgroups in helping to capture highly similar user preferences on a subset of items.

Item

Reduction Strategies to Tackle Class Imbalance in Datasets

(University of Hyderabad, 2021-07-28) Krishnaveni, C.V. ; Sobha Rani, T

Banking, retail, financial, scientific and telecommunications and various other sectors have all been using data mining technologies, for processing massive amounts of data measured in zeta bytes. While this massive amount of data is useful, datasets have to be processed effectively to perform predictive and inferential forecasts for a target population. The Class imbalance, where there are fewer instances of a class than the number of instances in other class/classes in a dataset has posed challenges to the traditional classifiers. Traditional classifiers fail to handle the imbalanced datasets due to inherent assumptions made in designing them. The distribution of classes within the dataset has a direct impact on the classifier/model performance. One of the proven practices to address this problem is to balance the classes in the training data sets. Main goals of the balancing are increasing sensitivity, selecting representative samples from the majority class, maintaining trade-off between Majority Class and Minority Class prediction rates.

Item

DNSSEC : verification, validation and proposal for enhancement

(University of Hyderabad, 2018-12-30) Ramesh Babu, Kollapalli ; Vineet C.P. Nair

Item

Energy efficient data center management strategies

(University of Hyderabad, 2018-05-30) Dinesh Reddy, Vemula ; Gangadharan, G.R.

Item

Video and camera tamper detection techniques for securing video surveillance systems

(University of Hyderabad, 2018-06-30) Sitara, K. ; Mehtre, B.M.

Computer and Information Sciences - Theses

Permanent URI for this collection

Browse

Browse

Recent Submissions