Papers

The Shapley Value in Machine Learning

Published in IJCAI 2022, 2022

Over the last few years, the Shapley value, a solution concept from cooperative game theory, has found numerous applications in machine learning. In this paper, we first discuss fundamental concepts of cooperative game theory and axiomatic properties of the Shapley value. Then we give an overview of the most important applications of the Shapley value in machine learning: feature selection, explainability, multi-agent reinforcement learning, ensemble pruning, and data valuation. We examine the most crucial limitations of the Shapley value and point out directions for future research.

Recommended citation: Benedek Rozemberczki, Lauren Watson, Péter Bayer, Hao-Tsung Yang, Olivér Kiss, Sebastian Nilsson, Rik Sarkar. The Shapley Value in Machine Learning. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence Survey Track. Pages 5572-5579 https://doi.org/10.24963/ijcai.2022/778

Pytorch geometric temporal: Spatiotemporal signal processing with neural machine learning models

Published in CIKM 2021, 2021

We present PyTorch Geometric Temporal, a deep learning framework combining state-of-the-art machine learning algorithms for neural spatiotemporal signal processing. The main goal of the library is to make temporal geometric deep learning available for researchers and machine learning practitioners in a unified easy-to-use framework. PyTorch Geometric Temporal was created with foundations on existing libraries in the PyTorch eco-system, streamlined neural network layer definitions, temporal snapshot generators for batching, and integrated benchmark datasets. These features are illustrated with a tutorial-like case study. Experiments demonstrate the predictive performance of the models implemented in the library on real-world problems such as epidemiological forecasting, ride-hail demand prediction, and web traffic management. Our sensitivity analysis of runtime shows that the framework can potentially operate on web-scale datasets with rich temporal features and spatial structure.

Recommended citation: Benedek Rozemberczki, Paul Scherer, Yixuan He, George Panagopoulos, Alexander Riedel, Maria Astefanoaei, Oliver Kiss, Ferenc Beres, Guzmán López, Nicolas Collignon and Rik Sarkar. PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models, Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pp 4564–4573, 2021 https://dl.acm.org/doi/abs/10.1145/3459637.3482014

Little Ball of Fur: A Python Library for Graph Sampling

Published in CIKM 2020, 2020

Sampling graphs is an important task in data mining. In this paper, we describe Little Ball of Fur a Python library that includes more than twenty graph sampling algorithms. Our goal is to make node, edge, and exploration-based network sampling techniques accessible to a large number of professionals, researchers, and students in a single streamlined framework. We created this framework with a focus on a coherent application public interface which has a convenient design, generic input data requirements, and reasonable baseline settings of algorithms. Here we overview these design foundations of the framework in detail with illustrative code snippets. We show the practical usability of the library by estimating various global statistics of social networks and web graphs. Experiments demonstrate that Little Ball of Fur can speed up node and whole graph embedding techniques considerably with mildly deteriorating the predictive value of distilled features.

Recommended citation: B. Rozemberczki, O. Kiss and R. Sarkar. Little Ball of Fur: A Python Library for Graph Sampling, Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp 3133–3140, 2020 https://dl.acm.org/doi/abs/10.1145/3340531.3412758

Karate Club: An API Oriented Open-Source Python Framework for Unsupervised Learning on Graphs

Published in CIKM 2020, 2020

Karate Club consists of state-of-the-art methods to do unsupervised learning on graph structured data. To put it simply it is a Swiss Army knife for small-scale graph mining research. First, it provides network embedding techniques at the node and graph level. Second, it includes a variety of overlapping and non-overlapping community detection methods. Implemented methods cover a wide range of network science (NetSci, Complenet), data mining (ICDM, CIKM, KDD), artificial intelligence (AAAI, IJCAI) and machine learning (NeurIPS, ICML, ICLR) conferences, workshops, and pieces from prominent journals.

Recommended citation: B. Rozemberczki, O. Kiss and R. Sarkar. An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs, Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp 3125-3132. 2020. https://dl.acm.org/doi/10.1145/3340531.3412757

The black and white score gap after the no child left behind act

Published in BCE SZD, 2016

The study investigates the black and white test score gap among elementary school students in the United States after 2010. A core finding of the research is that black students do worse than whites in mathematics even when environmental variables such as socioeconomic status are controlled for. Intriguingly, even after propensity score matching the gap starts to widen, which might flag the results of racial homophily or consequences of premarket discrimination of black students.

Recommended citation: B. Rozemberczki and O. Kiss. The black and white score gap after the no child left behind act, BCE 2016. http://szd.lib.uni-corvinus.hu/10244/

Externáliák hatékony szabályozása kétoldalú monopol piacokon [Only available in Hungarian]

Published in BCE SZD, 2013

A dolgozat elsődleges célja, hogy megvizsgálja az externáliák szabályozási lehetőségeit kétoldalú monopolista esetében. Ennek során a negatív externáliákra helyezve a hangsúlyt megvizsgálom, hogy miként hat az externáliák társadalmilag optimális szabályozása a piaci szereplők jólétére. A vizsgálathoz egy olyan elméleti modellt építek fel, mely alkalmas az externáliák hatásainak megfelelő bemutatására.

Recommended citation: O. Kiss. Externáliák hatékony szabályozása kétoldalú monopol piacokon, BCE 2013. http://szd.lib.uni-corvinus.hu/6188/