Adel Bibi is a senior researcher in machine learning and computer vision at the Department of Engineering Science of the University of Oxford with Philip H.S. Torr. He is a Junior Research Fellow (JRF) of Kellogg College and a member of the ELLIS Society. Prior to that, Bibi was a postdoctoral research assistant and a senior research associate in the same department. He received his MSc and PhD degrees from King Abdullah University of Science & Technology (KAUST) in 2016 and 2020, respectively, working with Bernard Ghanem. In 2018, Bibi was a visiting PhD intern for 6 months at Intel Labs in Munich working with Vladlen Koltun. Bibi received an Amazon Research Award in Fall 2021 and has contributed more than 30 papers published in top machine learning and computer vision conferences like CVPR, ICCV, ECCV, ICCV, ICLR, NeurIPS, TPAMI, AAAI, and UAI. Bibi has also served as an Area Chair for NeurIPS23, AAAI23, and IJCAI23. He has received outstanding reviewer awards in CVPR18, CVPR19, ICCV19, and ICLR22.
Currently, Bibi is interested in large scale offline and online robust and private continual learning. Robustness, in both aspects empirical and provably certifiable, here refers to deep models under $\ell_p$ bounded additive and geometric attacks. Moreover, continual learning refers to the learning from a stream of data in stringent memory and computational settings.
Download my resume
PhD in Electrical Engineering (4.0/4.0); Machine Learning and Optimization Track, 2020
King Abdullah University of Science and Technology (KAUST)
MSc in Electrical Engineering (4.0/4.0); Computer Vision Track, 2016
King Abdullah University of Science and Technology (KAUST)
BSc in Electrical Engineering (3.99/4.0), 2014
Kuwait University
Recent language models have shown impressive multilingual performance, even when not explicitly trained for it. Despite this, concerns have been raised about the quality of their outputs across different languages. In this paper, we show how disparity in the treatment of different languages arises at the tokenization stage, well before a model is even invoked. The same text translated into different languages can have drastically different tokenization lengths, with differences up to 15 times in some cases. These disparities persist across the 17 tokenizers we evaluate, even if they are intentionally trained for multilingual support. Character-level and byte-level models also exhibit over 4 times the difference in the encoding length for some language pairs. This induces unfair treatment for some language communities in regard to the cost of accessing commercial language services, the processing time and latency, as well as the amount of content that can be provided as context to the models. Therefore, we make the case that we should train future language models using multilingually fair tokenizers.
Despite clear computational advantages in building robust neural networks, adversarial training (AT) using single-step methods is unstable as it suffers from catastrophic overfitting (CO); Networks gain non-trivial robustness during the first stages of adversarial training, but suddenly reach a breaking point where they quickly lose all robustness in just a few iterations. Although some works have succeeded at preventing CO, the different mechanisms that lead to this remarkable failure mode are still poorly understood. In this work, however, we find that the interplay between the structure of the data and the dynamics of AT plays a fundamental role in CO. Specifically, through active interventions on typical datasets of natural images, we establish a causal link between the structure of the data and the onset of CO in single-step AT methods. This new perspective provides important insights into the mechanisms that lead to CO and paves the way towards a better understanding of the general dynamics of robust model construction.
We revisit the common practice of evaluating adaptation of Online Continual Learning (OCL) algorithms through the metric of online accuracy, which measures the accuracy of the model on the immediate next few samples. However, we show that this metric is unreliable, as even vacuous blind classifiers, which do not use input images for prediction, can achieve unrealistically high online accuracy by exploiting spurious label correlations in the data stream. Our study reveals that existing OCL algorithms can also achieve high online accuracy, but perform poorly in retaining useful information, suggesting that they unintentionally learn spurious label correlations. To address this issue, we propose a novel metric for measuring adaptation based on the accuracy on the near-future samples, where spurious correlations are removed. We benchmark existing OCL approaches using our proposed metric on large-scale datasets under various computational budgets and find that better generalization can be achieved by retaining and reusing past seen information. We believe that our proposed metric can aid in the development of truly adaptive OCL methods.