LIST 2023 - Posters | UCL Centre for Artificial Intelligence - UCL

List of posters for LIST 2023.

24th May 2023

Author: Haotian Wu

Title: Features-over-the-Air: Contrastive Learning Enabled Cooperative Edge Inference

Abstract: We study the collaborative image retrieval problem at the wireless edge, where multiple edge devices capture images of the same object for the joint image retrieval problem at the edge server over a shared multiple access channel. We propose a semantic non-orthogonal multiple access (NOMA) communication, in which extracted features from each device are mapped directly to channel inputs and are then added over-the-air, which is shown to outperform the single source method and separation-based digital schemes. We then propose a novel contrastive learning (CL)-based semantic communication (CL-SC) paradigm, aiming to exploit signal correlations to maximize the retrieval accuracy under a total bandwidth constraint. Specifically, we treat noisy correlated signals as different augmentations of a common identity and propose a cross-view CL algorithm to optimize the correlated signals in a coarse-to-fine fashion to improve retrieval accuracy. Extensive numerical experiments verify that our method achieves the state-of-the-art performance and can significantly improve retrieval accuracy, with particularly significant gains in low signal-to-noise ratio (SNR) and limited bandwidth regimes.

Author: Marcello Bullo

Abstract: As the Internet of Things (IoT) continues to advance, Deep Learning (DL) models have emerged as a promising solution for edge computing applications. However, the limited battery capacity and computational resources of edge devices, coupled with the high computational complexity of DL models, poses a significant challenge for their on-device deployment. The inference task alone can rapidly deplete the battery, compromising the device's performance and longevity. Although researchers have explored efficient deep neural network (DNN) models to reduce storage and computational requirements, energy requirements for inference tasks remain a dominant concern for the future of edge intelligence. To address this concern, we propose an approach that leverages renewable energy sources, such as energy harvesting (EH), to provide self-sustainability and perpetual operation for IoT devices. However, this EH-driven operation poses a critical challenge due to the stochastic availability of ambient energy sources, requiring devices to dynamically adjust their behavior to adapt to the energy supply. Our proposed solution is an energy-adaptive dynamic early exiting (EE) technique energy-adaptive dynamic early exiting (EE) technique that enables efficient and accurate inference in an EH edge intelligence system. Specifically, our approach considers a DNN model that performs anytime resource-adaptive and input-adaptive inference and adopts EE as an adaptive energy-aware inference strategy. The derived policy determines the optimal amount of computational processing on a per-sample basis, which is critical for efficient and judicious use of available resources. Our numerical results demonstrate that implementing EE with an energy-aware exiting policy is essential when energy resources are scarce or intermittent.
This approach has broad implications for a range of applications, including smart homes, healthcare and environmental monitoring, and smart agriculture, among others.

Author: Burak Hasircioglu

Title: Over-the-Air Ensemble Inference with Model Privacy

Abstract: We consider distributed inference at the wireless edge, where multiple clients with an ensemble of models, each trained independently on a local dataset, are queried in parallel to make an accurate decision on a new sample. In addition to maximizing inference accuracy, we also want to maximize the privacy of local models. We exploit the superposition property of the air to implement bandwidth-efficient ensemble inference methods. We introduce different over-the-air ensemble methods and show that these schemes perform significantly better than their orthogonal counterparts, while using less resources and providing privacy guarantees. We also provide experimental results verifying the benefits of the proposed over-the-air inference approach.

Author: Szymon Kobus

Consider an agent/decoder placed at a fixed vertex of a directed weighted graph. The controller/encoder observes the location of a target placed at another random vertex of the graph, and its goal is to help the agent to reach this target with the minimal total cost, dictated by the weights of the edges traversed on the way. The encoder can transmit only a limited number of bits to the decoder at each step of the algorithm. Our goal is to identify the optimal trade-off between the available communication budget and the average total cost. We formulate this problem as a goal-oriented compression problem with decoder constraints, which generalizes classical lossless compression problems. We show that this problem is in general NP-complete, and construct several suboptimal heuristic algorithms for solving it in polynomial-time with bounds on their suboptimality gap. We also show a lower bound on the expected cost for any coding scheme, assuming unit cost for each transition.

Author: Li Qiao

Title: Unsourced Massive Access-Based Digital Over-the-Air Computation for Efficient Federated Edge Learning

Abstract: Over-the-air computation (OAC) is a promising technique to achieve fast model aggregation across multiple devices in federated edge learning (FEEL). In addition to the analog schemes, one-bit digital aggregation (OBDA) scheme was proposed to adapt OAC to modern digital wireless systems. However, one-bit quantization in OBDA can result in a serious information loss and slower convergence of FEEL. To overcome this limitation, this paper proposes an unsourced massive access (UMA)-based generalized digital OAC (GD-OAC) scheme. Specifically, at the transmitter, all the devices share the same non-orthogonal UMA codebook for uplink transmission. The local model update of each device is quantized based on the same quantization codebook. Then, each device transmits a sequence selected from the UMA codebook based on the quantized elements of its model update. At the receiver, we propose an approximate message passing-based algorithm for efficient UMA detection and model aggregation. Simulation results show that the proposed GD-OAC scheme significantly accelerates the FEEL convergences compared with the state-of-the-art OBDA scheme while using the same uplink communication resources.

Author: Amirmohammad Farzaneh

Title: A practical random tree generator, its entropy, and compression

Abstract: Random trees play a significant role in modelling many of the real world phenomena. Prominent examples of this are routing tables in net-
work routing protocols, and random forests in machine learning. However, there is a significant lack of random tree generators in literature. In this project, we have designed a novel random tree generator that simulates how these trees are generated in practice. In most scenarios, there is an underlying graph that governs the way the random trees can be generated. For instance, a routing table is always a subtree of the network topology graph. Based on this, we introduce the Random Spanning Tree model. In this model, the underlying graph is first generated randomly, and then one of its spanning trees is chosen randomly as the output of the random tree generator. After having introduced the random model, we then focus on its entropy, and try to quantify its information-theoretic properties. As the trees get bigger, the need for compressing them in an efficient manner grows. For this reason, after having calculated the entropy of the model as the lower bound for compression, we look at optimal compression algorithms for this model. It is shown that the proposed compression algorithm is asymptotically optimal as the trees grow larger. Ultimately, application scenarios in which tree compression can be used are mentioned and discussed.

Author: Jiechen Chen

Abstract: Neuromorphic computing is an emerging technology that supports event-driven data processing for applications requiring efficient online inference and/or control. Recent work has introduced the concept of neuromorphic communications, whereby neuromorphic computing is integrated with impulse radio (IR) transmission to implement low-energy and low-latency remote inference in wireless IoT networks. In this paper, we introduce neuromorphic integrated sensing and communications (N-ISAC), a novel solution that enables efficient online data decoding and radar sensing. N-ISAC leverages a common IR waveform for the dual purpose of conveying digital information and of detecting the presence or absence of a radar target. A spiking neural network (SNN) is deployed at the receiver to decode digital data and detect the radar target using directly the received signal. The SNN operation is optimized by balancing performance metric for data communications and radar sensing, highlighting synergies and trade-offs between the two applications.

Title: “Guaranteed Dynamic Scheduling of Ultra-Reliable Low-Latency Traffic via Conformal Prediction”
Authors: Kfir M. Cohen

Abstract: The dynamic scheduling of ultra-reliable and low-latency traffic (URLLC) in the uplink can significantly enhance the efficiency of coexisting services, such as enhanced mobile broadband (eMBB) devices, by only allocating resources when necessary. The main challenge is posed by the uncertainty in the process of URLLC packet generation, which mandates the use of predictors for URLLC traffic in the coming frames. In practice, such prediction may overestimate or underestimate the amount of URLLC data to be generated, yielding either an excessive or an insufficient amount of resources to be pre-emptively allocated for URLLC packets. In this paper, we introduce a novel scheduler for URLLC packets that provides formal guarantees on reliability and latency irrespective of the quality of the URLLC traffic predictor. The proposed method leverages recent advances in online conformal prediction (CP), and follows the principle of dynamically adjusting the amount of allocated resources so as to meet reliability and latency requirements set by the designer.

Author: Martin Ferianc
Title: MIMMO: Multi-Input Massive Multi-Output Neural Network
Abstract: Neural networks (NNs) have achieved superhuman accuracy in multiple tasks, but NNs predictions' certainty is often debatable, especially if confronted with out of training distribution data. Averaging predictions of an ensemble of NNs can recalibrate the certainty of the predictions, but an ensemble is computationally expensive to deploy in practice. Recently, a new hardware-efficient multi-input multi-output (MIMO) NN was proposed to fit an ensemble of independent NNs into a single NN. In this work, we propose the addition of early-exits to the MIMO architecture with inferred depth-wise weightings to produce multiple predictions for the same input, giving a more diverse ensemble. We denote this combination as MIMMO: a multi-input, massive multi-output NN and we show that it can achieve better accuracy and calibration compared to the MIMO NN, simultaneously fit more NNs and be similarly hardware efficient as MIMO or the early-exit ensemble.

Author: Liu Ziang

Massive multiple-input multiple-output (M-MIMO) architecture is the workhorse of modern communication systems. Currently, two fundamental bottlenecks, namely, power consumption and receiver saturation, limit the full potential achievement of this technology. These bottlenecks are intricately linked with the analog-to-digital converter (ADC) used in each radio frequency (RF) chain. The power consumption in M–MIMO systems grows exponentially with the ADC’s bit budget while ADC saturation causes permanent loss of information. This motivates the need for a solution that can simultaneously tackle the above-mentioned bottlenecks while offering advantages over existing alternatives such as low-resolution ADCs. Taking a radically different approach to this problem, we propose λ–MIMO architecture which uses modulo ADCs (Mλ–ADC) instead of a conventional ADC. Our work is inspired by the Unlimited Sampling Framework. Mλ–ADC in the RF chain folds high dynamic range signals into low dynamic range modulo samples, thus alleviating the ADC saturation problem. At the same time, digitization of modulo signal results in high resolution quantization. In the novel λ–MIMO context, we discuss baseband signal reconstruction, detection and uplink achievable sum-rate performance. The key takeaways of our work include, (a) leveraging higher signal-to-quantization noise ratio (SQNR), (b) detection and average uplink sum-rate performances comparable to a conventional, infinite-resolution ADC when using a 1-2 bit Mλ–ADC. This enables higher order modulation schemes e.g. 1024 QAM that seemed previously impossible, (c) superior trade-off between energy efficiency and bit budget, thus resulting in higher power efficiency. Numerical simulations and modulo ADC based hardware experiments corroborate our theory and reinforce the clear benefits of λ–MIMO approach.

Author: Lucas Theis

Title: Lossy Compression with Gaussian Diffusion

We consider a novel lossy compression approach based on unconditional diffusion generative models, which we call DiffC. Unlike modern compression schemes which rely on transform coding and quantization to restrict the transmitted information, DiffC relies on the efficient communication of pixels corrupted by Gaussian noise. We implement a proof of concept and find that it works surprisingly well despite the lack of an encoder transform, outperforming the state-of-the-art generative compression method HiFiC on ImageNet 64x64. DiffC only uses a single model to encode and denoise corrupted pixels at arbitrary bitrates. The approach further provides support for progressive coding, that is, decoding from partial bit streams. We perform a rate-distortion analysis to gain a deeper understanding of its performance, providing analytical results for multivariate Gaussian data as well as theoretic bounds for general distributions. Furthermore, we prove that a flow-based reconstruction achieves a 3 dB gain over ancestral sampling at high bitrates.

Author: Amirreza Zamanisiboni

Abstract: A private cache-aided compression problem is studied, where a server has access to a database of $N$ files, $(Y_1,...,Y_N)$, each of size $F$ bits and is connected through a shared link to $K$ users,
each equipped with a local cache of size $MF$ bits. In the placement phase, the server fills the users$'$ caches without knowing their demands, while the delivery phase takes place after the users send their demands to the server.
We assume that each file $Y_i$ is arbitrarily correlated with a private attribute $X$, and an adversary is assumed to have access to the shared link. The users and the server have access to a shared key $W$.
The goal is to design the cache contents and the delivered message $\cal C$
such that the average length of $\mathcal{C}$ is minimized, while satisfying:
i. The response $\cal C$ does not reveal any information about $X$, i.e., $X$ and $\cal C$ are independent, which corresponds to the perfect privacy constraint; ii. User $i$ is able to decode its demand, $Y_{d_i}$, by using $\cal C$, its local cache $Z_i$, and the shared key $W$.
Since the database is correlated with $X$, existing codes for cache-aided delivery do not satisfy the perfect privacy condition. Indeed, we propose a variable-length coding scheme that combines privacy-aware compression with coded caching techniques. In particular, we use two-part code construction and Functional Representation Lemma. Finally, we extend the results to the case, where $X$ and $\mathcal{C}$ can be correlated, i.e., non-zero leakage is allowed.

Author: Selim F Yilmaz

Title: Distributed Deep Joint Source-Channel Coding over a Multiple Access Channel

Abstract: We consider distributed image transmission over a noisy multiple access channel (MAC) using deep joint source-channel coding (DeepJSCC). It is known that Shannon's separation theorem holds when transmitting independent sources over a MAC in the asymptotic infinite block length regime. However, we are interested in the practical finite block length regime, in which case separate source and channel coding is known to be suboptimal. We introduce a novel joint image compression and transmission scheme, where the devices send their compressed image representations in a non-orthogonal manner. While non-orthogonal multiple access (NOMA) is known to achieve the capacity region, to the best of our knowledge, non-orthogonal joint source channel coding (JSCC) scheme for practical systems has not been studied before. Through extensive experiments, we show significant improvements in terms of the quality of the reconstructed images compared to orthogonal transmission employing current DeepJSCC approaches particularly for low bandwidth ratios. We publicly share source code to facilitate further research and reproducibility.

25th May 2023

Author: James Henderson

Title: Deciphering the molecular determinants of T-cell specificity using relevancy, redundancy and synergy

Abstract: T-cells can coordinate a targeted immune response by binding to specific snippets of biological material taken from tumours or invading pathogens. This ability is governed by their heterodimeric surface receptors (T-cell receptors/TCRs), whose function is largely determined by a sequence of hyper-variable amino acids on the end of the receptor structure. Given the large number of possible TCR-target combinations, a key challenge in modern immunology is predicting the biological function of a T-cell purely using its TCR sequence. Solving this problem requires a method for connecting the molecular code of the TCR to its observed immunological behaviour. We present an analysis framework based on second order Renyi entropy in which we break the TCR sequence into its component features and asses the mutual information between each of these features and immune specificity. We show how informative features identified in this way may be used to construct machine learning-based models to predict the specificity of TCRs in a predictable and interpretable manner.

Author: Sangwoo Park

Title: “Quantum Conformal Prediction for Reliable Uncertainty Quantification in Quantum Machine Learning”

Abstract: Quantum machine learning is a promising programming paradigm for the optimization of quantum algorithms in the current era of noisy intermediate scale quantum (NISQ) computers. A fundamental challenge in quantum machine learning is generalization, as the designer targets performance under testing conditions, while having access only to limited training data. Existing generalization analyses, while identifying important general trends and scaling laws, cannot be used to assign reliable and informative “error bars” to the decisions made by quantum models. In this article, we propose a general methodology that can reliably quantify the uncertainty of quantum models, irrespective of the amount of training data, of the number of shots, of the ansatz, of the training algorithm, and of the presence of quantum hardware noise. The approach, which builds on probabilistic conformal prediction, turns an arbitrary, possibly small, number of shots from a pre-trained quantum model into a set prediction, e.g., an interval, that provably contains the true target with any desired coverage level. Experimental results confirm the theoretical calibration guarantees of the proposed framework, referred to as quantum conformal prediction.

Author: Matteo Zecchin

Title: Robust PACm: Training Ensemble Models Under Misspecification and Outliers

Abstract: Bayesian learning has been shown to have limited generalization capabilities under misspecification and in the presence of outliers. In this poster we introduce (m,t)-robust Bayesian learning, a novel robust free energy training criterion that combines the generalized logarithm scoring function with PACm ensemble bounds to counteract the detrimental effect of misspecification – with respect to both likelihood and prior distribution – and outliers. We theoretically study the merits of (m,t)-robust Bayesian learning and provide experiments that highlight its enhanced generalization capabilities and calibration performance under model misspecification, prior misspecification and with data sets corrupted by outliers.

Author: Yunchuan Zhang

Title: Bayesian and Multi-Armed Contextual Meta-Optimization for Efficient Wireless Radio Resource Management

Abstract: Optimal resource allocation in modern communication networks calls for the optimization of objective functions that are only accessible via costly separate evaluations for each candidate solution. The conventional approach carries out the optimization of resource-allocation parameters for each system configuration, characterized, e.g., by topology and traffic statistics, using global search methods such as Bayesian optimization (BO). These methods tend to require a large number of iterations, and hence a large number of key performance indicator (KPI) evaluations. In this paper, we propose the use of meta-learning to transfer knowledge from data collected from related, but distinct, configurations in order to speed up optimization on new network configurations. Specifically, we combine meta-learning with BO, as well as with multi-armed bandit (MAB) optimization, with the latter having the potential advantage of operating directly on a discrete search space. Furthermore, we introduce novel contextual meta-BO and meta-MAB algorithms, in which transfer of knowledge across configurations occurs at the level of a mapping from graph-based contextual information to resource-allocation parameters. Experiments for the problem of open loop power control (OLPC) parameter optimization for the uplink of multi-cell multi-antenna systems provide insights into the potential benefits of meta-learning and contextual optimization.

Author: Fredrik Hellström

Title: Information-Theoretic Generalisation Bounds for Neural Networks and Meta Learning

Abstract: Information-theoretic generalisation bounds, due to their data- and algorithm-dependence, are promising tools for understanding machine learning performance when classical, complexity-based bounds are insufficient. In particular, the evaluated conditional mutual information (e-CMI) framework of Steinke and Zakynthinou has enabled numerically accurate bounds for deep neural networks, while being expressive enough to capture some classical bounds. Through the use of convex comparator functions, we derive a family of bounds in this framework which improve upon previous results for low training losses, which is common in deep learning. For multiclass classification, we demonstrate that they recover bounds in terms of the Natarajan dimension. By extending our techniques to meta-learning,
we obtain an improved dependence on the number of samples per task, with rates that match known classical bounds for representation learning.

Author: Clement Ruah

Abstract: Commonly adopted in the manufacturing and aerospace sectors, digital twin (DT) platforms are increasingly seen as a promising paradigm to control, monitor, and analyze software-based, “open”, communication systems that are expected to dominate 6G deployments. Notably, DT platforms provide a sandbox in which to test artificial intelligence (AI) solutions for communication systems, potentially reducing the need to collect data and test algorithms in the field, i.e., on the physical twin (PT). A key challenge in the deployment of DT systems is to ensure that virtual control optimization, monitoring, and analysis at the DT are safe and reliable, avoiding incorrect decisions caused by “model exploitation”. To address this challenge, we present a general Bayesian framework with the aim of quantifying and accounting for model uncertainty at the DT that is caused by limitations in the amount and quality of data available at the DT from the PT. In the proposed framework, the DT builds a Bayesian model of the communication system, which is leveraged to enable core DT functionalities such as control via multi-agent reinforcement learning (MARL), monitoring of the PT for anomaly detection, prediction, data-collection optimization, and counterfactual analysis.

Author: Matias Altamirano

Abstract: Our work proposes an online, provably robust, and scalable Bayesian approach for changepoint detection. The resulting algorithm has key advantages over previous work: it provides provable robustness by leveraging the generalised Bayesian perspective, and also addresses the scalability issues of previous attempts. Specifically, the proposed generalised Bayesian formalism leads to conjugate posteriors whose parameters are available in closed form by leveraging diffusion score matching. The resulting algorithm is exact, can be updated through simple algebra, and is more than 10 times faster than its closest competitor.

Author: Shyam Ramesh

Abstract: Contextual Bayesian optimization (CBO) is a powerful framework for sequential decision-making given side information, with important applications, e.g., in wind energy systems. In this setting, the learner receives context (e.g., weather conditions) at each round, and has to choose an action (e.g., turbine parameters). Standard algorithms assume no cost for switching their decisions at every round. However, in many practical applications, there is a cost associated with such changes, which should be minimized. We introduce the episodic CBO with movement costs problem and, based on the online learning approach for metrical task systems of Coester and Lee (2019), propose a novel randomized mirror descent algorithm that makes use of Gaussian Process confidence bounds. We compare its performance with the offline optimal sequence for each episode and provide rigorous regret guarantees. We further demonstrate our approach on the important real-world application of altitude optimization for Airborne Wind Energy Systems. In the presence of substantial movement costs, our algorithm consistently outperforms standard CBO algorithms.

Author: Romain Chor

Title: More Communication Does Not Result in Smaller Generalization Error in Federated Learning

Abstract: We study the generalization error of statistical learning models in a Federated Learning (FL) setting. Specifically, there are K devices or clients, each holding an independent own dataset of size n. Individual models, learned locally via Stochastic Gradient Descent, are aggregated (averaged) by a central server into a global model and then sent back to the devices. We consider multiple (say R ∈ N∗) rounds of model aggregation and study the effect of R on the generalization error of the final aggregated model. We establish an upper bound on the generalization error that accounts explicitly for the effect of R (in addition to the number of participating devices K and dataset size n). It is observed that, for fixed (n, K), the bound increases with R, suggesting that the generalization of such learning algorithms is negatively affected by too frequent communication with the parameter server. Combined with the fact that the empirical risk, however, generally decreases for larger values of R, this indicates that R might be a parameter to optimize to reduce the population risk of FL algorithms. The presented results are also illustrated through numerical examples.

Author: Sun Zhuo

Title: Meta-learning Control Variates: Variance Reduction with Limited Data.

Abstract: Control variates can be a powerful tool to reduce the variance of Monte Carlo estimators, but constructing effective control variates can be challenging when the number of samples is small. In this paper, we show that when a large number of related integrals need to be computed, it is possible to leverage the similarity between these integration tasks to improve performance even when the number of samples per task is very small. Our approach, called meta-learning CVs (Meta-CVs), can be used for up to hundreds or thousands of tasks. Our empirical assessment indicates that Meta-CVs can lead to significant variance reduction in such settings, and our theoretical analysis establishes general conditions under which Meta-CVs can be successfully trained.

Author: Kaiyu Li

Title: Multilevel Bayesian Quadrature

Abstract: multilevel Monte Carlo is a key tool for approximating integrals involving expensive scientific models. The idea is to use approximations of the integrand to construct an estimator with improved accuracy over classical Monte Carlo. We propose to further enhance multilevel Monte Carlo through Bayesian surrogate models of the integrand, focusing on Gaussian process models and the associated Bayesian quadrature estimators. We show, using both theory and numerical experiments, that our approach can lead to significant improvements in accuracy when the integrand is expensive and smooth, and when the dimensionality is small or moderate. We conclude the paper with a case study illustrating the potential im- pact of our method in landslide-generated tsunami modelling, where the cost of each integrand evaluation is typically too large for operational settings.

Author: Haiyun He

Title: How Does Pseudo-Labeling Affect the Generalization Error of the Semi-Supervised Gibbs Algorithm?

Abstract: We provide an exact characterization of the expected generalization error (gen-error) for semi-supervised learning (SSL) with pseudo-labeling via the Gibbs algorithm. The gen-error is expressed in terms of the symmetrized KL information between the output hypothesis, the pseudo-labeled dataset, and the labeled dataset. Distribution-free upper and lower bounds on the gen-error can also be obtained. Our findings offer new insights that the generalization performance of SSL with pseudo-labeling is affected not only by the information between the output hypothesis and input training data but also by the information {\em shared} between the {\em labeled} and {\em pseudo-labeled} data samples. This serves as a guideline to choose an appropriate pseudo-labeling method from a given family of methods. To deepen our understanding, we further explore two examples---mean estimation and logistic regression. In particular, we analyze how the ratio of the number of unlabeled to labeled data $\lambda$ affects the gen-error under both scenarios. As $\lambda$ increases, the gen-error for mean estimation decreases and then saturates at a value larger than when all the samples are labeled, and the gap can be quantified {\em exactly} with our analysis, and is dependent on the \emph{cross-covariance} between the labeled and pseudo-labeled data samples. For logistic regression, the gen-error and the variance component of the excess risk also decrease as $\lambda$ increases.