Dr Ahsan Adeel
Director, Conscious Multisensory Integration (CMI) Lab
Co-PI/ Deputy Director (Research & Innovation), UKRI EPSRC £4 million Healthcare Technologies 2050 Programme
Fellow, Howard Brain Sciences Foundation
Fellow, MIT Synthetic Intelligence Lab and Oxford Computational Neuroscience Lab
Reader/ Associate Professor in Artificial Intelligence, University of Stirling
Adeel is a leading expert on the use of 21st-century neurobiology for next-generation biologically plausible multisensory AI, which he refers to as "The Beginning of Real Understanding". His work constitutes a significant contribution to the revolution occurring in the sciences of brain, mind, and AI. Though not yet widely accepted, it has started to influence prominent neuroscientists, psychologists, and AI experts, marking it as an emerging trend worth noting. See Demo: http://cmilab.org/research/
Check out the latest multi-scale perspective from the flagship Human Brain Project (HBP); his work stands out as a notable highlight (Link: HBP, 2023). Also see the latest review paper by P. Poirazi, which recognises his work as of "outstanding interest" in next-generation neuromorphic computing (Link). His work is a major highlight in the first-ever book on two-point neurons by Prof. W.A. Phillips, published by the Oxford University Press in March 2023, with strong endorsements from leading Neuroscientists, Psychologists, Pathologists, and Philosophers.
Primarily Adeel's laboratory is interested in understanding and mimicking the structure and function of two-point layer 5 pyramidal cells (L5PCs) to develop biologically plausible MS deep neural nets, trustworthy AI tools, neuromorphic chips, and intelligent nano-electronic devices.
His pioneering work into cooperative context-sensitive neural information processing (2023, 2022a, 2022b, 2018) offers fundamental implications for the interpretation of the L5PC mechanism and unlocks, for the first time, a transformative computational potential of context-sensitive L5PCs to effectively and efficiently process large amounts of heterogeneous real-world MS data. His work supports the hypothesis that the processing and learning capabilities of L5PCs may be fundamental to the abilities of the mammalian neocortex and could circumvent the computational limitations of deep learning (DL).
Adeel's contribution to this rapidly growing fundamental advance in cellular neurobiology is encouraging AI experts to exploit L5PCs in state-of-the-art AI/DL models for applications where speed and the efficient use of energy are crucial. It is also encouraging neurobiologists to search for the essentials (fine tunings) necessary to make this neurobiological mechanism solve complex real-world problems.
His work has brought him into various academic and government projects of significant magnitude, addressing the pressing need for secure, environmentally viable, economical, and resilient AI solutions for the success of emerging technologies in health care, space, underwater, robotics, autonomous cars, and manufacturing.
He is involved in several large scale national and international projects, including the world’s first brain-inspired multisensory hearing aid that will improve the quality of life and mental health of millions of people with hearing problems; new AI chips for future Mars rovers to go farther, faster, and do more science; truly human-like robots for effective therapeutic interventions and assistive technologies; biologically plausible models to understand the cellular foundations of consciousness, anaesthesia, dreaming, hallucinations, autism, and other neurological and developmental disorders.
Adeel pioneered conscious multisensory integration (CMI) theory, which sheds light on how our brain contextually integrates multisensory information at the cellular level. It is believed that if his introduced contextual fields in L5PCs exist and behave in a way described in his work, this could be a major contribution to our understanding of the intracellular mechanisms responsible for producing coherent thoughts, percepts, and actions, which are well-adapted to different situations and long-term goals.
He is an electrical engineer and a cognitive scientist. He holds B. Eng. (Electrical), MSc (Electronics) and PhD (Cognitive Computing) degrees. He is a visiting EPSRC/MRC Research Fellow at the University of Stirling, and also a Fellow at MIT Synthetic Intelligence Lab, Oxford Computational Neuroscience Lab, and Howard Brain Sciences Foundation.
Cognitively-inspired multimodal (MM) hearing-aid:
Developing the world’s first biologically plausible multisensory hearing aid that uses video information from lip movements to selectively amplify speech signals heard in noisy environments. It has been shown to be able to remove background noise so well that it can generate speech output in a noisy environment that is as clear as in a noiseless environment. Thus, it is now possible to offer people with impaired hearing intelligent lip-reading hearing aids that will make it far easier for them to perceive speech.
Conscious multisensory integration:
Developing a novel theory on conscious multisensory integration (CMI), which, as opposed to unconditional excitatory and inhibitory activity in existing deep neural networks (DNNs), supports conditional amplification/suppression of feedforward signals, with respect to external environment.
The theory sheds light on some crucial neuroscience questions, including: How does the brain integrate the incoming multisensory signals with respect to different external environments? How are the roles of these multisensory signals defined to adhere to the anticipated behavioural-constraint of the environment?
Understanding information decomposition in conscious multisensory integration:
This work aims to further understand the information decomposition in conscious multisensory integration. Specifically, we are quantifying the suppression and attenuation of multisensory (AV) signals in terms of four basic arithmetic operators (addition, subtraction, multiplication and division) and their various forms. The aim is to analyze how the information is decomposed into components unique to each other having multiway mutual/shared information in a CMI model.
Computational modelling of biological audio-visual processing in Alzheimer's and Parkinson's diseases using conscious multisensory integration: Sensory impairments have an enormous impact on our lives and are closely linked to cognitive functioning. Neurodegenerative processes in AD and PD affect the structure and functioning of neurons, resulting in altered neuronal activity. For example, patients with AD suffer from sensory impairment and lack the ability to channelize awareness. However, the cellular and neuronal circuit mechanisms underlying this disruption are elusive. Therefore, it is important to understand how multisensory integration changes in AD/PD, and why patients fail to guide their actions. This project aims to further extend the existing preliminary CMI research to understand how the roles of audio and visual cues change with respect to the outside world in patients with neurodegenerative diseases (e.g. AD/PD).
Explainable artificial intelligence:
Undoubtedly, existing AI and deep learning
systems exhibit impressive performance and effectuate tasks that are normally
performed by humans. Yet, these end-to-end multimodal AI models operate at the
network level and fail to justify reasoning with limited generalization and real-time
analytics; thereby, restricting their application in areas where outcomes have an
impact on humans. On the other hand, humans can extrapolate from a small number
of examples, and are quick to learn and generalize lessons learned in one situation
to instances that occur in different contexts. In this work, we are using CMI and advances in information decomposition to address the aforementioned problems
and develop XAI algorithms.
Low-power neuromorphic chips:
This research work aims to develop energy
efficient (low-power) neuromorphic chips and IoT sensors by exploiting the controlled
firing property of the CMI theory. The CMI model inherently leverages the complementary strengths of incoming multisensory signals
with respect to the outside environment and anticipated behaviour.
EPSRC funded project: Towards flexible electronic hearing aid (HA) implementations
In collaboration with the University of Manchester , we are creating an audio-visual (AV) HA
platform based upon flexible electronics, which are now being made as “temporary tattoos” for improved discreteness and social acceptability.
.
EPSRC funded project: On-chip big data processing
In collaboration with the University of Manchester, Alpha Data, and ENU , we are implementing deep cognitive neural network (DCNN) features for autonomous, privacy-preserving transfer learning (TL). Preliminary work has demonstrated such DCNN architectures are capable of highly energy-efficient, on-chip implementations, with fast decision-making, excellent generalization, and large gains per-operation scaling with deep structures for large scale processing. For very large-scale simulations, comprising 1M neurons and 2.5B synapses, have demonstrated up to 300X faster decision-making compared to DNNs.
.
EPSRC funded project: Privacy-preserving, multimodal (MM) lip-reading (LR)
In collaboration with the University of Glasgow , we are exploring the groundbreaking technology of ambient radio frequency (RF), for lip-reading.
EPSRC funded project: deep transfer learning (TL)
In collaboration with the University of Edinburgh and ENU , we are developing deep TL based generalized audio-visual (AV) speech enhancement (SE) algorithms. We are further building our innovative, context-aware DNN based AV mask estimation and SE filtering models, including through top-down models of speech, inspired by human cognition and evolution.
Hearing Loss Testing:
In collaboration with Princeton University, New
Jersey and the National University of Computer and Emerging Sciences, we are developing an automated cost-effective pre-screening test to
predict hearing loss at an early stage. The device
can potentially offer a second opinion to audiologists and can also be utilized in
developing countries or rural areas where there is a lack of well-educated
audiologists.
Dementia Sensitive Personalized Environment Planner App:
With the support of
Dementia Services Development Centre at the University of Stirling, we are
empowering people with cognitive impairments (e.g. dementia, autism, major
depressive disorder) to proactively choose their personalized surrounding
environment using a 5G small cell technology driven proactive environment planner
app.
Embedded Security for IoT:
Developing a pioneering technology that is capable of providing on- chip low power intrusion detection and encryption in embedded and multi-core
computing systems. These represent a cost-effective alternative, and a
comparatively superior approach to state-of-the-art ARM (Arm Cortex-A, Cortex-
M23, and Cortex-M33) processors - TrustZone
http://arm.com/products/processors/technologies/trustzone and Intel's work
(https://software.intel.com/en-us/articles/intel-virtualization-technology-for-directed-
io-vt-d-enhancing-intel-platforms-for-efficient-virtualization-of-io-devices).
IoT sensors: In collaboration with the National University of Science and Technology, we are developing a new IoT standard, DeepNode. DeepNode stands as a major enabler for future smart cities, healthcare and industrial monitoring, and environmental/earth (remote) sensing. The DeepNodeWAN is capable of processing a large amount of sensitive data quickly with low power consumption and high throughput, complying with the intelligent secure RRM and diverse communication requirements in massive real-time communication domains.
AV Ear Defenders - SE Application in Navy and Military:
Collaboration with the University of Texas at
Dallas to explore and exploit the potential of our develop AV speech enhancement
technology in the US Navy (for people controlling aircraft carriers deck operations),
US military (for officers not wearing earplugs), air traffic control towers (to improve
communication and reduce the risk of accidents), and cargo trains (to address driver
distraction).
Disaster Management:
Collaboration with the Tianjin University of Technology to
explore the application of our disruptive multimodal speech processing technology in
extremely noisy environments e.g. in situations where ear defenders are worn, such
as emergency and disaster response and battlefield environments.
Asthma:
Collaboration with the Edinburgh Medical School to understand the role of
exogenous sex steroid hormones in female patients with asthma. Specifically,
we are finding the correlation between the use of hormonal contraceptives and
asthma exacerbations in reproductive age females.
Multiphase Flow Meter Calibration:
A novel deep learning driven time-series
predictive and optimization model for uncertainty growth prediction and calibration
intervals optimization. The technology addresses the limitations of state-of-the-art
mathematical/statistical uncertainty growth and calibration intervals predictive
methods such as limited modelling assumptions, limited learning, lack of ability to
deal with non-linear complex behaviours, and poor scalability. State-of-the-art
literature reveals that it is difficult to solve the calibration optimization equation in
closed form.
Collision Free Wi-Fi routing:
A novel deep learning driven collision free Wi-Fi
routing algorithm to enable larger number of nodes in smart cities. Social and digital
infrastructure of a IoT-based smart city could be boosted by deploying high-density
public WiFi. Indeed, WiFi is a key to smart cities. Existing Wi-Fi devices operate
following the 802.11 standards with the aim to fairly use the channel that the devices
share. However, the throughput performance of the existing Wi-Fi networks suffers
from high packet loss and supports very limited number of nodes with low datarate.
Acute general hospital admission: Collaboration with the Dementia Services Development Centre at the University of Stirling, we are helping policymakers to explore predictors of good/bad outcome following acute general hospital admission for people with cognitive impairment.
Keynotes/Research Visits (2017 onward)
Local organizing committee chair, IEEE World Congress on Computational Intelligence (IEEE WCCI) 2020, 19 - 24th July, 2020, Glasgow (UK)
Keynote speaker at the National Pattern Recognition Laboratory, Chinese Academy of Sciences, Beijing, Oct 19th 2019. Talk on Conscious Multisensory Integration
Invited talk on AV speech processing, School of Computing Sciences, University of East Anglia, January, 2019
Invited talk on multisensory integration and its application to low-power neuromorphic chips, School of Computer Science, University of Manchester, Feb 2019
Invited talk on Accurate Model Of The Retinal Response, at the Computational Neuroscience and Cognitive Robotics Centre, Nottingham Trent University, March 2019
Invited visit to UTD for exploitation of our develop AV speech enhancement technology in the US Navy, Jan 2018
Invited visit to MIT for the development of a novel highly energy efficient, Deep Cognitive Neural Network (DCNN) for cognitive IoT devices and neuromorphic chips, Dec 2017
Invited visit to Harvard for possible collaboration on skin based flexible electronics development, Dec 2017
Invited speaker at SICSA Conference on Big Data Science Innovations: Prospects in Smart Cities, Media and Governance, Nov, 2016
Invited talk on contextual audio-visual processing, Computing Science and Maths Seminars, University of Stirling, January, 2019
Keynote on AI application to geological disaster management, Harbin Institute of Technology, Harbin, China, British Council-China initiative, supported by Newton Fund Researcher Links, April 2017
Keynote speaker, Suzhou University of Science and Technology, China, April 2018
Keynote invitation, Fifth International Conference on Biosignals, Images and Instrumentation ICBSII 2019, SSN College of Engineering, Chennai, Tamil Nadu, 14- 15th March 2019
Keynote invitation, 2nd World Congress on Mechanical and Mechatronics Engineering (WCMME-2019), April 15-16, 2019 at Dubai, UAE
GCU guest speaker, RiSE 2nd Conference, School of Engineering and Built Environment, Glasgow Caledonian University, June 2018
Keynote, Workshop on Big Data-driven Condition-monitoring and Signal-processing with Applications to the Oil & Gas Industry, Glasgow Caledonian University, June 2017
Talk at the Medical Research Council Network Meeting for Hearing-Impaired Listeners, Stirling, May, 2018
Invited visit to Edinburgh School of Art, ESRC Charter house Project, Oct 2017
Invited visit to Edinburgh medical school, meeting on OPCRD Sex Hormones & Asthma, Nov, 2017
Invited talk, lip-reading driven hearing-aid technology, Stockholm University, Sweden, Sept, 2017
Invited talk at the University of Oxford, AI based automated liver cancer diagnosis, July, 2017
Talk at the Medical Research Council Network Meeting for Hearing-Impaired Listeners, MRC Cardiff, July 2017
Invited visit to the the Scottish Dementia Research Consortium Event, Dundee, 20th April, 2017
Workshop chair, IEEE Symposium Series on Computational Intelligence (SSCI), SSCI 2017), Dec 2017
Invited visit, Dementia Design App, University of Stirling, External Advisory Board Meeting, Sept 2017
Invited visit, GCRF, SDG6 (Water and Sanitation) Stirling Meeting, September. 2017
Noisy situations
cause huge problems for suffers of hearing loss as hearing aids often make
the signal more audible but do not always restore the intelligibility. In noisy
settings, humans routinely exploit the audio-visual (AV) nature of the speech
to selectively suppress the background noise and to focus on the target
speaker. In this paper, we present a causal, language, noise and speaker
independent AV deep neural network (DNN) architecture for speech
enhancement (SE). The model exploits the noisy acoustic cues and noise
robust visual cues to focus on the desired speaker and improve the speech
intelligibility. To evaluate the proposed SE framework a first of its kind AV
binaural speech corpus, called ASPIRE, is recorded in real noisy
environments including cafeteria and restaurant. We demonstrate superior
performance of our approach in terms of objective measures and subjective
listening tests over the state-of-the-art SE approaches as well as recent DNN
based SE models. In addition, our work challenges a popular belief that a
scarcity of multi-language large vocabulary AV corpus and wide variety of
noises is a major bottleneck to build a robust language, speaker and noise
independent SE systems. We show that a model trained on synthetic mixture
of Grid corpus (with 33 speakers and a small English vocabulary) and ChiME
3 Noises (consisting of only bus, pedestrian, cafeteria, and street noises)
generalise well not only on large vocabulary corpora but also on completely
unrelated languages (such as Mandarin), wide variety of speakers and noises.
This new
publicly available dataset is based on the benchmark audio-visual GRID
corpus, which was originally developed by our project partners at Sheffield for
speech perception and automatic speech recognition. The new dataset
contains a range of joint audiovisual vectors, in the form of 2D-DCT visual
features, and the equivalent audio log-filterbank vector. All visual vectors were
extracted by tracking and cropping the lip region of a range of Grid videos
(1000 videos from five speakers, giving a total of 5000 videos), and then
transforming the region with 2D-DCT. The audio vector was extracted by
windowing the audio signal, and transforming each frame into a log-filterbank
vector. The visual signal was then interpolated to match the audio, and a
number of large datasets were created, with the frames shuffled randomly to
prevent bias, and with different pairings, including multiple visual frames to
estimate a single audio frame (from one visual to one audio pairings, to 28
visual to one audio pairings). This dataset will enable researchers to evaluate
how well audio speech can be estimated using visual information only.
Specifically, the application of novel speech enhancement algorithms
(including those based on advanced machine learning), can be used to
evaluate the potential of exploiting visual cues for speech enhancement.
ASPIRE is a a first of its
kind, audiovisual speech corpus recorded in real noisy environment (such as
cafe, restaurants) which can be used to support reliable evaluation of multi-
modal Speech Filtering technologies. This dataset follows the same sentence
format as the audiovisual Grid corpus.
A detailed
description of the AV challenge, a novel real noisy AV corpus (ASPIRE),
benchmark speech enhancement task, and baseline performance results are
outlined in [Link]. The latter are based on training a deep neural architecture
on a synthetic mixture of Grid corpus and ChiME3 noises (consisting of bus,
pedestrian, cafe, and street noises) and testing on the ASPIRE corpus.
Subjective evaluations of five different speech enhancement algorithms
(including SEAGN, spectrum subtraction (SS) , log-minimum mean-square
error (LMMSE), audio-only CochleaNet, and AV CochleaNet) are presented
as baseline results. The aim of the multi-modal challenge is to provide a
timely opportunity for comprehensive evaluation of novel AV speech
enhancement algorithms, using our new benchmark, real-noisy AV corpus
and specified performance metrics. This will promote AV speech processing
research globally, stimulate new ground-breaking multi-modal approaches,
and attract interest from companies, academics and researchers working in
AV speech technologies and applications. We encourage participants
(through a challenge website sign-up) from both the speech and hearing
research communities, to benefit from their complementary approaches to AV
speech in noise processing.
Selected Publications (2017-2020)