Dr Ahsan Adeel
Theme Lead and Director, Conscious Multisensory Integration (CMI) Lab
Lead CI/ Wolv PI, EPSRC £4 million transformative healthcare technologies 2050 programme grant
Fellow, Howard Brain Sciences Foundation
Fellow, MIT Synthetic Intelligence Lab and Oxford Computational Neuroscience Lab
Visiting EPSRC/MRC Fellow, University of Stirling
Reader in Artificial Intelligence, University of Wolverhampton
Dr Adeel is an internationally recognised expert in brain-inspired multimodal information processing. He is the Lead CI of the prestigious EPSRC £4 million transformative healthcare technologies (2050) programme grant, along with Prof. Hussain (PI). His proposed multisensory 5G IoT-enabled audio-visual (AV) hearing aid (HA) project is ranked second in the EPSRC’s healthcare technologies grand challenge of frontiers of physical intervention. EPSRC believes that the project vision represents a step-change in how healthcare will be delivered in future. As a PI, he is leading vision 2050, which aims to leverage the complementary strengths of both the cloud-based system and internet-independent 'brain-like' on-chip MM big data processing, to enable contextual switching option in HA by 2050. Adeel is a pioneer of conscious multisensory integration (CMI) theory, which sheds light on how our brain contextually/ selectively integrates the incoming multisensory information at the cellular level. It is believed that if his introduced universal contextual field (UCF) in pyramidal cell exists, and behaves in a way he described in his work, this could be a major contribution in our understanding of the intracellular mechanisms responsible for producing coherent thoughts, percepts, and actions, which are well-adapted to different situations and long-term goals. In collaboration with Qualcomm (San Diego), Alpha Data, Oxford Computational Neuroscience, and Plymouth BRIC, he is exploring and exploiting the potential of CMI theory to realize his grand 2050 vision of future brain-inspired technologies, including energy-efficient internet-independent on-chip big data processing, novel AI tool for diagnosing neurological disorders, collaborative and trustworthy partnerships between humans and machines, and novel therapeutic interventions/ assistive technologies. He is an electrical engineer and a cognitive scientist. He holds B. Eng. (Electrical), MSc (Electronics) and PhD (Cognitive Computing) degrees. He is a visiting EPSRC/MRC Research Fellow at the University of Stirling, and also a Fellow at MIT Synthetic Intelligence Lab, Oxford Computational Neuroscience Lab, and Howard Brain Sciences Foundation.
Cognitively-inspired multimodal (MM) hearing-aid:
Developing the world's first MM hearing aid (that can see) in collaboration with the University of Sheffield, Medical Research Council (MRC), and Sonova Switzerland (leading hearing aid manufacturers). The listening device extracts speech from noise by using a camera to see what the speaker is saying, filtering out the competing sound. This ability is well beyond that of current audio-only HA technology and has the potential to improve the quality of life of the millions of people suffering from hearing loss.
Conscious multisensory integration:
Going beyond what W.A. Phillips (Phillips et al., 2011) and G. Hinton (Lillicrap et al., 2020) have proposed, I have recently developed a novel theory on conscious multisensory integration (CMI), which, as opposed to unconditional excitatory and inhibitory activity in existing deep neural networks (DNNs), supports conditional amplification/suppression of feedforward signals, with respect to external environment. The theory sheds light on some crucial neuroscience questions, including: How does the brain integrate the incoming multisensory signals with respect to different external environments? How are the roles of these multisensory signals defined to adhere to the anticipated behavioural-constraint of the environment?
Understanding information decomposition in conscious multisensory integration:
This work aims to further understand the information decomposition in conscious multisensory integration. Specifically, we are quantifying the suppression and attenuation of multisensory (AV) signals in terms of four basic arithmetic operators (addition, subtraction, multiplication and division) and their various forms. The aim is to analyze how the information is decomposed into components unique to each other having multiway mutual/shared information in a CMI model.
Computational modelling of biological audio-visual processing in Alzheimer's and Parkinson's diseases using conscious multisensory integration: Sensory impairments have an enormous impact on our lives and are closely linked to cognitive functioning. Neurodegenerative processes in AD and PD affect the structure and functioning of neurons, resulting in altered neuronal activity. For example, patients with AD suffer from sensory impairment and lack the ability to channelize awareness. However, the cellular and neuronal circuit mechanisms underlying this disruption are elusive. Therefore, it is important to understand how multisensory integration changes in AD/PD, and why patients fail to guide their actions. This project aims to further extend the existing preliminary CMI research to understand how the roles of audio and visual cues change with respect to the outside world in patients with neurodegenerative diseases (e.g. AD/PD).
Explainable artificial intelligence:
Undoubtedly, existing AI and deep learning systems exhibit impressive performance and effectuate tasks that are normally performed by humans. Yet, these end-to-end multimodal AI models operate at the network level and fail to justify reasoning with limited generalization and real-time analytics; thereby, restricting their application in areas where outcomes have an impact on humans. On the other hand, humans can extrapolate from a small number of examples, and are quick to learn and generalize lessons learned in one situation to instances that occur in different contexts. In this work, we are using CMI and advances in information decomposition to address the aforementioned problems and develop XAI algorithms.
Low-power neuromorphic chips:
This research work aims to develop energy efficient (low-power) neuromorphic chips and IoT sensors by exploiting the controlled firing property of the CMI theory. The CMI model inherently leverages the complementary strengths of incoming multisensory signals with respect to the outside environment and anticipated behaviour.
EPSRC funded project: Towards flexible electronic hearing aid (HA) implementations
In collaboration with the University of Manchester , we are creating an audio-visual (AV) HA platform based upon flexible electronics, which are now being made as “temporary tattoos” for improved discreteness and social acceptability. .
EPSRC funded project: On-chip big data processing
In collaboration with the University of Manchester, Alpha Data, and ENU , we are implementing deep cognitive neural network (DCNN) features for autonomous, privacy-preserving transfer learning (TL). Preliminary work has demonstrated such DCNN architectures are capable of highly energy-efficient, on-chip implementations, with fast decision-making, excellent generalization, and large gains per-operation scaling with deep structures for large scale processing. For very large-scale simulations, comprising 1M neurons and 2.5B synapses, have demonstrated up to 300X faster decision-making compared to DNNs. .
EPSRC funded project: Privacy-preserving, multimodal (MM) lip-reading (LR)
In collaboration with the University of Glasgow , we are exploring the groundbreaking technology of ambient radio frequency (RF), for lip-readi.
EPSRC funded project: deep transfer learning (TL)
In collaboration with the University of Edinburgh and ENU , we are developing deep TL based generalized audio-visual (AV) speech enhancement (SE) algorithms. We are further building our innovative, context-aware DNN based AV mask estimation and SE filtering models, including through top-down models of speech, inspired by human cognition and evolution.
Hearing Loss Testing:
In collaboration with Princeton University, New Jersey and the National University of Computer and Emerging Sciences, we are developing an automated cost-effective pre-screening test to predict hearing loss at an early stage. The device can potentially offer a second opinion to audiologists and can also be utilized in developing countries or rural areas where there is a lack of well-educated audiologists.
Dementia Sensitive Personalized Environment Planner App:
With the support of Dementia Services Development Centre at the University of Stirling, we are empowering people with cognitive impairments (e.g. dementia, autism, major depressive disorder) to proactively choose their personalized surrounding environment using a 5G small cell technology driven proactive environment planner app.
Embedded Security for IoT:
Developing a pioneering technology that is capable of providing on- chip low power intrusion detection and encryption in embedded and multi-core computing systems. These represent a cost-effective alternative, and a comparatively superior approach to state-of-the-art ARM (Arm Cortex-A, Cortex- M23, and Cortex-M33) processors - TrustZone http://arm.com/products/processors/technologies/trustzone and Intel's work (https://software.intel.com/en-us/articles/intel-virtualization-technology-for-directed- io-vt-d-enhancing-intel-platforms-for-efficient-virtualization-of-io-devices).
IoT sensors: In collaboration with the National University of Science and Technology, we are developing a new IoT standard, DeepNode. DeepNode stands as a major enabler for future smart cities, healthcare and industrial monitoring, and environmental/earth (remote) sensing. The DeepNodeWAN is capable of processing a large amount of sensitive data quickly with low power consumption and high throughput, complying with the intelligent secure RRM and diverse communication requirements in massive real-time communication domains.
AV Ear Defenders - SE Application in Navy and Military:
Collaboration with the University of Texas at Dallas to explore and exploit the potential of our develop AV speech enhancement technology in the US Navy (for people controlling aircraft carriers deck operations), US military (for officers not wearing earplugs), air traffic control towers (to improve communication and reduce the risk of accidents), and cargo trains (to address driver distraction).
Collaboration with the Tianjin University of Technology to explore the application of our disruptive multimodal speech processing technology in extremely noisy environments e.g. in situations where ear defenders are worn, such as emergency and disaster response and battlefield environments.
Collaboration with the Edinburgh Medical School to understand the role of exogenous sex steroid hormones in female patients with asthma. Specifically, we are finding the correlation between the use of hormonal contraceptives and asthma exacerbations in reproductive age females.
Multiphase Flow Meter Calibration:
A novel deep learning driven time-series predictive and optimization model for uncertainty growth prediction and calibration intervals optimization. The technology addresses the limitations of state-of-the-art mathematical/statistical uncertainty growth and calibration intervals predictive methods such as limited modelling assumptions, limited learning, lack of ability to deal with non-linear complex behaviours, and poor scalability. State-of-the-art literature reveals that it is difficult to solve the calibration optimization equation in closed form.
Collision Free Wi-Fi routing:
A novel deep learning driven collision free Wi-Fi routing algorithm to enable larger number of nodes in smart cities. Social and digital infrastructure of a IoT-based smart city could be boosted by deploying high-density public WiFi. Indeed, WiFi is a key to smart cities. Existing Wi-Fi devices operate following the 802.11 standards with the aim to fairly use the channel that the devices share. However, the throughput performance of the existing Wi-Fi networks suffers from high packet loss and supports very limited number of nodes with low datarate.
Acute general hospital admission: Collaboration with the Dementia Services Development Centre at the University of Stirling, we are helping policymakers to explore predictors of good/bad outcome following acute general hospital admission for people with cognitive impairment.
Keynotes/Research Visits (2017 onward)
Local organizing committee chair, IEEE World Congress on Computational Intelligence (IEEE WCCI) 2020, 19 - 24th July, 2020, Glasgow (UK)
Keynote speaker at the National Pattern Recognition Laboratory, Chinese Academy of Sciences, Beijing, Oct 19th 2019. Talk on Conscious Multisensory Integration
Invited talk on AV speech processing, School of Computing Sciences, University of East Anglia, January, 2019
Invited talk on multisensory integration and its application to low-power neuromorphic chips, School of Computer Science, University of Manchester, Feb 2019
Invited talk on Accurate Model Of The Retinal Response, at the Computational Neuroscience and Cognitive Robotics Centre, Nottingham Trent University, March 2019
Invited visit to UTD for exploitation of our develop AV speech enhancement technology in the US Navy, Jan 2018
Invited visit to MIT for the development of a novel highly energy efficient, Deep Cognitive Neural Network (DCNN) for cognitive IoT devices and neuromorphic chips, Dec 2017
Invited visit to Harvard for possible collaboration on skin based flexible electronics development, Dec 2017
Invited speaker at SICSA Conference on Big Data Science Innovations: Prospects in Smart Cities, Media and Governance, Nov, 2016
Invited talk on contextual audio-visual processing, Computing Science and Maths Seminars, University of Stirling, January, 2019
Keynote on AI application to geological disaster management, Harbin Institute of Technology, Harbin, China, British Council-China initiative, supported by Newton Fund Researcher Links, April 2017
Keynote speaker, Suzhou University of Science and Technology, China, April 2018
Keynote invitation, Fifth International Conference on Biosignals, Images and Instrumentation ICBSII 2019, SSN College of Engineering, Chennai, Tamil Nadu, 14- 15th March 2019
Keynote invitation, 2nd World Congress on Mechanical and Mechatronics Engineering (WCMME-2019), April 15-16, 2019 at Dubai, UAE
GCU guest speaker, RiSE 2nd Conference, School of Engineering and Built Environment, Glasgow Caledonian University, June 2018
Keynote, Workshop on Big Data-driven Condition-monitoring and Signal-processing with Applications to the Oil & Gas Industry, Glasgow Caledonian University, June 2017
Talk at the Medical Research Council Network Meeting for Hearing-Impaired Listeners, Stirling, May, 2018
Invited visit to Edinburgh School of Art, ESRC Charter house Project, Oct 2017
Invited visit to Edinburgh medical school, meeting on OPCRD Sex Hormones & Asthma, Nov, 2017
Invited talk, lip-reading driven hearing-aid technology, Stockholm University, Sweden, Sept, 2017
Invited talk at the University of Oxford, AI based automated liver cancer diagnosis, July, 2017
Talk at the Medical Research Council Network Meeting for Hearing-Impaired Listeners, MRC Cardiff, July 2017
Invited visit to the the Scottish Dementia Research Consortium Event, Dundee, 20th April, 2017
Workshop chair, IEEE Symposium Series on Computational Intelligence (SSCI), SSCI 2017), Dec 2017
Invited visit, Dementia Design App, University of Stirling, External Advisory Board Meeting, Sept 2017
Invited visit, GCRF, SDG6 (Water and Sanitation) Stirling Meeting, September. 2017
Noisy situations cause huge problems for suffers of hearing loss as hearing aids often make the signal more audible but do not always restore the intelligibility. In noisy settings, humans routinely exploit the audio-visual (AV) nature of the speech to selectively suppress the background noise and to focus on the target speaker. In this paper, we present a causal, language, noise and speaker independent AV deep neural network (DNN) architecture for speech enhancement (SE). The model exploits the noisy acoustic cues and noise robust visual cues to focus on the desired speaker and improve the speech intelligibility. To evaluate the proposed SE framework a first of its kind AV binaural speech corpus, called ASPIRE, is recorded in real noisy environments including cafeteria and restaurant. We demonstrate superior performance of our approach in terms of objective measures and subjective listening tests over the state-of-the-art SE approaches as well as recent DNN based SE models. In addition, our work challenges a popular belief that a scarcity of multi-language large vocabulary AV corpus and wide variety of noises is a major bottleneck to build a robust language, speaker and noise independent SE systems. We show that a model trained on synthetic mixture of Grid corpus (with 33 speakers and a small English vocabulary) and ChiME 3 Noises (consisting of only bus, pedestrian, cafeteria, and street noises) generalise well not only on large vocabulary corpora but also on completely unrelated languages (such as Mandarin), wide variety of speakers and noises.
This new publicly available dataset is based on the benchmark audio-visual GRID corpus, which was originally developed by our project partners at Sheffield for speech perception and automatic speech recognition. The new dataset contains a range of joint audiovisual vectors, in the form of 2D-DCT visual features, and the equivalent audio log-filterbank vector. All visual vectors were extracted by tracking and cropping the lip region of a range of Grid videos (1000 videos from five speakers, giving a total of 5000 videos), and then transforming the region with 2D-DCT. The audio vector was extracted by windowing the audio signal, and transforming each frame into a log-filterbank vector. The visual signal was then interpolated to match the audio, and a number of large datasets were created, with the frames shuffled randomly to prevent bias, and with different pairings, including multiple visual frames to estimate a single audio frame (from one visual to one audio pairings, to 28 visual to one audio pairings). This dataset will enable researchers to evaluate how well audio speech can be estimated using visual information only. Specifically, the application of novel speech enhancement algorithms (including those based on advanced machine learning), can be used to evaluate the potential of exploiting visual cues for speech enhancement.
ASPIRE is a a first of its kind, audiovisual speech corpus recorded in real noisy environment (such as cafe, restaurants) which can be used to support reliable evaluation of multi- modal Speech Filtering technologies. This dataset follows the same sentence format as the audiovisual Grid corpus.
A detailed description of the AV challenge, a novel real noisy AV corpus (ASPIRE), benchmark speech enhancement task, and baseline performance results are outlined in [Link]. The latter are based on training a deep neural architecture on a synthetic mixture of Grid corpus and ChiME3 noises (consisting of bus, pedestrian, cafe, and street noises) and testing on the ASPIRE corpus. Subjective evaluations of five different speech enhancement algorithms (including SEAGN, spectrum subtraction (SS) , log-minimum mean-square error (LMMSE), audio-only CochleaNet, and AV CochleaNet) are presented as baseline results. The aim of the multi-modal challenge is to provide a timely opportunity for comprehensive evaluation of novel AV speech enhancement algorithms, using our new benchmark, real-noisy AV corpus and specified performance metrics. This will promote AV speech processing research globally, stimulate new ground-breaking multi-modal approaches, and attract interest from companies, academics and researchers working in AV speech technologies and applications. We encourage participants (through a challenge website sign-up) from both the speech and hearing research communities, to benefit from their complementary approaches to AV speech in noise processing.
Selected Publications (2017-2019)