Pulse Wave Dataset of 30,618 Virtual Subjects (Ages 25-75) with Circle of Willis Variations
The full database is available for download here, along with Matlab and Python scripts used for its creation and pulse wave analysis, including machine learning models for classifying circle of Willis (CoW) topology by pulse wave analysis.

Using Nektar1D, we created a database of 30,618 virtual subjects (ages 25-75) with complete and incomplete CoW topologies. Each subject's data includes arterial pressure (P), flow rate (Q), flow velocity (U), luminal area (A), and photoplethysmogram (PPG) pulse waves at various sites, alongside simulation parameters (e.g., vessel geometries, CoW topology, cardiac output, arterial stiffness). The dataset was validated against in vivo data.
Ahmet Sen, Miquel Aguirre, Peter H Charlton, Laurent Navarro, Stephane Avril and Jordi Alastruey. Machine learning-based pulse wave analysis for classification of circle of Willis topology: An in silico study with 30,618 virtual subjects. Biomedical Signal Processing and Control 100:106999, 2024
Background and Objective: The topology of the circle of Willis (CoW) is crucial in cerebral circulation and significantly impacts patient management. Incomplete CoW structures increase stroke risk and post-stroke damage. Current detection methods using computed tomography and magnetic resonance scans are often invasive, time-consuming, and costly. This study investigated the use of machine learning (ML) to classify CoW topology through arterial blood flow velocity pulse waves (PWs), which can be noninvasively measured with Doppler ultrasound.
Methods: A database of in silico PWs from 30,618 virtual subjects, aged 25 to 75 years, with complete and incomplete CoW topologies was created and validated against in vivo data. Seven ML architectures were trained and tested using 45 combinations of carotid, vertebral and brachial artery PWs, with varying levels of artificial noise to mimic real-world measurement errors. SHapley Additive exPlanations (SHAP) were used to interpret the predictions made by the artificial neural network (ANN) models.
Results: A convolutional neural network achieved the highest accuracy (98%) for CoW topology classification using a combination of one vertebral and one common carotid velocity PW without noise. Under a 20% noise-to- signal ratio, a multi-layer perceptron model had the highest prediction rate (79%). All ML models performed best for topologies lacking posterior communication arteries. Mean and peak systolic velocities were identified as key features influencing ANN predictions.
Conclusions: ML-based PW analysis shows significant potential for efficient, noninvasive CoW topology detection via Doppler ultrasound. The dataset, post-processing tools, and ML code, are freely available to support further research..


Incomplete CoW topologies increase stroke risk and post-stroke permanent damage in subjects with severe stenosis in the afferent arteries and are linked to various cerbrovascular conditions, including atherosclerosis, post-surgery intracerebral haemorrhage, intracranial aneurysms, white matter disease, and brain ageing.

The database was used to provide an in silico proof of concept for the feasibility of using Doppler ultrasound and artificial neural networks (ANNs) to detect CoW topology noninvasively by analysing carotid and vertebral artery flow velocity waveforms. This approach offers a potential, cost-effective alternative to angiography, aiding in disease risk assessment and preoperative decision-making.
Funders