List of our publicly released datasets. Please cite the paper if you use any of these datasets.

Physiological dataset

Authors: Nilesh Kumar Sahu, Snehil Gupta, Haroon R Lone.

The dataset captures physiological and self-reported responses from participants during a baseline condition and three anxiety-inducing activities (Speech, Group Discussion, Interview). Each activity consists of three phases. Additionally, anticipation data is included, representing participants’ preparation time for the speech, Interview, and group discussion activities. The dataset was collected in controlled settings at IISER Bhopal using Shimmer ECG and Shimmer GSR kits at a sampling rate of 1024 Hz. The dataset accompanies the IMWUT paper titled: “MAD: A Multimodal Physiological and Self-Reported Dataset for Anxiety Research from a Low-to-Middle-Income Country.”

Night-time traffic signboard dataset

Authors: Aditya Mishra, Akshay Aggarwal, Haroon R Lone.

The traffic signboard dataset consists of 6004 night-time images with 14044 instances of signboards spanning over a carefully curated set of 41 classes. The dataset spans across six different districts, includes both rural and urban signboards, and also features different lightening conditions based on the availibilty of street lights and different time of the night. The dataset also consist of four additional classes, which are not present in existing datasets.

Knowledge (Question-Answer) dataset

Authors: Vimaleswar A, Prabhu Nandan Sahu, Nilesh Kumar Sahu, Haroon R Lone.

The knowledge dataset in the form of question answer (or Seeker Supporter) can be used for fine-tuning. It is derived exclusively from the openly licensed, publicly available textbooks. The dataset accompanies the paper titled, “An Offline Mobile Conversational Agent for Mental Health Support: Learning from Emotional Dialogues and Psychological Texts with Student-Centered Evaluation.”

Thermal images dataset

Authors: Arijit Samal, Haroon R Lone.

The dataset contains thermal images of study participants collected in dense and sparse settings. Dense settings correspond to classrooms and sparse correspond to discussion/research rooms. Please read paper’s abstract for the details.

Speech audio dataset

Authors: Nilesh K Sahu, Manjeet Yadav, Haroon R Lone.

The dataset contains audios and self-reported anxiety scores from 105 participants, recorded while they delivered anxiety-inducing speeches.

PPG dataset

Authors: Pranay Jaiswal, Nilesh K Sahu, Haroon R Lone.

The dataset comprises photoplethysmography (PPG) data from 32 participants, obtained using both a Samsung smartwatch and a Shimmer sensor. In addition to PPG data, the dataset also includes accelerometer data. Participants engaged in various activities such as sitting, standing, and walking during the data collection process.

INFRAred Dataset for Occupancy Estimation and Localization (INFRADEL)

Authors: Soumya Ranjan Sahoo, Haroon R Lone.

This dataset contains thermal images, including 2-channel gray-scale thermal images and converted RGB format. It captures a diverse range of scenarios, including both static and dynamic settings with varying levels of occupancy, making it ideal for developing and testing occupancy estimation algorithms. The dataset was collected in the classrooms of IISER Bhopal.

Traffic signs dataset

Authors: Rishabh Uikey, Haroon R Lone, Akshay Agarwal.

The dataset contains Indian traffic signboards capturing signs such as left turn, right turn, speed limit, stop, U-turn, no-parking, etc. Furthermore, annotations (height, width, depth, label, and bounding-box coordinates) are recorded in a separate file for each signboard. The dataset contains around 8512 sign boards.

Rash driving patterns dataset (IMU)

Authors: Durgesh Mishra, Manoj Gulati, Haroon R Lone,

This dataset contains Inertial Measurement Unit (IMU) data collected from a smartphone in a running car. The IMU data includes acceleration along the X and Y axes and angular velocity around the Z axis of the car. The dataset captures signatures of five distinct rash driving patterns: Lane Weaving, Lane Swerving, Hard Braking, Hard Cornering, and Quick U-turn. The data spans five hours and is sampled at a rate of 100 Hz.

Cough dataset

Authors: Pranay Jaiswal, Haroon R Lone.

The dataset contains cough samples collected via a Samsung smartwatch under controlled settings while doing different activities. Please read the paper for further details.

I-BLEND energy dataset

Authors: Haroon R Lone, Pushpendra Singh, Amarjeet Singh.

I-BLEND consists of 52 months’ worth of electrical energy data, sampled at one-minute intervals, gathered from both commercial and residential buildings within an academic institute campus in India. Additionally, it incorporates occupancy data for each building on the campus, sampled at a rate of every 10 minutes.