
This project is designed to support mixed-ability collaboration between BLV and sighted pairs by conveying information about where someone is referencing on a screen. For example, pointing gestures or saying something like “it’s next to the red box” does not help people who use non-visual ways of accessing information. To solve this problem, we designed a system which will track the gaze and pointing gestures of a sighted collaborator, identify the referenced object (e.g. a paragraph or button), and relay this information to the BLV person’s screen reader. Our goal is to improve collaboration between people of all abilities by improving their communication and reducing their task burden. We also hope to conduct a user study to test the efficacy of our system and add voice recognition.
Table of Contents
This repository contains code designed to obtain and filter gaze data and send that to a website. The functions inside the tobiiLive.py file are the most recent versions, tobiiTest was used to develop the eye tracking functions. This code contains: data collection, filtering/processing, and visualization functions for the Tobii Pro Fusion eye tracker. It is able to calculate centroids live, contains a custom calibration function, and can write data to a csv file afterwards. This code contains a Python implementation of the Tobii I-VT Fixation Filter, which is a fixation classification algorithm. There is also a sample webpage, web.html, with Javascript designed to handle live data receiving with a Flask server. For more information, see the full description at the top of the tobiiLive file.
More information and instructions regarding the hand tracking part of this project can be found in the README file in the hand_tracking folder.
This is an example of how you may give instructions on setting up your project locally.
This project is built and tested on a Tobii Pro Fusion eye tracker.
- Install the Tobii Pro SDK (tested with v 1.11) https://developer.tobiipro.com/python/python-sdk-reference-guide.html
- Install the necessary packages: matplotlib, numpy, Flask (for the server), pygame (for custom calibration).
This image shows our setup, which includes a 24" monitor, an Azure Kinect camera above the monitor and a Tobii Pro Fusion eye tracker attached to the bottom bezel of the monitor.
This image shows where the sensors are collecting data using dashed lines (pink coming from the eye tracker to the eye and cyan from the Azure Kinect to the pointing finger). There is a red line representing gaze to the estimated position on the screen and a yellow line representing pointing to the screen. The estimated area of interest is highlighted on the screen in a red box.
The raw data displayed above has red and blue displaying the calculated positions for each eye and green for interpolated data from the user's dominant eye (the left in this example)/ From this data, centroids were calculated based on the time and position of the eye. The centroid points are averaged from all the points displayed in the convex hull graph.
This project produces two key CSV files from the gaze data collected with the Tobii Pro Fusion eye tracker.
This file contains raw gaze data and additional computed features for each gaze sample.
Column Name | Description |
---|---|
new_timestamps | System time (UTC) when the gaze sample was recorded. |
device_time_stamp | Time (in microseconds) from the eye tracker’s internal clock (since start). |
left_gaze_point_on_display_area | [x, y] coordinates (in pixels) of the left eye gaze on the screen. |
right_gaze_point_on_display_area | [x, y] coordinates (in pixels) of the right eye gaze on the screen. |
inter_gaze_point_on_display_area | [x, y] interpolated gaze point when valid data was missing. |
selected_eye | The eye used for this data point (left , right , inter , or none ). |
index | Sequential index of the gaze sample. |
angular_distance | Angular distance (degrees) between gaze points in the velocity calculation window. |
velocity | Gaze velocity (degrees/second), computed from angular distance over time. |
window_1 / window_2 | Indices of the first and last points in the velocity window for the gaze sample. |
gaze_origin_in_user_coordinate_system | [x, y, z] position of the user's head relative to the tracker (in millimeters). |
validity columns | Values like left_gaze_origin_validity or inter_gaze_origin_validity (1 = valid, 0 = invalid). |
This file contains fixation (centroid) data computed from stable gaze clusters using the I-VT fixation filter.
Column Name | Description |
---|---|
id | List of indices from output.csv that form this fixation. |
start | Start time of the fixation (microseconds, device timestamp). |
end | End time of the fixation (microseconds, device timestamp). |
x_avg | Average X-coordinate (pixels) of the fixation center. |
y_avg | Average Y-coordinate (pixels) of the fixation center. |
x_list | All X-coordinates (pixels) of gaze points in the fixation. |
y_list | All Y-coordinates (pixels) of gaze points in the fixation. |
origin | [x, y, z] head position (User Coordinate System) during the fixation. |
- Coordinate systems: Screen coordinates are scaled to the monitor (e.g., 1920x1200 pixels).
- Device time: All timestamps are in microseconds and reset when the device restarts.
- Dominant eye: The dominant eye is used for primary data collection. This can be set in the code (default is
left
). - Interpolation: Missing gaze data is filled via linear interpolation if the gap is short enough, improving fixation detection.
- Fixations (centroids): Grouped gaze points where the gaze was relatively stable, indicating attention on a specific area of the screen.
Juno Bartsch - junobartsch@gmail.com
Project Link: https://github.com/juno-b/mixed-ability-collab
This project would not have been possible without the support and contributions of Juno Bartsch, Veronica Lin, Joon Jang, and Andrew Begel.
Created at the Carnegie Mellon University VariAbility Lab during the Summer 2023 REUSE program.