This repository aims to be a multi feature tool for locally manipulating Strava's bulk export archive file. The main features are:
- Unzip compressed (.gz) activities files.
- Remove leading first line blank spaces of .tcx activities files for properly importing it (feature not yet available directly in the
sweatpy
package, see here). - Import multiple .fit/.gpx/.tcx activities files at once (without the need of conversion) and create local highly customizable heatmaps with different colors by activity type with the use of the
folium
library. - Increment your Strava activities metadata by adding country, state, city, postal code, latitude, longitude geolocation information for each activity given the start recorded point (first non-missing latitude/longitude).
Additionally, it is possible to apply a series of filters to select the desired activities before performing the heatmap, such as activities that started inside a bounding box (within 4 corner latitude/longitude points) or activities realized in specific countries or states.
Although similar projects already exist (see here), some of the features implemented in this project were partial or non-existent.
Munich Heatmap (rides in orange; runs in blue)
Vienna Heatmap (rides in orange; runs in blue)
Map interaction (option to navigate through the map, click in a line and get an activity summary pop-up)
Strava's bulk export process documentation can be found here.
Note: Please keep in mind that Strava's bulk export is language sensitive, i.e. the activities column labels will depend on users' defined language preferences. This project assumes that your bulk export was realized in English (US)
. To change the language, log in to Strava and on the bottom right-hand corner of any page, select English (US)
from the drop-down menu (more on this here).
In essence, the process is as follows:
- Log in to Strava.
- Open the Account Download and Deletion. Then press
Request Your Archive
button (Important: Don't press anything else on that page, particularly not theRequest Account Deletion
button). - Wait until Strava notifies you that your archive is ready via email. Download the archive file and unzip it to
Downloads/Strava
folder (or alternatively set a different working directory in the strava-local-heatmap-tool.py code).
python -m pip install "git+https://github.com/roboes/strava-local-heatmap-tool.git@main"
activities_import(activities_directory, activities_file, skip_geolocation)
- Imports Strava
activities.csv
into a DataFrame and enriches it with geolocation data from .fit/.gpx/.tcx activity files by using the initial recorded coordinates (first non-missing latitude/longitude).
activities_directory
: path object. Stravaactivities
directory from the Strava data bulk export.activities_file
: path object. Stravaactivities.csv
file from the Strava data bulk export.skip_geolocation
: bool, default: True. Skip geolocation retrieval for .fit/.gpx/.tcx activity files, using the initial recorded coordinates (first non-missing latitude/longitude). Note that geolocation retrieval relies on the public Nominatim instance (nominatim.openstreetmap.org
), which may slow down the import process for exports containing a large number of activities (with "an absolute maximum of 1 request per second").
activities_filter(activities_df, activity_type=None, activity_location_state=None, bounding_box={'latitude_top_right': None, 'longitude_top_right': None, 'latitude_top_left': None, 'longitude_top_left': None, 'latitude_bottom_left': None, 'longitude_bottom_left': None, 'latitude_bottom_right': None, 'longitude_bottom_right': None})
- Filter Strava activities DataFrame.
activities_df
: Strava activities DataFrame. Imported fromactivities_import()
function.activity_type
: str list. If None, no activity type filter will be applied.activity_location_state
: str list. If None, no state location filter will be applied.bounding_box
: dict. If None, no bounding box will be applied.
Examples of bounding_box
:
# Munich
bounding_box={
'latitude_top_right': 48.2316, 'longitude_top_right': 11.7170, # Top right boundary
'latitude_top_left': 48.2261, 'longitude_top_left': 11.4521, # Top left boundary
'latitude_bottom_left': 48.0851, 'longitude_bottom_left': 11.4022, # Bottom left boundary
'latitude_bottom_right': 48.0696, 'longitude_bottom_right': 11.7688 # Bottom right boundary
}
# Greater Munich
bounding_box={
'latitude_top_right': 48.4032, 'longitude_top_right': 11.8255, # Top right boundary
'latitude_top_left': 48.3924, 'longitude_top_left': 11.3082, # Top left boundary
'latitude_bottom_left': 47.9008, 'longitude_bottom_left': 11.0703, # Bottom left boundary
'latitude_bottom_right': 47.8609, 'longitude_bottom_right': 12.1105, # Bottom right boundary
}
# Southern Bavaria
bounding_box={
'latitude_top_right': 47.7900, 'longitude_top_right': 12.2692, # Top right boundary
'latitude_top_left': 47.7948, 'longitude_top_left': 10.9203, # Top left boundary
'latitude_bottom_left': 47.4023, 'longitude_bottom_left': 10.9779, # Bottom left boundary
'latitude_bottom_right': 47.4391, 'longitude_bottom_right': 12.3187, # Bottom right boundary
}
strava_activities_heatmap(activities_df, activities_coordinates_df=activities_coordinates, activity_colors={'Hike': '#00AD43', 'Ride': '#FF5800', 'Run': '#00A6FC'}, map_tile='dark_all', map_zoom_start=12, line_weight=1.0, line_opacity=0.6, line_smooth_factor=1.0)
- Create Heatmap based on inputted activities DataFrame.
activities_df
: Strava activities DataFrame, default: activities. Imported fromactivities_import()
function.activities_coordinates_df
: Strava activities coordinates DataFrame, default: activities_coordinates. Imported fromactivities_import()
function.strava_activities_heatmap_output_path
: path object. Path where the Strava activity heatmap will be saved.activity_colors
: dict, default: {'Hike': '#00AD43', 'Ride': '#FF5800', 'Run': '#00A6FC'}. Depending on how many distinctactivity_type
are contained in theactivities
DataFrame, more dictionaries objects need to be added.map_tile
: str, options: 'dark_all', 'dark_nolabels', 'light_all', 'light_nolabels', 'terrain_background', 'toner_lite' and 'ocean_basemap', default: 'dark_all'.map_zoom_start
: int, default: 12. Initial zoom level for the map (for more details, check zoom_start parameter for folium.folium.Map documentation).line_weight
: float, default: 1.0. Stroke width in pixels (for more details, check weight parameter for folium.vector_layers.PolyLine).line_opacity
: float, default: 0.6. Stroke opacity (for more details, check opacity parameter for folium.vector_layers.PolyLine).line_smooth_factor
: float, default: 1.0. How much to simplify the polyline on each zoom level. More means better performance and smoother look, and less means more accurate representation (for more details, check smooth_factor parameter for folium.vector_layers.PolyLine).
copy_activities(activities_directory, activities_files=activities['filename'])
- Copies a given .fit/.gpx/.tcx list of files to 'output/activities' folder.
activities_directory
: path object. Stravaactivities
directory from the Strava data bulk export.activities_files
: list, default: activities['filename'].
# Import packages
from strava_local_heatmap_tool.strava_local_heatmap_tool import activities_filter, strava_activities_heatmap, activities_import, gz_extract, tcx_lstrip
from plotnine import aes, geom_line, ggplot, labs, scale_color_brewer, theme_minimal
# Extract .gz files
gz_extract(activities_directory=os.path.join(os.path.expanduser('~'), 'Downloads', 'Strava Export', 'activities'))
# Remove leading first line blank spaces of .tcx activity files
tcx_lstrip(activities_directory=os.path.join(os.path.expanduser('~'), 'Downloads', 'Strava Export', 'activities'))
# Import Strava activities to DataFrame
activities_df, activities_coordinates_df = activities_import(
activities_directory=os.path.join(os.path.expanduser('~'), 'Downloads', 'Strava Export', 'activities'),
activities_file=os.path.join(os.path.expanduser('~'), 'Downloads', 'Strava Export', 'activities.csv'),
skip_geolocation=True,
)
# Tests
## Check for activities without activity_gear
print(activities_df.query(expr='activity_gear.isna()').groupby(by=['activity_type'], level=None, as_index=False, sort=True, dropna=True).agg(count=('activity_id', 'nunique')))
## Check for activity_name inconsistencies
print(activities_df.query(expr='activity_name.str.contains(r"^ | | $")'))
print(activities_df.query(expr='activity_name.str.contains(r"[^\\s]-|-[^\\s]")'))
## Check for distinct values for activity_name separated by a hyphen
print(
pd.DataFrame(data=(activities_df.query(expr='activity_type == "Ride"')['activity_name'].str.split(pat=' - ', expand=True).stack().unique()), index=None, columns=['activity_name'], dtype=None).sort_values(
by=['activity_name'],
ignore_index=True,
),
)
## Check for distinct values for activity_description
print(
pd.DataFrame(
data=(
activities_df.query(expr='activity_type == "Weight Training" and activity_description.notna()')['activity_description']
.replace(to_replace=r'; | and ', value=r', ', regex=True)
.str.lower()
.str.split(pat=',', expand=True)
.stack()
.unique()
),
index=None,
columns=['activity_description'],
dtype=None,
).sort_values(by=['activity_description'], ignore_index=True),
)
# Summary
## Count of activities by type
print(activities_df.groupby(by=['activity_type'], level=None, as_index=False, sort=True, dropna=True).agg(count=('activity_id', 'nunique')))
## Runs overview per year-month (distance in km)
print(
activities_df.query(expr='activity_type == "Run"')
.assign(activity_month=lambda row: row['activity_date'].dt.strftime(date_format='%Y-%m'))
.groupby(by=['activity_month'], level=None, as_index=False, sort=True, dropna=True)
.agg(count=('activity_id', 'nunique'), distance=('distance', lambda x: x.sum() / 1000)),
)
## Strava yearly overview cumulative (Plot)
strava_yearly_overview = (
activities_df.query(expr='activity_type == "Ride"')
.query(expr='activity_date >= "2017-01-01" and activity_date < "2023-01-01"')
.assign(distance=lambda row: row['distance'] / 1000, year=lambda row: row['activity_date'].dt.strftime(date_format='%Y'), day_of_year=lambda row: row['activity_date'].dt.dayofyear)
.assign(distance_cumulative=lambda row: row.groupby(by=['year'], level=None, as_index=False, sort=True, dropna=True)['distance'].transform('cumsum'))
.filter(
items=[
'activity_date',
'year',
'day_of_year',
'distance',
'distance_cumulative',
],
)
)
(
ggplot(strava_yearly_overview, aes(x='day_of_year', y='distance_cumulative', group='year', color='factor(year)'))
+ geom_line()
+ scale_color_brewer(palette=1)
+ theme_minimal()
+ labs(title='Cumultative Distance (KM)', y='Distance (KM)', x='Day of Year', color='Year')
)
## Delete objects
del strava_yearly_overview
# Filter Strava activities
activities_df = activities_filter(
activities_df=activities_df,
activity_type=['Hike', 'Ride', 'Run'],
activity_location_state=None,
bounding_box={
'latitude_top_right': None,
'longitude_top_right': None,
'latitude_top_left': None,
'longitude_top_left': None,
'latitude_bottom_left': None,
'longitude_bottom_left': None,
'latitude_bottom_right': None,
'longitude_bottom_right': None,
},
)
# Create heatmap
strava_activities_heatmap(
activities_df=activities_df,
activities_coordinates_df=activities_coordinates_df,
strava_activities_heatmap_output_path=os.path.join(os.path.expanduser('~'), 'Downloads', 'strava-activities-heatmap.html'),
activity_colors={'Hike': '#FF0000', 'Ride': '#00A3E0', 'Run': '#FF0000'},
map_tile='dark_all',
map_zoom_start=12,
line_weight=1.0,
line_opacity=0.6,
line_smooth_factor=1.0,
)
# Copy activities files to 'output/activities' folder
# copy_activities(activities_directory=os.path.join(os.path.expanduser('~'), 'Downloads', 'Strava Export', 'activities'), activities_files=activities_df['filename'])
# Import .fit/.gpx/.tcx activity files into a DataFrame
# activities_coordinates_df = activities_coordinates_import(activities_directory=os.path.join(os.path.expanduser('~'), 'Downloads', 'Strava Export', 'activities'))
# Get geolocation for .fit/.gpx/.tcx activity files given the start recorded coordinates (first non-missing latitude/longitude)
# activities_geolocation = activities_geolocator(activities_coordinates_df=activities_coordinates_df, skip_geolocation=True)
# activities_file_rename(os.path.join(os.path.expanduser('~'), 'Downloads', 'Strava Export', 'activities'), activities_geolocation_df=activities_geolocation)
Unfortunately Folium does not natively export a rendered map to .png.
A workaround is to open the rendered .html Folium map in Chrome, then open Chrome's Inspector, changing the width and high dimensions to 3500 x 3500 px, setting the zoom to 22% and the DPR to 3.0. Then capture a full size screenshot.
The canvas.xcf is a Gimp template for printing a canvas in 30 x 30 cm. Its design is similar to this Reddit discussion:
The statistics shown in the lower right corner are printed once the strava_activities_heatmap()
function is executed.
Strava API v3: Definition of activities variables.
These repositories have a similar or additional purpose to this project:
Strava local heatmap browser: Code to reproduce the Strava Global Heatmap with local .gpx files (Python).
Visualization of activities from Garmin Connect: Code for processing activities with .gpx files from Garmin Connect (Python).
Create artistic visualisations with your Strava exercise data: Code for creating artistic visualizations with your Strava exercise data (Python; a R version is available here).
strava-offline: Tool to keep a local mirror of Strava activities for further analysis/processing.
dérive - Generate a heatmap from GPS tracks: Generate heatmap by drag and dropping one or more .gpx/.tcx/.fit/.igc/.skiz file(s) (JavaScript, HTML).
Data Science For Cycling - How to Visualize GPX Strava Routes With Python and Folium (GitHub).
Build Interactive GPS activity maps from GPX files using Folium (GitHub).
StatsHunters: Connect your Strava account and show all your sport activities and added photos on one map.
Recommended settings:
- Receive monthly statistics by email
- Hide my data in club heatmaps
Cultureplot Custom Strava Heatmap Generator: Connect to Strava to see your activity heatmap. Includes the possibility to filter the activities (by date, time and type) and to customize the map (map type, background color, line color (also by activity), thickness and opacity).