HACO: Learning Dense Hand Contact Estimation
from Imbalanced Data

Daniel Sungho Jung, Kyoung Mu Lee

Seoul National University

ArXiv 2025

HACO is a framework for dense hand contact estimation that addresses class and spatial imbalance issues in training on large-scale datasets. Based on 14 datasets that span hand-object, hand-hand, hand-scene, and hand-body interaction, we build a powerful model that learns dense hand contact in diverse scenarios.

Installation

We recommend you to use an Anaconda virtual environment. Install PyTorch >=1.11.0 and Python >= 3.8.0. Our latest HACO model is tested on Python 3.8.20, PyTorch 1.11.0, CUDA 11.3.
Setup the environment.

# Initialize conda environment
conda create -n haco python=3.8 -y
conda activate haco

# Install PyTorch
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch

# Install all remaining packages
pip install -r requirements.txt

Data

You need to follow our directory structure of the data.

For quick demo: See docs/data_demo.md.
For evaluation: See docs/data_eval.md.
For training: See docs/data_train.md.

Then, download the official checkpoints and place them in the release_checkpoint from HuggingFace by running (if not working, try OneDrive):

bash scripts/download_haco_checkpoints.sh

Quick demo (Image)

To run HACO on demo images using the WiLoR or Mediapipe hand detector, please run:

python demo.py --backbone {BACKBONE_TYPE} --detector {DETECTOR_TYPE} --checkpoint {CKPT_PATH} --input_path {INPUT_PATH}

For example,

# ViT-H (Default, HaMeR initialized) backbone
python demo.py --backbone hamer --detector wilor --checkpoint release_checkpoint/haco_final_hamer_checkpoint.ckpt --input_path asset/example_images

# ViT-B (ImageNet initialized) backbone
python demo.py --backbone vit-b-16 --detector wilor --checkpoint release_checkpoint/haco_final_vit_b_checkpoint.ckpt --input_path asset/example_images

Note: The demo includes post-processing to reduce noise in small or sparse contact areas.

Quick demo (Video)

Before the video demo, please download example videos from HuggingFace and save at asset/example_videos by running (if not working, try OneDrive):

bash scripts/download_demo_example_videos.sh

To run HACO on demo videos using the WiLoR or Mediapipe hand detector, please run:

python demo_video.py --backbone {BACKBONE_TYPE} --checkpoint {CKPT_PATH} --input_path {INPUT_PATH}

For example,

# ViT-H (Default, HaMeR initialized) backbone
python demo_video.py --backbone hamer --checkpoint release_checkpoint/haco_final_hamer_checkpoint.ckpt --input_path asset/example_videos

# ViT-B (ImageNet initialized) backbone
python demo_video.py --backbone vit-b-16 --checkpoint release_checkpoint/haco_final_vit_b_checkpoint.ckpt --input_path asset/example_videos

Note: The demo includes post-processing for both spatial smoothing of small contact areas and temporal smoothing across frames to ensure stable contact predictions and hand detections.

Running HACO

Train

To train HACO, please run:

python train.py --backbone {BACKBONE_TYPE}

For example,

# ViT-H (Default, HaMeR initialized) backbone
python train.py --backbone hamer

# ViT-B (ImageNet initialized) backbone
python train.py --backbone vit-b-16

Test

To evaluate HACO on MOW dataset, please run:

python test.py --backbone {BACKBONE_TYPE} --checkpoint {CKPT_PATH}

For example,

# ViT-H (Default, HaMeR initialized) backbone
python test.py --backbone hamer --checkpoint release_checkpoint/haco_final_hamer_checkpoint.ckpt

# ViT-L (ImageNet initialized) backbone
python test.py --backbone vit-l-16 --checkpoint release_checkpoint/haco_final_vit_l_checkpoint.ckpt

# ViT-B (ImageNet initialized) backbone
python test.py --backbone vit-b-16 --checkpoint release_checkpoint/haco_final_vit_b_checkpoint.ckpt

# ViT-S (ImageNet initialized) backbone
python test.py --backbone vit-s-16 --checkpoint release_checkpoint/haco_final_vit_s_checkpoint.ckpt

# FPN (HandOccNet initialized) backbone
python test.py --backbone handoccnet --checkpoint release_checkpoint/haco_final_handoccnet_checkpoint.ckpt

# HRNet-W48 (ImageNet initialized) backbone
python test.py --backbone hrnet-w48 --checkpoint release_checkpoint/haco_final_hrnet_w48_checkpoint.ckpt

# HRNet-W32 (ImageNet initialized) backbone
python test.py --backbone hrnet-w32 --checkpoint release_checkpoint/haco_final_hrnet_w32_checkpoint.ckpt

# ResNet-152 (ImageNet initialized) backbone
python test.py --backbone resnet-152 --checkpoint release_checkpoint/haco_final_resnet_152_checkpoint.ckpt

# ResNet-101 (ImageNet initialized) backbone
python test.py --backbone resnet-101 --checkpoint release_checkpoint/haco_final_resnet_101_checkpoint.ckpt

# ResNet-50 (ImageNet initialized) backbone
python test.py --backbone resnet-50 --checkpoint release_checkpoint/haco_final_resnet_50_checkpoint.ckpt

# ResNet-34 (ImageNet initialized) backbone
python test.py --backbone resnet-34 --checkpoint release_checkpoint/haco_final_resnet_34_checkpoint.ckpt

# ResNet-18 (ImageNet initialized) backbone
python test.py --backbone resnet-18 --checkpoint release_checkpoint/haco_final_resnet_18_checkpoint.ckpt

Technical Q&A

ImportError: cannot import name 'bool' from 'numpy': Please just comment out the line from numpy import bool, int, float, complex, object, unicode, str, nan, inf.
np.int was a deprecated alias for the builtin int. To avoid this error in existing code, use int by itself. Doing this will not modify any behavior and is safe. When replacing np.int, you may wish to use e.g. np.int64 or np.int32 to specify the precision. If you wish to review your current use, check the release note link for additional information: Please refer to here.

Acknowledgement

We thank:

DECO for human-scene contact estimation.
CB Loss for inspiration on VCB Loss.
HaMeR for Transformer-based regression architecture.

Reference

@article{jung2025haco,    
title = {Learning Dense Hand Contact Estimation from Imbalanced Data},
author = {Jung, Daniel Sungho and Lee, Kyoung Mu},
journal = {arXiv preprint arXiv:2505.11152},  
year = {2025}  
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HACO: Learning Dense Hand Contact Estimation
from Imbalanced Data

ArXiv 2025

Installation

Data

Quick demo (Image)

Quick demo (Video)

Running HACO

Train

Test

Technical Q&A

Acknowledgement

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
asset		asset
data		data
docs		docs
lib		lib
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
demo_video.py		demo_video.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
vis_contact_means.py		vis_contact_means.py

License

dqj5182/HACO_RELEASE

Folders and files

Latest commit

History

Repository files navigation

HACO: Learning Dense Hand Contact Estimation from Imbalanced Data

ArXiv 2025

Installation

Data

Quick demo (Image)

Quick demo (Video)

Running HACO

Train

Test

Technical Q&A

Acknowledgement

Reference

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

HACO: Learning Dense Hand Contact Estimation
from Imbalanced Data

Packages