Skip to content

lihongcs/Awesome-3D-Assets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

🔥Awesome-3D-Assets

A repository for recent 3D Assets.

🫨 Working hard on collecting ...

A incomplete collection of 3D tasks can be found here.

3D Datasets

Contents

Real Data

Object

time paper Sources Data Scale Modality Task
CVPR 2024 ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding Objaverse, ShapeNet (800K real-world 3D shape),(52.5K 3D shapes with 55 annotated categories) 3D point clouds, images, and language zero-shot 3D classification, standard 3D classification with fine-tuning, and 3D captioning (3D-to- language generation)
CVPR 2024 Workshop 3DCoMPaT Challenge 3DCoMPaT dataset++ 3D objects, 3D renderings recognize and ground compositions of materials on parts of 3D objects
CVPR 2024 LASO: Language-guided Affordance Segmentation on 3D Object 3D-AffordanceNet 19,751 point-question pairs, covering 8434 object shapes and 870 expert-crafted questions; Point Cloud, Text Language-guided Affordance Segmentation on 3D Object
NeurlPS 2023 OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding ShapeNetCore, 3D-FUTURE,ABO, Objaverse 876 K Text-image-3D point cloud point cloud captioning, point-cloud conditioned image generation
NeurlPS 2023 Real3D-AD: A Dataset of Point Cloud Anomaly Detection Raw 1,254 high-resolution 3D items (from forty thousand to millions of points for each item) Point-cloud 3D industrial anomaly detection
CVPR2023 RealImpact: A Dataset of Impact Sound Fields for Real Objects Raw 150,000 recordings of impact sounds of 50 everyday objects,5 distinct impact positions impact locations, microphone locations, contact force profiles, material labels, and RGBD images listener location classification and visual acoustic matching
CVPR 2023 OMNI3D: A Large Benchmark and Model for 3D Object Detection in the Wild New annotation (SUN RBG-D, ARKitScenes, Hypersim, Objectron, KITTI and nuScenes) 234k images annotated with more than 3 million instances and 98 3D boxes categories single-image, 3D cuboids 3D object detections
CVPR 2023 OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation Raw 6,000 scanned objects (190 daily categories) Mesh, Point-cloud, multi-view, videos 3D perception, novel-view synthesis, neural surface reconstruction, 3D object generation
CVPR 2022 Self-supervised Neural Articulated Shape and Appearance Models No dataset contribution - image, 3D shape few-shot reconstruction, the generation of novel articulations, and novel view-synthesis
CVPR 2022 ABO: Dataset and Benchmarks for Real-World 3D Object Understanding Raw 7, 953 3D Mesh; 8,222 multi-view images) Mesh, Multi-view; attribute single-view 3D reconstruction, material estimation, and cross-domain multi-view object retrieval.
NeurlPS 2022 Dataset and Benchmark MBW: Multi-view Bootstrapping in the Wild Raw Dataset Multi-view (2~4 cameras) tigers, fish, colobus monkeys, gorillas, chimpanzees, and flamingos from a zoo dataset, each with 2 synchronized videos Multi-view, 2D landmark of articulated objects Labeling articulated objects
CVPR 2021 3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding PartNet 23 K with 18 affordance classes 3D point cloud with affordance annotations Affordance Reasoning
ICCV 2021 Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction Raw (Annotated on MS-COCO) 1.5 M multi-view (19K objects) <=>(annotation)cameras and 3D point clouds Multi-view; Point-cloud new-view-synthesis and category-centric 3D reconstruction
ECCV 2020 ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes ScanNet two datasets (synthetic dataset of referential utterances (Sr3D) and natural referential utterances (Nr3D)): 5878 (Scene target, distractors) tuples scene, target objects, relationships between target objects and surrounding object (anchor) lanuage-assisted 3D point clouds
ECCV 2016 ObjectNet3D: A Large Scale Database for 3D Object Recognition Raw 100 categories, 90,127 images, 201,888 objects, 44,147 3D shapes Images, 3D shape (not mesh or pc) 3D pose / 3D shape recognition
CVPR 2015 3D ShapeNets: A Deep Representation for Volumetric Shapes Raw (download 3D CAD models (3D Warehouse, Yobi3D)) 3D CAD, categories 151,128 3D CAD models <=> 660 unique object categories object recognition and shape completion

Hands

time paper Sources Data Scale Modality Task
NeurlPS 2023 A Dataset of Relighted 3D Interacting Hands Raw 1.5M Images, 10 subobject image; MANO & Mask track two-hand 3D poses

Ego

time paper Sources Data Scale Modality Task
WACV 2024 IKEA Ego 3D Dataset Understanding furniture assembly actions from ego-view 3D Point Clouds Raw 493k frames Point cloud, ego RGBD action recognition
T-Ro 2024 Towards robust robot 3d perception in urban environments: The ut campus object dataset Raw 58min (1.3M 3D bounding box); 53 categories Point-cloud, RGB-D, 9 DoF inertial measurements. 3D object detection

Scene

time paper Sources Data Scale Modality Task
CVPR 2024 Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships 3DSSG 1482 scene graphs with 48k object nodes and 544k edges; 93 different attributes on 21k object instances; relationship and affordance w/o exact number RGB-D 3D object classification and inter-object relationships prediction
CVPR 2024 Multi-Attribute Interactions Matter for 3D Visual Grounding ScanRefer, ReferIt3D(Sr3D/Nr3D) (51K descriptions; 11K objects; 800 ScanNet scenes),(41K human-annotated descriptions/83K simple machine-generated descriptions; 707 scenes with object mask) RGB-D, language 3D visual grounding
CVPR 2024 LASA: Instance Reconstruction from Real Scans using A Large-scale Aligned Shape Annotation Dataset ArKitScenes 10,412 CAD aligned with 920 scenes across 17 categories scanned from ArKitScene Point cloud, Multi-view indoor instance-level scene reconstruction
CVPR 2024 DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision Raw 10K videos, 51M frames, with POI annotation Multi-view RGB novel view synthesis
ICLR 2023 SQA3D: Situated Question Answering in 3D Scenes ScanNet (New Situation Question Answer) 6.8K Situation <=> 20.4K description <=> 33.4K Reasoning Answer 3D scan, egocentric video, bird-eye view <=> situation <=> question 3D Situation Question Answer
ICCV 2023 ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes Raw 460 scenes, 280000 DSLR images, 3.7M iPhone RGBD Point cloud, Mesh, RGBD novel view synthesis and 3D semantic scene understanding
ICCV 2023 Multi3DRefer: Grounding Text Description to Multiple 3D Objects ScanRefer 61926 descriptions of 11609 objects 3D Visual Grounding
CVPR 2022 ScanQA: 3D Question Answering for Spatial Scene Understanding ScanNet (New Question-Answer Pairs) 41 K question-answer pairs (800 indoors scenes) RGB-D 3D Question-Answer (spatial understanding)
ECCV 2022 Language-Grounded Indoor 3D Semantic Segmentation in the WildScanNet200 ScanNet 200 categories in ScanNet - 3D instance segmentation;
CVPR 2021 Scan2Cap: Context-aware Dense Captioning in RGB-D Scans ScanRefer (New Task) 51,583 descriptions for 11,046 objects in 800 ScanNet scenes RGB-D <=> bounding box <=> descriptions dense scan in 3D scenes
ECCV 2020 ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language ScanNet (New Task) 51,583 descriptions <=> 11,046 objects RGB-D 3D Scens; textual; Object location with text descriptions
ICCV 2019 RIO: 3D Object Instance Re-Localization in Changing Indoor Environments Raw Data 1482 RGB-D Scans of 478 environmental RGB-D <=> object instance <=> 6 DoF mapping among scenes 3D object instance re-localization (RIO)
ArXiv 2019 The Replica Dataset: A Digital Replica of Indoor Spaces Raw 18 highly photo-realistic 3D indoor scenes dense mesh; high-resolution high-dynamic range textures <=> semantic class and instance information <=> planar mirror, glass reflectors 3D instance segmentation; navigation; question-answering; instruction following
CVPR 2017 ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes Raw Data 2.5M in Multi-views in 1513 scene annotations RGB-D <=> 3D camera pose <=> surface reconstruction <=> semantic segmentation 3D object classification, semantic voxel labeling, and CAD model retrieval

Synthetic

Object

time paper Sources Data Scale Modality Task
CVPR 2024 Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering
CVPR 2023 GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts Raw: GAPartNet 8489part instances on 1166 objects Point-cloud part segmentation, part pose estimation, and part-based object manipulation
NeurIPS 2023 3D-Aware Visual Question Answering about Parts, Poses and Occlusions Super-CLEVR-3D 5 categories with sub-types and attributes (12; color, material, size). RGB-D part questions, 3D pose questions, and occlusion questions
ArXiv 2212 GeoCode: Interpretable Shape Programs - train: 9,570 chairs, 9,330 vases, and 6,270 tables; validation and test: 957 chairs, 933 vases, and 627 tables Mesh, Point-Cloud, sketch 3D geometry edit
NeurlPS 2022 Datasets and Benchmarks Breaking Bad: A Dataset for Geometric Fracture and Reassembly Thingi10K, PartNet 10,474 shapes, 1,047,400 breakdown patterns Point cloud geometry measurements; shape assembly
CVPR 2022 Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction Raw Data: poorly-designed 3D physical objects (point videos of 3D objects) with choices to fix them 5K Point cloud fixing 3D object shapes based on functionality
ACCV 2022 The Eyecandies Dataset for Unsupervised Multimodal Anomaly Detection and Localization

Manipulation

time paper Sources Data Scale Modality Task
CoRL 2022 Leveraging Language for Accelerated Learning of Tool Manipulation - 36 objects images tool utilize

Scene

time paper Sources Data Scale Modality Task
2024 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination Raw 40,087 household scenes paired with 6.2 million densely-grounded scene-language instructions Point-cloud, text 3D semantic segmentation

Object and Scene

time paper Sources Data Scale Modality Task
NeurlPS 2022 PeRFception: Perception using Radiance Fields CO3D, ScanNet Co3D(18669 annotated videos with a total 1.5 million of camera-annotated frames), ScanNet(1.5 K indoor scenes with commercial RGB-D sensors) Multi-view, reconstructed Point-cloud 2D image classification, 3D object classification, 3D semantic segmentation

Real and Synthetic

Object

time paper Sources Data Scale Modality Task
2023.8 HANDAL: A Dataset of Real-World Manipulable Object Categories with Pose Annotations, Affordances, and Reconstructions videos 308k annotated image frames from 2.2k videos of 212 real-world objects in 17 categories 3D reconstruction reconstructed mesh pose estimation and affordance prediction

Scene

time paper Sources Data Scale Modality Task
CVPR 2024 SceneFun3D: Fine-Grained Functionality and Affordance Understanding in 3D Scenes
2024.1 SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding ScanNet , ARKitScenes, HM3D, 3RScan, MultiScan, Structured3D, ProcTHOR 68K scenes and 2.5M scene-language pairs Point cloud, scan 3D QA
ECCV 2022 OPD: Single-view 3D Openable Part Detection

Scene Datasets

time paper size modality type source
2025 TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes 850 scenes pts, Multi-view RGB outdoor nuScenes
2024 SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding 68K indoor scenes pts indoor ScanNet, ARKitScenes, HM3D, 3RScan, MultiScan, RIO
2024 EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI 5k ego-centric RGB-D scenes RGB-D indoor 3RScan, ScanNet, Matterport3D
2024 RELLIS-3D Dataset: Data, Benchmarks and Analysis 13,556 LiDAR scans and 6,235 images RGB, pts outdoor -
2024 SceneFun3D: Fine-Grained Functionality and Affordance Understanding in 3D Scenes 710 high-fidelity 3D scenes RGB-D(reconstructed), pts indoor -
2023 Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning 116 scenes? pts, RGB-D outdoor -
2023 ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes 460 scenes RGB-D, pts indoor -
2023 DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision 140+ RGB-D scenes; 10K+ video Multi-view RGB; video indoor & outdoor -
2021 Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting 20,000 sequences of unlabeled lidar point clouds and map-aligned pose pts outdoor -
2021 ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data 5,048 RGB-D sequences RGB-D, pts indoor -
2020 Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions 1482 scenes RGB-D indoor -
2020 ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language 800 ScanNet scenes RGB-D indoor -
2020 ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes 707 scenes with object mask pts indoor -
2020 nuScenes: A multimodal dataset for autonomous driving 1000 scenes RGB, pts outdoor -
2020 Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical Understanding of Outdoor Scene 937.1M points pts outdoor -
2012 Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite 389 stereo and optical flow image pairs, stereo visual odometry sequences pts outdoor -

Statistics

! Pending

Type \ Modality Mesh Point-Cloud Multi-view Scene-Graph
Real-Object
Real-Scene
Synthetic-Object
Synthetic-Scene

Generative Models or Tools

Object

Image 2 Multi-view

  • (ICLR 2024) SyncDreamer: Generating Multiview-consistent Images from a Single-view Image [Paper]

Text 2 3D

  • (ECCV 2024) DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation [Paper] [Project], coarse-to-fine scheme
  • (2022.12) Point·E: A System for Generating 3D Point Clouds from Complex Prompts [Paper]
  • (2022.10) CommonSim-1: Generating 3D Worlds [Project], text-to-3D dynamic environment.

Single-View 2 3D

  • (CVPR 2024) Wonder3D: Single Image to 3D using Cross-Domain Diffusion [Paper]
  • (2024.9) MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis [Paper], leveraging the power of MLLM.
  • (CVPR 2024) Splatter Image: Ultra-Fast Single-View 3D Reconstruction [Paper]
  • (ACMM 2024) Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models [Paper]
  • (RSS 2024 Workshop) Single-View 3D Reconstruction via SO(2)-Equivariant Gaussian Sculpting Networks [Paper]
  • (NeurlPS 2023) One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization [Paper]
  • (CVPR 2023) PC^2 Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction [Paper]
  • (2023.4) Anything-3D: Towards Single-view Anything Reconstruction in the Wild [Paper]
  • (ICCV 2023) Zero-1-to-3: Zero-shot One Image to 3D Object [Paper]

PairedImg 2 3D

  • (CVPR 2024) pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction [Paper]

Multi-View 2 3D

  • (ICCV 2023) NeuS2: Fast Learning of Neural Implicit Surfaces for Multi-view Reconstruction [Paper]

Text 2 4D

  • (CVPR 2024) Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models [Paper]

3D part segmentation

  • (CVPR 2023) Self-positioning Point-based Transformer for Point Cloud Understanding [Paper]
  • (CVPR 2017) PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation [Paper] [Code]
  • (Proc. SGP 2023) Cross-Shape Attention for Part Segmentation of 3D Point Clouds [Paper] [Code]

3D general model

  • (ICLR 2024 Spotlight) Uni3D: Exploring Unified 3D Representation at Scale [Paper]

Scene

  • (2024.9) GigaGS: Scaling up Planar-Based 3D Gaussians for Large Scene Surface Reconstruction [Paper] [Project]

Images

RGB 2 Depth

  • (ICCV 2021) Vision Transformers for Dense Prediction [Paper]

About

A repository for recent 3D assets.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •