Jyun-Ting Song

I am an incoming PhD student at the Robotics Institute, Carnegie Mellon University. I previously completed my MS in Robotics at the same institute, advised by Prof. Kris Kitani. Before that, I was a Research Scientist Intern at Meta FAIR (now the Super Intelligence Lab). Earlier, I received my BS in Electrical Engineering from National Taiwan Normal University, where I was advised by Prof. Jacky Baltes.

I am currently looking for a summer internship in 2026.

Click to Reveal Email  /  CV  /  Scholar  /  GitHub  /  Linkedin

profile photo

Research Interests


My research focuses on advancing human-centric AI. I am currently interested in: Multi-view capture systems for in-the-wild human reconstruction under interaction scenarios Physically plausible human mesh recovery and optimization Physics-based humanoid control with contact-aware estimation

News


[2025/12]I passed my MSR thesis defense at Carnegie Mellon University on the thesis title "Multi View 4D Human Reconstruction under Interaction Scenarios''. [2025/11]SAM3D Body was released as a part of SAM3D together with SAM3. I was part of the SAM3D Body team and contributed to the model development and model training. [2025/11]Contact4D and BodyContact4D were accepted to 3DV 2026. [2025/06]I attended CVPR 2025 in Nashville, United States. [2025/06]I started a new position as a Research Scientist Intern at Meta FAIR, working on a promptable human mesh recovery model. [2024/12]I attended NeurIPS 2024 in Vancouver, Canada.
Show older news

Affiliation


NTNU
B.S. in EE
Advisor: Jacky Baltes
Sept 2017 ~ Jun 2021
CMU
M.S. in Robotics
Robotics Institute
Advisor: Kris Kitani
Sept 2023 ~ Present
Meta
Research Scientist Intern, FAIR
PI: Xitong Yang
June 2025 ~ Sept 2025

Publications


SAM 3D Body

SAM 3D Body: Robust Full-Body Human Mesh Recovery
SAM3D Body Team at Meta
Technical Report, 2025

A promptable 3D human mesh recovery model.

Contact4D

Contact4D: A Video Dataset for Whole-body Human Motion and Finger Contact in Dexterous Operations
Jyun-Ting Song, Jungeun Kim, Jinkun Cao, Yu Lei, Takuma Yagi, Kris Kitani
3D Vision (3DV), 2026

A large-scale whole-body human dataset for dexterous operations with finger contact annotations.

paper | abstract | project | dataset | bibtex

BodyContact4D

BodyContact4D: A Multi-view Video Dataset for Understanding Human and Environment Interactions
Soyong Shin, Chaeeun Lee, Holly Chen, Jyun-Ting Song, Eni Halilaj, Kris Kitani
3D Vision (3DV), 2026

A large-scale human dataset for body part contact estimation.

abstract | project | dataset | bibtex

Harmony4D

Harmony4D: A Video Dataset for In-The-Wild Close Human Interactions
Rawal Khirodkar*, Jyun-Ting Song*, Jinkun Cao, Zhengyi Luo, Kris Kitani
Neural Information Processing Systems (NeurIPS), 2024

A large-scale multi-human dataset for close human interactions captured in in-the-wild environments.

paper | abstract | project | dataset | bibtex

Balance Board

Reinforcement Learning and Action Space Shaping for a Humanoid Agent in a Highly Dynamic Environment
Jyun-Ting Song, Guilherme Christmann, Jaesik Jeong, Jacky Baltes
Springer's Studies in Computational Intelligence, 2023

Reinforcement learning framework for training a humanoid agent to balance on a dynamic board via contact-rich control.

paper | abstract | project | bibtex

CORSMAL

The CORSMAL benchmark for the prediction of the properties of containers
Alessio Xompero, et al.
IEEE Access, 2022

Benchmark for estimating container properties such as mass, type, and fill level from multimodal audio-visual data.

paper | abstract | project | bibtex |


adapted from Jon Barron's awesome webpage