face_pose_mlp¶
feat.utils.face_pose_mlp
¶
Landmark-only pose MLP inference.
The sole 6DoF-pose backend for non-img2pose face_model paths. The loaded MLP takes 68 face landmarks (normalized to the face bbox) and emits 6DoF head pose calibrated to img2pose's coordinate frame.
Training: distillation from img2pose on CelebV-HQ. v2 (default) was trained on 2.78M frames / 35K clips with a 512→256→128 hidden + LayerNorm + GELU + Dropout stack; v1 was the smaller 256→128→64 ReLU baseline on ~570k frames. v2 validation MAE on held-out CelebV-HQ: pitch 2.66°, roll 2.34°, yaw 1.58° — comparable to img2pose's reported ~4° avg MAE on BIWI (different dataset; smaller is better).
Weights: models/pose_mlp_v2.safetensors locally, HuggingFace
py-feat/pose_mlp_v2.
PoseMLP
¶
Bases: Module
Mirror of the architecture in scripts/train_pose_mlp.py.
v2 architecture: Linear → LayerNorm → GELU → Dropout per hidden block, with wider hidden layers (default 512/256/128). v1 used a bare Linear→ReLU→Dropout stack (256/128/64); we keep backward compatibility by inferring the architecture from the checkpoint when loading.
Source code in feat/utils/face_pose_mlp.py
pose_from_landmarks_mlp(landmarks_2d, bboxes=None)
¶
Estimate 6DoF pose from 68 2D landmarks.
Bbox-free: normalizes landmarks by their own centroid + inter-eye distance, so the MLP is decoupled from upstream face-detector bbox conventions (img2pose loose vs retinaface tight).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
landmarks_2d
|
Tensor
|
|
required |
bboxes
|
Tensor | None
|
ignored (kept in signature for backward compatibility). |
None
|
Returns:
| Type | Description |
|---|---|
Tensor | None
|
|
Tensor | None
|
|
Tensor | None
|
Returns |