face_gaze¶
feat.utils.face_gaze
¶
Pure-PyTorch gaze estimation from MediaPipe Face Mesh landmarks.
The MediaPipe mesh emits 478 3D landmarks per face: 468 face vertices plus 5 iris landmarks per eye (indices 468-472 left, 473-477 right). The gaze direction for each eye is approximated by
gaze_eye = normalize(iris_center - eye_center)
where the centers are computed from the iris-ring landmarks and a small ring
of eye-perimeter landmarks respectively. The original implementation in
MPDetector.estimate_gaze_direction does this computation in camera frame,
which makes a turned head look like averted gaze.
This module computes gaze in the head frame by rotating the relevant
landmarks using the head pose recovered by feat.utils.face_pose. The
returned gaze vector and (pitch, yaw) angles describe gaze relative to a
neutral, camera-facing head, which is what most downstream analyses want.
estimate_gaze(landmarks_3d, R=None)
¶
Estimate per-eye and combined gaze from a batch of face-mesh landmarks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
landmarks_3d
|
Tensor of shape [B, 478, 3], the full MediaPipe output (face + iris). 468-only input is rejected because gaze needs iris. |
required | |
R
|
Optional [B, 3, 3] head rotation from
|
None
|
Returns:
| Type | Description |
|---|---|
|
dict with keys: "left_vector": [B, 3] unit gaze vector for the left eye. "right_vector": [B, 3] unit gaze vector for the right eye. "combined_vector": [B, 3] unit gaze vector (mean of L and R). "left_pitch_yaw": [B, 2] (pitch, yaw) in radians for the left eye. "right_pitch_yaw": [B, 2] (pitch, yaw) in radians for the right eye. "combined_pitch_yaw": [B, 2] (pitch, yaw) for the combined gaze. |
|
|
Pitch is positive when looking up; yaw is positive when looking to the |
|
|
subject's left (i.e., the camera's right). Angles are 0 when the gaze |
|
|
is along the head's forward axis. |