Summary: Lesson 1 - Camera Systems and Computer Vision

Module: Module 2 - Sensors and Perception for Humanoid Robots Lesson: 01-camera-systems.md Target Audience: CS students with Python + Module 1 (ROS2) knowledge Estimated Time: 30-45 minutes Difficulty: Beginner

Learning Outcomes

By the end of this lesson, students will be able to:

Understand the five foundational camera parameters (pixels, resolution, frame rate, field of view, image encoding)
Differentiate between monocular, stereo, and RGB-D cameras based on trade-offs
Apply ROS2 sensor_msgs/Image and CameraInfo message structures to process camera data
Analyze camera placement strategies (head, wrist, chest) and their impact on humanoid capabilities
Evaluate trade-offs in multi-camera system design for competing requirements

Key Concepts Covered

Camera Types (Section 3.1)

Monocular: Simple, low-cost, no depth (2D only)
Stereo: Passive depth via triangulation, 0.5-10m range
RGB-D: Active depth (IR/ToF), 0.3-10m range, limited outdoor use

Camera Parameters (Section 3.2)

Resolution: 640×480 (VGA) to 4K, trade-off with computation
Field of View: 30-180°, inverse relationship with focal length
Frame Rate: 10-60+ FPS, trade-off with bandwidth

Camera Placement (Section 3.3)

Head-mounted: Navigation, situational awareness, pan/tilt capability
Wrist-mounted: Manipulation, visual servoing, eye-in-hand
Chest-mounted: Stable SLAM reference, compromise viewpoint

ROS2 Integration (Sections 3.4-3.5)

sensor_msgs/Image: header, dimensions, encoding, step, raw data
sensor_msgs/CameraInfo: K matrix (intrinsics), distortion coefficients
Publisher-Subscriber Pattern: Camera driver → multiple vision nodes
QoS: Best effort reliability, moderate queue depth for real-time

Real-World Examples

Boston Dynamics Atlas

Stereo cameras in head (752×480 @ 30fps, 10cm baseline)
RGB cameras in wrists (1280×720 @ 10-15fps)
Task specialization: depth for locomotion, color for manipulation

Tesla Optimus

Eight monocular cameras (1280×960 @ 36fps)
Vision-only approach (no lidar)
Neural network depth estimation from monocular cues

Code Examples

CameraSubscriber: Subscribe to /camera/image_raw and log metadata
CameraInfoSubscriber: Extract K matrix and calculate FOV from focal length

Both examples demonstrate:

Type hints on all function signatures
ROS2 node inheritance pattern
Callback-based message processing
Proper use of sensor_msgs types

Practice Exercises

Multi-Camera System Design: Design vision system for delivery robot (navigation + object recognition + human interaction)
Live Topic Inspection: Use ros2 CLI tools to inspect camera topics and extract parameters
AI Colearning Prompts:
- FOV/focal length relationship analogy
- Stereo synchronization requirements

Common Pitfalls (Expert Insights)

Resolution Sweet Spot: 640×480 @ 15-30fps often better than 4K for real-time humanoid applications
RGB vs BGR: OpenCV uses BGR by default; rgb8 messages need conversion to avoid color inversion

Assessment Criteria

Students demonstrate mastery when they can:

Explain camera type differences and justify selection for specific humanoid tasks
Subscribe to ROS2 camera topics and process image/calibration data
Design multi-camera configurations with trade-off justification
Calculate FOV from intrinsic matrix parameters
Identify appropriate camera placements for manipulation vs navigation

Prerequisites

Module 1: ROS2 Basics (nodes, topics, publishers, subscribers, message types)
Python 3.11+ with type hints
Basic linear algebra (vectors, matrices for K matrix understanding)

Next Steps

Lesson 2: Depth Sensing Technologies (LiDAR, structured light, ToF)
Connection: Cameras provide 2D semantic info; depth sensors add 3D spatial awareness

Metadata

Generated by: Agent Pipeline (9-agent system)
Created: 2025-12-08
Tags: ros2, sensors, camera, computer-vision, humanoid-robotics
Cognitive Load: Moderate (6 new concepts, builds on Module 1)
Word Count: ~3,200 words (including code + callouts)
Sections: 7 (What Is, Why Matters, Key Principles, Callouts, Code Examples, Summary, Next Steps)

Validation Status

✅ Technical Review: PASS WITH MINOR REVISIONS (all fixed)
✅ Structure & Style: PASS (all 7 sections, proper callouts)
✅ Frontmatter: COMPLETE (13 fields generated)
✅ Code Quality: PASS (type hints, docstrings, ROS2 patterns)
✅ Case Studies: 2 detailed examples (Atlas, Optimus)
✅ Callouts: 2 AI Colearning, 2 Expert Insights, 2 Practice Exercises

Learning Outcomes​

Key Concepts Covered​

Camera Types (Section 3.1)​

Camera Parameters (Section 3.2)​

Camera Placement (Section 3.3)​

ROS2 Integration (Sections 3.4-3.5)​

Real-World Examples​

Boston Dynamics Atlas​

Tesla Optimus​

Code Examples​

Practice Exercises​

Common Pitfalls (Expert Insights)​

Assessment Criteria​

Prerequisites​

Next Steps​

Metadata​

Validation Status​