Summary: Lesson 1 - Camera Systems and Computer Vision
Module: Module 2 - Sensors and Perception for Humanoid Robots Lesson: 01-camera-systems.md Target Audience: CS students with Python + Module 1 (ROS2) knowledge Estimated Time: 30-45 minutes Difficulty: Beginner
Learning Outcomes
By the end of this lesson, students will be able to:
- Understand the five foundational camera parameters (pixels, resolution, frame rate, field of view, image encoding)
- Differentiate between monocular, stereo, and RGB-D cameras based on trade-offs
- Apply ROS2 sensor_msgs/Image and CameraInfo message structures to process camera data
- Analyze camera placement strategies (head, wrist, chest) and their impact on humanoid capabilities
- Evaluate trade-offs in multi-camera system design for competing requirements
Key Concepts Covered
Camera Types (Section 3.1)
- Monocular: Simple, low-cost, no depth (2D only)
- Stereo: Passive depth via triangulation, 0.5-10m range
- RGB-D: Active depth (IR/ToF), 0.3-10m range, limited outdoor use
Camera Parameters (Section 3.2)
- Resolution: 640×480 (VGA) to 4K, trade-off with computation
- Field of View: 30-180°, inverse relationship with focal length
- Frame Rate: 10-60+ FPS, trade-off with bandwidth
Camera Placement (Section 3.3)
- Head-mounted: Navigation, situational awareness, pan/tilt capability
- Wrist-mounted: Manipulation, visual servoing, eye-in-hand
- Chest-mounted: Stable SLAM reference, compromise viewpoint
ROS2 Integration (Sections 3.4-3.5)
- sensor_msgs/Image: header, dimensions, encoding, step, raw data
- sensor_msgs/CameraInfo: K matrix (intrinsics), distortion coefficients
- Publisher-Subscriber Pattern: Camera driver → multiple vision nodes
- QoS: Best effort reliability, moderate queue depth for real-time
Real-World Examples
Boston Dynamics Atlas
- Stereo cameras in head (752×480 @ 30fps, 10cm baseline)
- RGB cameras in wrists (1280×720 @ 10-15fps)
- Task specialization: depth for locomotion, color for manipulation
Tesla Optimus
- Eight monocular cameras (1280×960 @ 36fps)
- Vision-only approach (no lidar)
- Neural network depth estimation from monocular cues
Code Examples
- CameraSubscriber: Subscribe to /camera/image_raw and log metadata
- CameraInfoSubscriber: Extract K matrix and calculate FOV from focal length
Both examples demonstrate:
- Type hints on all function signatures
- ROS2 node inheritance pattern
- Callback-based message processing
- Proper use of sensor_msgs types
Practice Exercises
- Multi-Camera System Design: Design vision system for delivery robot (navigation + object recognition + human interaction)
- Live Topic Inspection: Use ros2 CLI tools to inspect camera topics and extract parameters
- AI Colearning Prompts:
- FOV/focal length relationship analogy
- Stereo synchronization requirements
Common Pitfalls (Expert Insights)
- Resolution Sweet Spot: 640×480 @ 15-30fps often better than 4K for real-time humanoid applications
- RGB vs BGR: OpenCV uses BGR by default; rgb8 messages need conversion to avoid color inversion
Assessment Criteria
Students demonstrate mastery when they can:
- Explain camera type differences and justify selection for specific humanoid tasks
- Subscribe to ROS2 camera topics and process image/calibration data
- Design multi-camera configurations with trade-off justification
- Calculate FOV from intrinsic matrix parameters
- Identify appropriate camera placements for manipulation vs navigation
Prerequisites
- Module 1: ROS2 Basics (nodes, topics, publishers, subscribers, message types)
- Python 3.11+ with type hints
- Basic linear algebra (vectors, matrices for K matrix understanding)
Next Steps
- Lesson 2: Depth Sensing Technologies (LiDAR, structured light, ToF)
- Connection: Cameras provide 2D semantic info; depth sensors add 3D spatial awareness
Metadata
- Generated by: Agent Pipeline (9-agent system)
- Created: 2025-12-08
- Tags: ros2, sensors, camera, computer-vision, humanoid-robotics
- Cognitive Load: Moderate (6 new concepts, builds on Module 1)
- Word Count: ~3,200 words (including code + callouts)
- Sections: 7 (What Is, Why Matters, Key Principles, Callouts, Code Examples, Summary, Next Steps)
Validation Status
- ✅ Technical Review: PASS WITH MINOR REVISIONS (all fixed)
- ✅ Structure & Style: PASS (all 7 sections, proper callouts)
- ✅ Frontmatter: COMPLETE (13 fields generated)
- ✅ Code Quality: PASS (type hints, docstrings, ROS2 patterns)
- ✅ Case Studies: 2 detailed examples (Atlas, Optimus)
- ✅ Callouts: 2 AI Colearning, 2 Expert Insights, 2 Practice Exercises