🎥
Video Model ML Project
Task 1: Enhanced Pose Control (Priority)
Objective: Implement superior pose estimation and tracking using Meta Sapiens model.
Implementation Steps:
Sapiens Integration:
Deploy Sapiens model as custom ComfyUI nodes
Ensure compatibility with DWPose / OpenPose rig format
Replace current DWPose preprocessor
Performance Remapping:
Handle dimensional mismatches between reference image and base video
Map key points between reference and target performances
Ensure pose consistency across different aspect ratios/scales
Task 2: Trajectory Control System
Objective: Track and control significant objects/elements throughout video sequences.
Implementation Approach:
Use SAM2 for bounding box detection and tracking
Identify significant objects in initial frames
Maintain consistent tracking across entire video sequence
Python + ComfyUI prototype development
Deliverables:
Bounding box detection system
Format bounding box information as VACE preprocessor
Multi-object tracking capability
Integration with existing video pipeline
Task 3: Camera Path Control
Objective: Enable precise camera movement control through 3D scene understanding.
Technical Approach:
Create 3D bounding boxes of scene elements
Track camera position relative to 3D scene
Combine depth and trajectory tracking data
Use SAM2 for 3D bounding box generation
Format 3D bounding box information as VACE pre-processor
Integration: Can be combined with trajectory control brief for efficiency.
Task 4: Enhanced Edge Map Processing
Objective: Replace Canny edge detection with diffusion-based edge mapping for improved accuracy.
Implementation:
Integrate FramePath photo-to-line SafeTensors model
Replace current Canny-based edge detection
Implement diffusion-based edge map generation
Ensure improved accuracy over time + performance optimization for real-time
Price
We are offering ~$2k per task for this project. If you’re interested please contact rishi@glyf.space or shaheel@glyf.space