Screen Capture

macOS screenshot, webcam, and screen recording — full screen, region, window, clipboard, multi-display, video, and webcam capture using native macOS tools.

Default Active Utility

Built-in skill: screen-capture is active by default in every CLI-JAW session. No installation required. It uses macOS native screencapture for screenshots and video, plus the optional imagesnap for webcam capture. For browser-specific screenshots, prefer cli-jaw browser screenshot which captures via CDP.

Quick Reference

Skill namescreen-capture
CategoryUtility
Default activeYes — built-in, always available
SKILL.md path~/.cli-jaw/skills/screen-capture/SKILL.md
Primary toolscreencapture (macOS built-in)
Webcam toolimagesnap (optional, via Homebrew)
PlatformmacOS only
Output formatsPNG (default), JPEG, PDF, TIFF, MOV (video)
Permission requiredSystem Settings > Privacy > Screen Recording
Related skillsvision-click, browser, desktop-control

Quick Start

Three commands to get started immediately:

# Full screen capture (silent, no shutter sound)
screencapture -x ~/screenshot.png

# Interactive region selection
screencapture -i ~/selection.png

# Capture directly to clipboard
screencapture -c

Commands

Full Screen

Capture the entire display. Use -x for silent mode (suppresses the shutter sound) and -T for a countdown delay.

# Silent full screen
screencapture -x ~/screenshot.png

# Full screen with shutter sound
screencapture ~/screenshot.png

# 3-second delay before capture
screencapture -T 3 ~/screenshot.png

Region and Window

Select a specific area interactively or by exact coordinates.

# Interactive: drag to select a region
screencapture -i ~/region.png

# Interactive: click to select a window
screencapture -iW ~/window.png

# Specific coordinates (x, y, width, height)
screencapture -R 0,0,1280,720 ~/region.png

Specific Application Window

Capture a named application window by resolving its window ID via AppleScript. This is useful for automation pipelines where you need to capture a specific app without user interaction.

# Capture Google Chrome window
screencapture -l$(osascript -e 'tell app "Google Chrome" to id of window 1') ~/chrome.png

# Capture Safari window
screencapture -l$(osascript -e 'tell app "Safari" to id of window 1') ~/safari.png

# Capture the frontmost window of whatever app is active
screencapture -l$(osascript -e 'tell app "System Events" to id of first window of (first process whose frontmost is true)') ~/front.png
Window ID resolution: The -l flag takes a CGWindowID. The AppleScript expression tell app "AppName" to id of window 1 returns the correct ID. If the app has multiple windows, change window 1 to window 2, etc.

Clipboard

Capture to the system clipboard instead of a file. Useful for quick paste-into-chat or paste-into-another-app workflows.

# Full screen to clipboard
screencapture -c

# Region select to clipboard
screencapture -ic

Multiple Displays

Handle multi-monitor setups by specifying display numbers or capturing all displays at once.

# One file per display (auto-creates screen1.png, screen2.png, ...)
screencapture ~/screen1.png ~/screen2.png

# Main display only
screencapture -D 1 ~/main.png

# Secondary display
screencapture -D 2 ~/secondary.png

Video Recording

Record the screen as a MOV video. Use -V with a duration (in seconds) for timed recordings, or -v for manual-stop recording.

# Record screen video (stop with Ctrl+C)
screencapture -v ~/recording.mov

# Record exactly 10 seconds
screencapture -V 10 ~/recording.mov

Webcam Capture

Webcam capture requires imagesnap, an optional Homebrew dependency. It captures a still image from the FaceTime camera or any connected USB camera.

# Install imagesnap
brew install imagesnap

# Take a photo from the default camera
imagesnap ~/camera.png

# 2-second warmup for better image quality (recommended)
imagesnap -w 2 ~/camera.png

# Use a specific camera by name
imagesnap -d "FaceTime HD Camera" ~/camera.png

# List all available cameras
imagesnap -l
Warmup tip: Webcams need a brief warmup period to adjust exposure and white balance. Always use -w 2 (or higher) for production-quality captures. Without warmup, the first frame is often dark or color-shifted.

Output Formats

Use the -t flag to specify the image format. The default is PNG.

# PNG (default, lossless)
screencapture -t png ~/screenshot.png

# JPEG (smaller file size, lossy)
screencapture -t jpg ~/screenshot.jpg

# PDF (vector-friendly, useful for documents)
screencapture -t pdf ~/screenshot.pdf

# TIFF (high quality, large file size)
screencapture -t tiff ~/screenshot.tiff

Flag Reference

Complete reference of all screencapture flags used by this skill.

FlagDescriptionExample
-xSilent mode (no shutter sound)screencapture -x out.png
-iInteractive selection mode (drag region or press Space for window)screencapture -i out.png
-cCopy capture to clipboard instead of filescreencapture -c
-R x,y,w,hCapture a specific rectangle by coordinatesscreencapture -R 0,0,1280,720 out.png
-T secondsDelay N seconds before capturescreencapture -T 3 out.png
-t formatOutput format: png, jpg, pdf, or tiffscreencapture -t jpg out.jpg
-l windowIDCapture a specific window by its CGWindowIDscreencapture -l12345 out.png
-D displayNumCapture a specific display (1 = main)screencapture -D 2 out.png
-vRecord screen video (MOV)screencapture -v out.mov
-V secondsRecord video for exactly N secondsscreencapture -V 10 out.mov
-WStart in window-selection mode (combine with -i)screencapture -iW out.png

Common Workflows

Capture a Specific App State

Bring an application to the foreground via AppleScript, wait for it to render, then capture.

# Activate Finder, wait, then capture
osascript -e 'tell app "Finder" to activate'
sleep 0.5
screencapture -x ~/finder.png

Capture and Analyze

Take a screenshot and immediately pass it to the agent's vision model for analysis. This is the pattern used by vision-click and desktop-control.

# In a CLI-JAW session, just say:
"화면 캡처해서 뭐가 보이는지 설명해줘"

# The agent runs:
screencapture -x /tmp/screen.png
# Then reads the image and describes its contents

Timed Capture for UI States

Use the delay flag to give yourself time to set up a specific UI state (e.g., open a dropdown, hover over a tooltip) before the capture fires.

# 5-second delay: enough time to open a menu
screencapture -T 5 -x ~/menu-state.png

Multi-Display Documentation

Capture all monitors at once for workspace documentation or bug reports.

# Captures one PNG per connected display
screencapture ~/display-main.png ~/display-secondary.png

Dependencies

The core screencapture command is built into macOS — no installation needed. Webcam capture requires one optional Homebrew formula.

screencapture built-in (macOS)
Full screen, region, window, clipboard, multi-display, and video capture. No installation required.
imagesnap brew install imagesnap
Optional. Webcam still capture from FaceTime or USB cameras. Required only for camera-related tasks.

Install commands

# Webcam capture (optional)
brew install imagesnap

# Verify installation
imagesnap -l    # Lists available cameras
which screencapture    # Should print /usr/sbin/screencapture

Rules and Behavior

When the screen-capture skill is active, the agent follows these rules:

RuleDetail
Default capture methodUse screencapture as the default when tool-specific capture (Figma, Playwright, CDP) is unavailable.
Browser screenshotsPrefer cli-jaw browser screenshot for web pages (captures via Chrome DevTools Protocol for pixel-perfect fidelity).
Silent modeAlways use -x flag for programmatic captures to suppress the shutter sound.
Webcam warmupAlways use -w 2 or higher with imagesnap for better image quality.
Privacy permissionScreen Recording permission is required (System Settings > Privacy > Screen Recording). The agent will warn if permission is missing.
Temp file cleanupScreenshots taken for analysis (not explicitly saved) should be cleaned up after use.

"~해줌" Usage Examples

Real-world examples of how to trigger screen capture tasks in natural language. Korean and English both work.

"지금 화면 스크린샷 찍어줌"
Captures the full screen silently with screencapture -x and saves it. The agent reads the resulting image to confirm it was captured successfully. The simplest and most common usage.
"크롬 창 캠처해서 로그인 페이지 제대로 나오는지 확인해줌"
Resolves Chrome's window ID via AppleScript, captures that specific window with screencapture -l, then analyzes the image with the vision model to verify the login page renders correctly. Combines capture + visual QA.
"웹캠 사진 찍어줌 — 얼굴 인식용"
Uses imagesnap -w 2 to capture a webcam photo with proper warmup time. The resulting image can be passed to other skills (e.g., vision analysis) for face detection or identity verification workflows.
"화면 록화 10초만 해줌 — 버그 재현 과정 기록용"
Records a 10-second screen video with screencapture -V 10 ~/recording.mov. Useful for documenting bug reproduction steps, UI animation reviews, or creating quick demo clips.
"두번째 모니터 스크린샷 찍어서 클립보드에 복사해줌"
Targets the secondary display with screencapture -D 2 -c, sending the capture directly to the clipboard for immediate pasting into a chat, document, or image editor.

Comparison with Related Skills

CLI-JAW has several skills that involve visual capture. Choose the right tool for the job:

screen-capturebrowserdesktop-controlvision-click
FocusNative macOS screenshots and recordingsCDP-based browser automationFull desktop + browser automationVision-guided click targeting
Best forQuick captures, multi-display, video, webcamWeb page screenshots, DOM interactionCross-app workflows, hybrid targetsClicking UI elements by visual recognition
Capture methodscreencapture CLIChrome DevTools ProtocolCDP + Computer UseScreenshot + coordinate mapping
Requires browserNoYes (cli-jaw server)Depends on targetNo
When to preferNo browser needed, system-level capture, video, webcamWeb pages (pixel-perfect)Complex multi-app automationClicking on-screen elements
Rule of thumb: Use screen-capture for anything outside the browser or when you need video/webcam. Use browser screenshot for web pages. Use desktop-control when you need to both capture and interact with desktop apps.

Troubleshooting

Screen Recording permission denied

If screencapture produces a blank or all-black image, the terminal app (Terminal.app, iTerm2, or the CLI-JAW Electron app) likely lacks Screen Recording permission.

# Fix: Grant permission in System Settings
# System Settings > Privacy & Security > Screen Recording
# Enable your terminal application
# Restart the terminal after granting permission

imagesnap not found

Webcam capture commands will fail if imagesnap is not installed. It is an optional dependency.

# Install via Homebrew
brew install imagesnap

# Verify
imagesnap -l
# Expected: lists available cameras (e.g., "FaceTime HD Camera")

Webcam image is dark or blurry

The camera sensor needs time to adjust. Always include a warmup delay:

# Bad: no warmup, likely dark/blurry
imagesnap ~/photo.png

# Good: 2-second warmup for proper exposure
imagesnap -w 2 ~/photo.png

# Better: 3-second warmup for low-light conditions
imagesnap -w 3 ~/photo.png

Multi-display capture saves only one file

When capturing multiple displays, you must provide one output filename per display:

# Wrong: only captures the main display
screencapture ~/screenshot.png

# Right: one filename per connected display
screencapture ~/display1.png ~/display2.png

Video recording does not stop

When using screencapture -v (no duration limit), the recording runs until interrupted. Use -V seconds for timed recordings, or send Ctrl+C to stop manual recordings.

Capabilities Summary

CapabilityToolNotes
Full screen capturescreencaptureSilent with -x, delayed with -T
Region selectionscreencapture -iInteractive drag or coordinates with -R
Window capturescreencapture -lBy CGWindowID via AppleScript
Clipboard capturescreencapture -cCombine with -i for region-to-clipboard
Multi-displayscreencapture -DTarget specific display or capture all
Video recordingscreencapture -v/-VMOV format, timed or manual stop
Webcam photoimagesnapOptional dependency, use -w 2 for warmup
Format controlscreencapture -tPNG (default), JPEG, PDF, TIFF