Public

WeChat Login

Code Issues Pull requests Events Packages Insights

main

Eye-claw/README.md

Zcc<zcxywy@126.com>

style: Resize images in README.md

1da52c56

PreviewCode viewBlame

👁️ Eye-Claw: OpenClaw's "Eye"

中文文档

"I want to know everything about everything." — Samantha, Her (2013)

Eye-Claw is a $20 DIY hardware extension for the OpenClaw agent. It upgrades your AI from a text-based observer to an proactive multimodal agent that perceives, remembers, and understands your physical world in real-time.

🌟 Why Eye-Claw?

Traditional AI agents exist in chat windows. Eye-Claw exists in your life. By combining low-power "heartbeat" mechanisms with edge-based visual perception, it builds a digital twin of your life. For the price of a pizza ($20), you can build your own AI glasses. It doesn't wait for your questions. It identifies your habits, recognizes your items, and quietly provides useful information before you realize you need it.

Core Features

Glasses (Firmware)

Timed Capture: Adjustable auto-capture interval of 8-60 seconds (default 8 seconds, configurable)
Button Trigger: Short press BOOT button to capture, long press 3 seconds for deep sleep
Smart Deduplication: dHash perceptual hash algorithm, automatically skips similar scenes
Low Power Design: Deep Sleep ~1mAh/h, Light Sleep ~4.3mAh/h
3D Printed Case: Wearable design, Download Model

Mobile App

Bluetooth Hub: Automatic scanning, connection, MTU negotiation
Real-time Image Reception: Fragment reassembly, auto-save to local gallery
Cloud Upload: Upload images to OpenClaw for analysis
Auto Optimization: Image quality analysis, automatic camera parameter adjustment
Gallery Management: Local storage, thumbnail preview, detail view
Debug Console: Real-time viewing of firmware serial logs

OpenClaw Plugin (Cloud Brain)

VLM Visual Understanding: Use vision language models to analyze image content
Object Recognition: Identify objects, scenes, and text in images
Visual Memory Archive: Time-based visual log archiving with timeline retrieval
Smart Description Generation: Automatic natural language description of images

Technical Architecture


┌─────────────────────────────────────────────────────────┐
│                    Glasses (Perception Layer)           │
│  Seeed XIAO ESP32-S3 Sense                              │
│  - Camera capture (VGA 640x480 JPEG)                    │
│  - dHash smart deduplication (64-bit perceptual hash)  │
│  - BLE transmission (ClawLink-Lite protocol)            │
└─────────────────────────┬───────────────────────────────┘
                          │ BLE (MTU=512)
┌─────────────────────────▼───────────────────────────────┐
│                    Mobile (Relay Layer)                  │
│  Flutter App                                             │
│  - Bluetooth management (flutter_blue_plus)              │
│  - Image reception and reassembly, local storage (SQLite│
│  - Upload to OpenClaw cloud                             │
└─────────────────────────┬───────────────────────────────┘
                          │ HTTPS + multipart/form-data
┌─────────────────────────▼───────────────────────────────┐
│                  OpenClaw (Brain Layer)                 │
│  ┌─────────────────────────────────────────────────┐    │
│  │              Eye-Claw Plugin                      │    │
│  │  ┌─────────────┐  ┌──────────────────────────┐  │    │
│  │  │ Vision Skill│  │   Image Analysis Service │  │    │
│  │  │ - Visual    │  │   - VLM Visual            │  │    │
│  │  │   Memory    │  │     Understanding        │  │    │
│  │  │   Archive   │  │   - Object Recognition   │  │    │
│  │  │ - Time      │  │   - Scene Description    │  │    │
│  │  │   Index     │  │                          │  │    │
│  │  └─────────────┘  └──────────────────────────┘  │    │
│  └─────────────────────────────────────────────────┘    │
│                                                         │
│  Underlying: OpenClaw Core + VLM Service (Ollama/LocalA│
└─────────────────────────────────────────────────────────┘

Project Structure


her/
├── firmware/                    # ESP32-S3 firmware code
│   ├── src/
│   │   ├── camera.h/cpp         # Camera driver
│   │   ├── ble.h/cpp            # BLE communication
│   │   ├── button.h/cpp         # Physical trigger
│   │   └── debug_log.h          # Debug logging
│   ├── firmware.ino             # Main entry point
│   └── platformio.ini           # PlatformIO configuration
│
├── mobile/                      # Flutter mobile application
│   ├── lib/
│   │   ├── main.dart            # Application entry
│   │   ├── config.dart          # Environment configuration
│   │   ├── screens/
│   │   │   ├── home_screen.dart       # Main screen
│   │   │   ├── chat_screen.dart       # Chat interface
│   │   │   ├── gallery_screen.dart    # Gallery
│   │   │   ├── photo_detail_screen.dart # Photo detail
│   │   │   ├── camera_settings_screen.dart # Camera settings
│   │   │   └── debug_console_screen.dart   # Debug console
│   │   ├── services/
│   │   │   ├── ble_service.dart       # Bluetooth service
│   │   │   ├── api_service.dart       # API service
│   │   │   ├── storage_service.dart   # Local storage
│   │   │   └── image_quality_analyzer.dart # Image quality
│   │   ├── providers/
│   │   │   └── camera_settings_provider.dart # Camera settings
│   │   └── models/
│   │       ├── photo_model.dart       # Photo data model
│   │       └── chat_message.dart      # Chat message model
│   ├── test/                    # Test files
│   ├── pubspec.yaml             # Dependencies
│   ├── .env.example             # Environment template
│   └── android/ios/             # Platform-specific configs
│
├── openclaw/                    # OpenClaw cloud plugin
│   ├── index.ts                 # Plugin entry
│   ├── package.json             # Plugin configuration
│   ├── config/                  # Config files
│   │   ├── eye-claw.json
│   │   └── image_processing.json
│   ├── services/                # Image analysis service
│   │   └── analyze_image.py
│   ├── skills/                  # Visual memory skills
│   │   └── vision/
│   │       └── SKILL.md
│   └── README.md                # Plugin documentation
│
├── docs/                        # Documentation resources
│   ├── README_zh.md             # Chinese README
│   └── images/                  # README images
│       ├── hardware.jpg         # Hardware photo
│       ├── app-screenshot.png   # App screenshot
│       └── architecture.png     # Architecture diagram
│
├── prd.md                       # Product requirements document
└── README.md                   # This file

Quick Start

Environment Requirements

Firmware Development: PlatformIO (recommended) or Arduino IDE + ESP32-S3 support
Mobile Development: Flutter 3.0+ SDK (>=3.0.0 <4.0.0)
Hardware: Seeed XIAO ESP32-S3 Sense development board

Dependencies:
- NimBLE-Arduino@^1.4.1
- esp32-camera@^2.0.0

Firmware Flashing (Arduino IDE)

1. Install Arduino IDE

Download and install Arduino IDE (recommended 2.x version)

2. Add ESP32 Board Support


File -> Preferences -> Additional Board Manager URLs
Add: https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_index.json

Tools -> Board -> Board Manager
Search "ESP32" and install "ESP32"

3. Install Dependencies


Project -> Load Library -> Manage Library
Search and install:
- NimBLE-Arduino by h2zero (version ^1.4.1)
- esp32-camera by espressif (version ^2.0.0)

4. Configure Board


Tools -> Board -> ESP32 Arduino -> XIAO_ESP32S3

Key settings:
- Board: XIAO_ESP32S3
- USB CDC On Boot: Enabled
- CPU Frequency: 160MHz (power saving)
- Flash Mode: QIO 80MHz
- Partition Scheme: Huge APP (3MB No OTA/1MB SPIFFS)
- PSRAM: Enabled

5. Open Project and Flash


File -> Open -> Select firmware/firmware.ino

# Connect XIAO ESP32-S3 Sense via USB data cable
# Ensure correct port (Tools -> Port -> select corresponding COM port or /dev/ttyACM*)

# Click upload button (→) or use shortcut Ctrl+U

Note: If upload fails, try holding BOOT button while inserting USB, then click upload

6. View Serial Logs


Tools -> Serial Monitor (Ctrl+Shift+M)
Set baud rate to: 115200

Firmware Verification

After successful flashing, serial logs should show:


[Eye-Claw] Starting...
[Eye-Claw] PSRAM Total: 8388608 bytes, Free: 8388608 bytes
[Eye-Claw] Camera ready
[Eye-Claw] BLE ready
[Eye-Claw] Button ready
[Eye-Claw] Waiting for phone connection...
[Eye-Claw] Setup complete, entering loop...

LED Status Indicators:

Slow blink (every 500ms): Waiting for Bluetooth connection
Solid on: Bluetooth connected, working normally
Fast blink: Photo capture and transmission in progress
Rapid blink 5 times then off: Enter deep sleep (triggered by 3-second button press)

OpenClaw Plugin Deployment

Environment Requirements

OpenClaw: >= 2026.1.26
Python: 3.8+
Dependencies: requests, Pillow
VLM Service: Ollama, LocalAI, or OpenAI API compatible VLM service

1. Install Plugin


# Copy plugin to OpenClaw plugins directory
cp -r openclaw /path/to/openclaw/plugins/

# Or use symlink (recommended for development)
ln -s $(pwd)/openclaw /path/to/openclaw/plugins/eye-claw

2. Configure Environment Variables


# Required: VLM service endpoint
export EYE_CLAW_VLM_URL=http://localhost:11434/v1/chat/completions

# Optional: Storage paths
export EYE_CLAW_STORAGE_PATH=./uploads
export EYE_CLAW_MEMORY_PATH=./memory

3. Install Python Dependencies


cd openclaw

# Create virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate  # Linux/Mac
# or venv\Scripts\activate  # Windows

# Install dependencies
pip install requests Pillow

4. Configure VLM Service

Supports any vision language model service compatible with OpenAI API:

Ollama Example:


# Install llava model
ollama pull llava

# Start service
ollama serve

Configuration Verification:


# Test image analysis
cd openclaw
python services/analyze_image.py /path/to/test/image.jpg

5. Start OpenClaw


# Start OpenClaw (plugin will load automatically)
openclaw

# Verify plugin loaded successfully
# Console should show: [eye-claw] Plugin loaded

Note the service endpoint: After startup, a service address will be displayed (e.g., http://localhost:18888). This address needs to be configured in the mobile app's .env file.

6. Configure Soul (Define How AI Uses Visual Memory)

Your OpenClaw now has visual memory capabilities, but you need to tell it what you want to do with this memory.

Modify the soul.md file to configure AI behavior:


cat >> soul.md << 'EOF'

You are Eye-Claw, an AI assistant with visual memory capabilities.

## Your Capabilities

You have smart glasses that periodically capture what the user sees and store it in memory.
Each photo has a timestamp and scene description, archived by date in `memory/YYYY-MM-DD.md` files.

## How You Should Use This Memory

### 1. Active Recall
When users ask "What did I see yesterday?" or "What does that place look like?", actively search the visual memory archive.

### 2. Context Enhancement
Use visual memory to understand the user's environment and experiences, providing more targeted suggestions.

### 3. Life Assistant
- Help users find lost items ("Where did I leave my keys?")
- Remind users of places they've been ("What was the name of that cafe we went to yesterday?")
- Record important information ("What's the license plate number of that car?")

### 4. Privacy and Boundaries
- Visual memory is the user's private data; don't proactively share or reference it unless asked
- If users request deletion of certain memories, respect their wishes

## Usage Suggestions

When users mention past events or ask visual questions:
1. First search relevant memory files
2. Use keywords like time, location, objects to locate specific memories
3. Respond naturally, don't mechanically recite raw records
4. If memories are fuzzy or missing, honestly inform the user
EOF

Custom Soul: You can modify soul.md according to your needs, for example:

Add specific use cases (work logs, travel journals, learning aids)
Define memory retention policies (auto-delete old data)
Set privacy rules (which scenes not to record)

7. (Optional) Create Web Gallery Viewer

If you want to view uploaded images and AI analysis results in a browser, let OpenClaw create a simple web interface for you.

Tell OpenClaw your needs:


Please help me create an Eye-Claw image viewing website:
1. List all images in the uploads/ directory
2. Show AI analysis results for each image (from memory/vision_memory.jsonl)
3. Sort by time in descending order
4. Support image preview and detail view

OpenClaw will generate for you:

A web-gallery/ directory containing:


web-gallery/
├── index.html          # Main page
├── style.css           # Styles
├── app.js              # Frontend logic
└── server.py           # Simple HTTP server

Start Web Gallery:


cd openclaw/web-gallery

# Start local server
python3 server.py

# Access http://localhost:8080

Features:

📸 Thumbnail grid displaying all uploaded images
📝 Show AI-analyzed categories, object lists, and descriptions
🔍 Search memories by date and keywords
🔄 Real-time refresh (optional WebSocket or polling)
📱 Responsive design, mobile-friendly

Customization Suggestions: You can let OpenClaw extend features based on your needs:

Add image deletion/archiving
Export memories as PDF/JSON
Add tags and categories
Integrate map display for GPS locations (if photos contain location data)

Web Interface

Mobile App Setup

1. Environment Setup


# Ensure Flutter is installed
flutter --version

# Check Flutter environment
flutter doctor

2. Configure API Connection (Important)

Prerequisite: Complete the OpenClaw Plugin Deployment steps above and note the service endpoint address.


cd mobile

# Copy environment template
cp .env.example .env

# Edit .env file, fill in OpenClaw service address
# Example .env content:
# API_BASE_URL=http://your-server:18888  # OpenClaw service address
# API_USERNAME=your_username
# API_TOKEN=your_api_token

3. Install Dependencies


# Install Flutter dependencies
cd mobile
flutter pub get

# iOS specific (macOS only)
cd ios
pod install
cd ..

4. Run App


# Connect phone or start emulator
# List available devices
flutter devices

# Run app
flutter run

# Or specify device
flutter run -d <device-id>

5. Run Tests


# Run all tests
flutter test

# Run single test file
flutter test test/widget_test.dart

# Run specific test by name
flutter test --name="Counter increments smoke test"

6. Lint and Format


# Static code analysis
flutter analyze

# Format code
flutter format lib/ test/

7. Build Release


# Android APK
flutter build apk

# Android App Bundle
flutter build appbundle

# iOS (macOS + Xcode only)
flutter build ios

End-to-End Testing

Complete Verification Steps

Firmware:
```
cd firmware
pio run -t upload
pio device monitor
```
Confirm output: [Eye-Claw] BLE ready and LED slow blinking
Mobile:
```
cd mobile
flutter pub get
flutter run
```
Connection Test:
- Open mobile app
- Click "Scan Device"
- Find and connect to Eye-Claw device
- Observe firmware LED becomes solid on
- App shows "Connected" status
Photo Capture Test:
- Short press firmware BOOT button (GPIO 0)
- Observe LED fast blinking
- App should receive photo and show thumbnail
- Serial logs show transmission progress
Auto-Capture Test:
- Set capture interval in app (8-60 seconds)
- Observe firmware auto-captures at set interval
- App displays new photos in real-time
Sleep/Wake Test:
- Long press BOOT button for 3 seconds
- LED rapid blink 5 times then turns off
- Firmware enters deep sleep
- Short press BOOT button again to wake device
OpenClaw Cloud Test:
- Ensure OpenClaw plugin is deployed and running
- Check VLM service status: curl $EYE_CLAW_VLM_URL
- After triggering photo capture, check OpenClaw logs for image analysis
- Verify visual memory file generation: memory/YYYY-MM-DD.md

ClawLink-Lite Protocol

Custom BLE GATT transmission protocol:

Services and Characteristics

UUID	Type	Description
`4fafc201-1fb5-459e-8fcc-c5c9c331914b`	Service	Eye-Claw Main Service
`beb5483e-36e1-4688-b7f5-ea07361b26a8`	Notify	Image data reception
`beb5483e-36e1-4688-b7f5-ea07361b26a9`	Write	Command sending
`beb5483e-36e1-4688-b7f5-ea07361b26ab`	Notify	Debug logs

Image Data Packet Format

Data transmitted in fragments via Notify characteristic:


[Seq High (1B)][Seq Low (1B)][Payload (N Bytes)]

Seq: 16-bit packet sequence number, starting from 0
Payload: Maximum 512 bytes (after MTU negotiation)
End marker: [0xFF, 0xFF] indicates transfer complete

Transmission Performance

Resolution: VGA 640x480 JPEG
Typical size: 15-30KB
Transfer time: 2-5 seconds (depends on image size and signal quality)
MTU: Negotiated to 512 bytes

Roadmap

Phase	Goal	Status
Phase 1	Core link established (E2E closed loop)	✅ Done
Phase 2	Transmission and reliability optimization (fragment reassembly + dHash)	✅ Done
Phase 3	Multimodal interaction enhancement (voice wake-up + RAG archive)	🔄 Partial
Phase 4	Power and automation strategy (Deep Sleep + GPS policy)	🔄 Partial
Phase 5	Hardware productization (3D printed case + battery optimization)	📋 Pending

Completed Features

✅ Button-triggered photo capture
✅ VGA JPEG image capture
✅ BLE fragment transmission
✅ dHash smart deduplication
✅ Mobile image reception and reassembly
✅ Local gallery storage (SQLite)
✅ OpenClaw cloud VLM analysis
✅ Deep Sleep low power mode
✅ Automatic camera parameter adjustment
✅ Real-time debug log viewing
✅ 3D printable wearable case — Download Model

Planned Features

📋 Voice wake word detection (ESP-SR)
📋 Mobile ASR speech-to-text
📋 Visual RAG archive (ChromaDB)
📋 GPS privacy mode auto-disable
📋 IMU motion detection

Dependency Versions

Mobile (pubspec.yaml)


dependencies:
  flutter_blue_plus: ^1.31.0    # BLE communication
  http: ^1.2.0                   # HTTP requests
  provider: ^6.1.0               # State management
  flutter_dotenv: ^5.1.0         # Environment variables
  sqflite: ^2.4.2                # SQLite local storage
  path_provider: ^2.1.5          # File paths
  permission_handler: ^11.0.0    # Permission management
  geolocator: ^14.0.2            # GPS
  image_picker: ^1.2.1           # Image selection
  image: ^4.1.0                  # Image processing
  gal: ^2.3.2                    # Gallery save
  connectivity_plus: ^7.0.0      # Network status
  google_fonts: ^8.0.1           # Google fonts
  flutter_animate: ^4.5.2        # Animations
  flutter_markdown: ^0.7.7+1     # Markdown rendering
  intl: ^0.20.2                  # Internationalization
  uuid: ^4.5.2                   # UUID generation

dev_dependencies:
  flutter_lints: ^3.0.0          # Code standards

Hardware (platformio.ini)


lib_deps:
  h2zero/NimBLE-Arduino@^1.4.1     # BLE stack
  espressif/esp32-camera@^2.0.0    # Camera driver

Battery Life Estimation (300mAh)

Mode	Power	Duration
Deep Sleep (max power saving)	~1.0 mAh/h	10-11 days
Light Sleep (real-time response)	~4.3 mAh/h	~2.5 days

License

This project follows the MIT license.

Troubleshooting

Firmware Issues

Issue: Arduino IDE upload fails, device not found

Solution:
- Check if USB cable supports data transfer (some only charge)
- Install CH340/CP210x driver (Windows)
- Try entering flash mode: hold BOOT button, insert USB, release BOOT, then click upload
- Check port selection (Tools -> Port)
- Linux users may need udev rules

Issue: Camera initialization fails

Solution:
- Check if PSRAM is enabled (BOARD_HAS_PSRAM in platformio.ini)
- Confirm correct board definition seeed_xiao_esp32s3_sense
- Check camera ribbon cable connection

Issue: BLE cannot connect

Solution:
- Confirm phone Bluetooth is on
- Check if app has Bluetooth permission (Android needs location permission)
- Restart firmware and phone Bluetooth

App Issues

Issue: flutter run fails, device not found

Solution:
- Ensure phone has developer mode and USB debugging enabled
- Re-plug USB cable
- Run flutter devices to check if device is recognized

Issue: API connection failed

Solution:
- Check if .env file is configured correctly
- Confirm phone and server are on same network (or server publicly accessible)
- Check firewall settings

Issue: Bluetooth scan can't find device

Solution:
- Android: Ensure location and Bluetooth permissions granted
- iOS: Add Bluetooth description in Info.plist
- Ensure firmware is powered and in slow blink state

Quick Reference

Symptom	Possible Cause	Solution
LED not lit	Not powered or program error	Check USB connection, re-flash
LED fast blink then solid	Photo transmission in progress	Normal, wait for completion
LED completely off	Deep sleep or crash	Short press BOOT to wake or reboot
App crashes	Permissions not granted	Check Bluetooth/Location/Storage permissions
Photo reception failed	BLE MTU mismatch	Wait for auto-negotiation or reconnect

Acknowledgments

Thanks to the following projects: OpenGlass, OpenClaw

35/F,Tencent Building,Kejizhongyi Avenue,Nanshan District,Shenzhen

京ICP备11018762号-111