src — desktop-automation

Module: src-desktop-automation Cohesion: 0.80 Members: 0

src — desktop-automation

The src/desktop-automation module provides a robust, cross-platform interface for interacting with the desktop environment. It abstracts away the complexities of different operating systems and underlying automation tools, offering a unified API for controlling the mouse, keyboard, windows, applications, screen, and clipboard.

Inspired by projects like Native Engine, this module also integrates higher-level desktop capabilities such as permission management, system control, smart UI element snapshots, and screen recording.

1. Module Overview

The primary goal of this module is to enable programmatic control over a desktop environment, making it suitable for building automation scripts, testing tools, or intelligent agents that interact with graphical user interfaces. It achieves this through a flexible provider-based architecture, allowing different backends (native OS tools, third-party libraries, or mocks) to be swapped out seamlessly.

Key Features:

2. Architecture

The module employs a Provider Pattern to achieve platform independence and extensibility. The central DesktopAutomationManager class acts as a facade, delegating all actual desktop interaction logic to an underlying IAutomationProvider implementation.

graph TD
    A[DesktopAutomationManager] --> B{IAutomationProvider};
    B --> C[MockAutomationProvider];
    B --> D[NutJsProvider];
    B --> E[BaseNativeProvider];
    E --> F[LinuxNativeProvider];
    E --> G[MacOSNativeProvider];
    E --> H[WindowsNativeProvider];

  1. IAutomationProvider: An interface defining all the core desktop automation methods (mouse, keyboard, windows, apps, screen, clipboard).
  2. DesktopAutomationManager: The main entry point for consumers. It selects and manages the active IAutomationProvider, applies global configuration (like safety settings), and emits events.
  3. Concrete Providers: Implement IAutomationProvider using different technologies:

3. Key Components

3.1. DesktopAutomationManager

This is the primary class for interacting with the desktop automation features. It's an EventEmitter and manages the lifecycle and configuration of the underlying automation provider.

Core Responsibilities:

Key Methods:

3.2. IAutomationProvider and Concrete Implementations

The IAutomationProvider interface defines the contract for all automation backends. Each concrete provider implements these methods using platform-specific APIs or libraries.

MockAutomationProvider

BaseNativeProvider

LinuxNativeProvider

MacOSNativeProvider

NutJsProvider

3.3. Enterprise-grade Modules

These modules extend the desktop automation capabilities beyond basic input/output, providing higher-level services. They are exported alongside the core DesktopAutomationManager but operate independently.

PermissionManager

SystemControl

SmartSnapshotManager

ScreenRecorder

4. Usage

To use the desktop automation features, obtain an instance of DesktopAutomationManager via the singleton helper:

import { getDesktopAutomation, KeyCode, MouseButton } from './desktop-automation/index.js';

async function runAutomation() {
  const automation = getDesktopAutomation();

  try {
    await automation.initialize();
    console.log('Automation manager initialized with provider:', automation.getProvider()?.name);

    class="hl-cmt">// Mouse actions
    await automation.moveMouse(100, 100);
    await automation.click();
    await automation.rightClick(150, 150);

    class="hl-cmt">// Keyboard actions
    await automation.type('Hello, world!');
    await automation.keyPress(KeyCode.Enter);
    await automation.hotkey(KeyCode.Control, KeyCode.C); class="hl-cmt">// Ctrl+C

    class="hl-cmt">// Window actions
    const activeWindow = await automation.getActiveWindow();
    if (activeWindow) {
      console.log('Active window:', activeWindow.title);
      await automation.minimizeWindow(activeWindow.handle);
      await automation.delay(1000);
      await automation.restoreWindow(activeWindow.handle);
    }

    class="hl-cmt">// Application actions
    const app = await automation.launchApp('/Applications/Calculator.app'); class="hl-cmt">// macOS example
    console.log('Launched app:', app.name, 'PID:', app.pid);
    await automation.delay(2000);
    if (app.pid) {
      await automation.closeApp(app.pid);
    }

    class="hl-cmt">// Clipboard
    await automation.copyText('This is copied text.');
    const clipboardText = await automation.getClipboardText();
    console.log('Clipboard content:', clipboardText);

  } catch (error) {
    console.error('Automation failed:', error);
  } finally {
    await automation.shutdown();
  }
}

runAutomation();

5. Extensibility

5.1. Adding a New Automation Provider

To add support for a new automation backend (e.g., a different library or a new native toolset):

  1. Create a new class that implements the IAutomationProvider interface.
  2. Implement all methods defined in IAutomationProvider, translating them to the new backend's API.
  3. Register the provider with the DesktopAutomationManager using registerProvider().
import { IAutomationProvider, DesktopAutomationManager } from './desktop-automation/automation-manager.js';
import { AutomationProvider, ProviderCapabilities, MousePosition } from './desktop-automation/types.js';

class MyCustomProvider implements IAutomationProvider {
  readonly name: AutomationProvider = 'my-custom-provider';
  readonly capabilities: ProviderCapabilities = { /* ... */ };

  async initialize(): Promise<void> { /* ... */ }
  async shutdown(): Promise<void> { /* ... */ }
  async isAvailable(): Promise<boolean> { return true; }
  async getMousePosition(): Promise<MousePosition> { return { x: 0, y: 0 }; }
  class="hl-cmt">// ... implement all other IAutomationProvider methods
}

const automation = getDesktopAutomation();
automation.registerProvider(new MyCustomProvider());
class="hl-cmt">// Configure the manager to prefer this provider
automation.updateConfig({ provider: &#39;my-custom-provider&#39; });

5.2. Extending Native Providers

For platform-native providers, extend BaseNativeProvider to leverage its exec, execSync, and checkTool utilities.

6. Safety Features

The DesktopAutomationManager includes built-in safety mechanisms to prevent unintended or runaway automation:

These features are applied automatically by the DesktopAutomationManager before delegating actions to the underlying provider.

7. Eventing

DesktopAutomationManager extends EventEmitter and emits events for significant automation actions. Developers can subscribe to these events to monitor or react to automation progress.

Example Events:

automation.on(&#39;mouse-move&#39;, (pos) => console.log(&#39;Mouse moved to&#39;, pos));
automation.on(&#39;fail-safe&#39;, () => console.warn(&#39;Fail-safe activated!&#39;));

8. Integration Points

The desktop-automation module is a foundational component, integrated across various parts of the codebase:

9. Contributing

When contributing to the desktop-automation module: