Skip to content

Krafman/androidagent-experiment

Repository files navigation

androidagent

Control android devices with llm-powered agent LLM-Powered Android Test Controller (POC) This repository contains the Proof of Concept for an LLM-based controller designed for continuous testing of Android applications on the digital.ai device farm.

For a visual overview of how a natural language request is processed and executed on a device, see docs/request_flow.md.

  1. Prerequisites Before you begin, ensure you have the following software installed and configured on your system:

Python: Version 3.10 or higher. Android SDK Platform-Tools: Provides the Android Debug Bridge (adb) command-line tool. You can download this from the official Android developer website. Ensure the directory containing adb is added to your system's PATH environment variable. An Android Device or Emulator: Physical Device: An Android device with developer options enabled and USB Debugging turned ON. Emulator: An Android Virtual Device (AVD) set up via Android Studio. 2. Initial Setup Follow these steps to set up your local development environment.

Step 2.1: Clone the Repository Clone this repository to your local machine:

Bash

git clone cd llm-android-controller Step 2.2: Set Up Python Virtual Environment It is highly recommended to use a virtual environment to manage project dependencies.

Bash

Create a virtual environment

python -m venv venv

Activate the virtual environment

On Windows:

venv\Scripts\activate

On macOS/Linux:

source venv/bin/activate Step 2.3: Install Dependencies Install the required Python packages using the requirements.txt file.

Bash

pip install -r requirements.txt (Note: The initial requirements.txt will contain uiautomator2)

  1. Device Setup and Verification Step 3.1: Verify ADB Connection Connect your Android device to your computer via USB or ensure your emulator is running.

Open your terminal and run the following command to verify that your device is recognized by ADB:

Bash

adb devices You should see an output similar to this, with your device's serial number listed:

List of devices attached <your_device_serial> device If the list is empty or shows "unauthorized," please troubleshoot your ADB connection and ensure you have accepted the USB debugging prompt on your device.

Step 3.2: Initialize uiautomator2 on Your Device uiautomator2 needs to install a small agent on the device to work. Run the following command to initialize it for your connected device. This is typically a one-time step per device and may take a minute as it downloads and installs the necessary APKs.

Bash

python -m uiautomator2 init You should see a success message indicating that the required agents have been installed on your device.

Step 3.3: Run a Verification Script You can run the following simple Python script to confirm that you can successfully connect to your device and retrieve basic information.

Create a file named verify_connection.py:

Python

import uiautomator2 as u2 import pprint

if name == "main": try: # Connect to the first device detected by ADB print("Attempting to connect to device...") d = u2.connect()

    # Get device information
    device_info = d.device_info
    
    print("\nSuccessfully connected to device!")
    print("---------------------------------")
    pprint.pprint(device_info)
    print("---------------------------------")
    print("\nSetup is complete and working correctly.")

except Exception as e:
    print(f"\nAn error occurred: {e}")
    print("Please check your ADB connection and ensure the device is authorized.")

Run the script from your terminal:

Bash

python verify_connection.py A successful run will print "Successfully connected to device!" followed by a dictionary of your device's information.

  1. LLM Configuration API keys for LLM providers (OpenAI, Gemini, Together.ai, Mistral) are configured in src/llm_controller/llm_interface.py. The system prioritizes loading API keys from environment variables:
  • OPENAI_API_KEY for OpenAI
  • GEMINI_API_KEY for Gemini
  • TOGETHER_API_KEY for Together.ai
  • MISTRAL_API_KEY for Mistral
  • ANTHROPIC_API_KEY for Anthropic (if ever fully implemented)

If environment variables are not set, it falls back to config/llm_config.yml. It is strongly recommended to use environment variables for API keys in production or shared environments. Refer to config/llm_config.yml for placeholder examples and other LLM settings.

  1. Interaction Methods and Android Helper App Setup The system supports two primary methods for interacting with the Android device:
  • uiautomator2 (Default): This method leverages the uiautomator2 library, which directly interacts with UI elements on the device. It's generally easier to set up initially.
  • adb_helper: This method requires a companion Android application (android_helper_app.apk) to be installed and running as an Accessibility Service on the target device. It can offer more robust UI element detection and action execution in some scenarios.

The choice of interaction method (uiautomator2 or adb_helper) is typically configured when initializing the MainController or through a script parameter that sets this for UIParser and ActionExecutor instances.

Android Helper App (adb_helper method) Setup: This method requires the android_helper_app.apk.

  1. Installation: Once android_helper_app.apk is available (e.g., built from the android_helper_app/ directory or provided), install it using:

    adb install path/to/android_helper_app.apk
  2. Enable Accessibility Service:

    • Go to Android Settings on your device/emulator.
    • Navigate to Accessibility.
    • Find and select Installed apps or Downloaded services (this menu name can vary by Android version and OEM).
    • Locate the helper app (e.g., "LLM Android Helper" - the exact name depends on the app's manifest) and tap on it.
    • Enable the service by toggling the switch to On.
    • Confirm any permission dialogs required by the Accessibility Service. The helper app needs permissions to view and control the screen.

    For technical details on the helper app's interface (broadcast actions, data format), refer to its dedicated README: android_helper_app/README.md.

About

Control android devices with llm-powered agent

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published