Skip to content

Reboot_Health_ Check: Add run script, service file & README #73

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,139 +1,91 @@
Overview
# Reboot Health Check Test

This script automates a full reboot validation and health check for any embedded Linux system.
It ensures that after each reboot, the system:
This test automates a full reboot validation and health check for an embedded Linux system. It ensures that after a reboot, the system:

Boots correctly to shell

Key directories (/proc, /sys, /tmp, /dev) are available

Kernel version is accessible

Networking stack is functional


It supports auto-retry on failures, with configurable maximum retries.

No dependency on cron, systemd, Yocto specifics — purely portable.
- Boots into a stable shell
- Key filesystems (`/proc`, `/sys`, `/tmp`, `/dev`) are accessible
- Kernel version is accessible
- Networking stack is functional

This script is useful for validating device boot health as part of CI, flashing, or kernel testing workflows.

---

Features

Automatic setup of a temporary boot hook
## Overview

Reboot and post-boot health validations
The test script performs the following functional checks:

Detailed logs with PASS/FAIL results
1. **Boot Validation**
- Ensures the system boots into a stable shell.

Auto-retry mechanism up to a configurable limit

Safe cleanup of temp files and hooks after success or failure

Color-coded outputs for easy reading

Lightweight and BusyBox compatible
2. **Filesystem Accessibility**
- Confirms that key filesystems (`/proc`, `/sys`, `/tmp`, `/dev`) are accessible.

3. **Kernel Version Check**
- Verifies that the kernel version is accessible.

4. **Networking Stack Verification**
- Checks that the networking stack is functional.

---

Usage

Step 1: Copy the script to your device

scp reboot_health_check_autoretry.sh root@<device_ip>:/tmp/

Step 2: Make it executable

chmod +x /tmp/reboot_health_check_autoretry.sh

Step 3: Run the script

/tmp/reboot_health_check_autoretry.sh

The script will automatically:

Create a flag and self-copy to survive reboot

Setup a temporary /etc/init.d/ hook

Force reboot

On reboot, validate the system

Retry if needed


## Files Used

| File / Path | Description |
|--------------------------------------------------|-----------------------------------------------------------------------------|
| `run.sh` | Main script to execute the reboot validation test |
| `/var/reboot_health/` | Directory to store log and retry-related files |
| `/var/reboot_health/reboot_test.log` | Persistent log file for all test outputs |
| `/var/reboot_health/reboot_retry_count` | File storing number of reboot retries (used internally) |
| `/var/reboot_marker` | Temporary marker to differentiate pre- and post-reboot states |
| `/etc/systemd/system/reboot-health.service` | systemd service file to autostart reboot health check after boot |
| `/var/common/reboot_health_check.sh` | Actual reboot validation script that is called on system boot |

---

Log File

All outputs are stored in /tmp/reboot_test.log

Summarizes all individual tests and overall result



---

Configuration

Modify these inside the script if needed:


## Service setup:
1. Copy the `reboot-health.service` file to:
Enable the service:
systemctl`enable reboot-health.service`
---

Pass/Fail Criteria


---
## Manual Run Instructions:

Limitations

Requires basic /bin/sh shell (ash, bash, dash supported)

Needs writable /tmp/ and /etc/init.d/

Does not rely on systemd, cron, or external daemons


1. **make the script excuetable**
`chmod +x run.sh`

2. **Run the test using:**
`./run-test.sh Reboot_health_check`
---

Cleanup

Script automatically:

Removes temporary boot hook

Deletes self-copy after successful completion

Cleans retry counters


You don't need to manually intervene.


---

Example Run Output

2025-04-26 19:45:20 [START] Reboot Health Test Started
2025-04-26 19:45:21 [STEP] Preparing system for reboot test...
2025-04-26 19:45:23 [INFO] System will reboot now to perform validation.
(reboots)

2025-04-26 19:46:10 [STEP] Starting post-reboot validation...
2025-04-26 19:46:11 [PASS] Boot flag detected. System reboot successful.
2025-04-26 19:46:12 [PASS] Shell is responsive.
2025-04-26 19:46:12 [PASS] Directory /proc exists.
2025-04-26 19:46:12 [PASS] Directory /sys exists.
2025-04-26 19:46:12 [PASS] Directory /tmp exists.
2025-04-26 19:46:12 [PASS] Directory /dev exists.
2025-04-26 19:46:12 [PASS] Kernel version: 6.6.65
2025-04-26 19:46:13 [PASS] Network stack active (ping localhost successful).
2025-04-26 19:46:13 [OVERALL PASS] Reboot + Health Check successful!
## Sample output:
```text
[2025-05-22 18:11:00] [START] Reboot Health Test Started
[2025-05-22 18:11:00] [INFO] Reboot marker not found. Rebooting now...
Rebooting...

...system reboots...

[2025-05-22 18:11:10] [START] Reboot Health Test Started
[2025-05-22 18:11:10] [PASS] System booted successfully and root shell obtained.
[2025-05-22 18:11:10] [OVERALL PASS] Reboot + Health Check successful!
```
---
---
## Notes:
```text
The device takes approximately 10 seconds to reach shell after reboot.
Log file is persistent and accumulates output from all runs.
You can manually clear logs using:
`rm -f /var/reboot_health/reboot_test.log`
```
---
## License:
```text
SPDX-License-Identifier: BSD-3-Clause-Clear
(C) Qualcomm Technologies, Inc. and/or its subsidiaries.
```

Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
[Unit]
Description=Reboot Health Check Service
After=default.target

[Service]
Type=simple
ExecStart=/var/common/reboot_health_check
StandardOutput=tty
StandardError=tty
Restar=no


[Install]
WantedBy=multi-user.target
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#!/bin/sh

# Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
# SPDX-License-Identifier: BSD-3-Clause-Clear

LOG_FILE="/var/reboot_test.log"
MARKER="/var/reboot_marker"

echo "[START] Reboot Health Test Started" > $LOG_FILE

if [ "$(whoami)" = "root" ]; then
if [ ! -f "$MARKER" ]; then
echo "[INFO] Reboot marker not found. Rebooting now..." >> $LOG_FILE
touch "$MARKER"
sleep 2
reboot -f
else
echo "[PASS] System booted successfully and root shell obtained." >> $LOG_FILE
echo "[OVERALL PASS] Reboot + Health Check successful!" >> $LOG_FILE
rm -f "$MARKER"
fi
cat $LOG_FILE
exit 0
else
echo "[FAIL] Root shell not available!" >> $LOG_FIle
cat $LOG_FILE
exit 1
fi
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,16 @@ if [ -z "$__INIT_ENV_LOADED" ]; then
# shellcheck disable=SC1090
. "$INIT_ENV"
fi

# Always source functestlib.sh, using $TOOLS exported by init_env
# shellcheck disable=SC1090,SC1091
. "$TOOLS/functestlib.sh"

# Define log_info if not already defined
log_info() {
echo "[INFO] $(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"
}

TESTNAME="Reboot_health_check"
test_path=$(find_test_case_by_name "$TESTNAME")
cd "$test_path" || exit 1
Expand All @@ -41,7 +47,9 @@ log_info "=== Test Initialization ==="

# Directory for health check files
HEALTH_DIR="/var/reboot_health"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove var

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove /var/

LOG_FILE="$HEALTH_DIR/reboot_test.log"
RETRY_FILE="$HEALTH_DIR/reboot_retry_count"
MARKER_FILE="/var/reboot_marker"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove /var/ and handle using current directory

MAX_RETRIES=3

# Make sure health directory exists
Expand All @@ -52,35 +60,20 @@ if [ ! -f "$RETRY_FILE" ]; then
echo "0" > "$RETRY_FILE"
fi

# Read current retry count
RETRY_COUNT=$(cat "$RETRY_FILE")

log_info "--------------------------------------------"
log_info "Boot Health Check Started - $(date)"
log_info "Current Retry Count: $RETRY_COUNT"
log_info "[START] Reboot Health Test Started"

# Health Check: You can expand this check
if [ "$(whoami)" = "root" ]; then
log_pass "System booted successfully and root shell obtained."
log_info "Test Completed Successfully after $RETRY_COUNT retries."

# Optional: clean retry counter after success
echo "0" > "$RETRY_FILE"

# Reboot logic
if [ ! -f "$MARKER_FILE" ]; then
log_info "Reboot marker not found. Rebooting now..."
log_info "Rebooting"
touch "$MARKER_FILE"
reboot
exit 0
else
log_fail "Root shell not available!"

RETRY_COUNT=$((RETRY_COUNT + 1))
echo "$RETRY_COUNT" > "$RETRY_FILE"

if [ "$RETRY_COUNT" -ge "$MAX_RETRIES" ]; then
log_error "[ERROR] Maximum retries ($MAX_RETRIES) reached. Stopping test."
exit 1
else
log_info "Rebooting system for retry #$RETRY_COUNT..."
sync
sleep 2
reboot -f
fi
# Post-reboot actions
rm -f "$MARKER_FILE"
log_info "[PASS] System booted successfully and root shell obtained."
log_info "[OVERALL PASS] Reboot + Health Check successful!"
fi

log_info "-------------------Completed $TESTNAME Testcase----------------------------"
Loading