Skip to content

Commit 66016ca

Browse files
hawkwsmklein
andauthored
[nexus] SP ereport ingestion (#8296)
This branch adds a Nexus background task for ingesting ereports from service processors via MGS, using the MGS API endpoint added in #7903. These APIs in turn expose the MGS/SP ereport ingestion protocol added in oxidecomputer/management-gateway-service#370. For more information on the protocol itself, refer to the following RFDs: - [RFD 520 Control Plane Fault Ingestion and Data Model][RFD 520] - [RFD 544 Embedded E-Report Formats][RFD 544] - [RFD 545 Firmware E-Report Aggregation and Evacuation][RFD 545] In addition to the ereport ingester background task, this branch also adds database tables for storing ereports from SPs, which are necessary to implement the ingestion task. I've also added a table for storing ereports from the sled host OS, which will eventually be ingested via sled-agent. While there isn't currently anything that populates that table, I wanted to begin sketching out how we would represent the two categories of ereports we expect to deal with, and how we would query both tables for ereports. Finally, this branch also adds OMDB commands for querying the ereports stored in the database. These OMDB commands may be useful both for debugging the ereport ingestion subsystem itself *and* for diagnosing issues once the SP firmware actually emits ereports. At present, the higher-level components of the fault-management subsystem, which will process ereports, diagnose faults, and generate alerts, have yet to be implemented. Therefore, the OMDB ereport commands serve as an interim solution for accessing the lower-level data, which may be useful for debugging such faults until the higher-level FMA components exist. [RFD 520]: https://rfd.shared.oxide.computer/rfd/0520 [RFD 544]: https://rfd.shared.oxide.computer/rfd/0544 [RFD 545]: https://rfd.shared.oxide.computer/rfd/0545 --------- Co-authored-by: Sean Klein <sean@oxide.computer>
1 parent ad1e837 commit 66016ca

File tree

47 files changed

+2876
-24
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+2876
-24
lines changed

Cargo.lock

Lines changed: 6 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

clients/gateway-client/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ workspace = true
1111
base64.workspace = true
1212
chrono.workspace = true
1313
daft.workspace = true
14+
ereport-types.workspace = true
1415
gateway-messages.workspace = true
1516
gateway-types.workspace = true
1617
omicron-uuid-kinds.workspace = true

clients/gateway-client/src/lib.rs

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,12 +70,15 @@ progenitor::generate_api!(
7070
SpIgnition = { derives = [PartialEq, Eq, PartialOrd, Ord] },
7171
SpIgnitionSystemType = { derives = [Copy, PartialEq, Eq, PartialOrd, Ord] },
7272
SpState = { derives = [PartialEq, Eq, PartialOrd, Ord] },
73-
SpType = { derives = [daft::Diffable] },
73+
SpType = { derives = [daft::Diffable, PartialEq, Eq, PartialOrd, Ord] },
7474
SpUpdateStatus = { derives = [PartialEq, Hash, Eq] },
7575
UpdatePreparationProgress = { derives = [PartialEq, Hash, Eq] },
7676
},
7777
replace = {
7878
RotSlot = gateway_types::rot::RotSlot,
79+
Ena = ereport_types::Ena,
80+
Ereport = ereport_types::Ereport,
81+
TypedUuidForEreporterRestartKind = omicron_uuid_kinds::EreporterRestartUuid,
7982
TypedUuidForMupdateKind = omicron_uuid_kinds::MupdateUuid,
8083
},
8184
);

dev-tools/omdb/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ csv.workspace = true
2424
diesel.workspace = true
2525
dropshot.workspace = true
2626
dyn-clone.workspace = true
27+
ereport-types.workspace = true
2728
futures.workspace = true
2829
gateway-client.workspace = true
2930
gateway-messages.workspace = true

dev-tools/omdb/src/bin/omdb/db.rs

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919

2020
use crate::Omdb;
2121
use crate::check_allow_destructive::DestructiveOperationToken;
22+
use crate::db::ereport::cmd_db_ereport;
2223
use crate::helpers::CONNECTION_OPTIONS_HEADING;
2324
use crate::helpers::DATABASE_OPTIONS_HEADING;
2425
use crate::helpers::const_max_len;
@@ -180,6 +181,7 @@ use tabled::Tabled;
180181
use uuid::Uuid;
181182

182183
mod alert;
184+
mod ereport;
183185
mod saga;
184186

185187
const NO_ACTIVE_PROPOLIS_MSG: &str = "<no active Propolis>";
@@ -356,6 +358,8 @@ enum DbCommands {
356358
Disks(DiskArgs),
357359
/// Print information about internal and external DNS
358360
Dns(DnsArgs),
361+
/// Query and display error reports
362+
Ereport(ereport::EreportArgs),
359363
/// Print information about collected hardware/software inventory
360364
Inventory(InventoryArgs),
361365
/// Print information about physical disks
@@ -1494,6 +1498,9 @@ impl DbArgs {
14941498
&args,
14951499
token,
14961500
).await
1501+
},
1502+
DbCommands::Ereport(args) => {
1503+
cmd_db_ereport(&datastore, &fetch_opts, &args).await
14971504
}
14981505
}
14991506
}

0 commit comments

Comments
 (0)