Skip to content

Commit c1e89ae

Browse files
Tomer Tayarogabbay
authored andcommitted
accel/habanalabs/gaudi2: check extended errors according to PCIe addr_dec interrupt info
The FW interrupt info for a PCIe addr_dec event is set correctly, so check for either global errors or razwi according to the indications there. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
1 parent 7159813 commit c1e89ae

File tree

1 file changed

+8
-10
lines changed

1 file changed

+8
-10
lines changed

drivers/accel/habanalabs/gaudi2/gaudi2.c

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -8983,9 +8983,6 @@ static int gaudi2_print_pcie_addr_dec_info(struct hl_device *hdev, u16 event_typ
89838983
u32 error_count = 0;
89848984
int i;
89858985

8986-
gaudi2_print_event(hdev, event_type, true,
8987-
"intr_cause_data: %#llx", intr_cause_data);
8988-
89898986
for (i = 0 ; i < GAUDI2_NUM_OF_PCIE_ADDR_DEC_ERR_CAUSE ; i++) {
89908987
if (!(intr_cause_data & BIT_ULL(i)))
89918988
continue;
@@ -8994,15 +8991,16 @@ static int gaudi2_print_pcie_addr_dec_info(struct hl_device *hdev, u16 event_typ
89948991
"err cause: %s", gaudi2_pcie_addr_dec_error_cause[i]);
89958992
error_count++;
89968993

8997-
/*
8998-
* Always check for LBW and HBW additional info as the indication itself is
8999-
* sometimes missing
9000-
*/
8994+
switch (intr_cause_data & BIT_ULL(i)) {
8995+
case PCIE_WRAP_PCIE_IC_SEI_INTR_IND_AXI_LBW_ERR_INTR_MASK:
8996+
hl_check_for_glbl_errors(hdev);
8997+
break;
8998+
case PCIE_WRAP_PCIE_IC_SEI_INTR_IND_BAD_ACCESS_INTR_MASK:
8999+
gaudi2_print_pcie_mstr_rr_mstr_if_razwi_info(hdev, event_mask);
9000+
break;
9001+
}
90019002
}
90029003

9003-
hl_check_for_glbl_errors(hdev);
9004-
gaudi2_print_pcie_mstr_rr_mstr_if_razwi_info(hdev, event_mask);
9005-
90069004
return error_count;
90079005
}
90089006

0 commit comments

Comments
 (0)