Skip to content

Multiple SHIR containers with cascading failures - 0x80010002 (RPC_E_CALL_CANCELED)) #7

@nickcva

Description

@nickcva

We are running several windows SHIR containers on the same physical machines all containers are using the same network and default nat docker switch. Once one container is unhealthy it starts to slowly cascade to the rest of the SHIR containers. We do not use a proxy and network issues are not occurring between the onprem and Azure ADF/Synapse instance.

Are their any issues with running multiple SHIR containers on the same host that all connect to different Azure ADF/Synapse instances? We have the need to scale this out to hundreds of SHIR containers.

Server 2019 Standard 1809 build 17763.3406

Dockerfile is latest with this addtion:
RUN MD C:\Download ADD https://github.com/adoptium/temurin8-binaries/releases/download/jdk8u345-b01/OpenJDK8U-jdk_x64_windows_hotspot_8u345b01.zip C:/Download RUN MD "C:\Program Files\Eclipse Adoptium\jdk8u345-b01" RUN tar -xf C:/Download/OpenJDK8U-jdk_x64_windows_hotspot_8u345b01.zip -C "C:\Program Files\Eclipse Adoptium" RUN SETX PATH "%PATH%;C:\Program Files\Eclipse Adoptium\jdk8u345-b01\bin;C:\Program Files\Eclipse Adoptium\jdk8u345-b01\jre\bin\server" /m RUN SETX JAVA_HOME "C:\Program Files\Eclipse Adoptium\jdk8u345-b01\" /m

image

The only docker warning that is logged on the host server:
Health check for container 39fbbf4f690da051145d18f9d4df16b6666108c76dd39cf73d177179bf961f60 error: context deadline exceeded

This show up on all the containers that are unhealthy
`[09/22/2022 12:23:08] Registering SHIR node with the node key: redacted@ServiceEndpoint=usgovva.frontend.datamovement.azure.us@Vredacted

[09/22/2022 12:23:09] Registering SHIR node with the node name: redacted
[09/22/2022 12:23:09] Registering SHIR node with the enable high availability flag: true

[09/22/2022 12:23:09] Registering SHIR node with the tcp port: 8060

[09/22/2022 12:25:54] Start registering a new SHIR node

[09/22/2022 12:25:54] Enable High Availability

[09/22/2022 12:25:54] Remote Access Port: 8060

[09/22/2022 12:31:59] Waiting 60 seconds for connecting

Get-WmiObject : Call was canceled by the message filter. (Exception from

HRESULT: 0x80010002 (RPC_E_CALL_CANCELED))

At C:\SHIR\setup.ps1:17 char:22

  • ... essResult = Get-WmiObject Win32_Process -Filter "name = 'diahost.exe' ...

  •             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    • CategoryInfo : InvalidOperation: (:) [Get-WmiObject], COMExcept

    ion

    • FullyQualifiedErrorId : GetWMICOMException,Microsoft.PowerShell.Commands

    .GetWmiObjectCommand

[09/22/2022 12:34:02] diahost.exe is not running

Get-WmiObject : Call was canceled by the message filter. (Exception from

HRESULT: 0x80010002 (RPC_E_CALL_CANCELED))

At C:\SHIR\setup.ps1:17 char:22

  • ... essResult = Get-WmiObject Win32_Process -Filter "name = 'diahost.exe' ...

  •             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    • CategoryInfo : InvalidOperation: (:) [Get-WmiObject], COMExcept

    ion

    • FullyQualifiedErrorId : GetWMICOMException,Microsoft.PowerShell.Commands

    .GetWmiObjectCommand

[09/22/2022 12:36:06] diahost.exe is not running

Get-WmiObject : Call was canceled by the message filter. (Exception from

[09/22/2022 12:38:09] diahost.exe is not running

HRESULT: 0x80010002 (RPC_E_CALL_CANCELED))

At C:\SHIR\setup.ps1:17 char:22

  • ... essResult = Get-WmiObject Win32_Process -Filter "name = 'diahost.exe' ...

  •             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    • CategoryInfo : InvalidOperation: (:) [Get-WmiObject], COMExcept

    ion

    • FullyQualifiedErrorId : GetWMICOMException,Microsoft.PowerShell.Commands

    .GetWmiObjectCommand

Get-WmiObject : Call was canceled by the message filter. (Exception from

HRESULT: 0x80010002 (RPC_E_CALL_CANCELED))

At C:\SHIR\setup.ps1:17 char:22

  • ... essResult = Get-WmiObject Win32_Process -Filter "name = 'diahost.exe' ...

  •             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    • CategoryInfo : InvalidOperation: (:) [Get-WmiObject], COMExcept

    ion

    • FullyQualifiedErrorId : GetWMICOMException,Microsoft.PowerShell.Commands

    .GetWmiObjectCommand

[09/22/2022 12:40:11] diahost.exe is not running

Get-WmiObject : Call was canceled by the message filter. (Exception from

HRESULT: 0x80010002 (RPC_E_CALL_CANCELED))

At C:\SHIR\setup.ps1:17 char:22

  • ... essResult = Get-WmiObject Win32_Process -Filter "name = 'diahost.exe' ...

  •             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    • CategoryInfo : InvalidOperation: (:) [Get-WmiObject], COMExcept

    ion

    • FullyQualifiedErrorId : GetWMICOMException,Microsoft.PowerShell.Commands

    .GetWmiObjectCommand

[09/22/2022 12:42:12] diahost.exe is not running

Get-WmiObject : Call was canceled by the message filter. (Exception from

HRESULT: 0x80010002 (RPC_E_CALL_CANCELED))

At C:\SHIR\setup.ps1:17 char:22

  • ... essResult = Get-WmiObject Win32_Process -Filter "name = 'diahost.exe' ...

  •             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    

[09/22/2022 12:44:12] diahost.exe is not running

+ CategoryInfo          : InvalidOperation: (:) [Get-WmiObject], COMExcept 

ion

+ FullyQualifiedErrorId : GetWMICOMException,Microsoft.PowerShell.Commands 

.GetWmiObjectCommand`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions