Skip to content

Memory Leak still seems to be present when running in Python #603

@lcurtis-datto

Description

@lcurtis-datto

Looking at 560, it appeared that the memory leak was resolved. However, I find after running a large number of scans, memory usage continually increases until the Python script is killed by oom-killer. I have tried using multiprocess and running the sslyze scan in a child process, but I keep running into deadlocks when scanning larger lists of hosts.

To Reproduce
The condition is reproduce-able by running the great example code in 560:

import os
import gc
import psutil
from sslyze import Scanner, ServerScanRequest, ServerNetworkLocation


def print_memory_used(msg):
    object_count = len(gc.get_objects())
    process = psutil.Process(os.getpid())
    memory_used = process.memory_info().rss
    print(f"{msg}: object count: {object_count}, memory used: {memory_used / 1024**2} MB")


def sslyze_scan(hostname, port):
    results = list()
    request = ServerScanRequest(ServerNetworkLocation(hostname=hostname, port=port))
    scanner = Scanner()
    scanner.queue_scans([request])
    results = list(scanner.get_results())

for i in range(1,5):
    print_memory_used(f"before run {i}")
    sslyze_scan("mozilla.com", 443)
    print_memory_used(f"after run {i}")

Expected behavior
Memory usage should reset on each run, but output shows:

before run 1: object count: 35980, memory used: 37.78515625 MB
after run 1: object count: 45321, memory used: 59.171875 MB
before run 2: object count: 45321, memory used: 59.171875 MB
after run 2: object count: 52064, memory used: 69.640625 MB
before run 3: object count: 52064, memory used: 69.640625 MB
after run 3: object count: 52101, memory used: 79.33984375 MB
before run 4: object count: 52101, memory used: 79.85546875 MB
after run 4: object count: 45080, memory used: 86.80078125 MB

If I do the same thing using multiprocessing, the memory is released.

before run 1: object count: 36584, memory used: 23.13671875 MB
after run 1: object count: 45749, memory used: 51.92578125 MB
before run 2: object count: 36597, memory used: 23.13671875 MB
after run 2: object count: 45759, memory used: 51.9296875 MB
before run 3: object count: 36609, memory used: 23.13671875 MB
after run 3: object count: 45768, memory used: 52.19140625 MB
before run 4: object count: 36621, memory used: 23.13671875 MB
after run 4: object count: 45777, memory used: 52.00390625 MB

Python environment (please complete the following information):

  • OS: Ubuntu 20.04
  • Python version: Python 3.8.10
  • SSLYZE Version: 5.1.3
  • NASSL Version: 5.0.1

Additional context
I have tried using multiprocessing on main internal code, but run into deadlocks when scanning larger groups of hosts:

FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY

I am continuing to investigate other alternatives to avoid increasing memory usage on subsequent scans. Any assistance or advice is greatly appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions