Skip to content

GitHub sync crashing with 502 at repo stage #1824

@achantavy

Description

@achantavy

Description:

What issue is being seen? Describe what should be happening instead of the bug, for example: Cartography should not crash, the expected value isn't returned, the data schema is wrong, etc.

The GitHub sync is crashing at the repo stage. This is not transient.

To Reproduce:

Steps to reproduce the behavior. Provide all data and inputs required to reproduce the issue.

Run GitHub sync on a large enough environment? Unclear.

Logs:

If applicable, copy and paste your console log with the failing stack trace.

[2025-08-19T17:00:10Z] INFO cartography.graph.job: Finished job GitHubUser
[2025-08-19T17:00:10Z] INFO cartography.intel.github.repos: Syncing GitHub repos
[2025-08-19T17:01:32Z] ERROR cartography.intel.github.util: GitHub: Could not retrieve page of resource `repositories` due to HTTP error after 5 retries. Raising exception.
NoneType: None
[2025-08-19T17:01:32Z] ERROR cartography.sync: Unhandled exception during sync stage 'github'
Traceback (most recent call last):
  File "/app/.venv/lib/python3.13/site-packages/cartography/sync.py", line 144, in run
    stage_func(neo4j_session, config)
    ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.13/site-packages/cartography/util.py", line 204, in timed
    return method(*args, **kwargs)
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/__init__.py", line 43, in start_github_ingestion
    cartography.intel.github.repos.sync(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        neo4j_session,
        ^^^^^^^^^^^^^^
    ...<3 lines>...
        auth_data["name"],
        ^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/repos.py", line 1166, in sync
    repos_json = get(github_api_key, github_url, organization)
  File "/app/.venv/lib/python3.13/site-packages/cartography/util.py", line 204, in timed
    return method(*args, **kwargs)
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/repos.py", line 287, in get
    repos, _ = fetch_all(
               ~~~~~~~~~^
        token,
        ^^^^^^
    ...<3 lines>...
        "repositories",
        ^^^^^^^^^^^^^^^
    )
    ^
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/util.py", line 172, in fetch_all
    raise exc
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/util.py", line 154, in fetch_all
    resp = fetch_page(token, api_url, organization, query, cursor, **kwargs)
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/util.py", line 110, in fetch_page
    response = call_github_api(query, gql_vars_json, token, api_url)
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/util.py", line 75, in call_github_api
    response.raise_for_status()
    ~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/app/.venv/lib/python3.13/site-packages/requests/models.py", line 1026, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 502 Server Error: Bad Gateway for url: https://api.github.com/graphql
[2025-08-19T17:01:32Z] ERROR sync.1755622800165-311f7fd0: Error during Cartography sync: Traceback (most recent call last):
  File "/app/app/sync.py", line 762, in run_sync
    run_with_config(sync, config)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.13/site-packages/cartography/sync.py", line 283, in run_with_config
    return sync.run(neo4j_driver, config)
           ~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.13/site-packages/cartography/sync.py", line 144, in run
    stage_func(neo4j_session, config)
    ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.13/site-packages/cartography/util.py", line 204, in timed
    return method(*args, **kwargs)
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/__init__.py", line 43, in start_github_ingestion
    cartography.intel.github.repos.sync(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        neo4j_session,
        ^^^^^^^^^^^^^^
    ...<3 lines>...
        auth_data["name"],
        ^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/repos.py", line 1166, in sync
    repos_json = get(github_api_key, github_url, organization)
  File "/app/.venv/lib/python3.13/site-packages/cartography/util.py", line 204, in timed
    return method(*args, **kwargs)
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/repos.py", line 287, in get
    repos, _ = fetch_all(
               ~~~~~~~~~^
        token,
        ^^^^^^
    ...<3 lines>...
        "repositories",
        ^^^^^^^^^^^^^^^
    )
    ^
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/util.py", line 172, in fetch_all
    raise exc
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/util.py", line 154, in fetch_all
    resp = fetch_page(token, api_url, organization, query, cursor, **kwargs)
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/util.py", line 110, in fetch_page
    response = call_github_api(query, gql_vars_json, token, api_url)
  File "/app/.venv/lib/python3.13/site-packages/cartography/intel/github/util.py", line 75, in call_github_api
    response.raise_for_status()
    ~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/app/.venv/lib/python3.13/site-packages/requests/models.py", line 1026, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 502 Server Error: Bad Gateway for url: https://api.github.com/graphql

Screenshots:

If applicable, add screenshots to help explain your problem.

Please complete the following information::

  • Cartography release version or commit hash [e.g. 0.12.0 or 95e8e11]

0.110.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    module:GitHubRelated to GitHub intel module

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions