Create shared schemas collector for DBM integrations #21720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

sethsamuel wants to merge 16 commits into master from seth.samuel/DBMON-5799-create-shared-schema-collector

+328 −0

Contributor

sethsamuel commented Oct 21, 2025 •

edited

Loading

What does this PR do?

Adds a shared schema collector for the DBM integrations (Postgres, MySQL, SQLServer).

Motivation

This class centralizes shared logic around iteration, buffering, submission, etc. Individual integrations will implement subclasses that handle actual data retrieval and mapping. See #21501 for the Postgres implementation.

Review checklist (to be filled by reviewers)

Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged


          Create shared schemas collector for DBM integrations

6b80e5a

temporal-github-worker-1 bot added agent/review-requested ecosystems/review-requested product/review-requested labels

datadog-agent-integrations-bot bot added the base_package label

Contributor

datadog-official bot commented Oct 21, 2025 •

edited

Loading

⚠️ Tests

⚠️ Warnings

❄️ 1 New flaky test detected

test_statement_metrics_and_plans[master-EXEC multiQueryProc-expected_queries_patterns1-param_groups1-5-False-True-True-False-multiqueryproc] from test_statements.py (Datadog)
missing expected matching rows
assert 1 == 2
 +  where 1 = len([{'dd_commands': ['SELECT'], 'dd_comments': [], 'dd_tables': ['ϑings'], 'execution_count': 5, ...}])
 +  and   2 = len(["select @total = @total \\+ count\\(\\*\\) from sys\\.databases where name like '%_'", "select @total = @total \\+ count\\(\\*\\) from sys\\.sysobjects where type = 'U'"])

ℹ️ Info

🧪 All tests passed

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 2a65b0a | Docs | Was this helpful? Give us feedback!}

codecov bot commented Oct 21, 2025 •

edited

Loading

Codecov Report

❌ Patch coverage is 89.36170% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.12%. Comparing base (74c7edc) to head (2a65b0a).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

sethsamuel added 3 commits

October 21, 2025 14:36

WIP

96e5260

WIP

04f8163


          Changelog

4624b88

sethsamuel commented

View reviewed changes

datadog_checks_base/tests/base/utils/test_persistent_cache.py Show resolved Hide resolved

sethsamuel added 3 commits

October 21, 2025 14:49


          Warning

a68f875


          Remove unused

aa0e0dd


          Lint

3c64896

sethsamuel marked this pull request as ready for review

October 23, 2025 13:35

sethsamuel requested review from a team as code owners

October 23, 2025 13:35

datadog-agent-integrations-bot bot added team/agent-integrations team/database-monitoring-agent labels

chatgpt-codex-connector bot reviewed

View reviewed changes

chatgpt-codex-connector bot left a comment

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

datadog_checks_base/datadog_checks/base/utils/db/schemas.py Show resolved Hide resolved

datadog_checks_base/datadog_checks/base/utils/db/schemas.py Show resolved Hide resolved

sethsamuel added 2 commits

October 24, 2025 09:04


          AI Fixes

da84647

Fix

9c2daa0

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/checks/db.py Outdated Show resolved Hide resolved

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py Outdated Show resolved Hide resolved

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py Outdated Show resolved Hide resolved

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py Outdated Show resolved Hide resolved

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py Show resolved Hide resolved

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py Show resolved Hide resolved

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py Outdated

Comment on lines 88 to 91

    
                              with self._get_cursor(database_name) as cursor:

                                  # Get the next row from the cursor

                                  next = self._get_next(cursor)

                                  while next:

Contributor

eric-weaver Oct 27, 2025

Pymysql and psycopg cursors are iterable. I suspect the same for SqlServer drivers. We should be able to iterate the cursor directly an avoid the need for _get_next(). Did you try something like the following

Suggested change

      
                            with self._get_cursor(database_name) as cursor:
          
                                # Get the next row from the cursor
          
                                next = self._get_next(cursor)
          
                                while next:
          
                            with self._get_cursor(database_name) as cursor:
          
                                for next in cursor:

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py

    
                                      self.maybe_flush(is_last_payload)

                      except Exception as e:

                          status = "error"

                          self._log.error("Error collecting schema: %s", e)

Contributor

eric-weaver Oct 27, 2025

Worth including some stats in here? Such as how many databases/tables collected / time passed?


          Feedback

b487d83

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py Outdated Show resolved Hide resolved

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py Outdated Show resolved Hide resolved

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py Show resolved Hide resolved

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py

    
                              self._log.debug("Starting collection of schemas for database %s", database['name'])

                              database_name = database['name']

                              if not database_name:

                                  self._log.warning("database has no name %v", database)

Contributor

eric-weaver Oct 27, 2025

Suggested change

      
                                self._log.warning("database has no name %v", database)
          
                                self._log.warning("database has no name %s", database)

Strings should use %s

Contributor Author

sethsamuel Oct 27, 2025

This is the database object

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py

    
                      raise NotImplementedError("Subclasses must implement _get_databases")

                  @abstractmethod

                  def _get_cursor(self, database):

Contributor

eric-weaver Oct 27, 2025

Suggested change

      
                def _get_cursor(self, database):
          
                def _get_cursor(self, database) -> AbstractContextManager[Any]:

I think we can type this as requiring a context manager using from contextlib import AbstractContextManager

sethsamuel added 3 commits

October 27, 2025 12:34


          Feedback


          Lint

3d7e7d2


          Refactor health

9abc674

datadog-agent-integrations-bot bot added the integration/postgres label

sethsamuel added 3 commits

October 27, 2025 13:02

Fix

de655f6


          Merge branch 'master' into seth.samuel/DBMON-5799-create-shared-schem…

cf247b5

…a-collector


          Merge branch 'master' into seth.samuel/DBMON-5799-create-shared-schem…

2a65b0a

…a-collector

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent/review-requested base_package ecosystems/review-requested integration/postgres product/review-requested team/agent-integrations team/database-monitoring-agent