GPU Resource Monitor

The ISE group has 3 servers with GPU facilities which we use for teaching and research. There are talks about moving to a hosted shared facility, and to estimate capacity requirements we woud like to monitor current usage. To do this we would like to start by logging the usage of the GPU instances on the servers to a sqlite database.

We would like to make a tool that periodically reads the output of the nvidia-smi tool and record it in a log. Ideally the pid output of the tool should also be used to then lookup the info on that running process (see: https://github.com/giampaolo/psutil ) and record more information that could be useful.

For example:

nvidia-smi  --query-compute-apps=pid,used_memory --format=csv

pid, used_gpu_memory [MiB]

/usr/local/bin/python -m ipykernel_launcher -f /root/.local/share/jupyter/runtime/kernel-17748908-32ab-4310-9149-6f75784a799d.json, 1711 MiB

Open Questions What frequency to do the queries? Which fields qo query (depends on what is available) What does the log database schema look like? (bonuspoints for doing it in RDF… ;-) Which format to use for queries from nvidia-smi tool: csv or xml ?

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.gitignore		.gitignore
README.md		README.md
analysis.ipynb		analysis.ipynb
data_types.py		data_types.py
helper.py		helper.py
main.py		main.py
requirements.txt		requirements.txt
storing.py		storing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GPU Resource Monitor

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

ISE-FIZKarlsruhe/gpu_monitoring

Folders and files

Latest commit

History

Repository files navigation

GPU Resource Monitor

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages