Skip to content
This repository was archived by the owner on Sep 9, 2024. It is now read-only.
This repository was archived by the owner on Sep 9, 2024. It is now read-only.

Run script problem #1

@sdpkjc

Description

@sdpkjc

Run command in README

cd reincarnating_rl
python -um reincarnating_rl.train \
  --agent qdagger_rainbow \
  --gin_files reincarnating_rl/configs/qdagger_rainbow.gin
  --base_dir /tmp/qdagger_rainbow \
  --teacher_checkpoint_dir $TEACHER_CKPT_DIR/Breakout/1 \
  --teacher_checkpoint_number 399
  --run_number=1 \
  --atari_roms_path=/tmp/atari_roms \
  --alsologtostderr
  • --atari_roms_path is not found.
  • --gin_files reincarnating_rl/configs/qdagger_rainbow.gin and --teacher_checkpoint_number 399 no line connector \.

I initially modified the command as follows:

cd reincarnating_rl
python -um reincarnating_rl.train \
  --agent qdagger_rainbow \
  --gin_files reincarnating_rl/configs/qdagger_rainbow.gin \
  --base_dir /tmp/qdagger_rainbow \
  --teacher_checkpoint_dir $TEACHER_CKPT_DIR/Breakout/1 \
  --teacher_checkpoint_number 399 \
  --run_number=1 \
  --alsologtostderr

But the system reported an error, I hope to get your help. 🙏

2023-03-07 14:27:23.087431: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-03-07 14:27:23.178229: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/cv2/../../lib64:
2023-03-07 14:27:23.178255: I tensorflow/compiler/xla/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/tensorflow/python/framework/dtypes.py:246: DeprecationWarning: `np.bool8` is a deprecated alias for `np.bool_`.  (Deprecated NumPy 1.24)
  np.bool8: (False, True),
2023-03-07 14:27:23.629774: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/cv2/../../lib64:
2023-03-07 14:27:23.629827: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/cv2/../../lib64:
2023-03-07 14:27:23.629835: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/tf/__init__.py:48: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
  if (distutils.version.LooseVersion(tf.__version__) <
/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/jax/_src/deprecations.py:51: DeprecationWarning: jax.interpreters.xla.DeviceArray is deprecated. Use jax.Array instead.
  warnings.warn(message, DeprecationWarning)
/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/jax/_src/api_util.py:240: SyntaxWarning: Jitted function has invalid argnames {'distill_temperature', 'distill_type', 'distill_loss_coefficient', 'distill_best_action_only'} in static_argnames. Function does not take these args.This warning will be replaced by an error after 2022-08-20 at the earliest.
  warnings.warn(f"Jitted function has invalid argnames {invalid_argnames} "
2023-03-07 14:27:24.751736: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/cv2/../../lib64:
2023-03-07 14:27:24.751770: W tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: UNKNOWN ERROR (303)
2023-03-07 14:27:24.751785: I tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (sdpkjc008-B460MAORUSPRO): /proc/driver/nvidia/version does not exist
I0307 14:27:24.751885 140636716074048 train.py:126] Setting random seed: 1
Traceback (most recent call last):
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/sdpkjc008/dev/reincarnating_rl/reincarnating_rl/train.py", line 174, in <module>
    app.run(main)
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/home/sdpkjc008/dev/reincarnating_rl/reincarnating_rl/train.py", line 151, in main
    base_run_experiment.load_gin_configs(gin_files, gin_bindings)
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/dopamine/discrete_domains/run_experiment.py", line 55, in load_gin_configs
    gin.parse_config_files_and_bindings(gin_files,
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/config.py", line 2497, in parse_config_files_and_bindings
    includes_and_imports = parse_config_file(config_file, skip_unknown)
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/config.py", line 2450, in parse_config_file
    includes, imports = parse_config(f, skip_unknown=skip_unknown)
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/config.py", line 2322, in parse_config
    for statement in parser:
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/config_parser.py", line 211, in __next__
    statement = self.parse_statement()
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/config_parser.py", line 240, in parse_statement
    value = self.parse_value()
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/config_parser.py", line 280, in parse_value
    success, value = parser()
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/config_parser.py", line 558, in _maybe_parse_configurable_reference
    reference = self._delegate.configurable_reference(scoped_name, evaluate)
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/utils.py", line 60, in try_with_location
    augment_exception_message_and_reraise(exception, _format_location(location))
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
    raise proxy.with_traceback(exception.__traceback__) from None
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/utils.py", line 56, in try_with_location
    yield
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/config_parser.py", line 558, in _maybe_parse_configurable_reference
    reference = self._delegate.configurable_reference(scoped_name, evaluate)
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/config.py", line 844, in configurable_reference
    return ConfigurableReference(scoped_selector, evaluate)
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/config.py", line 691, in __init__
    self.initialize()
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/config.py", line 697, in initialize
    _raise_unknown_reference_error(self)
  File "/home/sdpkjc008/anaconda3/envs/rrl/lib/python3.9/site-packages/gin/config.py", line 644, in _raise_unknown_reference_error
    raise ValueError(err_str.format(ref.selector, maybe_parens, additional_msg))
ValueError: No configurable matching reference '@loss_helpers.persistence_linearly_decaying_epsilon'.
  In file "reincarnating_rl/configs/qdagger_rainbow.gin", line 27
    JaxFullRainbowAgent.epsilon_fn = @loss_helpers.persistence_linearly_decaying_epsilon
                                     ^

My system info:

  • Ubuntu 22.04
  • No GPU & CUDA
  • Python3.9.16
  • Dependencies from requirements.txt installed
  • All checkpoint files have been downloaded and placed in a directory named DQN_400 at the same level as the README file

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions