-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
While working on #9573 we found that FileTypeRouter raises a FileNotFoundError only when a non-existent file is passed with the meta parameter. Without meta, the component does not raise an error and still classifies the file based on its extension. This behavior is inconsistent.
The reason in the implementation is that we internally convert file paths to ByteStream objects in case metadata is provided so that we can add the metadata to the ByteStream object. We don't convert file paths if there is no metadata.
To handle this consistently, one option is to raise a FileNotFoundError too if there is no metadata. That's a breaking change. We could add a raise_on_failure=False parameter in addition. That's a breaking change for the case where there is no metadata but at least users could opt-in to the previous behavior.
I suggest to add a deprecation warning, wait for the next release, and then ensure consistent behavior in the release after.
from haystack.components.routers import FileTypeRouter
router = FileTypeRouter(mime_types=[r'text/plain'])
# No meta - does not raise error
router.run(sources=["non_existent.txt"])
# → {'text/plain': [PosixPath('non_existent.txt')]}
# With meta - raises FileNotFoundError
router.run(sources=["non_existent.txt"], meta={"spam": "eggs"})
# → FileNotFoundError: [Errno 2] No such file or directory: 'non_existent.txt'