Skip to content

Don't rely on file endings to deduce file content types #12

@kschulst

Description

@kschulst

Currently, when deducing file types, the following is used:
FileTypes.probeContentType

This in turn delegates to standard JDK Files content type probing, which is a bit primitive. It e.g. relies on files having a suffix.

Instead, we should use TikaLiteFileTypeDetector util to determine file content type, which should be a bit more capable.

And while we're at it: We should upgrade Tika to use the latest major version. We're on 1.x, and the latest is 2.x (which has some breaking changes).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions