Skip to content

openaleph/ingest-file

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Docs Python test and package pre-commit Coverage Status AGPLv3+ License Pydantic v2

ingest-file

ingest-file extract useful information from documents of different types in a structured standard format. It retains folder structures across directories, compressed archives and emails. The extracted data is formatted as Follow the Money (FtM) entities, ready for import into OpenAleph, or processing as an object graph.

Documentation

https://openaleph.org/docs/lib/ingest-file

Development environment

For local development use poetry

poetry install --with dev --all-extras

pre-commit

pre-commit install

Release procedure

# on main branch
git pull --rebase
make build
make test
poetry run bump2version {patch,minor,major} # pick the appropriate one
git push

Usage

Ingestors are usually called in the context of Aleph. In order to run them stand-alone, you can use the supplied docker compose environment. To enter a working container, run:

make build
make shell

Inside the shell, you will find the ingestors command-line tool. During development, it is convenient to call its debug mode using files present in the user's home directory, which is mounted at /host:

ingestors debug /host/Documents/sample.xlsx

License

As of release version 3.18.4 ingest-file is licensed under the AGPLv3 or later license. Previous versions were released under the MIT license.

Packages

 
 
 

Languages

  • Python 97.4%
  • Shell 1.4%
  • Other 1.2%