Skip to content

clamsproject/app-heuristic-chyron-understanding

Repository files navigation

Heuristic Chyron Understanding

Description

Prototype to convert chyron text from docTR/Tesseract/LLaVA MMIF output into a name and list of attributes.

User instructions

General user instructions for CLAMS apps are available at CLAMS Apps documentation.

System requirements

This requires Python 3.8 or higher. For local installation of required Python modules, see requirements.txt.

Configurable runtime parameters

For the full list of parameters, please refer to the app metadata from the CLAMS App Directory or the metadata.py file in this repository.

Input and output details

The app takes a MMIF file with TextDocument annotations generated by an OCR app such as doctr-wrapper or Tesseract.

The app outputs a TextDocument annotation corresponding to each of the input annotations, containing the original text split into name-as-written and attributes fields in an escaped JSON string; optionally, the output may also include a name-normalized field. Each annotation also contains identifying information for the new annotation, source annotation, and source VideoDocument. For more details, see the output section of the app metadata.

About

A prototype app to post-process OCR/TR into structured data from chyron images

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors 2

  •  
  •