Skip to content

Balance sheet data is collected from public sources like Google and Zalo. The CTGAN model generates synthetic tabular images to enhance dataset diversity. This enriched data trains the TATR model for table detection and structure recognition.

Notifications You must be signed in to change notification settings

Bang-tv259/Demo-TableExtraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Demo Table Extraction

Initially, balance sheet data from various companies is sourced from publicly available platforms such as Google and Zalo. To ensure diversity in the dataset, the CTGAN model is employed to generate synthetic tabular images that replicate the structure of the original data. Subsequently, this enriched dataset is utilized to train the TATR model, which performs two key tasks: table detection and table structure recognition from the input images.

1, Generation Table

Technology: Python, CTGAN

GenTable

2, Extraction Table

Technology: Python, TATR

ExtractionTable

About

Balance sheet data is collected from public sources like Google and Zalo. The CTGAN model generates synthetic tabular images to enhance dataset diversity. This enriched data trains the TATR model for table detection and structure recognition.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published