Skip to content
This repository was archived by the owner on Jan 24, 2025. It is now read-only.

MongoDB Document Structure

astreylabs edited this page Jun 28, 2014 · 2 revisions

MongoDB Document Structure

  • path - full path to processed file
  • filename - filename of processed file
  • text - text extracted from file
  • mime_type - not implemented yet

MongoDB Document Sample

{ "_id" : ObjectId("53aedfcbce499d425a272fe0"), "path" : "/home/user/data-dir/Stat-Syllabus.pdf", "filename" : "Stat-Syllabus.pdf" "text" : "Professor: Chad SparberOffice:233 Persson HallOffice Hours:Thursday: 4:00 - 4:50Friday: 8:00 - 9:50Office Phone:(315) 228...", }

Clone this wiki locally