Skip to content

Metadata Extractors Breakdown

Alex Birkett edited this page Oct 15, 2024 · 7 revisions

Metadata Extractors Breakdown

This document provides a comprehensive overview of the various metadata extractors designed for handling different types of files in archaeological data management. Each extractor's purpose, functionality, and fields collected are detailed below.


Table of Contents

  1. Project Metadata Extractor
  2. Folder Level Metadata Extractor
  3. Other Metadata Extractor
  4. Geophysics Metadata Extractor
  5. Geospatial Metadata Extractor
  6. Image Metadata Extractor
  7. Control Point Metadata Extractor
  8. Folder Tree Creation

1. Project Metadata Extractor

  • Purpose: To gather and structure project-level metadata.

How It Works:

  1. Prompt User for Input:

    • Opens a Tkinter GUI window.
    • Displays fields such as Title, Description, Subject, etc.
    • Each field corresponds to a user input box for text entry.
  2. Collect Input:

    • On clicking 'OK', collects data from each text box and stores it in a dictionary called project_metadata.
  3. Return Metadata:

    • Constructs an XML element <Project_Level> containing sub-elements for each field with corresponding user input as text.

Fields to Collect:

Metadata Element File Name General Description
Project Title PROJECT_TITLE The title (and any alternatives such as site codes) for the dataset.
Description PROJECT_DESCRIPTION A brief summary of the main aims and objectives of the research project, along with a brief summary of the dataset content.
Subject PROJECT_SUBJECT Keywords for the subject content of the dataset, using controlled terms from FISH.
Coverage PROJECT_COVERAGE Spatial and temporal coverage details, including site coordinates and postal codes.
Projection System PROJECT_PCS The Projected Coordinate System (PCS) used.
Coordinate System PROJECT_GCS The Geographic Coordinate System (GCS) used.
Creators PROJECT_CREATORS Details of the creator(s) and organisations responsible for data collection.
Publisher PROJECT_PUBLISHER Information about any organisation that published the data.
Contributors PROJECT_CONTRIBUTORS Other individuals or organisations contributing to the resource.
Identifiers PROJECT_PROJECTID Project or reference numbers identifying the dataset.
Dates PROJECT_DATES Dates related to dataset creation and archaeological project activities.
Copyright PROJECT_COPYRIGHT Name of the copyright holder for the dataset.

2. Folder Level Metadata Extractor

  • Purpose: To extract metadata at the folder level, particularly for the "3D_Recording" folder and its subfolders.

How It Works:

  1. Initialize Metadata Dictionary:

    • Creates an empty dictionary called metadata to store folder-level details.
    • Extracts the folder name using os.path.basename(folder_path).
  2. Populate Metadata:

    • Updates the dictionary with the following keys, all initially set as empty strings (manual entry required):
      • FILE_SUBJECT: Placeholder for keywords related to the subject of the folder.
      • FILE_ACCURACY: Placeholder for intended accuracy information.
      • FILE_COVERAGE: Placeholder for coverage details of the folder contents.
      • FILE_PCS: Placeholder for the Projected Coordinate System (PCS).
      • PROJECT_RELATIONS: Placeholder for folder-level relations or references.
      • PROJECT_LANGUAGE: Defaults to 'English'.
      • PROJECT_TYPE: Placeholder for resource type (e.g., primary data, processed data).
      • PROJECT_FORMAT: Placeholder for the format of files within the folder (e.g., AutoCAD, 3D Model).
  3. Process Folder Metadata:

    • Iterates over all subfolders within the specified starting directory.
    • Specifically looks for the "3D_Recording" folder and processes its subfolders to extract and populate folder-level metadata.

Fields Extracted:

Metadata Element File Name General Description
FOLDER_PROJECTID FOLDER_PROJECTID Placeholder for project ID (manual entry required).
FOLDER_NAME FOLDER_NAME Name of the folder without the extension.
FOLDER_PATH FOLDER_PATH Directory path of the folder.
FOLDER_DESCRIPTION FOLDER_DESCRIPTION Description of the folder contents.
FOLDER_SUBJECT FOLDER_SUBJECT Placeholder for keywords related to the subject of the folder.
FOLDER_ACCURACY FOLDER_ACCURACY Placeholder for intended accuracy information.
FOLDER_COVERAGE FOLDER_COVERAGE Placeholder for coverage details of the folder contents.
FOLDER_PCS FOLDER_PCS Placeholder for the Projected Coordinate System (PCS).
FOLDER_GCS FOLDER_GCS Placeholder for the Geographic Coordinate System (GCS).
FOLDER_CREATORS FOLDER_CREATORS Details of the creator(s) responsible for the folder's contents.
FOLDER_PUBLISHER FOLDER_PUBLISHER Information about any organisation that published the folder's contents.
FOLDER_CONTRIBUTORS FOLDER_CONTRIBUTORS Other individuals or organisations contributing to the folder's contents.
FOLDER_FOLDERID FOLDER_FOLDERID Project or reference numbers identifying the folder.
FOLDER_DATES FOLDER_DATES Dates indicating when the folder was created or modified.
FOLDER_COPYRIGHT FOLDER_COPYRIGHT The name of the copyright holder for the folder's contents.
FOLDER_SIZE FOLDER_SIZE Size of the folder in MB.
FOLDER_COUNT FOLDER_COUNT Count of files in the folder.
FOLDER_RELATIONS FOLDER_RELATIONS Placeholder for folder-level relations or references.
FOLDER_LANGUAGE FOLDER_LANGUAGE Defaults to 'English'.
FOLDER_TYPE FOLDER_TYPE Placeholder for resource type (e.g., primary data, processed data).
FOLDER_FORMAT FOLDER_FORMAT Placeholder for the format of files within the folder (e.g., AutoCAD, 3D Model).

3. Other Metadata Extractor

  • Purpose: To extract metadata from files that do not fall into the image category (e.g., CSV files).

How It Works:

  1. File Stats:

    • Uses os.stat(file_path) to retrieve file statistics, including creation and modification times.
  2. Calculate File Size:

    • Converts file size from bytes to megabytes.
  3. Construct Metadata Dictionary:

    • Populates a dictionary with keys for file details.

Fields Extracted:

Metadata Element File Name General Description
FILE_NAME FILE_NAME Name of the file without extension.
FILE_PATH FILE_PATH Full path of the file.
FILE_EXTENSION FILE_EXTENSION File extension.
FILE_SIZE FILE_SIZE Formatted string for file size in MB.
FILE_CREATED FILE_CREATED Creation date formatted as '%Y-%m-%d %H:%M:%S'.
FILE_UPDATED FILE_UPDATED Modification date formatted similarly.
FILE_SOFTWARE FILE_SOFTWARE Software used to create the file.
FILE_HARDWARE FILE_HARDWARE Hardware used to create the file.
FILE_OPSYS FILE_OPSYS Operating system used.
FILE_KEYWORDS FILE_KEYWORDS Relevant keywords for the file.
FILE_DATES FILE_DATES Dates indicating dataset creation.
FILE_PROJECTID FILE_PROJECTID Project or reference numbers identifying the dataset.
FILE_LINKED FILE_LINKED Relationships between files.
FILE_IDENTIFIER FILE_IDENTIFIER Source file or derived status.
FILE_COPYRIGHT FILE_COPYRIGHT Copyright or rights holder details.
FILE_GCS FILE_GCS Geographic Coordinate System used.
FILE_PCS FILE_PCS Projected Coordinate System used.

4. Geophysics Metadata Extractor

  • Purpose: To extract metadata specifically from geophysics files (e.g., .xcp, .xgd).

How It Works:

  1. File Name Check:

    • Checks if the filename (converted to uppercase) contains the keyword defined by GEOPHYSICS_COMP_CONDITION.
    • Additionally, checks if the file extension is in the list of allowed geophysics file types (e.g., .xcp, .xgd).
  2. Calculate File Size:

    • Retrieves the file size in bytes and converts it to megabytes.
  3. Construct Metadata Dictionary:

    • If the checks pass, creates a dictionary with the following keys:
      • FILE_PATH: Path of the file.
      • FILE_NAME: Name of the file.
      • FILE_DESCRIPTION: Initially set as an empty string (manual entry required).
      • FILE_INSTRUMENT: Initially an empty string (manual entry required).
      • FILE_UNITS: Initially an empty string (manual entry required).
      • FILE_CENTRAL_COORDINATE: Initially an empty string (manual entry required).
      • FILE_NW_COORDINATE: Initially an empty string (manual entry required).
      • FILE_SE_COORDINATE: Initially an empty string (manual entry required).
      • FILE_COMMENTS: Initially an empty string (manual entry required).
      • FILE_1ST_TRAVERSE_DIRECTION: Initially an empty string (manual entry required).
      • FILE_METHOD: Initially an empty string (manual entry required).
      • FILE_SENSORS: Initially an empty string (manual entry required).
      • FILE_DUMMY_VALUE: Initially an empty string (manual entry required).
      • FILE_COMPOSITE_SIZE: Initially an empty string (manual entry required).
      • FILE_SURVEY_SIZE: Initially an empty string (manual entry required).
      • FILE_GRID_SIZE: Initially an empty string (manual entry required).
      • FILE_X_INTERVAL: Initially an empty string (manual entry required).
      • FILE_Y_INTERVAL: Initially an empty string (manual entry required).
      • FILE_SIZE: Formatted string indicating the size in MB.

Fields Extracted:

Metadata Element File Name General Description
FILE_PATH FILE_PATH Path of the file.
FILE_NAME FILE_NAME Name of the file.
FILE_DESCRIPTION FILE_DESCRIPTION Initially set as an empty string (manual entry required).
FILE_INSTRUMENT FILE_INSTRUMENT Initially an empty string (manual entry required).
FILE_UNITS FILE_UNITS Initially an empty string (manual entry required).
FILE_CENTRAL_COORDINATE FILE_CENTRAL_COORDINATE Initially an empty string (manual entry required).
FILE_NW_COORDINATE FILE_NW_COORDINATE Initially an empty string (manual entry required).
FILE_SE_COORDINATE FILE_SE_COORDINATE Initially an empty string (manual entry required).
FILE_COMMENTS FILE_COMMENTS Initially an empty string (manual entry required).
FILE_1ST_TRAVERSE_DIRECTION FILE_1ST_TRAVERSE_DIRECTION Initially an empty string (manual entry required).
FILE_METHOD FILE_METHOD Initially an empty string (manual entry required).
FILE_SENSORS FILE_SENSORS Initially an empty string (manual entry required).
FILE_DUMMY_VALUE FILE_DUMMY_VALUE Initially an empty string (manual entry required).
FILE_COMPOSITE_SIZE FILE_COMPOSITE_SIZE Initially an empty string (manual entry required).
FILE_SURVEY_SIZE FILE_SURVEY_SIZE Initially an empty string (manual entry required).
FILE_GRID_SIZE FILE_GRID_SIZE Initially an empty string (manual entry required).
FILE_X_INTERVAL FILE_X_INTERVAL Initially an empty string (manual entry required).
FILE_Y_INTERVAL FILE_Y_INTERVAL Initially an empty string (manual entry required).
FILE_SIZE FILE_SIZE Formatted string indicating the size in MB.

5. Geospatial Metadata Extractor

  • Purpose: To extract metadata from geospatial files such as GeoTIFFs and shapefiles.

How It Works:

  1. File Extension Check:

    • The extractor checks if the file extension is either .tif or .tiff to handle GeoTIFF files or .shp for shapefiles.
  2. GeoTIFF Metadata Extraction:

    • If the file is identified as a GeoTIFF:
      • Open File: Uses rasterio.open(file_path) to open the raster file.
      • Retrieve Tags: Calls src.tags() to get metadata tags associated with the raster file.
      • File Size Calculation: Retrieves the file size using os.path.getsize(file_path) and converts it from bytes to megabytes.
      • Get Associated Files: Calls the method get_associated_files(file_path) to find related files based on the main raster file's name (e.g., .tfw, .prj).
      • Construct Metadata Dictionary: Populates a dictionary with the following keys:
        • FILE_TITLE: Gets the value from tags; defaults to 'Unknown' if not present.
        • FILE_NAME: Name of the file without the extension.
        • FILE_PATH: Full path to the file.
        • FILE_EXTENSION: File extension (e.g., .tif).
        • FILE_DESCRIPTION: Description tag from the metadata; defaults to 'Unknown' if not available.
        • FILE_KEYWORDS: Keywords from the metadata; defaults to 'Unknown'.
        • FILE_VERSION: Version information from the metadata; if not present, defaults to the driver used to read the file.
        • FILE_SIZE: Size of the file formatted as a string in MB.
        • FILE_BANDS: Number of bands in the raster (obtained using src.count).
        • FILE_CELL_SIZE: Spatial resolution of the raster (obtained using src.res).
        • FILE_COVERAGE: Geographical bounds of the raster (obtained using src.bounds).
        • FILE_PCS: Projected Coordinate System as a string (using src.crs.to_string()).
        • FILE_GCS: Geographic Coordinate System as EPSG code (using src.crs.to_epsg()).
        • FILE_ASSOCIATED: Comma-separated string of associated files obtained from get_associated_files().
  3. Shapefile Metadata Extraction:

    • If the file is identified as a shapefile:
      • Open File: Reads the file using gpd.read_file(file_path).
      • Geometry Type: Gets the geometry type of the shapefile (e.g., Point, Polygon) using gdf.geometry.geom_type.unique()[0].
      • Feature Count: Counts the number of features in the shapefile using len(gdf).
      • File Creation Date: Obtains the creation date of the file using os.path.getctime(file_path) and formats it as a string.
      • Coordinate Systems: Extracts the Projected Coordinate System (PCS) and Geographic Coordinate System (GCS) using gdf.crs.to_string() and gdf.crs.geodetic_crs.to_string(), respectively.
      • Associated Files: Searches for associated files using the name of the shapefile (excluding the extension) to find files that begin with that name in the same directory.
      • Construct Metadata Dictionary: Populates a dictionary with keys:
        • FILE_PROJECTID: Placeholder for project ID (manual entry required).
        • FILE_NAME: Name of the shapefile without the extension.
        • FILE_PATH: Directory path of the shapefile.
        • FILE_EXTENSION: The file extension (e.g., .shp).
        • FILE_SIZE: Size of the file in MB.
        • FILE_DESCRIPTION: Placeholder for a description (manual entry required).
        • FILE_GEOMTYPE: Geometry type derived from the shapefile.
        • FILE_FEATURE_COUNT: Number of features present in the shapefile.
        • FILE_METHOD: Placeholder for method information (manual entry required).
        • FILE_DATES: Creation date of the shapefile.
        • FILE_COVERAGE: Placeholder for coverage description (manual entry required).
        • FILE_PCS: Projected Coordinate System as a string.
        • FILE_GCS: Geographic Coordinate System as a string.
        • FILE_SCALE: Placeholder for scale (manual entry required).
        • FILE_ASSOCIATED: Comma-separated list of associated files.

Fields Extracted:

  • For GeoTIFF:
Metadata Element File Name General Description
FILE_TITLE FILE_TITLE The title of the raster file or a suitable caption.
FILE_NAME FILE_NAME The name of the raster file (without extension).
FILE_PATH FILE_PATH The full path to the raster file.
FILE_EXTENSION FILE_EXTENSION The file format extension, e.g., .tif, .tiff.
FILE_SIZE FILE_SIZE The size of the file in MB.
FILE_DESCRIPTION FILE_DESCRIPTION A description of the raster data.
FILE_KEYWORDS FILE_KEYWORDS Keywords for the raster file (e.g., period, site, or feature terms).
FILE_VERSION FILE_VERSION The version or driver of the file, e.g., TIFF 6.0.
FILE_BANDS FILE_BANDS The number of raster bands in the file.
FILE_CELL_SIZE FILE_CELL_SIZE The size of the raster cells (resolution).
FILE_COVERAGE FILE_COVERAGE The bounding coordinates of the dataset.
FILE_PCS FILE_PCS The Projected Coordinate System used.
FILE_GCS FILE_GCS The Geographic Coordinate System used.
FILE_ASSOCIATED FILE_ASSOCIATED List of associated files for the raster file (e.g., .tfw, .prj).
  • For Shapefile:
Metadata Element File Name General Description
FILE_PROJECTID FILE_PROJECTID Placeholder for project ID (manual entry required).
FILE_NAME FILE_NAME Name of the shapefile without the extension.
FILE_PATH FILE_PATH Directory path of the shapefile.
FILE_EXTENSION FILE_EXTENSION The file extension (e.g., .shp).
FILE_SIZE FILE_SIZE Size of the file in MB.
FILE_DESCRIPTION FILE_DESCRIPTION Placeholder for a description (manual entry required).
FILE_GEOMTYPE FILE_GEOMTYPE Geometry type derived from the shapefile.
FILE_FEATURE_COUNT FILE_FEATURE_COUNT Number of features present in the shapefile.
FILE_METHOD FILE_METHOD Placeholder for method information (manual entry required).
FILE_DATES FILE_DATES Creation date of the shapefile.
FILE_COVERAGE FILE_COVERAGE Placeholder for coverage description (manual entry required).
FILE_PCS FILE_PCS Projected Coordinate System as a string.
FILE_GCS FILE_GCS Geographic Coordinate System as a string.
FILE_SCALE FILE_SCALE Placeholder for scale (manual entry required).
FILE_ASSOCIATED FILE_ASSOCIATED Comma-separated list of associated files.

6. Image Metadata Extractor

  • Purpose: To extract metadata from image files using the Pillow library.

How It Works:

  1. File Size Calculation:

    • Uses os.path.getsize(file_path) to retrieve the file size in bytes.
    • Converts bytes to megabytes.
  2. Open Image:

    • Attempts to open the image file with Image.open(file_path).
    • If successful, extracts the image's width and height.
  3. Determine Bit Depth:

    • Checks the image mode and sets bit_depth accordingly.
  4. Construct Metadata Dictionary:

    • Populates a dictionary with metadata keys and their corresponding values.

Fields Extracted:

Metadata Element File Name General Description
FILE_TITLE FILE_TITLE Title of the image or suitable caption.
FILE_PATH FILE_PATH Full path of the image file.
FILE_DESCRIPTION FILE_DESCRIPTION Description of the image.
FILE_COVERAGE FILE_COVERAGE Site location details and relevant period terms.
FILE_PCS FILE_PCS Projected Coordinate System used.
FILE_GCS FILE_GCS Geographic Coordinate System used.
FILE_KEYWORDS FILE_KEYWORDS Keywords related to the image.
FILE_VERSION FILE_VERSION Image format version.
FILE_SIZE FILE_SIZE Size of the file in bytes.
FILE_RESOLUTION FILE_RESOLUTION Resolution of the image in pixels per inch (ppi).
FILE_DIMENSIONS FILE_DIMENSIONS Dimensions of the image in pixels.
FILE_COLOUR FILE_COLOUR Colour space used in the image.
FILE_BITDEPTH FILE_BITDEPTH Bit depth of the image.

7. Control Point Metadata Extractor

  • Purpose: To extract metadata from control point files, specifically looking at shapefiles and CSV files.

How It Works:

  1. Metadata Initialization:

    • Creates an empty dictionary called metadata for storing control point information.
    • Extracts the base name of the file using os.path.splitext(os.path.basename(file_path))[0].
  2. Retrieve Associated Files:

    • Gathers associated files in the same directory by checking if they start with the same name as the main file.
    • Joins associated file names into a single string called linked_files.
  3. File Type Handling:

    • If the file is a shapefile (.shp):
      • Uses gpd.read_file(file_path) to load the shapefile data.
      • Extracts the first point's coordinates (X, Y, and optional Z) from the geometry.
      • Populates the metadata dictionary with:
        • CONTL_X: X coordinate of the first control point.
        • CONTL_Y: Y coordinate of the first control point.
        • CONTL_Z: Z coordinate if available; otherwise, an empty string.
    • If the file is a CSV (.csv):
      • Opens the file using csv.DictReader().
      • Reads through each row to extract X, Y, and Z coordinates.
      • Populates metadata with:
        • CONTL_X, CONTL_Y, CONTL_Z as before.
  4. Additional Metadata Fields:

    • Adds various placeholders for additional details:
      • CONTL_CX, CONTL_CY, CONTL_CZ: Placeholders for covariance values.
      • CONTL_Location: Placeholder for a textual description of the location.
      • FILE_DATES: Placeholder for date information.
      • FILE_PROJECTID: Placeholder for the project ID.
      • FILE_COVERAGE: Placeholder for coverage details.
      • FILE_PCS: Placeholder for Projected Coordinate System.
      • FILE_GCS: Placeholder for Geographic Coordinate System.
      • FILE_LINKED: Contains the list of associated files.

Fields Extracted:

Metadata Element File Name General Description
FILE_TITLE FILE_TITLE The title of the control point file or a suitable caption.
FILE_NAME FILE_NAME The name of the control point file (without extension).
FILE_PATH FILE_PATH The full path to the control point file.
FILE_EXTENSION FILE_EXTENSION The file format extension, e.g., .shp or .csv.
FILE_SIZE FILE_SIZE The size of the file in MB.
FILE_DESCRIPTION FILE_DESCRIPTION A description of the control point data.
CONTL_X CONTL_X X coordinate of the first control point.
CONTL_Y CONTL_Y Y coordinate of the first control point.
CONTL_Z CONTL_Z Z coordinate if available; otherwise, an empty string.
CONTL_CX CONTL_CX Placeholder for Covariance X (manual entry required).
CONTL_CY CONTL_CY Placeholder for Covariance Y (manual entry required).
CONTL_CZ CONTL_CZ Placeholder for Covariance Z (manual entry required).
CONTL_LOCATION CONTL_LOCATION Textual description of location.
FILE_DATES FILE_DATES Dates indicating when the dataset was created.
FILE_PROJECTID FILE_PROJECTID Project or reference numbers used to identify the dataset.
FILE_COVERAGE FILE_COVERAGE Placeholder for coverage details.
FILE_PCS FILE_PCS Projected Coordinate System used.
FILE_GCS FILE_GCS Geographic Coordinate System used.
FILE_LINKED FILE_LINKED Contains the list of associated files.

8. Folder Tree Creation

  • Purpose: To create an XML representation of the folder structure within a specified directory, including details about each folder and its contained files.

How It Works:

  1. Count Total Folders:

    • The function count_total_folders(folder_path) iterates through the directory using os.walk().
    • It counts all subfolders while applying exclusions based on predefined suffixes in EXCLUDED_DIRECTORY_SUFFIXES. This ensures that specific directories (like .files, .gdb, .Overviews) are not counted or included in the XML.
  2. Gather Folder Size and File Count:

    • The function get_folder_size_and_file_count(folder_path) calculates the total size of files within a folder.
    • It also counts the number of files present:
      • It uses os.path.getsize(file_path) to sum the sizes of non-symlink files.
      • The total size is converted from bytes to megabytes for easier readability.
  3. Create XML Elements:

    • The function create_folder_element(folder_path, parent_xml_element) creates XML sub-elements for each folder:
      • It generates a element for the current folder, including attributes for the folder's name, size in MB, and file count.
      • For each item in the current folder, it checks if the item is a directory or a file:
        • If it’s a directory, it recursively calls create_folder_element() to process that subfolder, adding it to the current folder’s XML element.
        • If it’s a file, it adds a element to the current folder's XML with the name of the file.
  4. Construct the Complete Folder Tree:

    • The function create_folder_tree_xml(start_dir) serves as the main entry point for building the entire folder tree:
      • It first counts the total number of folders and displays a message box to indicate that processing is underway.
      • It creates a root XML element <Folder_Tree> and initiates the recursive creation of folder elements using create_folder_element().
      • After constructing the XML tree, it returns the root element for further processing (e.g., saving to a file).

Key Considerations:

  • Exclusions: The script is designed to skip over certain directories to avoid unnecessary clutter in the XML output.
  • File Handling: Only non-symlink files are considered to prevent counting linked files that may not occupy space in the folder.

Example XML Structure:

The generated XML structure for the folder tree would look like this: xml <Folder_Tree> <FOLDER_PROJECTID>Project_001</FOLDER_PROJECTID> <FOLDER_NAME>MainFolder</FOLDER_NAME> <FOLDER_PATH>/path/to/MainFolder</FOLDER_PATH> <FOLDER_DESCRIPTION>Main project folder containing all data.</FOLDER_DESCRIPTION> <FOLDER_SUBJECT>Archaeology</FOLDER_SUBJECT> <FOLDER_ACCURACY>High</FOLDER_ACCURACY> <FOLDER_COVERAGE>OSGB 123456, 654321</FOLDER_COVERAGE> <FOLDER_PCS>EPSG:27700</FOLDER_PCS> <FOLDER_GCS>EPSG:4326</FOLDER_GCS> <FOLDER_CREATORS>Jane Doe, Archaeological Institute</FOLDER_CREATORS> <FOLDER_PUBLISHER>Archaeological Institute</FOLDER_PUBLISHER> <FOLDER_CONTRIBUTORS>John Smith</FOLDER_CONTRIBUTORS> <FOLDER_FOLDERID>Folder_001</FOLDER_FOLDERID> <FOLDER_DATES>2024-10-15</FOLDER_DATES> <FOLDER_COPYRIGHT>© 2024 Archaeological Institute</FOLDER_COPYRIGHT> <FOLDER_SIZE>2.5MB</FOLDER_SIZE> <FOLDER_COUNT>10</FOLDER_COUNT> <FOLDER_RELATIONS>Related Project Materials</FOLDER_RELATIONS> <FOLDER_LANGUAGE>English</FOLDER_LANGUAGE> <FOLDER_TYPE>Primary Data</FOLDER_TYPE> <FOLDER_FORMAT>Various formats including .png, .pdf</FOLDER_FORMAT>

    <FOLDER Name="SubFolder1" Size_MB="1.2" FileCount="5">
        <FILE Name="file2.png">
            <FILE_TITLE>Site Image</FILE_TITLE>
            <FILE_PATH>/path/to/MainFolder/SubFolder1/file2.png</FILE_PATH>
            <FILE_DESCRIPTION>Image of the archaeological site.</FILE_DESCRIPTION>
            <FILE_COVERAGE>Coordinates: OSGB 123456, 654321</FILE_COVERAGE>
            <FILE_PCS>EPSG:27700</FILE_PCS>
            <FILE_GCS>EPSG:4326</FILE_GCS>
            <FILE_KEYWORDS>archaeology, site</FILE_KEYWORDS>
            <FILE_VERSION>1.0</FILE_VERSION>
            <FILE_SIZE>1.2MB</FILE_SIZE>
            <FILE_RESOLUTION>300 dpi</FILE_RESOLUTION>
            <FILE_DIMENSIONS>1920 x 1080 px</FILE_DIMENSIONS>
            <FILE_COLOUR>RGB</FILE_COLOUR>
            <FILE_BITDEPTH>24</FILE_BITDEPTH>
        </FILE>
    </FOLDER>
    <FOLDER Name="SubFolder2" Size_MB="1.3" FileCount="3">
        <FILE Name="file4.pdf">
            <FILE_NAME>findings_report</FILE_NAME>
            <FILE_PATH>/path/to/MainFolder/SubFolder2/file4.pdf</FILE_PATH>
            <FILE_EXTENSION>pdf</FILE_EXTENSION>
            <FILE_SIZE>0.8MB</FILE_SIZE>
            <FILE_CREATED>2024-10-15 10:00:00</FILE_CREATED>
            <FILE_UPDATED>2024-10-15 10:05:00</FILE_UPDATED>
            <FILE_SOFTWARE>Adobe Acrobat 2024</FILE_SOFTWARE>
            <FILE_HARDWARE>HP Laptop</FILE_HARDWARE>
            <FILE_OPSYS>Windows 10</FILE_OPSYS>
            <FILE_KEYWORDS>report, findings, archaeology</FILE_KEYWORDS>
            <FILE_DATES>2024-10-15</FILE_DATES>
            <FILE_PROJECTID>Project_001</FILE_PROJECTID>
            <FILE_LINKED>Related_Image.png</FILE_LINKED>
            <FILE_IDENTIFIER>Source Document</FILE_IDENTIFIER>
            <FILE_COPYRIGHT>© 2024 Archaeological Institute</FILE_COPYRIGHT>
            <FILE_GCS>EPSG:4326</FILE_GCS>
            <FILE_PCS>EPSG:27700</FILE_PCS>
            <FILE_COVERAGE>Coordinates: OSGB 123456, 654321</FILE_COVERAGE>
        </FILE>
    </FOLDER>
</FOLDER>

</Folder_Tree>