How to report multiple documents on extract

I implement a dataset-level metadata extractor. I think I need to be able to report multiple, individual metadata records. In principle, one be able to build these records in a way that they can be reported in a nested fashion (thereby reporting just a single object). However, in my case I have no control over the nature of these documents, and they might be linked (or not) in different ways.

What is a desirable approach here?

- an arbitrary top-level key that maps onto an array?
- a JSON-LD style `@graph` top-level key (as a realization of the above)?
- something else?

Related: We might be talking about a lot of stuff to return. If I see things correctly, I need to load multiple standalone records into memory (many), report them via immediate_data as a single dict, such that they can be written out as JSON (again). I am yet to understand why `meta-extract` turns a single return value of type `ExtractorResult` into a result record, rather than dealing with result records directly. This would make the standard machinery of seemlessly switching between return values and generator yields applicable to metadata extractors too

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to report multiple documents on extract #391

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to report multiple documents on extract #391

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions