Skip to content

Quote ingest using apache stack: arrow / parquet #536

@goodboy

Description

@goodboy

In Follow up to #486, it'd sure be nice to be able to move away
from our current multiprocessing.shared_memory approach for
real-time quote/tick ingest and possibly leverage an apache
standard format such as arrow and parquet.

As part of improving the .parquet file based tsdb IO from #486
obviously it'd be ideal to support df appends instead of only full
overwrites 😂.


ToDo content from #486

pertaining to StorageClient.write_ohlcv() write on backfills and
rt ingest. rn the write is masked out mostly bc there's some
details to work out on when/how frequent the writes to parquet
files should happen, particularly whether to "append" to parquet
files: turns out there's options for appending (faster then
overwriting i guess?) to parquet, particularly using fastparquet,
see the below resources:

Metadata

Metadata

Assignees

No one assigned

    Labels

    data-layerreal-time and historical data processing and storagedependencieswe are the dependent, or are you?fspfinancial signal processingintegrationexternal stack and/or lib augmentationsperfefficiency and latency optimizationresearchprobably just a link dump..

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions