Replies: 4 comments 1 reply
-
Good question! Let's break it down, here's the pipeline:
Seems like https://github.com/tensorflow/tfjs/blob/d8a8afeeb9218e39655712c3d8d26977371054bf/tfjs-react-native/src/camera/camera_stream.tsx#L369 would be a good reference for 1 and 3. I would start with 1 and 3 - i.e. to make sure that I am able to receive a feed of images and augment them (with a static text). 1 would produce a stream of in-memory, platform-specific abstractions of "image". We'll need to feed those images into mlkit. This library need an enhancement - to be able to receive the "image" object instead of a file URL. It is already supported in native packages, we'll just need to expose those methods to RN. |
Beta Was this translation helpful? Give feedback.
-
Thanks for outlining the steps! I don't have time to do this now, but I'm definitely interested in tackling this and making the enhancement to this library in ~1 month or so. A few questions:
Thanks! |
Beta Was this translation helpful? Give feedback.
-
Thanks all for sharing this, this is complicated and I was also trying frame processor, but was having a tough time. |
Beta Was this translation helpful? Give feedback.
-
I am not too familiar with Tensor and its terms. The referenced file https://github.com/tensorflow/tfjs/blob/d8a8afeeb9218e39655712c3d8d26977371054bf/tfjs-react-native/src/camera/camera_stream.tsx#L291 integrates with Expo Camera package. It "renders" two React elements;
The first one "starts" expo camera, the seconds one shows a Camera's output is captured here Seems like all the magic is happening in Replacing Tensor logic with mlkit logic would produce the desired effect I think.
Just follow the official RN guide - it's quite a detailed documentation and this package is very "usual" in terms of implementation - it's just one class for each platform. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Really more of a question (or a feature request), but is it possible to perform OCR on the live frame data from the Camera?
In my app I was trying to get around this by repeatedly taking pictures at regular intervals and scanning them to show red bounding boxes for the text (video below). This would've worked fine, except iOS makes a loud shutter noise when you take a picture (and a shutter noise every 500ms is really really annoying).
It looks like expo-camera (which is what I'm using) doesn't access the live frame data but there are some other options out there that do, but I don't know how to get them to work with MLKit:
Thanks! Appreciate this package a ton and understand what I'm asking here might be difficult to do. If so, I'm also curious to learn more and help out.
RPReplay_Final1632580413.mov
Beta Was this translation helpful? Give feedback.
All reactions