Watson-Visual-Recognition (WVR) Summary

Bigger picture

Watson has various services that you weave together to solve the user’s problem. Watson does not just know. It has to be taught. Cognitive systems are not programmed, they are trained. There are five key Watson patterns: Engagement, Discovery, Decision, Policy, and Exploration

Discover - Vision - Visual Recognition

Let us look into Watson API learning model - visual recognition further. WVR has 6 basic models as shown below

Working of various models

Consider a tyre image from the demo here and go through the classification results as shown below

From curl or swagger or postman, submit the images as input from

Get the API access key credentials of the visual recognition service from your IBM cloud account

Image on left as input - Json response on right as output

WVR Food model

WVR Face model

WVR General model

WVR Custom model

We created a custom model that classifies dogs. We supplied a negative sample of cats. The JSON response below shows the training phase of the custom classifier dogs_2025763446

Once the training is done, make a get request to see the status ready before passing test samples.

Below you find one positive (dog golden retreiver ) and one negative (apples) JSON responses when passed to custom classifier dogs_2025763446

Documentation specifies the WVR custom model limitations.

Issues / Limitations

Assertion 1: Documentation says that form parameter images_file can be a single file or a zip file with max 20 images. The maximum size of such a zip file is 100MB. Not ideal for cases of real-time video classification that takes more than 20 fps.

Assertion 2: When using the general model, it does not show all the objects like a apple within an image in JSON response.

curl -X POST -u "apikey:m2SyTztvn6aR1PFI0i7Lyf9er4Jh8fANO6E0btcYWrAL" --form "images_file=@/Users/krishna/Desktop/img_db/fruitbowl.jpg" "https://gateway.watsonplatform.net/visual-recognition/api/v3/classify?version=2018-03-19"

Assertion 3: As assertion 2 uses the general model and all objects within food image are not shown, We passed the classifier id as food now. Even then, not all fruits like oranges are classified with a default threshold as shown below.

Test with image sample of fruitbowl.jpg

curl -X POST -u "apikey:m2SyTztvn6aR1PFI0i7Lyf9er4Jh8fANO6E0btcYWrAL" --form "images_file=@/Users/krishna/Desktop/img_db/fruitbowl.jpg"  -F "classifier_ids=food" "https://gateway.watsonplatform.net/visual-recognition/api/v3/classify?version=2018-03-19"

Threshold of 0 and 0.5(default)

Threshold of 0.6, 0.9

Note:

In fruitbowl.jpg (640 × 426 pixel image resolution), when the threshold is above 0.6, apples or banana are not recognized. The default threshold of 0.5 and anything below 0.5 recognized the fruits apple and banana. Adjusting the threshold might increase the quality of predictions but sometimes the objects are gone out of predictions completely.

Test with another image sample of Apples_green_red.jpg

Default threshold of 0.5

Threshold of 0.7, 0.8

Note:

In Apples_green_red.jpg (342 × 147 pixel image resolution), none of the objects are recognized when the threshold is increased to 0.8. The image resolution of Apples_green_red.jpg is less than that of the above fruitbowl.jpg (640 × 426 pixel image resolution).
Instead of changing the threshold to improve the prediction results, we can fix the threshold to 0.5 (default) and submit images with higher resolution for better predictions results.
On a broader note, Threshold is directly proportional to the image quality. Higher the picture quality, objects in the picture can be recognized with higher thresholds. Lesser the picture quality, objects in the picture can be recognized only with lesser thresholds. Some tips on choosing the right threshold value for custom classifiers is shown here: 3rd point in Questions
Also, Documentation mentions that images in training and testing sets should resemble each other. Significant visual differences between training and testing groups will result in poor performance results. There are number of additional factors that will impact the quality of your training beyond the resolution of your images. Lighting, angle, focus, color, shape, distance from subject, and presence of other objects in the image will all impact your training.
So far, we tested images with pre-trained classifiers or built-in models where we have no control of trained images. Custom classifiers have much more control on training & test samples to improve the accuracy levels taking in view of aforementioned points.

Assertion 3: Faces are detected in food image for the following command below

curl -X POST -u "apikey:m2SyTztvn6aR1PFI0i7Lyf9er4Jh8fANO6E0btcYWrAL" --form "images_file=@/Users/krishna/Desktop/img_db/fruitbowl.jpg" "https://gateway.watsonplatform.net/visual-recognition/api/v3/detect_faces?version=2018-03-19"

Assertion 4: Documentation says that for a given image, age and gender is classified using general model. However, JSON responses for the curl requests using a general model for above Ginni / Trump images does not shown such classification.

Assertion 5: General model does not detect multiple faces within a single image as shown below

General model response for face detection

/p>

For such classification to happen, we have to explicitly pass the parameter detect_faces while submitting the image through curl request as shown below. Means, we have to know whether we are passing the face/object/food image before passing image.

detect_faces parameter passed in curl request

curl -X POST -u "apikey:m2SyTztvn6aR1PFI0i7Lyf9er4Jh8fANO6E0btcYWrAL" --form "images_file=@/Users/krishna/Desktop/img_db/6_faces_in_single_image.jpg.jpg" "https://gateway.watsonplatform.net/visual-recognition/api/v3/detect_faces?version=2018-03-19"

Assertion 6: Delayed responses in cases of increased image files. 1st image below takes <1sec. While the next 2 zipped folders with 5 and 22 images take 2.5 and more than 8 seconds.

Time = < 1sec for 1 file

Time = 2.5sec for 5 files

Time = >8 sec for 22 files

Detailed JSON response for 20 files - Note that only 20 files are processed as specified in the documentation

Assertion 7: We can observe from above-detailed JSON response that, images that have faces does not contain any information about their age/gender within the JSON response. Also, we passed images with combinations like images with face and food, food and text, food and hands. In such cases, the JSON responses are restricted to only one particular category.

Assertion 8: Current UI interface does not show any train button to upload images in custom model creation. Hence we trained our custom models by passing training datasets through curl request. Check below demo for further details.

Relevant studies

After reviewing lighthouse and other IBM internal assets, we found a close resemblance between Watson Natural Language Classifier (NLC) and WVR Text model. WLC is used for Text Classification https://www.ibm.com/watson/services/natural-language-classifier/. WVR Text model is used to identify the natural language in the uploaded image
The Watson Personality Insights (Waston PI) https://w3-03.ibm.com/services/lighthouse/documents/61268 uses linguistic analytics to extract a spectrum of cognitive and social characteristics from the text data that a person generates through blogs, tweets, forum posts, and more.
The common point with above 2 APIs and WVR Text model is related to identifying and classifying text. There were times when APIs like Alchemy vision https://www.ibm.com/blogs/watson/2016/05/visual-recognition-update/ deprecated to visual recognition and Q & A service deprecated to Engagement - Conversation due to their similarities in functioning. However, such merges in above 2 API cases to visual recognition or vice versa needs further investigation.
Projects that are built with Watson APIs in combination with visual recognition, NLC, etc or standalone are specified below: (Study in progress...)
- Watson text classification - https://developer.ibm.com/patterns/extend-watson-text-classification/
- Ticket categorization - https://developer.ibm.com/patterns/watson-studio-nlc-technical-support-ticket-categorization/
- Customer communications Insights analyzer - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/Customer%20Communication%20Insights%20Analyzer
- Text and email analyzer - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/Text%20and%20email%20analyzer
- Watson health depression pres screening - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/Watson%20Health%20DEP%20Depression%20Pre-Screening
- Optimized dictionary of German street names - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/Optimized%20Dictionary%20of%20Street%20Names%20in%20German
- Troll patrol - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/Troll%20Patrol
- GDPR Email Triage - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/GDPR%20Email%20Triage
- Email routing accelerator - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/Email%20Routing%20Accelerator

Onpremise offering

We can run IBM Watson services on any cloud. As visual recognition service is a part of Watson, we can host WVR onpremise.
Through the integration with IBM Cloud Private for Data (ICP for Data), Watson and Watson OpenScale can now be run any environment – on premises, or on any private, public or hybrid multi cloud – enabling businesses to apply AI to data wherever it is hosted.
ICP for data purchase details
Transform the Enterprise with IBM Cloud Private on OpenShift

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Screenshots		Screenshots
img_db		img_db
img_responses		img_responses
ICP for Data.pdf		ICP for Data.pdf
README.md		README.md
WVR T&C.pdf		WVR T&C.pdf
WVR-Models-IBMCloud.png		WVR-Models-IBMCloud.png
Watson Visual Recognition.docx		Watson Visual Recognition.docx
Watson patterns description.pdf		Watson patterns description.pdf
Working-demo-images-classified.png		Working-demo-images-classified.png
json-response-22-files.json		json-response-22-files.json
watson_visual_recognition.py		watson_visual_recognition.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Watson-Visual-Recognition (WVR) Summary

Bigger picture

Discover - Vision - Visual Recognition

Working of various models

From curl or swagger or postman, submit the images as input from

Image on left as input - Json response on right as output

WVR Food model

WVR Face model

WVR General model

WVR Custom model

Issues / Limitations

Note:

Relevant studies

Onpremise offering

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

i-krishna/Watson-VisualRecognition

Folders and files

Latest commit

History

Repository files navigation

Watson-Visual-Recognition (WVR) Summary

Bigger picture

Discover - Vision - Visual Recognition

Working of various models

From curl or swagger or postman, submit the images as input from

Image on left as input - Json response on right as output

WVR Food model

WVR Face model

WVR General model

WVR Custom model

Issues / Limitations

Note:

Relevant studies

Onpremise offering

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages