Watson has various services that you weave together to solve the user’s problem. Watson does not just know. It has to be taught. Cognitive systems are not programmed, they are trained. There are five key Watson patterns: Engagement, Discovery, Decision, Policy, and Exploration
Let us look into Watson API learning model - visual recognition further. WVR has 6 basic models as shown below
Consider a tyre image from the demo here and go through the classification results as shown below
Get the API access key credentials of the visual recognition service from your IBM cloud account
We created a custom model that classifies dogs. We supplied a negative sample of cats. The JSON response below shows the training phase of the custom classifier dogs_2025763446
Once the training is done, make a get request to see the status ready before passing test samples.
Below you find one positive (dog golden retreiver ) and one negative (apples) JSON responses when passed to custom classifier dogs_2025763446
Documentation specifies the WVR custom model limitations.
Assertion 1: Documentation says that form parameter images_file can be a single file or a zip file with max 20 images. The maximum size of such a zip file is 100MB. Not ideal for cases of real-time video classification that takes more than 20 fps.
Assertion 2: When using the general model, it does not show all the objects like a apple within an image in JSON response.
curl -X POST -u "apikey:m2SyTztvn6aR1PFI0i7Lyf9er4Jh8fANO6E0btcYWrAL" --form "images_file=@/Users/krishna/Desktop/img_db/fruitbowl.jpg" "https://gateway.watsonplatform.net/visual-recognition/api/v3/classify?version=2018-03-19"
Assertion 3: As assertion 2 uses the general model and all objects within food image are not shown, We passed the classifier id as food now. Even then, not all fruits like oranges are classified with a default threshold as shown below.
Test with image sample of fruitbowl.jpg
curl -X POST -u "apikey:m2SyTztvn6aR1PFI0i7Lyf9er4Jh8fANO6E0btcYWrAL" --form "images_file=@/Users/krishna/Desktop/img_db/fruitbowl.jpg" -F "classifier_ids=food" "https://gateway.watsonplatform.net/visual-recognition/api/v3/classify?version=2018-03-19"
Threshold of 0 and 0.5(default)
Threshold of 0.6, 0.9
Note:
- In fruitbowl.jpg (640 × 426 pixel image resolution), when the threshold is above 0.6, apples or banana are not recognized. The default threshold of 0.5 and anything below 0.5 recognized the fruits apple and banana. Adjusting the threshold might increase the quality of predictions but sometimes the objects are gone out of predictions completely.
Test with another image sample of Apples_green_red.jpg
Default threshold of 0.5
Threshold of 0.7, 0.8
- In Apples_green_red.jpg (342 × 147 pixel image resolution), none of the objects are recognized when the threshold is increased to 0.8. The image resolution of Apples_green_red.jpg is less than that of the above fruitbowl.jpg (640 × 426 pixel image resolution).
- Instead of changing the threshold to improve the prediction results, we can fix the threshold to 0.5 (default) and submit images with higher resolution for better predictions results.
- On a broader note, Threshold is directly proportional to the image quality. Higher the picture quality, objects in the picture can be recognized with higher thresholds. Lesser the picture quality, objects in the picture can be recognized only with lesser thresholds. Some tips on choosing the right threshold value for custom classifiers is shown here: 3rd point in Questions
- Also, Documentation mentions that images in training and testing sets should resemble each other. Significant visual differences between training and testing groups will result in poor performance results. There are number of additional factors that will impact the quality of your training beyond the resolution of your images. Lighting, angle, focus, color, shape, distance from subject, and presence of other objects in the image will all impact your training.
- So far, we tested images with pre-trained classifiers or built-in models where we have no control of trained images. Custom classifiers have much more control on training & test samples to improve the accuracy levels taking in view of aforementioned points.
Assertion 3: Faces are detected in food image for the following command below
curl -X POST -u "apikey:m2SyTztvn6aR1PFI0i7Lyf9er4Jh8fANO6E0btcYWrAL" --form "images_file=@/Users/krishna/Desktop/img_db/fruitbowl.jpg" "https://gateway.watsonplatform.net/visual-recognition/api/v3/detect_faces?version=2018-03-19"
Assertion 4: Documentation says that for a given image, age and gender is classified using general model. However, JSON responses for the curl requests using a general model for above Ginni / Trump images does not shown such classification.
Assertion 5: General model does not detect multiple faces within a single image as shown below
General model response for face detection
For such classification to happen, we have to explicitly pass the parameter detect_faces while submitting the image through curl request as shown below. Means, we have to know whether we are passing the face/object/food image before passing image.
detect_faces parameter passed in curl request
curl -X POST -u "apikey:m2SyTztvn6aR1PFI0i7Lyf9er4Jh8fANO6E0btcYWrAL" --form "images_file=@/Users/krishna/Desktop/img_db/6_faces_in_single_image.jpg.jpg" "https://gateway.watsonplatform.net/visual-recognition/api/v3/detect_faces?version=2018-03-19"
Assertion 6: Delayed responses in cases of increased image files. 1st image below takes <1sec. While the next 2 zipped folders with 5 and 22 images take 2.5 and more than 8 seconds.
Detailed JSON response for 20 files - Note that only 20 files are processed as specified in the documentation
Assertion 7: We can observe from above-detailed JSON response that, images that have faces does not contain any information about their age/gender within the JSON response. Also, we passed images with combinations like images with face and food, food and text, food and hands. In such cases, the JSON responses are restricted to only one particular category.
Assertion 8: Current UI interface does not show any train button to upload images in custom model creation. Hence we trained our custom models by passing training datasets through curl request. Check below demo for further details.
-
After reviewing lighthouse and other IBM internal assets, we found a close resemblance between Watson Natural Language Classifier (NLC) and WVR Text model. WLC is used for Text Classification https://www.ibm.com/watson/services/natural-language-classifier/. WVR Text model is used to identify the natural language in the uploaded image
-
The Watson Personality Insights (Waston PI) https://w3-03.ibm.com/services/lighthouse/documents/61268 uses linguistic analytics to extract a spectrum of cognitive and social characteristics from the text data that a person generates through blogs, tweets, forum posts, and more.
-
The common point with above 2 APIs and WVR Text model is related to identifying and classifying text. There were times when APIs like Alchemy vision https://www.ibm.com/blogs/watson/2016/05/visual-recognition-update/ deprecated to visual recognition and Q & A service deprecated to Engagement - Conversation due to their similarities in functioning. However, such merges in above 2 API cases to visual recognition or vice versa needs further investigation.
-
Projects that are built with Watson APIs in combination with visual recognition, NLC, etc or standalone are specified below: (Study in progress...)
-
Watson text classification - https://developer.ibm.com/patterns/extend-watson-text-classification/
-
Ticket categorization - https://developer.ibm.com/patterns/watson-studio-nlc-technical-support-ticket-categorization/
-
Customer communications Insights analyzer - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/Customer%20Communication%20Insights%20Analyzer
-
Text and email analyzer - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/Text%20and%20email%20analyzer
-
Watson health depression pres screening - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/Watson%20Health%20DEP%20Depression%20Pre-Screening
-
Optimized dictionary of German street names - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/Optimized%20Dictionary%20of%20Street%20Names%20in%20German
-
Troll patrol - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/Troll%20Patrol
-
GDPR Email Triage - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/GDPR%20Email%20Triage
-
Email routing accelerator - https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/Wfdeebd09058d_481a_945f_bf89b0d58d08/page/Email%20Routing%20Accelerator
-
-
We can run IBM Watson services on any cloud. As visual recognition service is a part of Watson, we can host WVR onpremise.
-
Through the integration with IBM Cloud Private for Data (ICP for Data), Watson and Watson OpenScale can now be run any environment – on premises, or on any private, public or hybrid multi cloud – enabling businesses to apply AI to data wherever it is hosted.
-
Transform the Enterprise with IBM Cloud Private on OpenShift
- https://sourcedexter.com/ibm-visual-recognition-api-part-1/
- WVR as a service in IBM Cloud
- Visual recognition overview - Data platform
- Visual recognition overview
- Visual recognition getting-started-tutorial
- Visual recognition Bluemix docs
- Visual recognition - customizing guidelines
- Cognitive analytics community material
- IBM Cloud Private on Data - tl;dr
- IBM Cloud Private on Data - In detail
- Visual recognition Blog
- Visual recognition - custom model
- Visual recognition - custom model video1
- Visual recognition - custom model video2
- Visual recognition - Redbook