Allow "pepper" or "vegetable" #72

ccreutzi · 2024-08-06T10:17:54Z

The moondream model is unreliable in reporting peppers or even reporting anything at all, notice the empty responses in several of these calls:

>> disp(generate(chat,messages))

 The image features a bunch of fruits and vegetables, including several bell peppers and chilies in various colors such as red, green, yellow, purple, and white. There is also an onion and some garlic on the table along with other food items that could be related to a meal or ingredient preparation.
>> disp(generate(chat,messages))

 The image displays a group of fresh vegetables on top of purple fabric. Among the vegetables are several peppers and garlic, some lying flat while others stand tall or in various positions.
>> disp(generate(chat,messages))
urn of different colored vegetables such as peppers and onions are placed on a table.
>> disp(generate(chat,messages))

 The image shows an assortment of various colorful vegetables, primarily focused on the peppers. There is a wide range of vegetables displayed in different shapes and sizes, creating a visually appealing arrangement that highlights their unique characteristics.
>> disp(generate(chat,messages))
>> disp(generate(chat,messages))

 The image displays a table with an arrangement of colorful, whole vegetables such as peppers and onions. These ingredients are spread across the table, showcasing a variety of colors including reds, oranges, greens, yellows, and purples. There is also an onion among these vegetables, further adding to their vibrant appearance. Overall, the scene captures a visually appealing display of fresh produce.
>> disp(generate(chat,messages))
>> disp(generate(chat,messages))
>> disp(generate(chat,messages))
 The image shows a variety of brightly colored vegetables sitting on a table, including red bell pepper and green bell pepper. The vibrant hues and diverse shapes of the peppers create an eye-catching display against the purple background.
>> disp(generate(chat,messages))
>> disp(generate(chat,messages))
xtracted image of many different kinds of vegetables including red bell peppers, onions and garlic, as well as carrots.
>> disp(generate(chat,messages))
>> disp(generate(chat,messages))

The image features a variety of colorful vegetables, including bell peppers and garlic, all stacked up in an appealing manner. There are also several carrots mixed throughout the pile, contributing to the vibrant display.
>> disp(generate(chat,messages))
Â The image features a pile of fresh vegetables, including numerous bell peppers in various shades. This vibrant collection includes both red and green varieties that create a visually appealing display on the table or surface they are placed on.
>> disp(generate(chat,messages))

 The image displays a variety of fresh vegetables, including bell peppers and carrots.

There have also been multiple CI failures like this one:

      Actual Value:
          "urn of vegetables is on display"
      Expected Substring:
          "pepper"

Since we are not interested in testing the model, but we do want to run an end-to-end test to ensure that we pass images in the data format required by Ollama, generate multiple responses and make sure that at least one of them mentions "pepper" or "vegetable".

The moondream model is unreliable in reporting peppers or even reporting anything at all, notice the empty responses in several of these calls: ``` >> disp(generate(chat,messages)) The image features a bunch of fruits and vegetables, including several bell peppers and chilies in various colors such as red, green, yellow, purple, and white. There is also an onion and some garlic on the table along with other food items that could be related to a meal or ingredient preparation. >> disp(generate(chat,messages)) The image displays a group of fresh vegetables on top of purple fabric. Among the vegetables are several peppers and garlic, some lying flat while others stand tall or in various positions. >> disp(generate(chat,messages)) urn of different colored vegetables such as peppers and onions are placed on a table. >> disp(generate(chat,messages)) The image shows an assortment of various colorful vegetables, primarily focused on the peppers. There is a wide range of vegetables displayed in different shapes and sizes, creating a visually appealing arrangement that highlights their unique characteristics. >> disp(generate(chat,messages)) >> disp(generate(chat,messages)) The image displays a table with an arrangement of colorful, whole vegetables such as peppers and onions. These ingredients are spread across the table, showcasing a variety of colors including reds, oranges, greens, yellows, and purples. There is also an onion among these vegetables, further adding to their vibrant appearance. Overall, the scene captures a visually appealing display of fresh produce. >> disp(generate(chat,messages)) >> disp(generate(chat,messages)) >> disp(generate(chat,messages)) The image shows a variety of brightly colored vegetables sitting on a table, including red bell pepper and green bell pepper. The vibrant hues and diverse shapes of the peppers create an eye-catching display against the purple background. >> disp(generate(chat,messages)) >> disp(generate(chat,messages)) xtracted image of many different kinds of vegetables including red bell peppers, onions and garlic, as well as carrots. >> disp(generate(chat,messages)) >> disp(generate(chat,messages)) The image features a variety of colorful vegetables, including bell peppers and garlic, all stacked up in an appealing manner. There are also several carrots mixed throughout the pile, contributing to the vibrant display. >> disp(generate(chat,messages)) Â The image features a pile of fresh vegetables, including numerous bell peppers in various shades. This vibrant collection includes both red and green varieties that create a visually appealing display on the table or surface they are placed on. >> disp(generate(chat,messages)) The image displays a variety of fresh vegetables, including bell peppers and carrots. ``` Since we are not interested in testing the model, but we do want to run an end-to-end test to ensure that we pass images in the data format required by Ollama, generate multiple responses and make sure that at least one of them mentions `"pepper"` or `"vegetable"`.

codecov-commenter · 2024-08-06T10:19:37Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.11%. Comparing base (fce8bb8) to head (b5ad5d5).
Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main      #72   +/-   ##
=======================================
  Coverage   97.10%   97.11%           
=======================================
  Files          40       41    +1     
  Lines        1349     1350    +1     
=======================================
+ Hits         1310     1311    +1     
  Misses         39       39

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

vpapanasta

I am happy with this change. However, if we see that moondream makes the tests flaky, we could try out llava:7b which is a bit larger and might provide more robust answers for the test purposes.

ccreutzi · 2024-08-06T11:28:02Z

I am happy with this change. However, if we see that moondream makes the tests flaky, we could try out llava:7b which is a bit larger and might provide more robust answers for the test purposes.

We just moved away from bakllava:7b, which is just as big as llava:7b, because of the download size.

This is the only test point using a vision model in Ollama, at least so far.

ccreutzi requested review from adulai, debymf, MiriamScharnke and vpapanasta as code owners August 6, 2024 10:17

vpapanasta approved these changes Aug 6, 2024

View reviewed changes

MiriamScharnke approved these changes Aug 6, 2024

View reviewed changes

ccreutzi merged commit 150d9c1 into main Aug 6, 2024
1 check passed

ccreutzi deleted the img-descriptions branch August 6, 2024 12:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow "pepper" or "vegetable" #72

Allow "pepper" or "vegetable" #72

Uh oh!

ccreutzi commented Aug 6, 2024

Uh oh!

codecov-commenter commented Aug 6, 2024

Uh oh!

vpapanasta left a comment

Uh oh!

ccreutzi commented Aug 6, 2024

Uh oh!

Uh oh!

Uh oh!

Allow "pepper" or "vegetable" #72

Allow "pepper" or "vegetable" #72

Uh oh!

Conversation

ccreutzi commented Aug 6, 2024

Uh oh!

codecov-commenter commented Aug 6, 2024

Codecov Report

Uh oh!

vpapanasta left a comment

Choose a reason for hiding this comment

Uh oh!

ccreutzi commented Aug 6, 2024

Uh oh!

Uh oh!

Uh oh!