Images are signals!

And henceforth, sound!

Scheme for my 21M.380 Final Project: Musical Pictures
Tools to generate a 3:30 minute composition given a photograph

Current scheme (4/7/2020)

Sections of the piece

The composition, and the image, will be divided into n sections, in order to encourage diversity throughout the piece. In order to create a MusicalPicture, just construct a MusicalPicture object with 2 parameters:
MusicalPicture(".png/.jpg/.bmp", number_of_sections)

These slices will be vertical (they will divide the width of the image), and are intended to be performed from left to right, although they can be performed in any order.

Each section of the piece will pull from its own pool of pitches, note durations, and velocities, and will have its own synthesizer instrument.

The information for each section can be written to a .txt file that is formatted as an odot bundle in order to easily be copied into Max/MSP.

Instruments

A one dimentional row of information, or a 1D signal, can be generated in many ways from an image. For example, any cross section of the image can be viewed as a 1D signal:

Examples of wave forms from the 50th and 1000th rows of an image, respectively (notice how they can be very different?)

For the composition, I plan to average the pixels across the columns of a section (converted to grayscale) in order to create a 1D signal. There is a tool included, MusicalPicture.generate_single_wav, that will create 1 period of a signal from the section and save it as a .wav file. This file can be imported into Max/MSP and used as a custom wave form in the cycle~ object.

Notes

Pitch

Each section will pull from a default of 7 pitches representing the 7 most dominant hues in the section of the image. I found the most dominant colors in the HSV colorspace by using K-Means Clustering.

Example of the most dominant colors from each section of an image

Now that the colors are extracted, they need to be converted into pitches (in Hz). Using inspiration from this article, the tool finds the closest western chromatic pitch associated with the hue of the color, and scales it up by an octave given how saturated the color is.

The tool will also generate the frequency at which the colors appear, which will be fed into o.random.weighted in order to pick pitches from colors that are more dominant more frequently.

Final write up

For my final project, I decided to explore the pure sonification of a photograph. While sound is a one dimentional signal, pictures can be represented in 2 or three dimensions, depending on the type of color data available. Working with a colored image, I had many degrees of freedom on how I chose to sonify each component. Therefore, the further I went, the more I realized that a "pure" sonification does not really exist, when trying to manipulate data in this way. However, I did try to limit my creative decisions to be focused around the image data.

Scheme

The general scheme for the project changed over time, but eventually settled to focusing on a few aspects of the image. Firstly, the image is divided into 5 vertical strips (although this can be any number, I decided on 5). Each section of the image represents a section of the piece, read from left to right. To compose a 3.5 minute piece, I specified each section to represent 42 seconds each. There are two main areas of data I extractd from each section:

Instruments

The main synthesizer I used was generated by using a custom waveform in the cycle~ object. This custom waveform was generated directly from the shape of the data. Using python, I converted the image sections into grayscale, meaning that each pixel only has one value. I then averaged the grayscale values along the rows, so the result was a 1D array of numbers. These numbers represented a wave shape that I could save as a .wav file and import into MAX. The difference in the tambre of the synthesizers from section to section are suble, but can definitely be heard, which was exciting.

Note Information

Once the instruments were designed, I decided to structure the melody of the piece by choosing from a random pool of pitches and durations. The pool was chosen from the segments of images, by looking at the color information in each segment. I started by finding the 7 most frequently appearing colors in the segment, using "k-means clustering" in python. Instead of leaving the colors encoded as RGB, I transformed them into the Hue Saturation Value colorspace (displayed below), which I thought would be more useful. I wanted the hue to inspire the picthes that were used, so I converted them to notes on the western chromatic scale that hold the same frequency as the color, but ~40 octaves down! It was really interesting researching various ways to associate colors and sound, but this way seemed the most straightforward.

I used the "value" information (or basically, how dark the colors were) to generate the durations. This meant that brighter sections would have more quickly changing notes, which you can notice around the middle of the piece, where there is the bright sun in the center. The notes move faster, and I also increased the volume when the overall brightness of the section was higher.

The "saturation" information (or how grey a color is) was only used to increase pitches by an octave if the colors were more saturated. This was simply to add a bit more range to the pool of pitches. All of the color data I exported to a text file I could copy and paste into an odot bundle, and I did most of the conversions using odot scripting.

Randomness

The second major topic from the class I decided to implement was randomness. Once the pool of notes was generated, I used the frequency with which the colors appear in order to use a weighted random selector to choose which note would play and for how long. For example, if there is a lot of red in one segement of the image, you would hear the pitch F more frequently than the other notes in the pool. If it was a darker red, the note durations would often be much longer.

Underlying drone

In order to add more texture, I cycled through the pool of pitches in each segment, dropped them down by 2 octaves, and fed them into a droning synthesizer that played underneath the melody. This made the piece much easier to listen to, and I ended up liking the result enough to keep it in the piece.

Conclusion

Thoughts on the Result

When I started this project, I had no idea what the final result would sound like, and I let the image shape the result for the most part. I did not expect for the result to resemble an "aesthetic" representation of the image. In fact, the piece sounds almost very sinister and creepy, whereas the picture looks quite happy and hopeful! What I would try to demonstrate with this piece is that data does not always represent the beauty in something. However, there are infinitely other ways I could have sonified the image that might have resulted in a less dissonant result, this was just how my manipulation of the parameters worked out.

Future Improvements

Having had more time, I would have tried to find a way to make the sections of the image less discrete and more continuous. Then the piece would progress without the listener really realizing the changes happening until perhaps the end, when the tone becomes much different. I would also have liked to tie the note envelopes to the photo information as well, so that there is more variety in the note shape. I would also love to run my code on lots of other images, especially ones with more variety of colors. It would be really interesting to see what sorts of musical moods could be created from different types of images, and what patterns I notice.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
all_sounds_data/eff/d1f		all_sounds_data/eff/d1f
horizontal_strips		horizontal_strips
section_0		section_0
section_1		section_1
section_2		section_2
section_3		section_3
section_4		section_4
throwaway		throwaway
turn_in		turn_in
MAXPATCH.maxpat		MAXPATCH.maxpat
README.md		README.md
WRITEUP.md		WRITEUP.md
example_image_big.jpg		example_image_big.jpg
final_recording.aiff		final_recording.aiff
final_recording.mp3		final_recording.mp3
hsv.png		hsv.png
image_for_patch.png		image_for_patch.png
image_histogram_example2.png		image_histogram_example2.png
instrument_0.wav		instrument_0.wav
instrument_1.wav		instrument_1.wav
instrument_2.wav		instrument_2.wav
instrument_3.wav		instrument_3.wav
instrument_4.wav		instrument_4.wav
lambert_final_proj_writeup.pdf		lambert_final_proj_writeup.pdf
pic2music.py		pic2music.py
row_1000.png		row_1000.png
row_50.png		row_50.png
section_0_rotate.png		section_0_rotate.png
section_1_rotate.png		section_1_rotate.png
section_2_rotate.png		section_2_rotate.png
section_3_rotate.png		section_3_rotate.png
section_4_rotate.png		section_4_rotate.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Images are signals!

Current scheme (4/7/2020)

Sections of the piece

Instruments

Notes

Pitch

Final write up

Scheme

Instruments

Note Information

Randomness

Underlying drone

Conclusion

Thoughts on the Result

Future Improvements

About

Uh oh!

Releases

Packages

Languages

alambert14/sonifying-images

Folders and files

Latest commit

History

Repository files navigation

Images are signals!

Current scheme (4/7/2020)

Sections of the piece

Instruments

Notes

Pitch

Final write up

Scheme

Instruments

Note Information

Randomness

Underlying drone

Conclusion

Thoughts on the Result

Future Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages