User Story -> Kaggle Competition #1116

bw4sz · 2025-04-14T21:53:22Z

bw4sz
Apr 14, 2025

Hi Kitzes lab, ben weinstein here. I thought i'd narrate a somewhat unorganized unboxing story as I play with opensoundscape for the first time. These are just thoughts for user design and experience. I would love if anyone wanted to do the same for DeepForest.

User

Ben Weinstein

Experience Level

10+ years developing machine learning models and open source packages. No experience with audio data. Working on a workshop in Colombia with Santiago.

User goal

To load the kaggle competition data https://www.kaggle.com/competitions/birdclef-2025
Perform basic audio manipulation
Download and run the pretrained models
Extract the model backbone and underlying pytorch to attach my own cometml logger
Use as a teaching device for Colombian participants of our workshop -> https://weecology.github.io/AI_for_ecology_workshop. Just education, no intent on submitting competitive notebooks.

Getting Started

Tutorials and getting started docs are clean and compelling. They gave me confidence and made sense.

Sample data

https://drive.google.com/file/d/11BT4trlQsUsSfrRm79rsn3pnSl9zNQ9m/view?usp=sharing

Installation

installation process with multiple packages, models and dependencies tripped me up. Easy to solve for me, probably not for others.
There are lots of models to choose from, as a novice user I would have liked the experts (you) to pick a horse in the race. A default model of course makes no sense for all the problems, but just gets me off the ground. Probably having one model in opensoundscape as a default with all dependencies would be a good compromise with the rest of the models in the zoo. I am biased towards pytorch over tensorflow, all the tools are there.

Load the audio

audio = Audio.from_file("../birdclef-2025/train_audio/21038/iNat65519.ogg")
audio

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File ~/miniconda3/envs/megadetector/lib/python3.11/site-packages/IPython/core/formatters.py:406, in BaseFormatter.__call__(self, obj)
    404     method = get_real_method(obj, self.print_method)
    405     if method is not None:
--> 406         return method()
    407     return None
    408 else:

File ~/miniconda3/envs/megadetector/lib/python3.11/site-packages/opensoundscape/audio.py:364, in Audio._repr_html_(self)
    357 def _repr_html_(self):
    358     """create interactive audio widget
    359 
    360     This method is used by Jupyter Notebook if the object is returned from a cell.
    361     It uses the IPython.display.Audio class to create an interactive Audio widget.
    362 
    363     """
--> 364     return self._to_ipdisplay_audio()._repr_html_()

File ~/miniconda3/envs/megadetector/lib/python3.11/site-packages/opensoundscape/audio.py:350, in Audio._to_ipdisplay_audio(self, normalize, autoplay)
    348 def _to_ipdisplay_audio(self, normalize=False, autoplay=False):
    349     """create interactive IPython display audio object"""
--> 350     return IPython.display.Audio(
    351         data=self.samples,
    352         rate=self.sample_rate,
    353         normalize=normalize,
    354         autoplay=autoplay,
    355     )

File ~/miniconda3/envs/megadetector/lib/python3.11/site-packages/IPython/lib/display.py:130, in Audio.__init__(self, data, filename, url, embed, rate, autoplay, normalize, element_id)
    128 if rate is None:
    129     raise ValueError("rate must be specified when data is a numpy array or list of audio samples.")
--> 130 self.data = Audio._make_wav(data, rate, normalize)

File ~/miniconda3/envs/megadetector/lib/python3.11/site-packages/IPython/lib/display.py:152, in Audio._make_wav(data, rate, normalize)
    149 import wave
    151 try:
--> 152     scaled, nchan = Audio._validate_and_normalize_with_numpy(data, normalize)
    153 except ImportError:
    154     scaled, nchan = Audio._validate_and_normalize_without_numpy(data, normalize)

File ~/miniconda3/envs/megadetector/lib/python3.11/site-packages/IPython/lib/display.py:186, in Audio._validate_and_normalize_with_numpy(data, normalize)
    183     raise ValueError('Array audio input must be a 1D or 2D array')
    185 max_abs_value = np.max(np.abs(data))
--> 186 normalization_factor = Audio._get_normalization_factor(max_abs_value, normalize)
    187 scaled = data / normalization_factor * 32767
    188 return scaled.astype("<h").tobytes(), nchan

File ~/miniconda3/envs/megadetector/lib/python3.11/site-packages/IPython/lib/display.py:213, in Audio._get_normalization_factor(max_abs_value, normalize)
    210 @staticmethod
    211 def _get_normalization_factor(max_abs_value, normalize):
    212     if not normalize and max_abs_value > 1:
--> 213         raise ValueError('Audio data must be between -1 and 1 when normalize=False.')
    214     return max_abs_value if normalize else 1

ValueError: Audio data must be between -1 and 1 when normalize=False.

that error made me think I had to normalize, or that normalize would be an argument.

audio = Audio.from_file("../birdclef-2025/train_audio/21038/iNat65519.ogg", normalize=True)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[13], line 2
      1 # Load the audio
----> 2 audio = Audio.from_file("../birdclef-2025/train_audio/21038/iNat65519.ogg", normalize=True)
      3 audio

TypeError: Audio.from_file() got an unexpected keyword argument 'normalize'

Then I realized this is just connecting with jupyter notebooks.

Predicting with a prebuilt model

At first it felt like the Audio class is the centerpiece of the package, so I thought the predict function would operate on that class.

file_path = "../birdclef-2025/train_audio/21038/iNat65519.ogg"
audio = Audio.from_file(file_path)
m.predict(audio)

AssertionError                            Traceback (most recent call last)
Cell In[21], line 1
----> 1 m.predict(audio)
      2 m.predict(file_path)    

File ~/miniconda3/envs/megadetector/lib/python3.11/site-packages/opensoundscape/ml/cnn.py:1072, in SpectrogramClassifier.predict(self, samples, batch_size, num_workers, activation_layer, split_files_into_clips, clip_overlap, clip_overlap_fraction, clip_step, overlap_fraction, final_clip, bypass_augmentations, invalid_samples_log, raise_errors, wandb_session, return_invalid_samples, progress_bar, **kwargs)
    999 """Generate predictions on a set of samples
   1000 
   1001 Return dataframe of model output scores for each sample.
   (...)   1069 
   1070 """
   1071 # create dataloader to generate batches of AudioSamples
-> 1072 dataloader = self.predict_dataloader(
   1073     samples,
   1074     bypass_augmentations=bypass_augmentations,
   1075     split_files_into_clips=split_files_into_clips,
   1076     overlap_fraction=overlap_fraction,
   1077     clip_overlap=clip_overlap,
   1078     clip_overlap_fraction=clip_overlap_fraction,
   1079     clip_step=clip_step,
   1080     final_clip=final_clip,
   1081     batch_size=batch_size,
   1082     num_workers=num_workers,
   1083     raise_errors=raise_errors,
...
     76     if isinstance(samples, CategoricalLabels):
     77         # extract sparse multihot label df
     78         samples = samples.mutihot_df_sparse

AssertionError: `samples` must be either: (a) list or np.array of files, or DataFrame with (b) file as Index, (c) (file,start_time,end_time) as MultiIndex, or (d) CategoricalLabels object
Output is truncated. View as a [scrollable element](command:cellOutput.enableScrolling?680e5cb0-c09b-4670-a5da-46a58d94d64b) or open in a [text editor](command:workbench.action.openLargeOutput?680e5cb0-c09b-4670-a5da-46a58d94d64b). Adjust cell output [settings](command:workbench.action.openSettings?%5B%22%40tag%3AnotebookOutputLayout%22%5D)...

but that's not right, the prediction happens at the file level?

m.predict(file_path)

works. but now i'm confused, if i perform functions with the Audio class, lots of cool functionality there, do I then save that file to predict it?

Analyzing results

The function returns the embeddings, but most users won't know what that is, it feels like a similiar theory as above, I expected the softmax function to be the default, and the embeddings be an optional argument, not the reverse. Perhaps this is the difference in DeepForest versus opensoundscape users, we try to always prioritize the more novice user, since they are more likely to get discouraged earlier, anyone know who what an embedding is, will know to look for an argument for it.

Other assorted notes

I am a huge fan of pytorch lightning. Hard to tell from https://opensoundscape.org/en/latest/tutorials/training_with_lightning.html#Create-Lightning-copmatible-model, whether all models, like the one above, are compatible, and what pieces I can abstract here (like the pytorch lightning comet logger).
Was expecting, and its surely there, to be able to tell what species are in each model

BirdSetEfficientNetB1(
  (network): EfficientNetLogits(
    (efficientnet): EfficientNetModel(
      (embeddings): EfficientNetEmbeddings(
        (padding): ZeroPad2d((0, 1, 0, 1))
        (convolution): Conv2d(1, 32, kernel_size=(3, 3), stride=(2, 2), padding=valid, bias=False)
        (batchnorm): BatchNorm2d(32, eps=0.001, momentum=0.99, affine=True, track_running_stats=True)
        (activation): SiLU()
      )
      (encoder): EfficientNetEncoder(
        (blocks): ModuleList(
          (0): EfficientNetBlock(
            (depthwise_conv): EfficientNetDepthwiseLayer(
              (depthwise_conv_pad): ZeroPad2d((0, 1, 0, 1))
              (depthwise_conv): EfficientNetDepthwiseConv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
              (depthwise_norm): BatchNorm2d(32, eps=0.001, momentum=0.99, affine=True, track_running_stats=True)
              (depthwise_act): SiLU()
            )
            (squeeze_excite): EfficientNetSqueezeExciteLayer(
              (squeeze): AdaptiveAvgPool2d(output_size=1)
              (reduce): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1), padding=same)
              (expand): Conv2d(8, 32, kernel_size=(1, 1), stride=(1, 1), padding=same)
              (act_reduce): SiLU()
              (act_expand): Sigmoid()
            )
...
    (dropout): Dropout(p=0.2, inplace=False)
    (classifier): Linear(in_features=1280, out_features=9736, bias=True)
  )
  (loss_fn): BCEWithLogitsLoss_hot()
)

or maybe m.labels. Clicking into the model link got me the repo, but no closer to accessing which species labels, you can see them after prediction. Give the brutal nature of taxonomy, especially in birds, connecting to some kind of taxize type thing feels key. Or atleast trying to document (for a default model) what taxonomy was used.

Fine-tuning

The annotation data structure isn't clear to me, most/all of the tutorial examples use 'BoxedAnnotations', which looking at source, assumes that you are subsetting longer recordings. What if the data you have is already clipped?

from opensoundscape.annotations import BoxedAnnotations

# Create a dataframe of annotations
annotations = BoxedAnnotations.from_raven_files(
    raven_files=selection_files, audio_files=audio_files, annotation_column="Species"
)

My data looks like

audio.metadata
{'title': 'Yellow Oriole (Icterus nigrogularis nigrogularis)', 'artist': 'Michel Giraud-Audine', 'album': 'xeno-canto', 'genre': 'Icteridae', 'samplerate': 32000, 'format': 'OGG', 'frames': 1422733, 'sections': 1, 'subtype': 'VORBIS', 'duration': 44.46040625, 'channels': 1}

I'll update here as I continue. Thanks for your interest and amazing work. I know how hard it is.

sammlapp · 2025-04-15T12:31:13Z

sammlapp
Apr 15, 2025
Maintainer

Thanks Ben, this is extremely useful and I appreciate you taking the time to make these notes!

0 replies

bw4sz · 2025-04-15T20:34:43Z

bw4sz
Apr 15, 2025
Author

Continue day 2.

I was expecting to find a 'trim' frequency clip, since after my band passes, data looks like

reminder that I have no real audio experience, so maybe thats a bad idea.

1 reply

sammlapp Apr 15, 2025
Maintainer

the Spectrogram.bandpass() method is what you're looking for here, it crops / trims the Spectrogram frequencies (whereas methods of the Audio signal apply frequency filters to the time signal).
eg

a=Audio.noise(5,32000,'blue')
Spectrogram.from_audio(a).bandpass(2000,5000).plot()

due to the nature of audio signals and fourier transforms, the spectrogram will always have frequencies spanning 0 to sample_rate/2, and "cropping" or "trimming" frequencies is only possible on a spectrum or spectrogram, rather than on the audio waveform.

bw4sz · 2025-04-15T21:24:41Z

bw4sz
Apr 15, 2025
Author

Still working on the general flow of fine-tuning, whether to use the https://opensoundscape.org/en/latest/source/opensoundscape.ml.html#opensoundscape.ml.datasets.AudioFileDataset or create my own dataset and dataloader classes and use pytorch lightning. I cannot follow the docs in terms of how the dataframes should be formatted.

but looking at the API source, the key sentence is that the filename is the index, and the rest is one hot encoded. Most of the examples come out of raven annotations and the clip_labels functions, making it hard to match from what I've got. I succeeded with

1 reply

bw4sz Apr 15, 2025
Author

stopping here for the day, will run on multi-gpu tomorrow.

https://github.com/weecology/AI_for_ecology_workshop/blob/main/collboration_challenge/benweinstein/OpenSoundScape.ipynb

bw4sz · 2025-04-16T22:59:24Z

bw4sz
Apr 16, 2025
Author

I think this is related to the high level concept of how annotations are structured, all the examples comes from the boxed annotations concept. When you have full recordings, its difficult to translate that to the documents. For example, in this Kaggle competition, you get a minute long recording of an Oriole, should I mock a dataframe in 5 second increments that label each 5 as 'oriole'?

the ghost of these choices is clear in the embedding structure.

For example, here is my input sample.

formatted_train_df.head()
                                                    22973  22976  24272  24292  ...  yelori1  yeofly1  yercac1  ywcpar
full_path                                                                       ...                                   
collaboration_challenge/birdclef-2025/train_aud...      1      0      0      0  ...        0        0        0       0
collaboration_challenge/birdclef-2025/train_aud...      0      1      0      0  ...        0        0        0       0
collaboration_challenge/birdclef-2025/train_aud...      0      0      1      0  ...        0        0        0       0
collaboration_challenge/birdclef-2025/train_aud...      0      0      0      1  ...        0        0        0       0
collaboration_challenge/birdclef-2025/train_aud...      0      0      0      0  ...        0        0        0       0

[5 rows x 192 columns]

The index is the path to a sample, and one hot encoded for the entire frame. I grabbed one sample per class.

formatted_train_df.shape
(192, 192)

but when you embed, somewhere the 5 second idea re-emerges.

embeddings = m.embed(formatted_train_df, batch_size=128, num_workers=2)
embeddings.head()
                                                                            0         1     ...      1278      1279
file                                               start_time end_time                      ...                    
collaboration_challenge/birdclef-2025/train_aud... 0.0        5.0      -0.154844 -0.088747  ... -0.073930 -0.138794
                                                   5.0        10.0     -0.019289  0.012054  ... -0.077652 -0.124102
                                                   10.0       15.0     -0.143674 -0.121990  ... -0.044354 -0.144992
                                                   15.0       20.0     -0.104029 -0.109827  ...  0.188335 -0.137870
                                                   20.0       25.0     -0.039814 -0.045654  ... -0.051586 -0.147670

embeddings.shape
(1315, 1280)

In general, once an annotation concept is bypassed, you might expect the rest of the functions to operate on those structures, provided it is a well formatted. Instead, in this case, the model reaches back and forces an annotation structure. In this case, because the input labels are per file, shape 192 and the embeddings have been subsetted, shape 1315, you get a shape error when fitting the classifier, understandably.

emb_train, label_train, emb_val, label_val = fit_classifier_on_embeddings(
    embedding_model=m,
    classifier_model=clf,
    train_df=formatted_train_df,
    validation_df=formatted_test_df,
    steps=3,
    embedding_batch_size=128,
    embedding_num_workers=5,
    device=torch.device("cuda" if torch.cuda.is_available() else "cpu"),
)

yields

Target size (torch.Size([192, 192])) must be the same as input size (torch.Size([1315, 192]))

stepping into this, we already knew the answer -> the train features and labels don't have the same shape.

   def forward(self, input: Tensor, target: Tensor) -> Tensor:
        return F.binary_cross_entropy_with_logits(
            input,
            target,
            self.weight,
            pos_weight=self.pos_weight,
            reduction=self.reduction,
        )

Potential workarounds

I haven't done this yet, i'll need to check the docs (feedback welcome).

Create dummy labels at the 5 second intervals to make the shapes match per file.
Look into docs/source of the embed structure to stop the time slicing and operate on the full input (shape errors too? different recording lengths?)

2 replies

sammlapp Apr 17, 2025
Maintainer

in short I think 1 is the correct approach.

The ML models are always looking at "clips" rather than entire audio files, eg the input sample is a spectrogram from 5 seconds of audio, and therefore each 5 second spectrogram has an associated set of labels. If you have an entire 1 minute recording of "oriole", we consider this a "weak label" because you don't actually know if each 5s clip contains oriole song. You might choose to annotate, pseudo-label based on energy or something else, or just assume that every clip from the file has oriole, but I think its a good thing that you have to explicitly make a decision about turning your weak 1-minute file label into strong 5-s clip labels.

although the API still supports the dataframe format in which "file" is the only index, we've moved towards always using "file, start_time, end_time" as the index and might remove support for just "file" as the index since it causes confusions like this one.

bw4sz Apr 17, 2025
Author

definitely worth dropping into docs.

bw4sz · 2025-04-18T22:48:21Z

bw4sz
Apr 18, 2025
Author

I've been trying in vain to silence all these userwarnings. They really fill up the stdout. i've never truthfully understood this area of python and how it relates to user code versus external libraries

warnings.filterwarnings("ignore")

clf = MLPClassifier(
    input_size=1280, output_size=formatted_train_df.shape[1], hidden_layer_sizes=()
)

emb_train, label_train, emb_val, label_val = fit_classifier_on_embeddings(
    embedding_model=m,
    classifier_model=clf,
    train_df=formatted_train_df,
    validation_df=formatted_test_df,
    steps=1000,
    embedding_batch_size=1,
    embedding_num_workers=0
)

preds = clf(emb_val.to(torch.device("cuda"))).detach().numpy()
# evaluate with threshold agnostic metrics: MAP and ROC AUC
print(
    f"average precision score: {average_precision_score(label_val,preds,average=None)}"
)
print(f"area under ROC: {roc_auc_score(label_val,preds,average=None)}")

/usr/local/lib/python3.11/dist-packages/opensoundscape/audio.py:1753: UserWarning: Audio object is shorter than requested duration: 13.296 sec instead of 30 sec
  warnings.warn(error_msg)

https://www.google.com/search?q=warnings.filterwarnings(%22ignore%22)+still+raises+warnings&oq=warnings.filterwarnings(%22ignore%22)+still+raises+warnings&gs_lcrp=EgZjaHJvbWUyBggAEEUYOdIBCDMwNDZqMGo3qAIAsAIA&sourceid=chrome&ie=UTF-8

1 reply

sammlapp Apr 21, 2025
Maintainer

oof me either, but in this particular case Audio.from_file has a flag for "out_of_bounds_mode" (default 'warn') that can be set to 'ignore'. m.preprocessor.pipeline.load_audio.set(out_of_bounds_mode= 'ignore')

bw4sz · 2025-04-21T18:57:09Z

bw4sz
Apr 21, 2025
Author

@sammlapp , I know this is upstream, but I'm really struggling to get opensoundscape installed within kaggle servers, its due to grad-cam

https://www.kaggle.com/code/benweinstein/medell-n-workshop/log?scriptVersionId=235231319

Any suggestiions?

7 replies

sammlapp Apr 21, 2025
Maintainer

I can't visit that link, seems to be private

sammlapp Apr 21, 2025
Maintainer

hm, I haven't used kaggle notebooks but I just made a blank notebook, ran !pip install opensoundscape==0.12.0, and it installed and is working properly. Is your workflow different?

https://www.kaggle.com/code/sammlapp/notebookf77c8a7759

When install completes, I do get some warnings about conflicting packages vs apparently some version requirements of kaggle notebooks, but they don't seem to be causing issues with using opensoundscape in my limited test

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-cloud-translate 3.12.1 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 5.29.4 which is incompatible.
google-api-core 1.34.1 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<4.0.0dev,>=3.19.5, but you have protobuf 5.29.4 which is incompatible.
ydata-profiling 4.16.1 requires matplotlib<=3.10,>=3.5, but you have matplotlib 3.10.1 which is incompatible.
google-spark-connect 0.5.2 requires google-api-core>=2.19.1, but you have google-api-core 1.34.1 which is incompatible.
pandas-gbq 0.26.1 requires google-api-core<3.0.0dev,>=2.10.2, but you have google-api-core 1.34.1 which is incompatible.
bigframes 1.36.0 requires rich<14,>=12.4.4, but you have rich 14.0.0 which is incompatible.
pylibcugraph-cu12 24.12.0 requires pylibraft-cu12==24.12.*, but you have pylibraft-cu12 25.2.0 which is incompatible.
pylibcugraph-cu12 24.12.0 requires rmm-cu12==24.12.*, but you have rmm-cu12 25.2.0 which is incompatible.
google-cloud-bigtable 2.28.1 requires google-api-core[grpc]<3.0.0dev,>=2.16.0, but you have google-api-core 1.34.1 which is incompatible.
mlxtend 0.23.4 requires scikit-learn>=1.3.1, but you have scikit-learn 1.2.2 which is incompatible.
Successfully installed appdirs-1.4.4 aru-metadata-parser-0.1.0 birdsong-recognition-dataset-0.3.2.post1 crowsetta-5.0.1 docopt-0.6.2 evfuncs-0.3.5.post1 grad-cam-1.5.5 lightning-2.5.1 matplotlib-3.10.1 multimethod-1.10 noisereduce-3.0.3 nvidia-cublas-cu12-12.4.5.8 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.2.1.3 nvidia-curand-cu12-10.3.5.147 nvidia-cusolver-cu12-11.6.1.9 nvidia-cusparse-cu12-12.3.1.170 nvidia-nvjitlink-cu12-12.4.127 opensoundscape-0.12.0 pandera-0.19.3 protobuf-5.29.4 schema-0.7.7 ttach-0.0.3

bw4sz Apr 21, 2025
Author

I agree, it works when you have access to the internet, I tried that too. When you 'build dependencies' it fails

okay, just checking if you had seen it. Clearly a weird edge case.

sammlapp Apr 21, 2025
Maintainer

ah, I'm not familiar with the "build dependencies" functionality. The grad-cam repo is pretty active so if the issue is with that package specifically they may be able to help out. or could try installing from the github perhaps?

bw4sz Apr 21, 2025
Author

thanks, no worries, not on you.

bw4sz · 2025-04-21T19:08:35Z

bw4sz
Apr 21, 2025
Author


3.4s | 1 | Looking in links: /kaggle/input/pm-81124002-at-04-21-2025-19-01-39
-- | -- | --
3.4s | 2 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/poetry-2.1.2-py3-none-any.whl
3.5s | 3 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/build-1.2.2.post1-py3-none-any.whl (from poetry)
3.5s | 4 | Requirement already satisfied: cachecontrol<0.15.0,>=0.14.0 in /usr/local/lib/python3.11/dist-packages (from cachecontrol[filecache]<0.15.0,>=0.14.0->poetry) (0.14.2)
3.5s | 5 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/cleo-2.1.0-py3-none-any.whl (from poetry)
3.5s | 6 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/dulwich-0.22.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from poetry)
3.5s | 7 | Requirement already satisfied: fastjsonschema<3.0.0,>=2.18.0 in /usr/local/lib/python3.11/dist-packages (from poetry) (2.21.1)
3.5s | 8 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/findpython-0.6.3-py3-none-any.whl (from poetry)
3.6s | 9 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/installer-0.7.0-py3-none-any.whl (from poetry)
3.6s | 10 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/keyring-25.6.0-py3-none-any.whl (from poetry)
3.6s | 11 | Requirement already satisfied: packaging>=24.0 in /usr/local/lib/python3.11/dist-packages (from poetry) (24.2)
3.6s | 12 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/pbs_installer-2025.4.9-py3-none-any.whl (from pbs-installer[download,install]<2026.0.0,>=2025.1.6->poetry)
3.6s | 13 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/pkginfo-1.12.1.2-py3-none-any.whl (from poetry)
3.6s | 14 | Requirement already satisfied: platformdirs<5,>=3.0.0 in /usr/local/lib/python3.11/dist-packages (from poetry) (4.3.7)
3.6s | 15 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/poetry_core-2.1.2-py3-none-any.whl (from poetry)
3.7s | 16 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/pyproject_hooks-1.2.0-py3-none-any.whl (from poetry)
3.7s | 17 | Requirement already satisfied: requests<3.0,>=2.26 in /usr/local/lib/python3.11/dist-packages (from poetry) (2.32.3)
3.7s | 18 | Requirement already satisfied: requests-toolbelt<2.0.0,>=1.0.0 in /usr/local/lib/python3.11/dist-packages (from poetry) (1.0.0)
3.7s | 19 | Requirement already satisfied: shellingham<2.0,>=1.5 in /usr/local/lib/python3.11/dist-packages (from poetry) (1.5.4)
3.7s | 20 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/tomlkit-0.13.2-py3-none-any.whl (from poetry)
3.7s | 21 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/trove_classifiers-2025.4.11.15-py3-none-any.whl (from poetry)
3.7s | 22 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/virtualenv-20.30.0-py3-none-any.whl (from poetry)
3.8s | 23 | Requirement already satisfied: msgpack<2.0.0,>=0.5.2 in /usr/local/lib/python3.11/dist-packages (from cachecontrol<0.15.0,>=0.14.0->cachecontrol[filecache]<0.15.0,>=0.14.0->poetry) (1.1.0)
3.8s | 24 | Requirement already satisfied: filelock>=3.8.0 in /usr/local/lib/python3.11/dist-packages (from cachecontrol[filecache]<0.15.0,>=0.14.0->poetry) (3.18.0)
3.8s | 25 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/crashtest-0.4.1-py3-none-any.whl (from cleo<3.0.0,>=2.1.0->poetry)
3.8s | 26 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/rapidfuzz-3.13.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from cleo<3.0.0,>=2.1.0->poetry)
3.9s | 27 | Requirement already satisfied: urllib3>=1.25 in /usr/local/lib/python3.11/dist-packages (from dulwich<0.23.0,>=0.22.6->poetry) (2.3.0)
3.9s | 28 | Requirement already satisfied: SecretStorage>=3.2 in /usr/lib/python3/dist-packages (from keyring<26.0.0,>=25.1.0->poetry) (3.3.1)
3.9s | 29 | Requirement already satisfied: jeepney>=0.4.2 in /usr/lib/python3/dist-packages (from keyring<26.0.0,>=25.1.0->poetry) (0.7.1)
3.9s | 30 | Requirement already satisfied: importlib_metadata>=4.11.4 in /usr/local/lib/python3.11/dist-packages (from keyring<26.0.0,>=25.1.0->poetry) (8.6.1)
3.9s | 31 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/jaraco.classes-3.4.0-py3-none-any.whl (from keyring<26.0.0,>=25.1.0->poetry)
3.9s | 32 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/jaraco.functools-4.1.0-py3-none-any.whl (from keyring<26.0.0,>=25.1.0->poetry)
3.9s | 33 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/jaraco.context-6.0.1-py3-none-any.whl (from keyring<26.0.0,>=25.1.0->poetry)
4.0s | 34 | Requirement already satisfied: httpx<1,>=0.27.0 in /usr/local/lib/python3.11/dist-packages (from pbs-installer[download,install]<2026.0.0,>=2025.1.6->poetry) (0.28.1)
4.0s | 35 | Requirement already satisfied: zstandard>=0.21.0 in /usr/local/lib/python3.11/dist-packages (from pbs-installer[download,install]<2026.0.0,>=2025.1.6->poetry) (0.23.0)
4.0s | 36 | Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests<3.0,>=2.26->poetry) (3.4.1)
4.0s | 37 | Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/dist-packages (from requests<3.0,>=2.26->poetry) (3.10)
4.0s | 38 | Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.11/dist-packages (from requests<3.0,>=2.26->poetry) (2025.1.31)
4.0s | 39 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/distlib-0.3.9-py2.py3-none-any.whl (from virtualenv<21.0.0,>=20.26.6->poetry)
4.0s | 40 | Requirement already satisfied: anyio in /usr/local/lib/python3.11/dist-packages (from httpx<1,>=0.27.0->pbs-installer[download,install]<2026.0.0,>=2025.1.6->poetry) (3.7.1)
4.0s | 41 | Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.11/dist-packages (from httpx<1,>=0.27.0->pbs-installer[download,install]<2026.0.0,>=2025.1.6->poetry) (1.0.7)
4.0s | 42 | Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.11/dist-packages (from httpcore==1.*->httpx<1,>=0.27.0->pbs-installer[download,install]<2026.0.0,>=2025.1.6->poetry) (0.14.0)
4.0s | 43 | Requirement already satisfied: zipp>=3.20 in /usr/local/lib/python3.11/dist-packages (from importlib_metadata>=4.11.4->keyring<26.0.0,>=25.1.0->poetry) (3.21.0)
4.0s | 44 | Requirement already satisfied: more-itertools in /usr/local/lib/python3.11/dist-packages (from jaraco.classes->keyring<26.0.0,>=25.1.0->poetry) (10.6.0)
4.0s | 45 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/backports.tarfile-1.2.0-py3-none-any.whl (from jaraco.context->keyring<26.0.0,>=25.1.0->poetry)
4.1s | 46 | Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.11/dist-packages (from anyio->httpx<1,>=0.27.0->pbs-installer[download,install]<2026.0.0,>=2025.1.6->poetry) (1.3.1)
5.6s | 47 | Installing collected packages: trove-classifiers, distlib, virtualenv, tomlkit, rapidfuzz, pyproject-hooks, poetry-core, pkginfo, pbs-installer, jaraco.functools, jaraco.classes, installer, findpython, dulwich, crashtest, backports.tarfile, jaraco.context, cleo, build, keyring, poetry
6.8s | 48 | Attempting uninstall: keyring
7.1s | 49 | Found existing installation: keyring 23.5.0
7.1s | 50 | Uninstalling keyring-23.5.0:
7.2s | 51 | Successfully uninstalled keyring-23.5.0
7.5s | 52 | Successfully installed backports.tarfile-1.2.0 build-1.2.2.post1 cleo-2.1.0 crashtest-0.4.1 distlib-0.3.9 dulwich-0.22.8 findpython-0.6.3 installer-0.7.0 jaraco.classes-3.4.0 jaraco.context-6.0.1 jaraco.functools-4.1.0 keyring-25.6.0 pbs-installer-2025.4.9 pkginfo-1.12.1.2 poetry-2.1.2 poetry-core-2.1.2 pyproject-hooks-1.2.0 rapidfuzz-3.13.0 tomlkit-0.13.2 trove-classifiers-2025.4.11.15 virtualenv-20.30.0
8.8s | 53 | Looking in links: /kaggle/input/pm-81124002-at-04-21-2025-19-01-39
8.8s | 54 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/grad-cam-1.5.5.tar.gz
9.1s | 55 | Installing build dependencies: started
10.1s | 56 | error: subprocess-exited-with-error
10.1s | 57 |  
10.1s | 58 | × pip subprocess to install build dependencies did not run successfully.
10.1s | 59 | │ exit code: 1
10.1s | 60 | ╰─> See above for output.
10.1s | 61 |  
10.1s | 62 | Installing build dependencies: finished with status 'error'
10.1s | 63 | note: This error originates from a subprocess, and is likely not a problem with pip.
10.1s | 64 | error: subprocess-exited-with-error
10.1s | 65 |  
10.1s | 66 | × pip subprocess to install build dependencies did not run successfully.
10.1s | 67 | │ exit code: 1
10.1s | 68 | ╰─> See above for output.
10.1s | 69 |  
10.1s | 70 | note: This error originates from a subprocess, and is likely not a problem with pip.
11.4s | 71 | Looking in links: /kaggle/input/pm-81124002-at-04-21-2025-19-01-39
11.5s | 72 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/opensoundscape-0.12.0-py3-none-any.whl
11.5s | 73 | Requirement already satisfied: Deprecated<2.0.0,>=1.2.13 in /usr/local/lib/python3.11/dist-packages (from opensoundscape) (1.2.18)
11.5s | 74 | Requirement already satisfied: Jinja2>=3.1.3 in /usr/local/lib/python3.11/dist-packages (from opensoundscape) (3.1.6)
11.5s | 75 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/aru_metadata_parser-0.1.0-py3-none-any.whl (from opensoundscape)
11.5s | 76 | Requirement already satisfied: certifi>=2024.7.4 in /usr/local/lib/python3.11/dist-packages (from opensoundscape) (2025.1.31)
11.6s | 77 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/crowsetta-5.0.1-py3-none-any.whl (from opensoundscape)
11.6s | 78 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/docopt-0.6.2.tar.gz (from opensoundscape)
11.6s | 79 | Preparing metadata (setup.py): started
12.6s | 80 | Preparing metadata (setup.py): finished with status 'done'
12.6s | 81 | Requirement already satisfied: gitpython>=3.1.41 in /usr/local/lib/python3.11/dist-packages (from opensoundscape) (3.1.44)
12.6s | 82 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/grad-cam-1.5.5.tar.gz (from opensoundscape)
12.7s | 83 | Installing build dependencies: started
13.8s | 84 | error: subprocess-exited-with-error
13.8s | 85 |  
13.8s | 86 | × pip subprocess to install build dependencies did not run successfully.
13.8s | 87 | │ exit code: 1
13.8s | 88 | ╰─> See above for output.
13.8s | 89 |  
13.8s | 90 | note: This error originates from a subprocess, and is likely not a problem with pip.
13.8s | 91 | Installing build dependencies: finished with status 'error'
13.8s | 92 | error: subprocess-exited-with-error
13.8s | 93 |  
13.8s | 94 | × pip subprocess to install build dependencies did not run successfully.
13.8s | 95 | │ exit code: 1
13.8s | 96 | ╰─> See above for output.
13.8s | 97 |  
13.8s | 98 | note: This error originates from a subprocess, and is likely not a problem with pip.
15.1s | 99 | Looking in links: /kaggle/input/pm-81124002-at-04-21-2025-19-01-39
15.1s | 100 | Processing /kaggle/input/pm-81124002-at-04-21-2025-19-01-39/bioacoustics-model-zoo-0.11.0.zip
15.2s | 101 | Preparing metadata (pyproject.toml): started
15.4s | 102 | Preparing metadata (pyproject.toml): finished with status 'done'
15.4s | 103 | Building wheels for collected packages: bioacoustics-model-zoo
15.4s | 104 | Building wheel for bioacoustics-model-zoo (pyproject.toml): started
16.0s | 105 | Building wheel for bioacoustics-model-zoo (pyproject.toml): finished with status 'done'
16.0s | 106 | Created wheel for bioacoustics-model-zoo: filename=bioacoustics_model_zoo-0.11.0-py3-none-any.whl size=265525 sha256=1c1d4053a19143774f2de3f316555439744b190392036e58231db62904b66e68
16.0s | 107 | Stored in directory: /root/.cache/pip/wheels/b2/14/eb/e167cf36778ecd6de6bf9582c08b183c885cf567c50491a9e0
16.0s | 108 | Successfully built bioacoustics-model-zoo
17.5s | 109 | Installing collected packages: bioacoustics-model-zoo
17.6s | 110 | Successfully installed bioacoustics-model-zoo-0.11.0
21.7s | 111 | 0.00s - Debugger warning: It seems that frozen modules are being used, which may
21.7s | 112 | 0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
21.7s | 113 | 0.00s - to python to disable frozen modules.
21.7s | 114 | 0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
22.4s | 115 | 0.00s - Debugger warning: It seems that frozen modules are being used, which may
22.4s | 116 | 0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
22.4s | 117 | 0.00s - to python to disable frozen modules.
22.4s | 118 | 0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
32.8s | 119 | Traceback (most recent call last):
32.8s | 120 | File "<string>", line 1, in <module>
32.8s | 121 | File "/usr/local/lib/python3.11/dist-packages/papermill/execute.py", line 131, in execute_notebook
32.8s | 122 | raise_for_execution_errors(nb, output_path)
32.8s | 123 | File "/usr/local/lib/python3.11/dist-packages/papermill/execute.py", line 251, in raise_for_execution_errors
32.8s | 124 | raise error
32.8s | 125 | papermill.exceptions.PapermillExecutionError:
32.8s | 126 | ---------------------------------------------------------------------------
32.8s | 127 | Exception encountered at "In [1]":
32.8s | 128 | ---------------------------------------------------------------------------
32.8s | 129 | ModuleNotFoundError                       Traceback (most recent call last)
32.8s | 130 | /tmp/ipykernel_58/2976048245.py in <cell line: 0>()
32.8s | 131 | 8 import torch
32.8s | 132 | 9 from sklearn.metrics import average_precision_score, roc_auc_score
32.8s | 133 | ---> 10 from opensoundscape import Audio, Spectrogram
32.8s | 134 | 11 from opensoundscape.metrics import predict_multi_target_labels
32.8s | 135 | 12 from opensoundscape.ml.shallow_classifier import MLPClassifier, fit_classifier_on_embeddings
32.8s | 136 |  
32.8s | 137 | ModuleNotFoundError: No module named 'opensoundscape'
32.8s | 138 |  
34.7s | 139 | /usr/local/lib/python3.11/dist-packages/traitlets/traitlets.py:2915: FutureWarning: --Exporter.preprocessors=["remove_papermill_header.RemovePapermillHeader"] for containers is deprecated in traitlets 5.0. You can pass `--Exporter.preprocessors item` ... multiple times to add items to a list.
34.7s | 140 | warn(
34.8s | 141 | [NbConvertApp] Converting notebook __notebook__.ipynb to notebook
35.0s | 142 | [NbConvertApp] Writing 15985 bytes to __notebook__.ipynb
36.4s | 143 | /usr/local/lib/python3.11/dist-packages/traitlets/traitlets.py:2915: FutureWarning: --Exporter.preprocessors=["nbconvert.preprocessors.ExtractOutputPreprocessor"] for containers is deprecated in traitlets 5.0. You can pass `--Exporter.preprocessors item` ... multiple times to add items to a list.
36.4s | 144 | warn(
36.4s | 145 | [NbConvertApp] Converting notebook __notebook__.ipynb to html
37.1s | 146 | [NbConvertApp] Writing 313920 bytes to __results__.html

0 replies

User Story -> Kaggle Competition #1116

Uh oh!

Uh oh!

bw4sz Apr 14, 2025

User

Experience Level

User goal

Getting Started

Sample data

Installation

Load the audio

Predicting with a prebuilt model

Analyzing results

Fine-tuning

Replies: 7 comments · 12 replies

Uh oh!

sammlapp Apr 15, 2025 Maintainer

Uh oh!

bw4sz Apr 15, 2025 Author

Uh oh!

sammlapp Apr 15, 2025 Maintainer

Uh oh!

bw4sz Apr 15, 2025 Author

Uh oh!

bw4sz Apr 15, 2025 Author

Uh oh!

bw4sz Apr 16, 2025 Author

Potential workarounds

Uh oh!

sammlapp Apr 17, 2025 Maintainer

Uh oh!

bw4sz Apr 17, 2025 Author

Uh oh!

Uh oh!

bw4sz Apr 18, 2025 Author

Uh oh!

sammlapp Apr 21, 2025 Maintainer

Uh oh!

Uh oh!

bw4sz Apr 21, 2025 Author

Uh oh!

sammlapp Apr 21, 2025 Maintainer

Uh oh!

Uh oh!

sammlapp Apr 21, 2025 Maintainer

Uh oh!

bw4sz Apr 21, 2025 Author

Uh oh!

sammlapp Apr 21, 2025 Maintainer

Uh oh!

bw4sz Apr 21, 2025 Author

Uh oh!

bw4sz Apr 21, 2025 Author

bw4sz
Apr 14, 2025

Replies: 7 comments 12 replies

sammlapp
Apr 15, 2025
Maintainer

bw4sz
Apr 15, 2025
Author

sammlapp Apr 15, 2025
Maintainer

bw4sz
Apr 15, 2025
Author

bw4sz Apr 15, 2025
Author

bw4sz
Apr 16, 2025
Author

sammlapp Apr 17, 2025
Maintainer

bw4sz Apr 17, 2025
Author

bw4sz
Apr 18, 2025
Author

sammlapp Apr 21, 2025
Maintainer

bw4sz
Apr 21, 2025
Author

sammlapp Apr 21, 2025
Maintainer

sammlapp Apr 21, 2025
Maintainer

bw4sz Apr 21, 2025
Author

sammlapp Apr 21, 2025
Maintainer

bw4sz Apr 21, 2025
Author

bw4sz
Apr 21, 2025
Author