Dolphin “Click Train” Synthesis
Background
In 2020, I joined the Cetacean Hearing and Telemetry (CHAT) project, led by Dr. Thad Starner at Georgia Tech's Contextual Computing Group. The CHAT project is a collaboration with marine biologists from The Wild Dolphin Project to facilitate two-way human, wild dolphin interaction. You can learn more about it here.
An integral part of the CHAT project is training the wild dolphins to associate our synthesized signals to the respective meanings. Stemming from this, our signals have to already be a part of their language. Synthesizing dolphin whistles is a straightforward task - just an application of chirp, a signal widely studied. However, “click trains” are a different story.
Among whistles and echolocation clicks, click trains are the most phenomenal signals dolphins can produce. These are essentially a sequence of tightly-packed clicks, which have been filtered over time. The result is a signal which looks like the negation of a whistle on a spectrogram: frequencies on all bands except where the signal would be.
Below are some examples shown on spectrograms made in Audacity. It’s worth pointing out the sample-rate is purposely 192 kHz so that the Nyquist is 96 kHz. Dolphins communication is present in frequencies way up to 80 kHz.
As can be seen, synthesizing these signals is not a trivial task, but I wanted to be the first on the team to try.
Observations & Assumptions
Each vertical pink line is a click, which can be interpreted as short bursts of noise. A pure noise signal is known to have non-zero amplitude on all frequencies in the spectrum. As shown above, this is not exactly seen in nature, but it is definitely true that each click is “noisy”.
Clicks are more spaced-out at the beginning of the train before filtering occurs. This is somewhat of a warm-up phase, where the time between clicks becomes shorter and shorter until reaching a dense, stable state.
The contour of the low-amplitude frequencies resembles the result of a cutoff-frequency-modulated bandstop filter. We like to call these regions “negative bands”.
Methodology
At the time I was working on this, I had just finished my first course in computer science, and my only programming experience was in MATLAB. I had also not taken an audio or signal processing course yet, so I had to teach myself.
I began by writing a “click generator”, which used MATLAB’s wgn() function to generate a short white Gaussian noise sample. The click generator inputs were parameters for an amplitude envelope, dictating time length of the attack and decay stages. This attack-decay envelope vector, ranging 0 to 1, is then multiplied element-wise with the noise sample vector to give a single synthesized click. Later in the project, I also tried sampling singular clicks from real click train audio produced by dolphins. Both methods give a similar result.
The inputs for the click train synthesis function were:
time length
contour shape (string)
periodicity
bandstop width
frequency range
There were also optional arguments for the warm-up phase I mentioned in my observations. The algorithm for actual synthesis is simple:
Generate a click, either using my generator or sampling.
Calculate N number of clicks needed to satisfy the input time length.
Create a vector V of N cutoff-frequency values, the output of our modulation function given the input shape, periodicity, and frequency range.
Initialize a vector to be the output. For each value in V, a click is bandstop-filtered at the appropriate cutoff frequency and then concatenated to this output vector. A 0 vector is also appended with every iteration to give silence space in between clicks.
Results (?) & Comments
Here are some spectrograms of the signals generated by this script.
At the moment, the effectiveness of these signals are still unknown. Our hydrophones are physically unable to produce them at an audible level in the ocean. This is partly due to lacking hardware, and more partly due to the signals spanning the entire spectrum. Because of this, a delay has been put on the project until the signals can be properly produced and tested.
However, I would be surprised to find these succeed in the future. Just by visual comparison of real click trains and my synthesized ones, I believe that a different algorithm is needed for more accurate representation. In its current state, the results look unrealistic, lacking many properties of a real click train. For example, the clicks in my signals span the entire spectrum while real signals widely vary with time. There is so much we don’t know about click trains in general: How do dolphins produce them? Why do they produce them? Do they prefer this over whistles? What are the important properties of a click-train?
As for a better algorithm, I can definitely see machine learning playing a role. After all, click trains are complicated signals - perhaps a neural network could uncover the key features in synthesizing a realistic one.