Sound Visualization Algorithm in MATLAB

by analuciarg24 in Circuits > Software

28 Views, 1 Favorites, 0 Comments

Sound Visualization Algorithm in MATLAB

FGX4T7WM4PM92E4.png
portada1.png

This algorithm provides a real-time sound visualization of an audio file. The visualization is dynamic, representing the audio's sound using concentric layers with colors corresponding to the detected note and alters its shape according to the intensity of the song at each moment. Upon loading an audio file and pressing play, both the waveform and the sound visualization will be displayed while the song is playing.


There are two attached videos showing how the algorithm works.

Supplies

  1. MATLAB software
  2. MATLAB Audio Toolbox
  3. Audio files

Initialization

inicializar colores y notas.png

This initializes the coloresArcoiris variable with the colors of the rainbow and the notas variable with the musical notes. This will be used later to represent each note with a color from the rainbow.

Waveform Plotting

forma de onda.png

The method starts by obtaining a block of audio samples. The currentSample variable keeps track of the current position in the audio data. A specific block of samples, defined by nSamples, is extracted. This block represents a small portion of the audio data at a given moment.

This block is then plotted as a waveform on the UI axes (UIAxes1), showing the variation in amplitude over time. The x-axis represents time (in seconds), and the y-axis represents amplitude. The waveform is updated continuously during the playback of the audio.


Peak Detection

detectar picos.png

The findpeaks function is used to identify significant peaks in the waveform data. Peaks are the points in the audio waveform where the amplitude is higher than its surrounding values. These peaks are used to assess the overall intensity of the sound. The maximum peak value (maxPeak) is used to adjust the scale for the next visual layer's intensity.

Frequency Spectrum Calculation

spectrum.png

The Fast Fourier Transform (FFT) is applied to the current audio block to analyze the frequency content of the sound. The FFT converts the time-domain signal (waveform) into the frequency domain, allowing the system to identify the dominant frequencies present in the audio block. The magnitude of each frequency is stored in spectrum.

Only the first half of the FFT result is used since it contains the positive frequencies. The spectrum is then normalized, and linear interpolation is applied to match the number of points required for visualization (nPoints).

Main Note Detection

detectar nota.png

From the calculated frequency spectrum, the code identifies the frequency with the highest magnitude (idxMax). This corresponds to the dominant frequency in the audio block. Using the position of this peak, the algorithm calculates the corresponding musical note.

The frequency is mapped to one of the notes defined in the notas array. A simple mapping is performed by rounding the frequency and using modulo arithmetic to select a note from the array.

Dynamic Layer Visualization

color y ruido.png
layers.png

After detecting the main note, the method proceeds to visualize the sound. The visualization is dynamic and consists of concentric layers (represented as ellipses) that grow and change based on the sound's intensity and the detected note.

The color of each layer is based on the detected note. The coloresArcoiris array provides a range of colors corresponding to different notes. Each layer is scaled according to the intensity of the sound and noise is added to add randomness to the visualization.

For each layer, x and y coordinates are calculated based on the radius and the angle (theta). The layers are drawn using the fill function on the UI axes (UIAxes2 and UIAxes3), and the resulting effect is a circular, dynamic visualization that responds to the sound's characteristics.

Multiple layers are drawn with increasing scale, creating a visual effect that expands outward. The color, opacity, and position of the layers are adjusted dynamically as the audio plays, creating a rich and responsive visualization.

Full Code

In summary, the method combines real-time audio analysis (peak detection, FFT) with dynamic visualizations to create an engaging display that reflects the properties of the sound.

In the application there are two tabs, one showing the waveform and the layered sound display and the other showing only the sound display for better appreciation.


The steps described are for the visualisation method. In the attached pdf file you can see the complete code for the app.