Setup

For this example, we have a 3x4x3 m room. The reverberation time (RT60) is 0.3sec. The critical distnaces overlap here. This makes for a more challenging setup than the previous examples. Room

Mod-MFCC Based Clusters

cluster	reference microphone	masked reference signal	DSB signal
1
2

Speaker Embedding Clusters

cluster	reference microphone	masked reference signal	DSB signal
1
2

Discussion

Even for this challenging case, the embeddings show to be good clustering features, delivering logically plausible clusters. In contrast, for the Mod-MFCC based features, there seems to be no logic in the clustering. In this example, it seems to us that the Mod-MFCC features cluster more based on the SINR than on the speaker-specific features. Also note that the speaker embedding features avoid taking microphones located in the region with overlapping critical distances of the speakers.

ASPIRE

Setup

Mod-MFCC Based Clusters

Speaker Embedding Clusters

Discussion