This is the same example as the third example with more reverberant walls: the room size is 3x4x3 m and the reverberation time is 0.5sec.
Mod-MFCC Based Clusters
CLUSTER 1
CLUSTER 2
BACKGROUND CLUSTER
cluster
reference microphone
masked reference signal
DSB signal
1
2
Speaker Embedding Clusters
CLUSTER 1
CLUSTER 2
BACKGROUND CLUSTER
cluster
reference microphone
masked reference signal
DSB signal
1
2
Discussion
In the more reverberant condition (compared to the third example), the Mod-MFCC based features create a less than ideal cluster for source 1,
as well as missing the best SINR microphone for source 2. The speaker embedding based features consistently deliver good (logical) clusters