Smiley face

Srđan Lazendić

PhD student and assistant



Smiley face
Home CV Research and publications Software Teaching Thesis proposals Contact


  There are three master thesis topics, both in English and Dutch:

    1. Quaternion sparse representation model for color image processing
    2. Convolutional Neural Networks (CNNs) analysed via sparse coding
    3. Sparse subspace clustering for large scale hyperspectral data
    4. Quaternionisch schaars representatiemodel voor verwerking van kleurenafbeeldingen



1. Quaternion sparse representation model for color image processing

Academic year: 2019/2020

Mentors: Srđan Lazendić (S8, office 130.071)
Promotors: Aleksandra Pižurica (Technicum TELIN-IPI), Hendrik De Bie (S8, office 130.062)


Description of the problem:

Many problems like image and video restoration, compression and coding, digital image inpainting and content analysis benefit from the sparse representation model (see [1]). As these techniques are powerful and widely applicable, sparse representations of signals (including images and higher-dimensional data), attract the interest of researchers from different fields. The goal of such a sparse representation is to approximate well a signal using only few elements from a (typically redundant) dictionary (see Figure 1). One of the best known and widely used approaches for dictionary learning is the so-called K-SVD method. K-SVD is an extension of the K-means clustering method that allows efficient learning of the dictionary using the singular value decomposition (SVD). The common dictionary learning techniques, including the recent K-SVD methods, treat signals in a unified way irrespective of their dimensionality and the nature of different channels in the case of multicomponent data (such as color, multispectral or hyperspectral images). All the data within a 2-D window (in the case of a greyscale image) or a 3-D window (in the case of a multicomponent image) are simply stacked in an array, using a given scanning order and as such treated as a single 1-D vector.

Fig 1. - For input data Y, dictionary learning method aims to find dictionary D and a representation matrix X such that its columns Xi are sparse enough (Data 61).


A very recent trend in signal processing and machine learning attempts to build an improved sparse representation model of color images by introducing quaternions into dictionary construction (see [2,3]). Quaternions are four-dimensional generalization of complex numbers (with three imaginary units instead of one). Due to their property to describe efficiently rotations in 3D, quaternions have many applications in theoretical applied mathematics but also in different fields of engineering such as computer graphics and computer vision as well as in various applications including biomedical processing, remote sensing, hyperspectral image processing and many others. The quaternionic representation with three imaginary units is also ideally suited for representing three color channels, and therefore quaternions have already been used extensively in color image processing. A very recent method, so-called K-QSVD, which is a generalization of the K-SVD algorithm in the quaternionic framework, already showed remarkable results (see Figure 2). The potentials of quaternions in improving sparse representations of multicomponent images are yet to be explored, starting from the first encouraging results. The motivation is that the coefficient matrix preserves not only the correlation among the channels but also the orthogonal property. According to recent studies, this proves to be important in terms of computation complexity but also in terms of color fidelity in the reconstruction. However, many aspects of this approach are yet to be explored, both theoretically and in terms of the practical design. In this master thesis, the student will be guided by supervisors from the Image Processing and Interpretation research group and the Clifford research group.

Fig 2. - An example showing the application of K-QSVD in image inpainting. Left: damaged image (70 % missing);
Right: an image reconstructed using K-QSVD.


Goal of the thesis:

This thesis should combine emerging and hugely popular technologies in image processing and computer vision with a solid mathematical theory to build a sound framework that will be validated in some concrete applications but even more widely applicable. The main goal of the thesis is to build a powerful method for encoding color images using quaternionic dictionaries starting from the literature and already developed algorithms, such as K-SVD and K-QSVD. A first task will be to explore the efficiency of K-QSVD dictionaries compared to the more traditional ones in terms of the approximation power (the goal is to compose the image as faithfully as possible by combining as few as possible elements at each local position). Secondly, the developed method will be applied in two concrete image processing applications - image denoising and digital inpainting (see an illustration in Figure 2). In these applications, the use of quaternionic dictionaries will be practically evaluated and compared to some of the current state-of-the-art methods that will be made available to the student.


References:

  1. Elad, Michael, Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, Springer, 2010.

  2. Xu, Yi, et al. Vector sparse representation of color image using quaternion matrix analysis. IEEE transactions on image processing, 2015, 24.4: 1315-1329.

  3. Zou, Cuiming, et al. Quaternion collaborative and sparse representation with application to color face recognition. IEEE Transactions on Image Processing, 2016, 25.7: 3287-3302.
  4. Mairal, Julien et al. Learning multiscale sparse representations for image and video restoration. SIAM Multiscale Modelling and Simulation, 2008, 7(1):214-241





2. Convolutional Neural Networks (CNNs) analysed via sparse coding

Academic year: 2019/2020

Mentors: Srđan Lazendić (S8, office 130.071), Shaoguang Huang (Technicum TELIN-GAIM)
Promotors: Aleksandra Pižurica (Technicum TELIN-GAIM)


Description of the problem:

Recently, the use of deep networks has led to unprecedented results across various fields of image and data processing. Although these models reach impressive prediction accuracies, due to their non-linear structure, it is not clear what information in the input data makes them actually arrive at their decisions. The configuration and training of deep learning networks are largely driven by trial-and-error strategies. Since this lack of transparency can be a major drawback, the development of methods for explaining and interpreting deep learning models has recently attracted increasing attention.

Deep learning as an instance of general representation learning is naturally connected to sparse signal representations and dictionary learning. While the development of new variants of deep neural networks have been largely driven by a considerable amount of intuition, dictionary learning offers a sound theoretical formulation. Hence, there is a lot of interest in combining the two into a powerful, yet better interpretable framework. Many problems like image and video restoration benefit from the sparse representation model. The goal of such a sparse representation model is to approximate well a signal using only few elements from a (typically redundant) dictionary. Recently, the convolutional sparse coding (CSC) paradigm has been introduced. CSC is a special case of the sparse representation model, built around a very structured dictionary being a union of banded and circulant matrices (see Fig. 1).

Fig. 1: The convolutional model description with the composition in terms of the local dictionary (see [1]).

An extension of the CSC model, known as multi-layer sparse model has raised insightful connections between sparse representations and convolutional neural networks (CNN) (see Fig. 2). The multi-layer CSC leads to a solid and systematic theoretical justification of the architectures used in deep learning for CNN networks. However, many aspects of this approach are yet to be explored, both theoretically and in terms of the practical design. Taking into account that this approach allows analysis of CNNs architectures and suggests how to build new ones in a systematic fashion, different sparse coding techniques should be analyzed and compared since different architectures are resulting from different solvers for the features. Student will be guided by the supervisors from the research group GAIM whose research expertise is on representation learning, deep learning and sparse coding.

Fig. 2: Decomposition of an image from MNIST in terms of two multilayer convolutional dictionaries. Two local convolutional atoms (bottom row) are combined to create molecules – at the second level, which are then combined to create the global atom (number 6). (see [2])


Goal of the thesis:

The goal of the thesis is to build on recent works in multi-scale convolutional sparse coding. Firstly, the student should study and understand the theory behind the representation learning, deep learning and dictionary learning in particular. Secondly, concrete solvers for the features (e.g. ML-ISTA, ML-FISTA, LBP, ML-LISTA...) should be chosen and the performance of the constructed architectures should be compared both among themselves and against some of the more conventional tools for image processing tasks. Practical applications will be chosen in the agreement with the student based on his/her affinities. Possible applications are in large-scale hyperspectral data processing in remote sensing and multimodal data analysis in art investigation. The existing code and literature will be made available to the student.


References:

  1. P. Vardan, Y. Romano, M. Elad. Convolutional neural networks analyzed via convolutional sparse coding, 2017.

  2. V. Papyan, Y. Romano, J. Sulam, M. Elad, Theoretical Foundations of Deep Learning via Sparse Representations, 2018.




3. Sparse subspace clustering for large scale hyperspectral data

Academic year: 2019/2020

Mentor: Srđan Lazendić (S8, bureau 130.071), Shaoguang Huang (Technicum TELIN-GAIM)
Supervisors: Aleksandra Pižurica (Technicum TELIN-GAIM), Shaoguang Huang (Technicum, TELIN-GAIM).


Description of the problem:

Supervised classification methods such as the classical support vector machine (SVM) and the modern convolutional neural network (CNN) require labeled training samples to train the classification model. In some applications, labeled data are rather scarce or not available, either because data labeling is labor intensive and time consuming or simply because not enough examples of a particular phenomenon of interest have been recorded yet. Clustering, as an unsupervised approach, partitions data points into different clusters (classes) without any labeled data. Thus, clustering approaches are especially interesting in cases where supervised classification is not applicable or not reliable enough due to the lack of sufficient annotated data. Such cases arise often in dynamic scenarios such as monitoring forest fires, disaster damages, land use / cover change detection and trajectory data mining.

We focus on the subspace clustering approach that yields state-of-the-art clustering performance in computer vision, image processing, remote sensing and pattern recognition. The main idea is to model the input data by a union of subspaces and uncover the cluster structure in lower-dimensional subspaces, as shown in Fig. 1. Compared with the classical fuzzy c-means and k-means methods, subspace clustering approaches are able to unveil more precisely the data correlations, leading to superior clustering performance. We are particularly interested in the processing of hyperspectral images (HSIs) in remote sensing, where the goal in this proposal is to cluster pixels or HSI into different groups using their spectral signatures, as shown in Fig. 2.

Fig. 1. The framework of a typical subspace clustering method, which includes subspace learning and representation, graph construction and spectral clustering. X is an input matrix with each column representing a data point; D is a dictionary that models the underlying subspaces; A is the corresponding subspace representation matrix with respect to D.

Despite the excellent clustering accuracy of subspace clustering techniques, their high computational complexity limits their applicability in real applications involving big data sets, especially in real-time processing tasks. It is therefore important to reduce the computational complexity of these clustering models and to develop scalable subspace clustering methods for large-scale data. Important aspects in addressing this problem are understanding representation learning (including dictionary learning and subspace representation) and efficient algorithm design. Research group GAIM has rich experience in this domain and will provide full support in programming, model construction, optimization algorithms and experiment validation based on the well-founded expertise.
Fig. 2. An illustration of subspace clustering in the application or hyperspectral remote sensing images.



Goal of the thesis:

The goal of this Master's thesis is to advance further the current subspace clustering methods in particular by reducing their computational complexity such that they can be applied to large-scale hyperspectral data. The concrete objectives are:
-Studying recent representation learning approaches from the literature, focusing on the design of discriminative dictionaries from input data.
-Implementing one of these approaches and integrating it into the framework of sparse subspace clustering.
-Conducting experiments on real hyperspectral remote sensing images.

The students will start from the current subspace clustering code of the best available GAIMs technique and will also have other useful GAIMs techniques and source codes for dictionary learning, sparse coding and optimization algorithms. Motivated students will be encouraged to participate with the developed techniques in data classification challenges of the IEEE Geoscience and Remote Sensing community.




4. Quaternionisch schaars representatiemodel voor verwerking van kleurenafbeeldingen

Academiejaar: 2018/2019

Begeleiders: Srđan Lazendić (S8, bureau 130.071)
Promotoren: Aleksandra Pižurica (Technicum TELIN-IPI), Hendrik De Bie (S8, bureau 130.062)


Probleemstelling:

Veel problemen zoals beeld- en videoreconstructie, compressie en codering, digitale beeldinkleuring en content analyse halen voordeel uit het zogenaamde schaarse representatiemodel (zie [1]). Aangezien deze technieken zeer sterk en uitgebreid toepasbaar zijn, trekken schaarse representaties van signalen (inclusief beelden en hogerdimensionale data) de aandacht van onderzoekers uit diverse onderzoeksgebieden. Het doel van een schaarse representatie is om om een goede benadering te geven van een signaal, enkel gebruik makend van een dictionary (zie Figuur 1 en 2). Een van de best gekende en algemeen gebruikte aanpakken voor dictionary learning is de zogenaamde K-SVD methode. K-SVD is een uitbreiding van de 'K-means clustering' methode, die toelaat tot efficiënt leren van de dictionary, gebruik makend van de 'sigular value method' (SVD). De vaak voorkomende dictionary learning technieken, inclusief de recente K-SVD methodes, behandelen signalen op eenzelfde manier, onafhankelijk van de dimensie en het soort verschillende kanalen in het geval van data die uit verschillende componenten bestaat (zoals kleur, multispectrale en hyperspectrale afbeeldingen). Alle data in een 2-D venster (in het geval van een grijsgeschaalde afbeelding) of een 3-D venster (in het geval van een beeld met meerdere componenten) zijn eenvoudig samengesteld in een rij, gebruik makend van een vooropgestelde volgorde en op die manier behandeld als een enkele 1-D vector.

Fig 1. - Voor gegeven data Y zoekt de dictionary learning methode een dictionary D en een representatie matrix X zodat zijn kolommen Xi schaars genoeg zijn (Data 61).


Een zeer recente trend in signaalverwerking en machineleren probeert een verbeterd schaars representatiemodel te bouwen van kleurenafbeeldingen door quaternionen te introduceren in de opbouw van de dictionary (zie [2],[3]). Quaternionen zijn vier-dimensionale veralgemeningen van complexe getallen (met drie imaginaire eenheden in plaats van 1). Door hun eigenschap dat ze op een efficiënte manier rotaties in 3-D beschrijven, hebben quaternionen tal van toepassingen in theoretische en toegepaste wiskunde, alsook in verschillende gebieden binnen de ingenieurswetenschappen, zoals computerbeelden en computervisie, alsook in tal van toepassingen in biomedische verwerking, afstandsvoelen, hyperspectrale beeldverwerking en vele andere. De quaternionische representatie met drie imaginaire eenheden is ook perfect geschikt voor de representatie van drie kleurkanalen. Daarom zijn quaternionen al excessief gebruikt in verwerking van kleurbeelden. Een zeer recente methode, de zogenaamde K-QSVD, een veralgemening van het K-SVD algoritme in de quaternionische setting, toonde al opmerkelijke resultaten (zie Figuur 2). Het potentiel van quaternionen voor het verbeteren van schaarse representaties van beelden die uit meerdere componenten bestaan moet nog worden onderzocht, beginnend bij de eerste aanmoedigende resultaten. De motivatie is dat de coëfficiëntenmatrix niet alleen de correlatie tussen kanalen behoudt, maar ook de orthogonaliteitseigenschap. Volgens recente studies, is dit belangrijk voor de complexiteit van de berekeningen en in termen van kleur-betrouwbaarheid in de reconstructie. Echter, vele aspecten van deze aanpak moeten nog worden onderzocht, zowel theoretisch als het praktische ontwerp. In deze masterthesis zal de student worden geleid door de promotoren van de onderzoeksgroep 'Image Processing and Interpretation' en de onderzoeksgroep 'Cliffordanalyse'.

Fig. 2. - Een voorbeeld die het gebruik van K-SVD toont bij inkleuring van afbeeldingen. Links: beschadigde afbeelding (70% ontbrekend);
Rechts: een gereconstrueerde afbeelding met K-QSVD.


Doelstelling:

Deze thesis combineert opkomende en zeer populaire technologieën in beelverwerking en computervisie met een goed onderbouwde wiskundige theorie om een gegrond kader te bouwen dat zal gevalideerd worden in een aantal concrete toepassingen die nog meer toepasbaar zijn. Het belangrijkste doel van deze thesis is om een krachtige methode te bouwen om kleurenbeelden te coderen met behulp van quaternionische dictionaries, started bij de literatuur en reeds gekende algoritmen zoals K-SVD en K-QSVD. Eerst is een verkenning nodig van de efficiëntie van K-QSVD dictionaries, in vergelijking met de meer traditionele dictionaries in termen van de benaderingskracht (het doel is om de afbeelding samen te stellen op een zo getrouw mogelijke manier, waarbij zo weinig mogelijke elementen gecombineerd worden op elke lokale positie). Ten tweede dient de ontwikkelde methode toegepast te worden in twee concrete beelverwerkingstoepassingen - verwijdering van ruis uit afbeeldingen en digitale inkleuring (zie figuur 3). In deze toepassingen zal het gebruik van quaternionische dictionaries praktisch worden geëvalueerd en vergeleken met een aantal huidige state-of-the-art methodes die de student zal kunnen gebruiken.


Referenties:

  1. Elad, Michael, Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, Springer, 2010.

  2. Xu, Yi, et al. Vector sparse representation of color image using quaternion matrix analysis. IEEE transactions on image processing, 2015, 24.4: 1315-1329.

  3. Zou, Cuiming, et al. Quaternion collaborative and sparse representation with application to color face recognition. IEEE Transactions on Image Processing, 2016, 25.7: 3287-3302.

  4. Mairal, Julien et al. Learning multiscale sparse representations for image and video restoration. SIAM Multiscale Modelling and Simulation, 2008, 7(1):214-241