Alternative clustering in subspace projections

Färber, Ines; Assent, Ira; Seidl, Thomas

doi:34519

Items
Marc 21

001			560970
005			20240715104115.0
020	_	_	\|a 978-3-86359-368-1
024	7	_	\|2 URN \|a urn:nbn:de:hbz:82-rwth-2015-066881
024	7	_	\|2 HBZ \|a HT018820505
024	7	_	\|2 Laufende Nummer \|a 34519
037	_	_	\|a RWTH-2015-06688
041	_	_	\|a English
082	_	_	\|a 004
100	1	_	\|0 P:(DE-82)029749 \|a Färber, Ines \|b 0
245	_	_	\|a Alternative clustering in subspace projections \|c Ines Färber \|h print, online
246	_	3	\|a Erkennung alternativer Clusteringlösungen in Teilraumprojektionen \|y German
250	_	_	\|a 1. Aufl.
260	_	_	\|a Aachen \|b Apprimus-Verl. \|c 2015
260	_	_	\|c 2016
336	7	_	\|0 PUB:(DE-HGF)11 \|2 PUB:(DE-HGF) \|a Dissertation / PhD Thesis \|b phd \|m phd
336	7	_	\|0 PUB:(DE-HGF)3 \|2 PUB:(DE-HGF) \|a Book \|m book
336	7	_	\|0 2 \|2 EndNote \|a Thesis
336	7	_	\|2 DRIVER \|a doctoralThesis
336	7	_	\|2 BibTeX \|a PHDTHESIS
336	7	_	\|2 DataCite \|a Output Types/Dissertation
336	7	_	\|2 ORCID \|a DISSERTATION
490	0	_	\|a Ergebnisse aus der Informatik \|v 6
500	_	_	\|a Weitere Reihe: Edition Wissenschaft Apprimus. - Auch veröffentlicht auf dem Publikationsserver der RWTH Aachen University 2016
502	_	_	\|a Zugl.: Aachen, Techn. Hochsch., Diss., 2014 \|b Dissertation \|c Zugl.: Aachen, Techn. Hochsch. \|g Fak01 \|o 2014-12-04
520	3	_	\|a Der bisherige technologische Fortschritt führte zu einer Durchdringung aller Lebensbereiche mit Informationssystemen und ermöglicht das einfache und günstige Erfassen großer Datenmengen. Für unsere Informationsgesellschaft ist es jedoch entscheidend aus diesen reichhaltigen Datenquellen nützliche Informationen und Wissen zu generieren. Diesem Ziel hat sich der Forschungsbereich des Data Mining gewidmet, dessen Aufgabe es ist automatisiert oder semi-automatisiert vorher unbekannte Muster aus Daten zu extrahieren. Diese Arbeit beschäftigt sich mit der Aufgabe des Clusterings, welche Objekte anhand ihrer Ähnlichkeit gruppiert. Da moderne Speichertechnologien keine ernsthaften Grenzen mehr aufzeigen, können Daten meist in ihrer vollen Komplexität ohne eine Beschränkung auf lediglich ausgewählte Aspekte erfasst werden. Für solch komplexe Daten stellt jedoch ein einziges Clustering oft keine ausreichende Charakterisierung dar. Stattdessen lassen sich für einen Datensatz oft mehrere, unterschiedliche und sinnvolle Clusterings identifizieren. Das Paradigma des Multi-View Clusterings, auch als Alternative Clustering bezeichnet, hat sich dem Ziel verschrieben explizit nach einer solch diversen Menge mehrerer, alternativer Clusterings zu suchen um alle versteckten Muster der Daten aufzudecken. Eine zweite Beobachtung für komplexe Daten, bei welchen üblicherweise für jedes Objekt eine Vielzahl von Eigenschaften erfasst wurde, ist eine sehr schwach ausgeprägte Ähnlichkeit zwischen Objekten bei Berücksichtigung all ihrer Merkmalsausprägungen. Während ein Clustering unter Berücksichtigung aller Attribute nicht zielführend ist, lassen sich bei Betrachtung einzelner Attributteilmengen, d.h. in Teilraumprojektionen, durchaus sinnvolle Clusterstrukturen identifizieren. Dieser Problemstellung haben sich Ansätze des Subspace Clustering Paradigmas angenommen, welche Clusterstrukturen in Teilraumprojektionen identifizieren, sodass für jeden Cluster automatisch auch die Menge der relevanten Attribute bestimmt wird. In dieser Arbeit wollen wir die grundsätzlichen Parallelen beider Paradigmen, Multi-View Clustering und Subspace Clustering, hervorheben, da beiden die Eigenschaft der gleichzeitigen Zugehörigkeit einzelner Objekte zu mehreren Clustern gemein ist. Entsprechend stellen wir verschiedene Ansätze vor die durch die Kombination beider Paradigmen Synergieeffekte nutzen um mehrere, verschiedene Gruppierungen in Teilraumprojektionen zu identifizieren. \|l ger
520	_	_	\|a The technological advancements of recent years led to a pervasion of all life areas with information systems and allows to conveniently and affordably gather large amounts of data. The key to our information society is the transformation of the mere data in these comprehensive databases into information and knowledge. One research area committed to this goal is the one of data mining, where the task is to automatically or semi-automatically extract previously unknown patterns from such data sources. The subject of this thesis is the mining task of clustering, which aims at grouping objects based on their similarity such that similar objects are grouped together, while dissimilar ones are separated. Since modern storage systems are not subject to practical limitations anymore, data can be captured in its full complexity without restriction to a small selective set of aspects. For such complex data, just identifying a single clustering is often not sufficient. Instead, multiple, alternative, and valid clusterings can be identified for a single dataset, each highlighting different aspects of the data. The paradigm of multi-view clustering, also referred to as alternative clustering, is dedicated to explicitly discover such a diverse set of multiple, alternative clusterings in order to find all hidden patterns in the data. A second observation for complex data sources, where usually many characteristics are stored for each object, is the inability to find similar objects by considering all of these characteristics. While clustering based on all attributes, in the full-space, is futile, valuable cluster patterns can be found for subsets of attributes, in subspace projections. This problem is tackled by approaches of the subspace clustering paradigm, which aim at uncovering clustering structures hidden in subspace projections, such that for each cluster a set of relevant attributes is determined automatically. In this thesis, we want to highlight fundamental parallels between the two paradigms of multi-view clustering and subspace clustering, since both account for the possibility of objects belonging to multiple clusters simultaneously. Consequently, we present several approaches exploiting synergy effects by combining both paradigms to find multiple, alternative clusterings in subspace projections of the data. \|l eng
591	_	_	\|a Germany
653	_	7	\|a Informatik
653	_	7	\|a data mining
653	_	7	\|a Cluster-Analyse
653	_	7	\|a Cluster
653	_	7	\|a Clusterverfahren
653	_	7	\|a Hochdimensionale Daten
653	_	7	\|a Datenbank
653	_	7	\|a Algorithmus
653	_	7	\|a Netzwerk
653	_	7	\|a Clustering
653	_	7	\|a Teilraum Clustering
653	_	7	\|a subspace clustering
653	_	7	\|a Teilraumprojektion
653	_	7	\|a subspace projections
653	_	7	\|a Redundanzentfernung
653	_	7	\|a redundancy avoidance
653	_	7	\|a graph mining
653	_	7	\|a network clustering
653	_	7	\|a multi-view clustering
653	_	7	\|a Wissensextraktion
700	1	_	\|0 P:(DE-82)001596 \|a Seidl, Thomas \|b 1 \|e Thesis advisor
700	1	_	\|0 P:(DE-82)001595 \|a Assent, Ira \|b 2 \|e Thesis advisor
856	4	_	\|u https://publications.rwth-aachen.de/record/560970/files/560970.pdf \|y OpenAccess
856	4	_	\|u https://publications.rwth-aachen.de/record/560970/files/560970_source.zip \|y restricted
856	4	_	\|u https://publications.rwth-aachen.de/record/560970/files/560970.gif?subformat=icon \|x icon \|y OpenAccess
856	4	_	\|u https://publications.rwth-aachen.de/record/560970/files/560970.jpg?subformat=icon-1440 \|x icon-1440 \|y OpenAccess
856	4	_	\|u https://publications.rwth-aachen.de/record/560970/files/560970.jpg?subformat=icon-180 \|x icon-180 \|y OpenAccess
856	4	_	\|u https://publications.rwth-aachen.de/record/560970/files/560970.jpg?subformat=icon-640 \|x icon-640 \|y OpenAccess
856	4	_	\|u https://publications.rwth-aachen.de/record/560970/files/560970.jpg?subformat=icon-700 \|x icon-700 \|y OpenAccess
856	4	_	\|u https://publications.rwth-aachen.de/record/560970/files/560970.pdf?subformat=pdfa \|x pdfa \|y OpenAccess
909	C	O	\|o oai:publications.rwth-aachen.de:560970 \|p dnbdelivery \|p VDB \|p driver \|p urn \|p open_access \|p openaire
914	1	_	\|y 2015
915	_	_	\|0 StatID:(DE-HGF)0510 \|2 StatID \|a OpenAccess
920	1	_	\|0 I:(DE-82)122510_20140620 \|k 122510 \|l Lehrstuhl für Informatik 9 (Datenmanagement und -exploration) \|x 0
920	1	_	\|0 I:(DE-82)120000_20140620 \|k 120000 \|l Fachgruppe Informatik \|x 1
980	1	_	\|a FullTexts
980	_	_	\|a phd
980	_	_	\|a VDB
980	_	_	\|a book
980	_	_	\|a I:(DE-82)122510_20140620
980	_	_	\|a I:(DE-82)120000_20140620
980	_	_	\|a UNRESTRICTED

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

h1

h2

h3

h4

h5

h6