Visual discovery of landmarks and their details in large-scale image collections

Weyand, Tobias; Chum, Ondrej; Leibe, Bastian
doi:2198-3372
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@PHDTHESIS{Weyand:681424,
      author       = {Weyand, Tobias},
      othercontributors = {Leibe, Bastian and Chum, Ondrej},
      title        = {{V}isual discovery of landmarks and their details in
                      large-scale image collections},
      volume       = {2},
      school       = {RWTH Aachen University},
      type         = {Dissertation},
      address      = {Aachen},
      publisher    = {Shaker},
      reportid     = {RWTH-2017-00203},
      isbn         = {978-3-8440-4882-7},
      series       = {Selected topics in computer vision},
      pages        = {1 Online-Ressource (viii, 171 Seiten) : Illustrationen,
                      Diagramme},
      year         = {2016},
      note         = {Druckausgabe: 2016. - Auch veröffentlicht auf dem
                      Publikationsserver der RWTH Aachen University 2017;
                      Dissertation, RWTH Aachen University, 2016},
      abstract     = {With their rapid growth in recent years, Internet photo
                      collections have become an invaluable repository of visual
                      data. In particular, they provide detailed coverage of the
                      world’s landmark buildings, monuments, sculptures, and
                      paintings. This wealth of visual information can be used to
                      construct landmark recognition engines that can
                      automatically tag a photo of a landmark with its name and
                      location. Landmark recognition engines rely on clustering
                      algorithms that are able to group several millions of images
                      by the buildings or objects they depict.This grouping
                      problem is very challenging since the massive amount of
                      Internet images requires efficient and highly parallel
                      algorithms, and the appearance variability of buildings
                      caused by viewpoint, weather and lighting changes requires
                      robust image similarity measures. Most importantly, it is
                      critical to define a clustering criterion that results in
                      meaningful object clusters. The Iconoid Shift algorithm we
                      present in this thesis uses a very intuitive definition: It
                      represents each object by an iconic image, or Iconoid, which
                      is the image that has the highest overlap with all other
                      images of the object. The object cluster is then the set of
                      all images that have a certain minimum overlap with the
                      Iconoid. We find Iconoids by performing mode search using a
                      novel distance measure based on image overlap that is more
                      robust to viewpoint and lighting changes than traditional
                      image distance measures. We propose efficient parallel
                      algorithms for performing this mode search. In contrast to
                      most previous algorithms that produced a hard clustering,
                      Iconoid Shift produces an overlapping clustering and thus
                      elegantly handles images showing multiple nearby landmarks
                      by assigning them to multiple clusters.The increasing
                      density of Internet photo collections allows us to go a step
                      further and to even discover sub-structures of buildings
                      such as doors, spires, or facade details. To this end, we
                      present the Hierarchical Iconoid Shift algorithm that,
                      instead of a flat clustering, produces a hierarchy of
                      clusters, where each cluster represents a building
                      sub-structure. This algorithm is based on a novel
                      hierarchical variant of Medoid Shift that tracks the
                      evolution of modes through scale space by continuously
                      increasing the size of its kernel window.But which objects
                      can a landmark recognition engine built by automatically
                      mining Internet photo collections recognize? And how to
                      construct such a system such that it is efficient and
                      achieves high recognition performance? To answer these
                      questions, we perform a large-scale evaluation of the
                      different components of a landmark recognition system,
                      analyzing how different choices of components and parameters
                      affect performance for different object categories such as
                      buildings, paintings or sculptures.As a final contribution,
                      we consider a practical problem of the image retrieval
                      methods that our algorithms are based on: a large fraction
                      of the photos in Internet photo collections has visible
                      watermarks, timestamps, or frames embedded in the image
                      content. These artifacts often cause false-positive image
                      matches. We present a simple but highly efficient and
                      effective method to detect such matches and thus prevent
                      errors in landmark discovery and recognition.},
      cin          = {123720 / 120000},
      ddc          = {004},
      cid          = {$I:(DE-82)123720_20140620$ / $I:(DE-82)120000_20140620$},
      typ          = {PUB:(DE-HGF)3 / PUB:(DE-HGF)11},
      urn          = {urn:nbn:de:hbz:82-rwth-2017-002035},
      doi          = {10.18154/RWTH-2017-00203},
      url          = {https://publications.rwth-aachen.de/record/681424},
}
h1

h2

h3

h4

h5

h6

RWTH

Kontakt

RWTH Publications

Allgemeines