Conference paper Open Access

Unsupervised Video Summarization via Attention-Driven Adversarial Learning

Apostolidis, Evlampios; Adamantidou, Eleni; Metsai, Alexandros; Mezaris, Vasileios; Patras, Ioannis


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Video summarization</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Unsupervised learning</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Attention mechanism</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Adversarial learning</subfield>
  </datafield>
  <controlfield tag="005">20200120174125.0</controlfield>
  <controlfield tag="001">3605501</controlfield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="d">January 2020</subfield>
    <subfield code="g">MMM 2020</subfield>
    <subfield code="a">26th Int. Conf. on Multimedia Modeling</subfield>
    <subfield code="c">Daejeon, Korea</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">CERTH</subfield>
    <subfield code="a">Adamantidou, Eleni</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">CERTH</subfield>
    <subfield code="a">Metsai, Alexandros</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">CERTH</subfield>
    <subfield code="a">Mezaris, Vasileios</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">QMUL</subfield>
    <subfield code="a">Patras, Ioannis</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">978801</subfield>
    <subfield code="z">md5:beaa35b2015dcdd1b1c3f0c86dcc9199</subfield>
    <subfield code="u">https://zenodo.org/record/3605501/files/mmm2020_lncs11961_1_preprint.pdf</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2020-01-06</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="p">user-retv-h2020</subfield>
    <subfield code="o">oai:zenodo.org:3605501</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">CERTH &amp; QMUL</subfield>
    <subfield code="a">Apostolidis, Evlampios</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Unsupervised Video Summarization via Attention-Driven Adversarial Learning</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-retv-h2020</subfield>
  </datafield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">780656</subfield>
    <subfield code="a">Enhancing and Re-Purposing TV Content for Trans-Vector Engagement</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;This paper presents a new video summarization approach that integrates an attention mechanism to identify the significant parts of the video, and is trained unsupervisingly via generative adversarial learning. Starting from the SUM-GAN model, we rst develop an improved version of it (called SUM-GAN-sl) that has a significantly reduced number of learned parameters, performs incremental training of the model&amp;#39;s components, and applies a stepwise label-based strategy for updating the adversarial part. Subsequently, we introduce an attention mechanism to SUM-GAN-sl in two ways: i) by integrating an attention layer within the variational auto-encoder (VAE) of the architecture (SUM-GAN-VAAE), and ii) by replacing the VAE with a deterministic attention auto-encoder (SUM-GAN-AAE). Experimental evaluation on two datasets (SumMe and TVSum) documents the contribution of the attention auto-encoder to faster and more stable training of the model, resulting in a signicant performance improvement with respect to the original model and demonstrating the competitiveness of the proposed SUM-GAN-AAE against the state of the art.&amp;nbsp;Software is publicly available at: https://github.com/e-apostolidis/SUM-GAN-AAE&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.1007/978-3-030-37731-1_40</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">conferencepaper</subfield>
  </datafield>
</record>
462
179
views
downloads
Views 462
Downloads 179
Data volume 175.2 MB
Unique views 448
Unique downloads 168

Share

Cite as