Conference paper Open Access

A Stepwise, Label-based Approach for Improving the Adversarial Training in Unsupervised Video Summarization

Apostolidis, Evlampios; Metsai, Alexandros; Adamantidou, Eleni; Mezaris, Vasileios; Patras, Ioannis


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="942" ind1=" " ind2=" ">
    <subfield code="a">2019-10-21</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Video Summarization</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Unsupervised Learning</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Adversarial Training</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Evaluation Protocol</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Datasets</subfield>
  </datafield>
  <controlfield tag="005">20200120172107.0</controlfield>
  <controlfield tag="001">3395967</controlfield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="d">21 October 2019</subfield>
    <subfield code="g">AI4TV@ACMMM 2019</subfield>
    <subfield code="a">1st Int. Workshop on AI for Smart TV Content Production, Access and Delivery (AI4TV'19) at ACM Multimedia 2019</subfield>
    <subfield code="c">Nice, France</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">CERTH-ITI, Thermi, Greece</subfield>
    <subfield code="a">Metsai, Alexandros</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">CERTH-ITI, Thermi, Greece</subfield>
    <subfield code="a">Adamantidou, Eleni</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">CERTH-ITI, Thermi, Greece</subfield>
    <subfield code="a">Mezaris, Vasileios</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Queen Mary University of London, UK</subfield>
    <subfield code="a">Patras, Ioannis</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">1442696</subfield>
    <subfield code="z">md5:756d6784b1ce31dbae500bd0d794d2cf</subfield>
    <subfield code="u">https://zenodo.org/record/3395967/files/Apostolidis_Summarization.pdf</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2019-10-21</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="p">user-retv-h2020</subfield>
    <subfield code="o">oai:zenodo.org:3395967</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">CERTH-ITI, Thermi, Greece, and Queen Mary University of London, UK</subfield>
    <subfield code="a">Apostolidis, Evlampios</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">A Stepwise, Label-based Approach for Improving the Adversarial Training in Unsupervised Video Summarization</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-retv-h2020</subfield>
  </datafield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">780656</subfield>
    <subfield code="a">Enhancing and Re-Purposing TV Content for Trans-Vector Engagement</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;In this paper we present our work on improving the efficiency of adversarial training for unsupervised video summarization. Our starting point is the SUM-GAN model, which creates a representative summary based on the intuition that such a summary should make it possible to reconstruct a video that is indistinguishable from the original one. We build on a publicly available implementation of a variation of this model, that includes a linear compression layer to reduce the number of learned parameters and applies an incremental approach for training the different components of the architecture. After assessing the impact of these changes to the model&amp;rsquo;s performance, we propose a stepwise, label-based learning process to improve the training efficiency of the adversarial part of the model. Before evaluating our model&amp;rsquo;s efficiency, we perform a thorough study with respect to the used evaluation protocols and we examine the possible performance on two benchmarking datasets, namely SumMe and TVSum. Experimental evaluations and comparisons with the state of the art highlight the competitiveness of the proposed method. An ablation study indicates the benefit of each applied change on the model&amp;rsquo;s performance, and points out the advantageous role of the introduced stepwise, label-based training strategy on the learning efficiency of the adversarial part of the architecture.&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.1145/3347449.3357482</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">conferencepaper</subfield>
  </datafield>
</record>
303
63
views
downloads
Views 303
Downloads 63
Data volume 90.9 MB
Unique views 291
Unique downloads 57

Share

Cite as