Visual effects extension

Visual effects extension mods#

Such an extension aims to make a V&L model inherit the capability of natural language understanding (NLU) from the original language model. %X A method for creating a vision-and-language (V&L) model is to extend a language model through structural modifications and V&L pre-training. %C Online and Punta Cana, Dominican Republic %I Association for Computational Linguistics %S Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing %T Effect of Visual Extensions on Natural Language Understanding in Vision-and-Language Models These results suggest that adopting a single-stream structure and devising the pre-training could be an effective method for improving the maintenance of language knowledge in V&L extensions. Further analysis shows that pre-training causes the performance drop in NLU tasks with few exceptions. Our main finding is that the dual-stream scores are not much different than the single-stream scores, contrary to expectation.

Dual-stream models, with their higher modality independence achieved by approximately doubling the number of parameters, are expected to preserve the NLU capability better. We compare five V&L models, including single-stream and dual-stream models, trained with the same pre-training. To see how well this is achieved, we propose to evaluate V&L models using an NLU benchmark (GLUE). Online and Punta Cana, Dominican RepublicĪ method for creating a vision-and-language (V&L) model is to extend a language model through structural modifications and V&L pre-training.

Proceedings of the 2021 Conference on Empirical Methods in Natural Language ProcessingĪssociation for Computational Linguistics Publisher = "Association for Computational Linguistics",Ībstract = "A method for creating a vision-and-language (VL extensions.",Įffect of Visual Extensions on Natural Language Understanding in Vision-and-Language Models

Visual effects extension mods#

Anthology ID: 2021.emnlp-main.167 Volume: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing Month: November Year: 2021 Address: Online and Punta Cana, Dominican Republic Venue: EMNLP SIG: Publisher: Association for Computational Linguistics Note: Pages: 2189–2196 Language: URL: DOI: 10.18653/v1/2021.emnlp-main.167 Bibkey: iki-aizawa-2021-effect Copy Citation: BibTeX MODS XML Endnote More options… PDF: Code alab-nii/eval_vl_glue Data CoLA, GLUE, MRPC, QNLI, = "Effect of Visual Extensions on Natural Language Understanding in Vision-and-Language Models",īooktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",Īddress = "Online and Punta Cana, Dominican Republic", Abstract A method for creating a vision-and-language (V&L) model is to extend a language model through structural modifications and V&L pre-training.