Location>code7788 >text

Purely dry and ad-free, the big graduation thesis, how to elegantly piece together, drop weight and embellishment

Popularity:691 ℃/2024-10-11 22:24:51

catalogs
  • put together
    • Selected styles
    • Examples (translated into Chinese using GPT)
    • Patchwork (manual patchwork)
    • Patchwork (GPT direct quote)
      • Rates of weight checking on the Internet
      • AIGC detection rate
  • check the weight of a book
    • general checking
    • AIGC Testing
  • weight loss
    • Direct use of GPT
    • abbreviate and then expand
      • abridge
      • expand (a text)
    • Chinese-Japanese-English translation
      • Sino-Turkish
      • Japanese-English language
      • Anglo-Chinese
      • effect
  • Colorization (must be done)
    • GPT touch-ups
    • Manual touch-ups (recommended)

put together

Selected styles

Style 1

image-20241011165615373

Style 2

image-20241011165737730

Style 3

image-20241011165949772

Examples (translated into Chinese using GPT)

Attention-Based Multimodal Fusion for Video Description(1)

We use the YouTube2Text [9] and MSR-VTT [32] video datasetsEvaluated our proposed feature fusion method.YouTube2Text contains ......

MSR-VTT [32] contains 10,000 web video clips totaling 41.2 hours and provides 200,000 clip-sentence pairs covering 20 categories and a variety of types of video content. Each clip is annotated with an average of about 20 natural language sentences.This dataset is categorized into training set, validation set and test set.The percentages were 65%, 5%, and 30%, corresponding to 6,513, 497, and 2,990 video clips. However, since these video clips are hosted on YouTube, some MSR-VTT videos have been removed due to content or copyright issues. At the time we downloaded these videos (February 2017), approximately 12% were unavailable. Therefore, we only used a subset of the available MSR-VTT dataset for training and testing, specifically 5,763 clips for the training set, 419 clips for the validation set, and 2,616 clips for the testing set.

Global2Local: A Joint-Hierarchical Attention for Video Captioning(2)

We have on the MSVD (Chen and Dolan 2011) and MSR-VTT (Xu et al. 2016) datasetsThe proposed G2L methodology was evaluated.MSVD by ......

MSR-VTT is a widely used benchmark dataset for video caption generation that is more challenging than MSVD. It contains 10,000 video clips from 20 generic motion categories. Each video clip is accompanied by 20 manually labeled natural subtitles. We follow its standard data division: 6,513 clips are used for training, 497 clips are used for testing, and the remaining 2,990 clips are also used for testing.

Multimodal feature fusion based on object relation for video captioning(3)

Microsoft Research video to text (translated results, but this is problematic)The Microsoft research-video to text (MSR-VTT) [17] dataset contains 10,000 video clips.It is divided into three parts: training, validation and testing. Each video clip is labeled with approximately 20 English subtitles, for a total of 200,000 subtitles. In addition, MSR-VTT provides category information for each video (20 categories in total).Based on the experiments in these papers [9, 12], theWe will use publicly available divisions for training and testing, including 6,513 video clips for training, 497 clips for validation, and 2,990 clips for testing.

image-20241011165323095

Controllable Video Captioning with POS Sequence Guidance Based on Gated Fusion Network(4)

MSR-VTT is a large-scale dataset for video subtitle generation covering the most diverse visual content to date. It contains 10,000 video clips from 20 categoriesand 200,000 video-subtitle pairs containing a total of 29,000 unique words.Each video clip corresponds to 20 English sentence descriptions.In accordance with existing work.We use publicly available data divisions for training and testing, theOf these, 6,513 were used for training, 497 for validation, and 2,990 for testing.

Patchwork (manual patchwork)

We evaluated the proposed G2L approach using the YouTube2Text [9] and MSR-VTT [32] video datasets (1).MSVD by ...... (2)

The Microsoft research-video to text (MSR-VTT) [17] dataset contains 10,000 video clips, (3) and 200,000 video-subtitle pairs containing a total of 29,000 unique words. (4) Based on the experiments in these papers [9, 12], (3) this dataset is divided into training, validation, and testing sets, (1) of which 6,513 [video clips] are used for training, 497 [video clips] for validation, and 2,990 [video clips] for testing. (4)However, because these video clips are hosted on YouTube, some of the MSR-VTT videos have been removed due to content or copyright issues. At the time we downloaded these videos (February 2017), theAbout 12% unavailable. Therefore, we used only a subset of the available MSR-VTT dataset for training and testing, specifically 5,763 segments for the training set, 419 segments for the validation set, and 2,616 segments for the testing set. (1)

Attention:The yellow labeled clip is a lot of names, which videos have been deleted because of content or copyright issues, or that part of the video can not be found, the ghost does not know if this is the case, no one to verify, but also very difficult to verify. So by removing some of the poorly rated data and keeping the useful dataset, the model is bound to work well. This is the basic operation of many papers on datasets, especially private datasets.

Patchwork (GPT direct quote)

image-20241011173026086

Due to the word limit on the Knowledge Network, a bunch of useless text has been added.

Rates of weight checking on the Internet

0%

image-20241011175015829

AIGC detection rate

0%

image-20241011174832949

Now you know why the Nobel Prize was awarded to the AI, but you can't trust the AI completely, especially after generating a paragraph with gpt, make sure to touch it up again, which will be talked about in a later article

check the weight of a book

/ check the weight of a book on the Internet

image-20241011173424715

general checking

image-20241011173323613

AIGC Testing

image-20241011173356702

weight loss

Direct use of GPT

Directly use the cue word, but the effect will not be particularly good, it is recommended to use the last two methods

image-20241011175735552

abbreviate and then expand

abridge

image-20241011180128510

expand (a text)

image-20241011180323223

Chinese-Japanese-English translation

Sino-Turkish

image-20241011180619117

Japanese-English language

image-20241011180638834

Anglo-Chinese

image-20241011180756871

effect

Original:The MSR-VTT dataset is an important large-scale benchmark dataset in the field of video description, which is widely used in the study of video subtitle generation. The dataset contains 10,000 video clips from 20 different categories, each accompanied by 20 manually labeled English subtitles, totaling 200,000 video-subtitle pairs. These subtitles cover 29,000 unique words, making them representative of diversity and content richness.The MSR-VTT dataset is standardly divided into 6,513 video clips for training, 497 for validation, and 2,990 for testing, ensuring comparability and reliability of the findings. This dataset has been widely adopted for evaluating and comparing the performance of various video description algorithms due to its wide range of application scenarios and challenging nature.

Will be heavy after that:The MSR-VTT dataset is a large-scale benchmark dataset that plays an important role in the field of video description and is widely used in research on video subtitle generation. The dataset contains 10,000 video clips collected from 20 different categories, each accompanied by 20 manually labeled English subtitles, totaling 200,000 video-subtitle pairs. These subtitles cover 29,000 unique words, representing a high degree of diversity and content richness.The MSR-VTT dataset is divided according to the criteria: 6,513 video clips for training, 497 for validation, and 2,990 for testing, which ensures the comparability and reliability of the study results. Due to its wide range of applications and high challenge, this dataset is widely used for evaluation and comparison of various video description algorithms.

Colorization (must be done)

GPT touch-ups

Edit the cue words yourself and just follow your needs.

image-20241011175555258

Manual touch-ups (recommended)

Will be heavy after that:The MSR-VTT dataset is a large-scale benchmark dataset that plays an important role in the field of video description and is widely used in research on video subtitle generation. The dataset contains 10,000 video clips collected from 20 different categories, each accompanied by 20 manually labeled English subtitles, totaling 200,000 video-subtitle pairs. These subtitles cover 29,000 unique words, therepresents a high degree of diversity and content richness.The MSR-VTT dataset is divided according to the criteria: 6,513 video clips for training, 497 for validation, and 2,990 for testing, which ensures the comparability and reliability of the results of the study. Due to its wide range of applications and high challenge, this dataset is widely used for evaluation and comparison of various video description algorithms.

Microsoft Research video to text (translated results, but this is problematic)The Microsoft research-video to text (MSR-VTT) [17] dataset contains 10,000 video clips divided into three parts: training, validation and testing. Each video clip is labeled with approximately 20 English subtitles, for a total of 200,000 subtitles. In addition, MSR-VTT provides category information for each video (20 categories in total). Based on the experiments in these papers [9, 12], we will use publicly available divisions for training and testing, including 6,513 video clips for training, 497 clips for validation, and 2,990 clips for testing.

I can't find a good example off the top of my head, but the above two paragraphs should suffice. Whether it's a cobbled-together paragraph or one that's been touched up by the GPT, it's important to do manual touch-ups: generally that means making changes to specialized words, reading it through once or twice to make sure the logic flows smoothly, and especially to make sure that the context flows smoothly

There is no denying that the GPT is powerful, exceptionally powerful, but it must be manually touched up!

There's no denying that the GPT is powerful, exceptionally powerful, but it must be manually touched up!

There is no denying that the GPT is powerful, exceptionally powerful, but it must be manually touched up!!!!