Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update webpage and readme #56

Merged
merged 5 commits into from
Jun 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,11 @@ This is the offical Github repository of Panda-70M.
[Ming-Hsuan Yang](https://faculty.ucmerced.edu/mhyang/),
[Sergey Tulyakov](http://www.stulyakov.com/)
</br>
*Computer Vision and Pattern Recognition 2024*
*Computer Vision and Pattern Recognition (CVPR) 2024*

<!-- [Arxiv Report](https://arxiv.org/abs/2307.04725) | [Project Page](https://snap-research.github.io/Panda-70M) -->
[![arXiv](https://img.shields.io/badge/arXiv-2402.19479-b31b1b.svg)](https://arxiv.org/abs/2402.19479)
[![Project Page](https://img.shields.io/badge/Project-Website-green)](https://snap-research.github.io/Panda-70M)
[![YouTube](https://badges.aleen42.com/src/youtube.svg)](https://youtu.be/m2NQ5k1oTcs)

## Introduction
Panda-70M is a large-scale dataset with 70M high-quality video-caption pairs.
Expand Down Expand Up @@ -86,7 +86,7 @@ More details can be found in [Dataset Dataloading](./dataset_dataloading) sectio
</tr>
</table>

<sup>**We will remove the video samples from our dataset / Github / project webpage as long as you need it. Please contact tsaishienchen at gmail dot com for the request.</sup>
<sup>**We will remove the video samples from our dataset / Github / project webpage / technical presentation as long as you need it. Please contact tsaishienchen at gmail dot com for the request.</sup>

Please check [here](https://snap-research.github.io/Panda-70M/more_samples) for more samples.

Expand Down
4 changes: 2 additions & 2 deletions captioning/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# 🐼 Panda-70M: Video Captioning

**[Note] To use our captioning code, please make sure you follow [this guideline](https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md#how-to-apply-delta-weights-only-needed-for-weights-v0) and correctly prepare vicuna-7b-v0 weight. Basically, you need to first download the original weights and then apply delta weights. Improper weights preparation will lead to meaningless outputs.**
**[Note] To run the captioning code, please make sure you follow [this guideline](https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md#how-to-apply-delta-weights-only-needed-for-weights-v0) and correctly prepare vicuna-7b-v0 weight. You need to first download the original weights and then apply delta weights. Improper weights preparation will lead to meaningless outputs.**

## Introduction
We propose a video captioning model to generate a caption for a short video clip.
Expand Down Expand Up @@ -61,7 +61,7 @@ Please look at the video and faithfully summarize it in one sentence.</sup></td>
</tr>
</table>

<sup>**We will remove the video samples from our dataset / Github / project webpage as long as you need it. Please contact tsaishienchen at gmail dot com for the request.</sup>
<sup>**We will remove the video samples from our dataset / Github / project webpage / technical presentation as long as you need it. Please contact tsaishienchen at gmail dot com for the request.</sup>

- **[Note]** You might get different outputs due to the randomness of LLM's generation.

Expand Down
Binary file modified captioning/assets/architecture.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion dataset_dataloading/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# 🐼 Panda-70M: Dataset Dataloading
The section includes the csv files listing the data samples in Panda-70M and the code to download the videos.

**[Note] Please use the video2dataset tool from this repository to download the dataset, as the video2dataset from [the official repository](https://github.com/iejMac/video2dataset) cannot work with our csv format for now.**
**[Note] Please use the video2dataset tool from this repository to download the dataset, as the video2dataset from [the official repository](https://github.com/iejMac/video2dataset) cannot work with our csv format.**

## Data Splitting and Download Link
| Split | Download | # Source Videos | # Samples | Video Duration | Storage Space |
Expand Down
1 change: 1 addition & 0 deletions docs/assets/wordcloud.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 0 additions & 1 deletion docs/assets/worldcloud.svg

This file was deleted.

15 changes: 15 additions & 0 deletions docs/html_pages/resources/stylesheet.css
Original file line number Diff line number Diff line change
Expand Up @@ -186,4 +186,19 @@ div.scroll-container {
.table-container {
width: 100%;
}
}

.youtube-container {
position: relative;
width: 100%;
padding-bottom: 56.25%; /* 16:9 aspect ratio */
height: 0;
overflow: hidden;
}
.youtube-container iframe {
position: absolute;
top: 0;
left: 0;
width: 100%;
height: 100%;
}
19 changes: 15 additions & 4 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
</nav>
<nav>
<a href="#download">Download</a>
<a href="#presentation">Presentation</a>
<a href="#collection">Collection</a>
<a href="#demo">Demo</a>
<a href="#statistic">Statistic</a>
Expand Down Expand Up @@ -167,7 +168,7 @@ <h5 class="pt-1" style="font-size: 2rem; font-weight: normal">A Large-Scale Data
</div>
</div>
<div class="container text-center footnote">
We will remove the video samples from our dataset / Github / project webpage as long as you need it. Please contact tsaishienchen at gmail dot com for the request.
We will remove the video samples from our dataset / Github / project webpage / technical presentation as long as you need it. Please contact tsaishienchen at gmail dot com for the request.
</div>
</div>

Expand Down Expand Up @@ -208,6 +209,16 @@ <h1 class="jumbotron-heading">Download Panda-70M</h1>

<hr class="mt-5">

<section id="presentation">
<div class="container text-center" style="margin-top: 10px">
<div class="youtube-container">
<iframe src="https://www.youtube.com/embed/m2NQ5k1oTcs?si=jCc8gruNWA_oXNyP&autoplay=0&mute=0" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
</div>
</div>
</section>

<hr class="mt-5">

<section id="collection">
<div class="container text-center">
<h1 class="jumbotron-heading">Collection Pipeline of Panda-70M</h1>
Expand Down Expand Up @@ -268,7 +279,7 @@ <h1 class="jumbotron-heading">Demo of Long Video Annotation</h1>
</video>
</div>
<div class="container text-center footnote">
We will remove the video samples from our dataset / Github / project webpage as long as you need it. Please contact tsaishienchen at gmail dot com for the request.
We will remove the video samples from our dataset / Github / project webpage / technical presentation as long as you need it. Please contact tsaishienchen at gmail dot com for the request.
</div>
</div>
</section>
Expand All @@ -283,7 +294,7 @@ <h1 class="jumbotron-heading">Statistic</h1>
<img src="./assets/statistic.svg" style="margin-top: -20px;">
</div>
<div class="image-item">
<img src="./assets/worldcloud.svg" style="margin-top: 10px; margin-bottom: 10px; width: 90%">
<img src="./assets/wordcloud.svg" style="margin-top: 10px; margin-bottom: 10px; width: 90%">
</div>
</div>
</div>
Expand Down Expand Up @@ -360,7 +371,7 @@ <h1 class="jumbotron-heading text-center">Acknowledgement</h1>
imageItems.forEach(item => {
const elementPosition = item.getBoundingClientRect().top;

if (elementPosition < window.innerHeight * 0.7) {
if (elementPosition < window.innerHeight * 0.85) {
item.style.opacity = '1';
} else {
item.style.opacity = '0';
Expand Down
6 changes: 3 additions & 3 deletions docs/more_samples.html
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
<a href="#Scenery">Scenery</a>
<a href="#Food">Food</a>
<a href="#Sports_Activity">Sports Activity</a>
<a href="#Vehicles">Vehicles</a>
<a href="#Vehicle">Vehicle</a>
<a href="#Tutorial_and_Narrative">Tutorial and Narrative</a>
<a href="#News_and_TV_Shows">News and TV Shows</a>
<a href="#Gaming_and_3D_Rendering">Gaming and 3D Rendering</a>
Expand All @@ -48,7 +48,7 @@ <h5 class="pt-1" style="font-size: 2rem; font-weight: normal">A Large-Scale Data
<a class="paper-btn" style="width: 130px" href="#Scenery">Scenery</a>
<a class="paper-btn" style="width: 130px" href="#Food">Food</a>
<a class="paper-btn" style="width: 130px" href="#Sports_Activity">Sports Activity</a>
<a class="paper-btn" style="width: 130px" href="#Vehicles">Vehicles</a>
<a class="paper-btn" style="width: 130px" href="#Vehicle">Vehicle</a>
</div>
<div class="paper-btn-parent">
<a class="paper-btn" style="width: 228px" href="#Tutorial_and_Narrative">Tutorial and Narrative</a>
Expand Down Expand Up @@ -462,7 +462,7 @@ <h2 class="pt-4"><p class="text-center" id="Sports_Activity">Sports Activity</p>

<hr class="mt-5">

<h2 class="pt-4"><p class="text-center" id="Vehicles">Vehicles</p></h2>
<h2 class="pt-4"><p class="text-center" id="Vehicle">Vehicle</p></h2>
<th style="text-align: center; vertical-align: top; padding: 10px;">
<video playsinline autoplay loop muted src="./assets/samples/489O6JiJ8Qk.14.mp4" style="width: 100%" type="video/mp4"></video>
<p class="responsive-text" style="font-family: Chalkduster; font-size: 16px; color: white">"A remote control monster truck is driving on rough terrain."</p>
Expand Down
2 changes: 1 addition & 1 deletion splitting/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ The code will split the videos listed in the `video_list.txt` and output the vid
</tr>
</table>

<sup>**We will remove the video samples from our dataset / Github / project webpage as long as you need it. Please contact tsaishienchen at gmail dot com for the request.</sup>
<sup>**We will remove the video samples from our dataset / Github / project webpage / technical presentation as long as you need it. Please contact tsaishienchen at gmail dot com for the request.</sup>

## Acknowledgements
The code for video splitting is built upon [PySceneDetect](https://github.com/Breakthrough/PySceneDetect) and [ImageBind](https://github.com/facebookresearch/ImageBind).
Expand Down