This repository has been archived by the owner on Feb 22, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 50
Cache waveform data in database #510
Merged
Merged
Changes from 3 commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
b1c4b67
Allow running specific tests
sarayourfriend f53f3e8
Document external binary and library dependencies
sarayourfriend 32765cc
Cache audio waveform in database
sarayourfriend 049a083
Remove unnecessary abstractmethod
sarayourfriend 6e0b589
Add waveform directly to Audio model
sarayourfriend 9d471f5
Enable not following logs
sarayourfriend 7a0e29d
Switch back to side-table for audio waveform data
sarayourfriend 50006f7
Ensure refresh will not cause constraint errors
sarayourfriend 1731209
Apply linting changes
sarayourfriend 8423ab8
Use primary_key flag for audio id field on the add on
sarayourfriend File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Generated by Django 3.2.9 on 2022-02-14 23:22 | ||
|
||
import django.contrib.postgres.fields | ||
from django.db import migrations, models | ||
|
||
|
||
class Migration(migrations.Migration): | ||
|
||
dependencies = [ | ||
('api', '0044_singular_category'), | ||
] | ||
|
||
operations = [ | ||
migrations.CreateModel( | ||
name='AudioAddOn', | ||
fields=[ | ||
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), | ||
('created_on', models.DateTimeField(auto_now_add=True)), | ||
('updated_on', models.DateTimeField(auto_now=True)), | ||
('audio_identifier', models.UUIDField(db_index=True, help_text='The identifier of the audio object.', unique=True)), | ||
('waveform_peaks', django.contrib.postgres.fields.ArrayField(base_field=models.FloatField(), blank=True, help_text='The waveform peaks. A list of floats in the range of 0 -> 1 inclusively.', null=True, size=1500)), | ||
], | ||
options={ | ||
'abstract': False, | ||
}, | ||
), | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -97,6 +97,46 @@ class Meta: | |
abstract = True | ||
|
||
|
||
class AudioAddOn(OpenLedgerModel): | ||
audio_identifier = models.UUIDField( | ||
db_index=True, | ||
unique=True, | ||
blank=False, | ||
null=False, | ||
sarayourfriend marked this conversation as resolved.
Show resolved
Hide resolved
|
||
help_text=( | ||
"The identifier of the audio object." | ||
) | ||
) | ||
""" | ||
This cannot be a "ForeignKey" or "OneToOneRel" because the refresh process | ||
wipes out the Audio table completely and recreates it. If we made these a FK | ||
or OneToOneRel there'd be foreign key constraint added that would be violated | ||
when the Audio table is recreated. | ||
|
||
The index is necessary as this column is used by the Audio object to query | ||
for the relevant add on. | ||
|
||
The refresh process will also eventually include cleaning up any potentially | ||
dangling audio_add_on rows. | ||
Comment on lines
+106
to
+115
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a great piece of documentation! |
||
""" | ||
|
||
waveform_peaks = ArrayField( | ||
base_field=models.FloatField(), | ||
# The approximate resolution of waveform generation | ||
# results in _about_ 1000 peaks. We use 1500 to give | ||
# sufficient wiggle room should we have any outlier | ||
# files pop up. | ||
# https://github.com/WordPress/openverse-api/blob/a7955c86d43bff504e8d41454f68717d79dd3a44/api/catalog/api/utils/waveform.py#L71 | ||
size=1500, | ||
help_text=( | ||
"The waveform peaks. A list of floats in" | ||
" the range of 0 -> 1 inclusively." | ||
), | ||
blank=True, | ||
null=True, | ||
) | ||
|
||
|
||
class Audio(AudioFileMixin, AbstractMedia): | ||
""" | ||
Inherited fields | ||
|
@@ -158,31 +198,17 @@ def audio_set(self): | |
except AudioSet.DoesNotExist: | ||
return None | ||
|
||
waveform = ArrayField( | ||
base_field=models.FloatField(), | ||
# The approximate resolution of waveform generation | ||
# results in _about_ 1000 peaks. We use 1500 to give | ||
# sufficient wiggle room should we have any outlier | ||
# files pop up. | ||
# https://github.com/WordPress/openverse-api/blob/a7955c86d43bff504e8d41454f68717d79dd3a44/api/catalog/api/utils/waveform.py#L71 | ||
size=1500, | ||
help_text=( | ||
"The waveform peaks. A list of floats in" | ||
" the range of 0 -> 1 inclusively." | ||
), | ||
blank=True, | ||
null=True, | ||
) | ||
|
||
def get_or_create_waveform(self): | ||
if self.waveform is not None: | ||
return self.waveform | ||
add_on, _ = AudioAddOn.objects.get_or_create(audio_identifier=self.identifier) | ||
|
||
if add_on.waveform_peaks is not None: | ||
return add_on.waveform_peaks | ||
|
||
self.waveform = generate_peaks(self) | ||
add_on.waveform_peaks = generate_peaks(self) | ||
|
||
self.save() | ||
add_on.save() | ||
|
||
return self.waveform | ||
return add_on.waveform_peaks | ||
|
||
class Meta(AbstractMedia.Meta): | ||
db_table = "audio" | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the primary key of this model we should drop the
audio_
prefix.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also note that leaving it as it is wouldn't be wrong either, as it refers to the Audio model instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think we should keep it as it's not the identifier for the add-on exactly... it just happens to act as that and we're throwing away the actual canonical PK for the add on table because we never use it anyway.