Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for existence of layers in cache before returning cached base image #3767

Merged
merged 19 commits into from
Sep 19, 2022
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -445,8 +445,8 @@ List<Image> getCachedBaseImages()
LayerCountMismatchException, UnlistedPlatformInManifestListException,
PlatformNotFoundInBaseImageException {
ImageReference baseImage = buildContext.getBaseImageConfiguration().getImage();
Optional<ImageMetadataTemplate> metadata =
buildContext.getBaseImageLayersCache().retrieveMetadata(baseImage);
Cache baseImageLayersCache = buildContext.getBaseImageLayersCache();
Optional<ImageMetadataTemplate> metadata = baseImageLayersCache.retrieveMetadata(baseImage);
if (!metadata.isPresent()) {
return Collections.emptyList();
}
Expand All @@ -457,6 +457,12 @@ List<Image> getCachedBaseImages()
if (manifestList == null) {
Verify.verify(manifestsAndConfigs.size() == 1);
ManifestTemplate manifest = manifestsAndConfigs.get(0).getManifest();

// Verify all layers described in manifest are present in cache
if (!baseImageLayersCache.allLayersCached(Verify.verifyNotNull(manifest))) {
return Collections.emptyList();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we move the Verify.verifyNotNull condition to allLayersCached?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this to a line above, since this condition is applied again to the manifest in toImage() a few lines down


if (manifest instanceof V21ManifestTemplate) {
return Collections.singletonList(
JsonToImageTranslator.toImage((V21ManifestTemplate) manifest));
Expand Down Expand Up @@ -486,6 +492,11 @@ List<Image> getCachedBaseImages()
}

ManifestTemplate manifest = Verify.verifyNotNull(manifestAndConfigFound.get().getManifest());
// Verify all layers described in manifest are present in cache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) there was quite a bit of code duplication in this method even in the past between the section that deals with general and platform-based manifest processing. Could there be an opportunity for refactoring common logic out into a helper method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah this is a great suggestion - will refactor this and see if I can make the methods cleaner to read.

if (!baseImageLayersCache.allLayersCached(manifest)) {
return Collections.emptyList();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to combine !manifestAndConfigFound.isPresent() and areAllLayersCached with an &&?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets messier here to combine them I think? (Since grabbing the manifest is conditional on manifestAndConfigFound.isPresent(), plus it also needs to perform the verifyNotNull check)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see! thanks for trying this out. Hm thinking of a way in which we could remove the number of if statements since they are all testing similar things but just at different levels of granularity. Is it possible to put them in a helper method with a descriptive name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the many if statements did get more cumbersome with the changes added in this PR. I think the refactoring suggestion from @elefeint will help with this here!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent refactoring!

ContainerConfigurationTemplate containerConfig =
Verify.verifyNotNull(manifestAndConfigFound.get().getConfig());
images.add(
Expand Down
11 changes: 11 additions & 0 deletions jib-core/src/main/java/com/google/cloud/tools/jib/cache/Cache.java
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
import com.google.cloud.tools.jib.image.json.ContainerConfigurationTemplate;
import com.google.cloud.tools.jib.image.json.ImageMetadataTemplate;
import com.google.cloud.tools.jib.image.json.ManifestAndConfigTemplate;
import com.google.cloud.tools.jib.image.json.ManifestTemplate;
import com.google.cloud.tools.jib.image.json.V21ManifestTemplate;
import com.google.common.collect.ImmutableList;
import java.io.IOException;
Expand Down Expand Up @@ -192,6 +193,16 @@ public Optional<ImageMetadataTemplate> retrieveMetadata(ImageReference imageRefe
return cacheStorageReader.retrieveMetadata(imageReference);
}

/**
* Returns {@code true} if all image layers described in a manifest exist in the cache.
*
* @param manifest the image manifest
* @return a boolean
*/
public boolean allLayersCached(ManifestTemplate manifest) {
return cacheStorageReader.allLayersCached(manifest);
}

/**
* Retrieves the {@link CachedLayer} that was built from the {@code layerEntries}.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,38 @@ static void verifyImageMetadata(ImageMetadataTemplate metadata, Path metadataCac
this.cacheStorageFiles = cacheStorageFiles;
}

/**
* Returns {@code true} if all image layers described in a manifest have a corresponding file
* entry in the cache.
*
* @param manifest the image manifest
* @return a boolean
*/
boolean allLayersCached(ManifestTemplate manifest) {
emmileaf marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it possible to move this method directly to PullBaseImageStep? Since that is the only class where this is used and it could help us avoid having to go through an extra layer of classes to reach this method. Or does this method require any variables that are specific to CacheStorageReader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was thinking about this too - this method goes through a few layers because it depends on CacheStorageReader’s cacheStorageFiles variable here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks! Hm it looks like Cache takes in cacheStorageFiles as a parameter to it's constructor, could we use that maybe?

private Cache(CacheStorageFiles cacheStorageFiles) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you’re right - I missed that we could just add cacheStorageFiles as a field and move this logic to Cache.

I went ahead and tried to make this change, but noticed that the Cache class is set up in a way that many of its methods are wrappers around calls to either CacheStorageReader or CacheStorageWriter. Looking at the existing test suites, CacheStorageReaderTest is also more straightforward to add to than CacheTest for unit testing the new check.

I am tempted to leave this logic in CacheStorageReader just to stay consistent with the existing setup here - lmk what you think!


List<DescriptorDigest> layerDigests;

if (manifest instanceof V21ManifestTemplate) {
layerDigests = ((V21ManifestTemplate) manifest).getLayerDigests();
} else if (manifest instanceof BuildableManifestTemplate) {
layerDigests =
((BuildableManifestTemplate) manifest)
.getLayers().stream()
.map(BuildableManifestTemplate.ContentDescriptorTemplate::getDigest)
.collect(Collectors.toList());
} else {
throw new IllegalArgumentException("Unknown manifest type: " + manifest);
}
Comment on lines +103 to +105
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this would break current behavior for those doing multi-platform image building? For context #2730, we also do caching for manifest lists in addition to single manifests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s a good callout, and please let me know if I'm misunderstanding anything here!

Right now the added logic in PullBaseImageStep.getCachedBaseImages() never passes anything of V22ManifestList type into areAllLayersCached() calls explicitly. In the case of manifest lists, this check is made individually when looping over the platform-specific manifests, and returns an overall cache miss if any of the platform-specific manifests has incomplete layers.

But, do you think there is value here to have CacheStorageReader.areAllLayersCached() itself handle the manifest list type, rather than rely on the code calling it? Perhaps instead of throwing an exception here, it can also just return false, and leave the rest of the behavior to existing logic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks for the detailed explanation! You're right, it is called for the individual manifests in the manifest list.

But, do you think there is value here to have CacheStorageReader.areAllLayersCached() itself handle the manifest list type, rather than rely on the code calling it?

Hm that's a good question. I was initially thinking about this too but looking at the code for getCachedBaseImage, we would probably still have to iterate through the manifests in the manifest list again to retrieve the collection of images? What you have currently (which a more fail-fast approach) seems like a better choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!


for (DescriptorDigest layerDigest : layerDigests) {
Path layerDirectory = cacheStorageFiles.getLayerDirectory(layerDigest);
if (!Files.exists(layerDirectory)) {
return false;
}
}
return true;
}

/**
* Retrieves the cached image metadata (a manifest list and a list of manifest/container
* configuration pairs) for an image reference.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
import com.google.cloud.tools.jib.image.json.ContainerConfigurationTemplate;
import com.google.cloud.tools.jib.image.json.ImageMetadataTemplate;
import com.google.cloud.tools.jib.image.json.ManifestAndConfigTemplate;
import com.google.cloud.tools.jib.image.json.ManifestTemplate;
import com.google.cloud.tools.jib.image.json.OciIndexTemplate;
import com.google.cloud.tools.jib.image.json.PlatformNotFoundInBaseImageException;
import com.google.cloud.tools.jib.image.json.UnlistedPlatformInManifestListException;
Expand Down Expand Up @@ -155,6 +156,7 @@ public void testCall_digestBaseImage()
ImageMetadataTemplate imageMetadata =
new ImageMetadataTemplate(null, Arrays.asList(manifestAndConfig));
Mockito.when(cache.retrieveMetadata(imageReference)).thenReturn(Optional.of(imageMetadata));
Mockito.when(cache.allLayersCached(manifestAndConfig.getManifest())).thenReturn(true);

ImagesAndRegistryClient result = pullBaseImageStep.call();
Assert.assertEquals("fat system", result.images.get(0).getOs());
Expand Down Expand Up @@ -196,6 +198,7 @@ public void testCall_offlineMode_cached()
ImageMetadataTemplate imageMetadata =
new ImageMetadataTemplate(null, Arrays.asList(manifestAndConfig));
Mockito.when(cache.retrieveMetadata(imageReference)).thenReturn(Optional.of(imageMetadata));
Mockito.when(cache.allLayersCached(manifestAndConfig.getManifest())).thenReturn(true);

ImagesAndRegistryClient result = pullBaseImageStep.call();
Assert.assertEquals("fat system", result.images.get(0).getOs());
Expand Down Expand Up @@ -297,6 +300,25 @@ public void testGetCachedBaseImages_emptyCache()
Assert.assertEquals(Arrays.asList(), pullBaseImageStep.getCachedBaseImages());
}

@Test
public void testGetCachedBaseImages_cachedWithoutAllLayers()
emmileaf marked this conversation as resolved.
Show resolved Hide resolved
throws InvalidImageReferenceException, CacheCorruptedException, IOException,
LayerCountMismatchException, PlatformNotFoundInBaseImageException,
BadContainerConfigurationFormatException, UnlistedPlatformInManifestListException {
ImageReference imageReference = ImageReference.parse("cat");
Mockito.when(buildContext.getBaseImageConfiguration())
.thenReturn(ImageConfiguration.builder(imageReference).build());
ManifestTemplate manifest = Mockito.mock(ManifestTemplate.class);
ImageMetadataTemplate imageMetadata =
new ImageMetadataTemplate(
null, Arrays.asList(new ManifestAndConfigTemplate(manifest, null)));

Mockito.when(cache.retrieveMetadata(imageReference)).thenReturn(Optional.of(imageMetadata));
emmileaf marked this conversation as resolved.
Show resolved Hide resolved
Mockito.when(cache.allLayersCached(manifest)).thenReturn(false);

Assert.assertEquals(Arrays.asList(), pullBaseImageStep.getCachedBaseImages());
}
emmileaf marked this conversation as resolved.
Show resolved Hide resolved

@Test
public void testGetCachedBaseImages_v21ManifestCached()
throws InvalidImageReferenceException, IOException, CacheCorruptedException,
Expand All @@ -316,6 +338,7 @@ public void testGetCachedBaseImages_v21ManifestCached()
null, Arrays.asList(new ManifestAndConfigTemplate(v21Manifest, null)));

Mockito.when(cache.retrieveMetadata(imageReference)).thenReturn(Optional.of(imageMetadata));
Mockito.when(cache.allLayersCached(v21Manifest)).thenReturn(true);

List<Image> images = pullBaseImageStep.getCachedBaseImages();

Expand Down Expand Up @@ -344,6 +367,7 @@ public void testGetCachedBaseImages_v22ManifestCached()
ImageMetadataTemplate imageMetadata =
new ImageMetadataTemplate(null, Arrays.asList(manifestAndConfig));
Mockito.when(cache.retrieveMetadata(imageReference)).thenReturn(Optional.of(imageMetadata));
Mockito.when(cache.allLayersCached(manifestAndConfig.getManifest())).thenReturn(true);

List<Image> images = pullBaseImageStep.getCachedBaseImages();

Expand Down Expand Up @@ -381,6 +405,10 @@ public void testGetCachedBaseImages_v22ManifestListCached()
new ManifestAndConfigTemplate(
new V22ManifestTemplate(), containerConfigJson2, "sha256:digest2")));
Mockito.when(cache.retrieveMetadata(imageReference)).thenReturn(Optional.of(imageMetadata));
Mockito.when(cache.allLayersCached(imageMetadata.getManifestsAndConfigs().get(0).getManifest()))
.thenReturn(true);
Mockito.when(cache.allLayersCached(imageMetadata.getManifestsAndConfigs().get(1).getManifest()))
.thenReturn(true);

Mockito.when(containerConfig.getPlatforms())
.thenReturn(ImmutableSet.of(new Platform("arch1", "os1"), new Platform("arch2", "os2")));
Expand Down Expand Up @@ -416,6 +444,8 @@ public void testGetCachedBaseImages_v22ManifestListCached_partialMatches()
new ContainerConfigurationTemplate(),
"sha256:digest1")));
Mockito.when(cache.retrieveMetadata(imageReference)).thenReturn(Optional.of(imageMetadata));
Mockito.when(cache.allLayersCached(imageMetadata.getManifestsAndConfigs().get(0).getManifest()))
.thenReturn(true);

Mockito.when(containerConfig.getPlatforms())
.thenReturn(ImmutableSet.of(new Platform("arch1", "os1"), new Platform("arch2", "os2")));
Expand Down Expand Up @@ -454,6 +484,7 @@ public void testGetCachedBaseImages_v22ManifestListCached_onlyPlatforms()
Arrays.asList(
unrelatedManifestAndConfig, targetManifestAndConfig, unrelatedManifestAndConfig));
Mockito.when(cache.retrieveMetadata(imageReference)).thenReturn(Optional.of(imageMetadata));
Mockito.when(cache.allLayersCached(targetManifestAndConfig.getManifest())).thenReturn(true);

Mockito.when(containerConfig.getPlatforms())
.thenReturn(ImmutableSet.of(new Platform("target-arch", "target-os")));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -638,4 +638,52 @@ public void testVerifyImageMetadata_validOciImageIndex() throws CacheCorruptedEx
CacheStorageReader.verifyImageMetadata(metadata, Paths.get("/cache/dir"));
// should pass without CacheCorruptedException
}

@Test
public void testAllLayersCached_v21SingleManifest()
throws IOException, CacheCorruptedException, DigestException, URISyntaxException {

emmileaf marked this conversation as resolved.
Show resolved Hide resolved
setupCachedMetadataV21(cacheDirectory);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion: If possible, let's format the test into three blocks: arrange, act and assert, with a space between each of these blocks. The arrange block will take care of all setup, the act block will call the method we're testing and the assert block will do all the verification.

Copy link
Contributor Author

@emmileaf emmileaf Sep 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, this makes a lot of sense! Tried to follow this idea though with the act and assert block more or less combined, since for a few of the tests had asserts both before and after certain actions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we store these calls in variables? For example:

cacheAfterFirstLayerDirectory = Files.createDirectories(cacheStorageFiles.getLayerDirectory(firstLayerDigest));
areAllLayersCachedAfterFirstLayer = cacheStorageReader.areAllLayersCached(manifest)
cacheAfterSecondLayerDirectory = ...
areAllLayersCachedAfterSecondLayer= ..

//Assert block

This is just an example so you can pick a name you think works better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh apologies I completely misunderstood earlier! Thank you for the explanation, I see what is meant by having separate act and assert blocks now. Will update the tests 😃

ImageMetadataTemplate metadata =
cacheStorageReader.retrieveMetadata(ImageReference.of("test", "image", "tag")).get();

V21ManifestTemplate manifest =
(V21ManifestTemplate) metadata.getManifestsAndConfigs().get(0).getManifest();
Assert.assertEquals(2, manifest.getLayerDigests().size());
emmileaf marked this conversation as resolved.
Show resolved Hide resolved
ManifestAndConfigTemplate manifestAndConfig =
new ManifestAndConfigTemplate(manifest, new ContainerConfigurationTemplate());

// Create only one of the layer directories.
DescriptorDigest firstLayerDigest =
DescriptorDigest.fromHash(manifest.getLayerDigests().get(0).getHash());
Files.createDirectories(cacheStorageFiles.getLayerDirectory(firstLayerDigest));
Assert.assertFalse(cacheStorageReader.allLayersCached(manifestAndConfig.getManifest()));

// Create the other layer directory.
DescriptorDigest secondLayerDigest =
DescriptorDigest.fromHash(manifest.getLayerDigests().get(1).getHash());
Files.createDirectories(cacheStorageFiles.getLayerDirectory(secondLayerDigest));
Assert.assertTrue(cacheStorageReader.allLayersCached(manifestAndConfig.getManifest()));
}

@Test
public void testAllLayersCached_v22SingleManifest()
throws IOException, CacheCorruptedException, DigestException, URISyntaxException {

setupCachedMetadataV22(cacheDirectory);
ImageMetadataTemplate metadata =
cacheStorageReader.retrieveMetadata(ImageReference.of("test", "image", "tag")).get();

V22ManifestTemplate manifest =
(V22ManifestTemplate) metadata.getManifestsAndConfigs().get(0).getManifest();
Assert.assertEquals(1, manifest.getLayers().size());
ManifestAndConfigTemplate manifestAndConfig =
new ManifestAndConfigTemplate(manifest, new ContainerConfigurationTemplate());

Assert.assertFalse(cacheStorageReader.allLayersCached(manifestAndConfig.getManifest()));
// Create the layer directory.
DescriptorDigest layerDigest = manifest.getLayers().get(0).getDigest();
Files.createDirectories(cacheStorageFiles.getLayerDirectory(layerDigest));
Assert.assertTrue(cacheStorageReader.allLayersCached(manifestAndConfig.getManifest()));
}
}