Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Change Request] Add the ability to create an Image XML Sitemap #81

Open
GeekInTheNorth opened this issue Jul 26, 2023 · 10 comments
Open

Comments

@GeekInTheNorth
Copy link

We are using the Geta Sitemaps module on a number of clients. One has recently requested the addition of an image xml sitemap.

Based on: https://www.holisticseo.digital/technical-seo/image-sitemap/

An Image only sitemap may look like this:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
    <image:image>
        <image:loc>https://www.holisticseo.digital/image-1.jpg</image:loc>
        <image:caption>Everything you need to know about Images</image:caption>
        <image:geo_location>London, United Kingdom</image:geo_location>
        <image:title>Image Sitemap Example</image:title>
        <image:license>Creator:HolisticSEO.Digital, Credit Line: HolisticSEO, Copyright Notice: Free to Use</image:licence>
    </image:image>
</urlset>

An Images within a regular sitemap may look like this:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" 
    xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
    <url>
        <loc>https://www.holisticseo.digital/image-sitemap</loc>
        <image:image>
            <image:loc>https://www.holisticseo.digital/image-1.jpg</image:loc>
        </image:image>
        <image:image>
            <image:loc>https://www.holisticseo.digital/image-2.jpg</image:loc>
        </image:image>
    </url>
</urlset> 

Do you have any plans to expand your functionality into including an Image Sitemap? are you open to collaborations if this is not on your roadmap?

@GeekInTheNorth
Copy link
Author

GeekInTheNorth commented Jul 27, 2023

@marisks I managed to prototype this for our client build and we will probably solve it this way. The way you have structured the XML builders here to be so extensible is brilliant as this allowed me to leverage a lot of the functionality you've created.

public class StandardAndImageSitemapXmlGenerator : SitemapXmlGenerator, IStandardSitemapXmlGenerator
{
    private readonly ILogger<StandardSitemapXmlGenerator> _logger;

    private readonly XNamespace _imageNamespace;

    private readonly XAttribute _imageAttribute;

    public StandardAndImageSitemapXmlGenerator(
        ISitemapRepository sitemapRepository,
        IContentRepository contentRepository,
        IUrlResolver urlResolver,
        ISiteDefinitionRepository siteDefinitionRepository,
        ILanguageBranchRepository languageBranchRepository,
        IContentFilter contentFilter,
        IUriAugmenterService uriAugmenterService,
        ISynchronizedObjectInstanceCache objectCache,
        IMemoryCache cache,
        ILogger<StandardSitemapXmlGenerator> logger)
        : base(
            sitemapRepository,
            contentRepository,
            urlResolver,
            siteDefinitionRepository,
            languageBranchRepository,
            contentFilter,
            uriAugmenterService,
            objectCache,
            cache,
            logger)
    {
        _logger = logger;

        _imageNamespace = XNamespace.Get("http://www.google.com/schemas/sitemap-image/1.1");
        _imageAttribute = new XAttribute(XNamespace.Xmlns + "image", _imageNamespace.NamespaceName);
    }

    protected override XElement GenerateRootElement()
    {
        XElement rootElement = new XElement(SitemapXmlGenerator.SitemapXmlNamespace + "urlset", _imageAttribute);

        if (this.SitemapData.IncludeAlternateLanguagePages)
            rootElement.Add((object)new XAttribute(XNamespace.Xmlns + "xhtml", (object)SitemapXmlGenerator.SitemapXhtmlNamespace));
        return rootElement;
    }

    protected override XElement GenerateSiteElement(IContent contentData, string url)
    {
        var element = base.GenerateSiteElement(contentData, url);
        var urlResolverArgs = new UrlResolverArguments { ForceAbsolute = true };
        
        if (contentData is SitePageData sitePageData && !sitePageData.TeaserImage.IsNullOrEmpty())
        {
            try
            {
                var imageUrl = UrlResolver.GetUrl(sitePageData.TeaserImage, "en", urlResolverArgs);
                var image = new XElement(_imageNamespace + "loc", (object)imageUrl);
                var imageParent = new XElement(_imageNamespace + "image", image);

                element.Add(imageParent);
            }
            catch (Exception ex)
            {
                _logger.LogError(ex, "Oh No!");
            }
        }

        return element;
    }
}

I may make this my next blog subject if you're okay with that :)

I was thinking the more long term solution would be an interface a page could implement?

public interface IPageWithSitemapImages
{
   IList<ContentReference> SitemapImages { get; }
}

@marisks
Copy link
Member

marisks commented Jul 27, 2023

Looks good and a blog post is welcome.

If this is some standard or common feature, we can add it to the sitemaps and make it configurable. I haven't heard about images in sitemaps before.

@GeekInTheNorth
Copy link
Author

Normally our clients are happy with standard XML Sitemaps, this is literally the first time I've had a requirement for it.

It should be noted that the following nodes are deprecated as of May 2022 by google:

  • image:caption
  • image:geo_location
  • image:title
  • image:license

I find this quite odd because then you have an image without context, unless they are more interested in the version where images are child nodes of the url elements e.g. tied to pages.

@GeekInTheNorth
Copy link
Author

@GeekInTheNorth
Copy link
Author

@marisks Is this something you'd be open to a PR on? Though I suspect that has a greater implementation concern, how it should fit into the existing module.

@valdisiljuconoks
Copy link
Member

We are definitely open for PRs to make the package more flexible and feature-rich. Let's exchange some ideas here about how you might see this as part of the library.. (that would save time spent already on implementing something that needs to be changed later on).

@marisks
Copy link
Member

marisks commented Aug 1, 2023

@GeekInTheNorth Your implementation of generator looks good. A question is if it should be a default one and should replace the current generator or if it should be an additional generator. And then how it would work with other generators - commerce one for example.

@GeekInTheNorth
Copy link
Author

You could make it so you have interfaces for both Image Sitemap enabled content and Video Sitemap enabled content. Your standard and commerce builders could then Consume those common properties. Then it would be up to the individual implementation to decide whether it wants to provide that data or not.

The challenge would be to decide whether you need the additional namespaces or not. But then maybe that could be additional conditions on the Sitemap configuration?

E.g. You choose standard or commerce etc, then tick "include images" and/or "include videos"

There's a very definite product decision to be made either way. And maybe that decision is to do nothing or query the community to see how much of a real demand there is. Because if it's a one off, then that build can override the functionality itself.

@valdisiljuconoks
Copy link
Member

thanks @GeekInTheNorth for some insights/ideas.. @marisks it could be also something between the lines how I refactored Google ProductFeed - with "production pipeline" consisting of extractors, enrichers, transformer and what not. not necessary to be that complex as Google Feed (it had it's own complexity and performance penalty to extract Commerce data more than once for different feeds).. but just as an idea that the whole sitemap generation process is transparent, flexible and extendable to suit implementing projects.

more details on the refactoring - https://blog.tech-fellow.net/2022/04/21/googleproductfeed-for-optimizely-reloaded/

@marisks
Copy link
Member

marisks commented Aug 2, 2023

I just looked more at what is needed, and the best solution would be to include image and video element generation as base functionality.

  • Create an abstraction (interface) that would return additional elements under the url tag - IContentElementGenerator? It might have a method IEnumerable<XElement> GetElements(IContent content).
  • Use this new interface in sitemap generators - standard, commerce, etc.
  • Implement image and video content element abstract base generators that would implement XML generation logic, but would not implement data source (image/video) logic (which prop to take, etc.)
  • Implement a composite content generator that would be registered as the main one and would call other content generators
  • Next, implement default image and video content generators - these ones will implement logic to pass data for XML generation for the base class.
  • Create a configuration that would allow registering of multiple content generators. By default, register default generators.
  • Add some helper methods to easier register/replace content generators in configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants