Hreflang Builder Purpose

Hreflang Builder started as an internal solution to solve website traffic cannibalization issues for our multinational clients that could not solve them with their CMS. Their CMS could not add tags making hreflang XML sitemaps the only viable method. Anyone who works in web infrastructures with enterprise companies is aware of the “many challenge” that comes from the size and breadth of enterprise websites having many versions, pages, URL formats, CMS, teams, and of course, opinions on how things should be done.

Solving Specific Challenges

Each iteration of Hreflang Builder worked to solve a new challenge faced due to complex website architectures or CMS limitations.

  • Enable using hreflang XML sitemaps where the CMS could not use hreflang tags.
  • Automate the creation of hreflang XML sitemaps for websites with complex infrastructures, minimal resources, and multiple stakeholders.

Hreflang Builder was developed to eliminate, or at least minimize, most of the hreflang challenges of the enterprise. Hreflang Builder can automatically import URLs, apply the mapping rules, generate hreflang XML sitemaps, and return the files to a server using revere proxy to create public files. 

CMS Cannot Implement Hreflang

While not as big of a problem as in the early years of hreflang, some Content Management Systems cannot implement hreflang tags into their pages. Even those that can add the tags have challenges when the site has localized URLs or are formed dynamically. Another significant problem with some of the large CMS is “phantom URLs,” where the system creates a hreflang tag with URLs that do not exist. This happened when a single child page was created for a market but not yet created or localized for all markets.  

Multiple Content Management Systems

As many are aware, implementing hreflang into a single CMS can be a challenge, and it is impossible to use tags when there are multiple CMS deployed as there is no central repository of URLs.  Please take a look at our guide to Hreflang for Multiple CMS.

The first two clients for Hreflang Builder were using multiple CMS, which could not accommodate cross-system tags, making it impossible to add hreflang tags to the pages and making hreflang XML sitemaps the only viable option to solve their cannibalization problem. Even if they could, they did want to add 85 and 165 rows of code to their pages, making hreflang XML sitemaps the perfect solution. 

Many URLs and Language Versions

Our first “many challenges” was creating a solution for a tech B2B client with 85 market versions of their website with 200k pages each.  These pages were managed across two different CMS systems, making adding tags to the pages impossible. They had significant traffic cannibalization between their various English and Spanish market pages. They estimated this cost them between $3 and $5 million each month in lost revenue.

Our second use case was for an e-commerce client with 165 markets and nearly 1 million URLs for each market.  Being e-commerce,  the product URLs were constantly changing, increasing, and decreasing, so an automated solution was required.  They had similar cannibalization problems across multiple languages due to product name searches. They estimated cannibalization cost them $8 to 10 million in monthly revenue.

In addition to many markets, pages, and CMS systems (3), they also had many different product SKUs across the regions. This project forced us to build logic to use rules and patterns to map alternates by six regional SKUs. Their Dev team had told them it would take 18 months and about $500k to develop a solution internally.   

Domains, Subdomains, and Folders

Another significant “many challenge” is often the diverse number of website domain name combinations.  A multinational with over 175 market sites using 37 different variations of ccTLDs, subdomains, and nearly a dozen different folder combinations designating markets and languages.  Hreflang Builder can import URLs from multiple sources, in this case, a combination of CMS-generated XML sitemaps and API calls, into SEO diagnostic tools for 70 websites that did not have XML sitemaps. The diversity of the URLs required us to create even more sophisticated alternate URL mapping logic to match the URLs.

Cross-Domain Hosting

The complexity of multiple domain variations created a challenge for loading the XML sitemaps into their respective servers and presenting them to search engines. Trying to get DevOps to upload these XML sitemaps was not feasible.  We solved this complex challenge by creating an AWS S3 storage bucket where Hreflang Builder can auto-load all the XML sitemaps and either using cross-domain hosting or reverse proxy the files to their respective domains, we could fully automate the process with minimal DevOps resources.  

Becoming Commercial Software

Hreflang Builder was never designed to be a stand-alone commercial product. Even today, much of the functionality is not pretty, but it does precisely what it needs to do to solve the challenges of enterprise clients to mitigate traffic cannibalization.