HREFLang has been said to be one of the most complex tasks in SEO. HREFlang itself is not the problem but the complexities of the sites where it is being implemented. The following are four examples of hidden challenges international SEO professionals may encounter during hreflang implementation – and how to deal with them.
Having the wrong language or country pages showing up in the search results is a fairly common problem among global websites, even when there are no duplicate language sites. It may not be a site-wide issue and only happens with certain search queries. In the world of international SEO, the hreflang element is one of the most impactful tools for SEO professionals and website owners. In the old days, we had to create target country signals for every market to designate its location by distinct differences, such as ccTLDs and server locations, but that didn’t solve all of the geotargeting problems.
So, when Google announced the hreflang in 2013, you can imagine how high many international SEO professionals jumped in joy. However, six years later, many are still struggling to benefit from the hreflang. This post is not about how to use the hreflang or whether you should go with the hreflang in the header or the XML sitemap format. It’s also not about the simple mistakes of using EN-UK instead of EN-GB or JP for Japanese instead of JA. You can learn from Google’s help pages and YouTube videos.
Instead, I’d like to talk about the hidden challenges that often happen during hreflang implementation. Below are three of the common challenges that I repeatedly find.
The hreflang works by listing the URLs of pages that are the “alternates” of another page which have the same or similar content on each language/country site. The mapping or grouping of these alternate pages is not so difficult when all of the sites have the page with the same URL structure, but this is often not the case. There is a variety of reasons for this to happen.
It could be business reasons such as some items not available in certain countries.
It may be more of the external reasons such as content is limited by the regulations of certain countries.
Why is this a challenge?
It is because no one really has a grasp of content availability and unique content from site to site. Oftentimes, the person who is responsible for the hreflang sitemap takes the main site that he or she is familiar with to multiply the URLs by replacing the language-country directory in the URLs assuming there is complete coverage. When you have missing pages on some language/country site, you need to decide if you want to place replacement pages in the URL group or default it to the global URL.
The URL structure variations cause some headaches, too.
You hope that everyone just uses the same URL structures as the main site, but the reality is that many sites get creative on their own and use different URL structures.
This happens frequently on the same domain sites, so you can imagine the difficulty level of trying to group pages correctly from different domain sites.
The difference in URL structures doesn’t just happen between the sites, but also happen within the site.
Irregular URL structure examples:
Most validation tools don’t crawl the site to confirm the URLs on-site or the existence of the listed URLs.
They don’t check if any of the URLs are redirected or the page has a different URL in the canonical tag. It just reviews what you entered against the alternate page logic of Hreflang.
If it makes sense, it returns as no issues found. These tools work only when you are absolutely sure that the URLs used are all correct and live.
One of the most important goals for using the hreflang is to ensure a correct language/country page appears in the search results based on the search users’ location.
It is to provide the right content to the search users based on the location of the search is conducted. It is also to have business conversions at the right location so that the local team will benefit from the conversions.
If a wrong language/country page appears in the search results, the conversion is counted under wrong local offices, or the worst case, it doesn’t convert at all.
For example, when your page created for the U.S. market appears in the search results in Japan, a site visitor would probably bounce back to the search result page and click another blue link.
In this case, you just lost potential business. A U.S. page could appear in the results in Japan especially when a product name is in English letters or the product number is a combination of letters and numbers, and there is nothing uniquely Japanese about it.
This becomes even trickier when you have multiple websites in the same language targeting different countries.
An example of this case would be when a person searches in Spanish in Costa Rica, but a page designed for Mexico shows up in the search results.
Since it’s in Spanish, a site visitor may fill out the form with a question or request about the product. But since the address on the form is not in Mexico, the Costa Rica office will never get that information. A possible lead is forgotten or deleted by the Mexico office.
A key contributor to this cannibalization is incomplete implementations of hreflang.
HREFLangBuilder’s 2019 research found 42% of global sites only implemented hreflang on the home page and key category pages leaving product pages to the decision of Google and the potential for significant missed opportunities.
Sadly, it is a common problem that the hreflang list is not double-checked before it goes live. It happens with regular XML sitemap files, too.
I’m sure many readers have seen “submitted URLs sending 404 error” in the Google Search Console report.
The problem doesn’t always happen from the beginning. The website grows over time adding new content.
New pages are added or removed from the site but rarely is there any automated method to update hreflang.
There may be more unique content on certain language/country websites.
If you are using the hreflang sitemap files and not updating them frequently, it is likely that many URLs are giving 3xx and 4xx.
Do not assume that everyone has the same content using the same URL structure.
Once you have that, create a list of mapped URLs, and use a crawler to make sure that all URLs on the list are live, and not redirected or have a different URL in the canonical tag. If you find any errors, update the list.
Even a small site changes frequently by adding or removing pages. Put the hreflang XML sitemap update on your review schedule to guarantee to feed up-to-date URLs to the search engines. Hopefully, these will prevent business cannibalization due to the wrong language or country pages showing up. If you are not sure if you have a cannibalization problem, you can review pages in Google Search Console to see if the majority of the impressions are from other markets. This quick and easy check can help you find new opportunities to improve local market performance.