mattgadient.com

Joomla, SJSB, SMF, and Google Canonical to reduce duplicate content

Here’s the issue. For the EGRR site, I have forums.eyeglassretailerreviews.com . I decided to install SJSB (Simple Joomla! 1.5.x / SMF 2.x Bridge) so that I could display the forums in-line (wrapped) on the www.eyeglassretailerreviews.com site as well. It works very well aside from the initial broken CSS, but I’ll give tips for formatting the CSS in another post.

In any case, one issue was.. duplicate content! Since all the pages were available at both “sites”, Googlebot would undoubtedly see them as separate sites with the same content – not good! There are 3 options:

  1. Leave it be, let google figure it out (not ideal)
  2. Use the noindex and nofollow meta tags on one of the sites
  3. Use the canonical meta tag to tell the bots which site is the preferred one

Option 1 was obviously not much of an option. Option 2 I really considered. The downside though is that if somebody LINKS to the site with the noindex, the potential pagerank will be killed. I decided to get Option 3 to work, since there are 0 downsides to it.

What do we want to do?

The forum url’s themselves look like this:

http://forums.eyeglassretailerreviews.com/
http://forums.eyeglassretailerreviews.com/index.php?board=2.0
http://forums.eyeglassretailerreviews.com/index.php?topic=238.0
http://forums.eyeglassretailerreviews.com/index.php?action=search
(and so on)

I want to make them look like this:

http://www.eyeglassretailerreviews.com/forums.html
http://www.eyeglassretailerreviews.com/forums.html?board=2.0
http://www.eyeglassretailerreviews.com/forums.html?topic=238.0
http://www.eyeglassretailerreviews.com/forums.html?action=search
(and so on)

You’ll notice in the urls in brown/orange that the parameters are the same. This is good – otherwise it might be a nightmare to dynamically create canonicals. So, what’s the code that we use? It’s right here:
echo '
<link rel="canonical" href="http://www.eyeglassretailerreviews.com/forums.html?';
echo "{$_SERVER['QUERY_STRING']}";
echo '" />';

This gets put into your index.template.php file (/Themes/default/index.template.php for example if you’re using the default theme). Place it in the function template_html_above() section – preferably somewhere between a couple other <link rel=”….. entries. BACK UP THE FILE FIRST IN CASE SOMETHING GOES WRONG!!!

This basically creates a line that looks like this:

<link rel=”canonical” href=”http://www.eyeglassretailerreviews.com/forums.html?something=12345” />

Where something=12345 will be board=1.0 , topic=234.0 , action=search , or whatever the parameter in the URL happens to be. This is made possible by the 3rd line where {$_SERVER[‘QUERY_STRING’]} basically takes whatever the php parameter happens is and by echoing it, fills that part in.

Simply change the blue part to fit whatever your preferred URL is. It can be the forum in the SMF installation, or the forum in the Joomla! installation. Either works!

That leaves us with one problem!

What’s the problem you might ask? Well you may remember this from the very beginning:

http://forums.eyeglassretailerreviews.com/
http://forums.eyeglassretailerreviews.com/index.php?board=2.0
http://forums.eyeglassretailerreviews.com/index.php?topic=238.0
http://forums.eyeglassretailerreviews.com/index.php?action=search
(and so on)

and….

http://www.eyeglassretailerreviews.com/forums.html
http://www.eyeglassretailerreviews.com/forums.html?board=2.0
http://www.eyeglassretailerreviews.com/forums.html?topic=238.0
http://www.eyeglassretailerreviews.com/forums.html?action=search
(and so on)

We just took care of the stuff in orange/brown, but what about the base (main forum page?).

Depending on which way you went, you’re either getting:

http://forums.eyeglassretailerreviews.com/index.php? instead of http://forums.eyeglassretailerreviews.com/

or

http://www.eyeglassretailerreviews.com/forums.html? instead of http://www.eyeglassretailerreviews.com/forums.html

Notice the stuff in red. We don’t want it there, right?

How do we fix this?

It’s easy. We use an if condition for that special case. Change the code you may have entered above to look like this instead:
if (!empty($_SERVER['QUERY_STRING'])) {
echo '

<link rel="canonical" href="http://www.eyeglassretailerreviews.com/forums.html?';
echo "{$_SERVER['QUERY_STRING']}";
echo '" />';
} else {
echo '

<link rel="canonical" href="http://www.eyeglassretailerreviews.com/forums.html" />
';
}

The first line (if) says “if there is a parameter” (technically it says if there isn’t no parameter which is a double-negative), do the stuff in the following lines. This is the same stuff we did before, except it’s now in an “if” statement.

The 6th line (else) says “otherwise (if there is not a parameter), print this line”. This line goes directs the canonical to the main forum page you want, without any special parameters added on.

I realize there’s a mess of color that some may find distracting. In part, I wanted to make it obvious where the difference between and was (single quote and double quote), and also wanted to make it easier to understand for someone who might be very new or are troubleshooting and want to get a better visual impression of what each section of code actually does.

Finally, there’s a tiny issue that still exists here. Check out Part 2 for the next steps to resolve them!