This is in continuation of the previous blog article Most irrelevant Results in Google Search – A Case Study , I have done a more detailed research into what’s happening with the Google search results. The following aspects have been considered in this study:
- How exactly the third party site (say site B) has been linking to the hacked site (site A, we are using the term hacked site, though it has been off-site)
- What are the Title and Description of the attacker’s site(site b) and the page that it is forwarding the visitor when he clicks on the search result
- Whether site B, which was considered to be the attacker’s site is itself a victim? It is very much possible that the hacker used site B to redirect traffic to site C (the ultimate beneficiary?)
Now, going to point #1, it was observed that the following code had been used to link to victim site (backlink URL):
http://www.sunnydaysandlovelyways.com/?htm=LabSim/network-lab-simulator.htm
In the above backlink, if we replace the domain name and the “?htm”, it exactly corresponds to the victim site URL, which is
http://www.site-A.com/LabSim/network-lab-simulator.htm
Note: Domain name has been changed to site-A.com.
It was observed by going through the set of backlinks, almost entire site had been reconstructed by the spammer with a different domain name, but with same link structure. Point #2. The meta tags like Title and Description appear to have been duplicated. For example, the search result for key word “ccna exam” is given below:
In the above search result, the title is “CCNA 200-125 Practice Exam” and the description is “200-125 ccna practice exam consists of 425+ questions with flashcard explanation”. The site A webpage corresponding to this result has the same exact Title and Description. On clicking on this search result, you will find that the destination page has nothing to do with the search term.
Result: Effectively, another irrelevant website (site B) has taken the place of Site A without having to hack site A. It was also observed that the webpage that matches the Title and Description shown above, had been delisted and doesn’t show in indexed list of web pages.
As per point#3, it was observed that when clicked on the site B’s link, it had been redirected to another site (site C) intermittently, resulting in suspecting that site B is an intermediary to final beneficiary website.
The results, though for a sample site, have far reaching ramifications. Just assume that this hacking has been done on a broader scale (hopefully, it is not so as of now), the whole search results become more or less irrelevant for the user.
It is very surprising that the search algo could not detect this kind of site hacking which is external to the victim’s website. It also points out to the fact that the search engine “memory” is not deep enough to remember the history of the web page, as new pages (weeks or months old) which are duplicates of the original pages (several years old pages, in this case 10+ years for the URL and the core content of the page has not changed) are showing up in the results.
Will it continue? It appears so. The only solution is to have deep memory combined with processing power for the search algo which may not be possible due to reasons like huge processing overhead, update schedules, and delivery of the search results. Even if the algo is modified to fix one hack, another form of hack may surface due to above limitations.
Though in this case, one particular search engine results were taken, it is possible that similar hacks might happen with other search engines like bing as the mechanisms of hacking are the same. It necessitates that webmasters assess the web metrics such as keywords, backlinks, ranking, etc. continually and do the maintenance on a continual basis!!
So, now what happens to what some major search engine FAQs that say: “Just concentrate on your web pages and create value to your website visitors.” It is partly true, as the webmasters now have to really work through web metrics like keywords, ranking, backlinks, analysis of backlinks, various types of possible off-site hacks, removal of backlinks, reporting of spam sites to respective search engines, etc. And this is a specialized work, and one needs a professional to do this work, and not many individuals, and small businesses could afford it. As a result, it is possible that most of these sites are going to vanish from major search engine results over a period of time, unless there is more heuristic approach to search mechanisms.
Disclaimer: This is in the opinion of the author and does not represent Anand Software and Training’s view.
Author: Vijay Anand