How Do Search Engines Work?

SocialB Digital Marketing Blog Last modified: 12 Oct 2015 by Cheryl

The Internet is a big place, with billions of websites in existence, each with a number of pages, so navigating around it is a complex task. We all understand how helpful search engines are in filtering through the chaos, and, for the most part, we understand that websites don’t end up at the top of search engine results by accident, nor do the ads at the top of the page. But do we really understand how they work?

Search engines are used to find, index and retrieve information on the Internet. This process follows a number of steps:


When a search term is entered into a search engine, the search engine uses ‘spiders’ or ‘bots’ to crawl through websites, in jargon-free terms this means these ‘spiders’ or ‘bots’ visit every page as a person would do, but just super fast. As they do this they browse and store information about various web pages relating to your query. Every time a crawler visits a web page, it makes a copy of it and adds its URL to an index, it then proceeds to follow all the links on the page, copying, and indexing and following the links, building up an index of all these web pages. They look at specific items such as page titles, images, content, keywords and what links to a particular page to get an understanding of what the page is regarding.


Now that the page has been crawled, it can be indexed, or put into an order of relevance. This order is determined by an algorithm called PageRank. Simply put, PageRank is a complex algorithm, which assesses the various qualities of a particular web page. For example, links to a web page from other websites can boost a page’s ranking; the most relevant links to a web page, the more relevant PageRank deems this page to be, and the higher up in search results it will be. However, it is not quite as simple as it sounds.

Determining Relevance

In the early days, search engines would simply match up the keywords in the search term with keywords within the webpage. It would also look at the number of links to your website and consider that websites with more links were clearly more popular, and as such deserved a higher rank. However, over time, it became clear that this needed to evolve, as people were simply ramming their web pages full of keywords, or creating lots of irrelevant links, in order to fool the search engines and reach the top of search engine results. Today, hundreds of factors influence the relevance of a website. These are called ‘Ranking Signals’, and can be anything from whether your website is mobile friendly or not, to social ranking signals.

Google Penguin & Panda

Google dominates the search marketing, getting around 70% of search traffic, with Bing in second at 17% and Yahoo third with 13%. Back in 2011, Google launched Panda in order to keep low quality and content sites out of the high-ranking positions in search results. As such, pages with poor quality content or lots of advertising saw significant drops in their ranking, and high-quality sites had a better shot at reaching high positions.

In 2012, Google launched Penguin in order to weed sites out of SERPs that it deemed to be spamming its results, for example, by buying links or obtaining them through link networks, all of which breach Google’s Webmaster Guidelines. This meant lowering the rankings of sites practicing ‘black-hat’ SEO techniques such as duplicate content, keyword stuffing and cloaking.

As a result, SEO experts now have to be much more transparent in their approach to optimisation, as ‘black hat’ techniques may achieve short-term results, but in the long-term will be highly detrimental to the ranking of a website.

We hope this demystifies the world of search engines a little, but if you’d like to know more about any aspects of SEO then contact the team today for a chat, or let us know over on Twitter.

Leave a Reply

Your e-mail address will not be published. Required fields are marked *