The evolution of search engines has come along way in regards to Indexing the Internet. It is not a new concept and one that has been in the making for nearly twenty-five years. The evolution of search engine technologies has been astounding to say the least, but it begs the question, where is it going? Well, that is a good question and one that we might be able to answer once we understand how search engines work, as well as what is the ultimate goal of these search engine developers.
Before there was widespread access to the Internet, it was mostly utilized by academics and governmental agencies so that they could share data and information. It was a pretty small group and there was no real need for any type of search capability. People would connect directly to another user’s computer via telephone line and they would already know which file they wanted to retrieve. As these individuals and agencies started freely sharing their information on a larger scale, it was soon too cumbersome to search the many computers on the network to locate a file of interest.
Along came what was technically the first search engine, as it had the ability to traverse the network and index files located on various computer systems. The name of this search engine was, “Archie,” and is a play on the word archive and the name Archie of comic book fame. It was written by Alan Emtage, Bill Heelan, and Peter J. Deutsch, while they were students at McGill University. The search engine was released in 1990 where it quickly gained notoriety in the networking world. Although some do disagree as to whether this can be considered a search engine, or not, I am of the mind that it is by definition, a search engine. The purpose of a search engine is to index websites or files and then present that data to the user on request. Archie did this, even if it was not able to search the text within the files.
It was when Tim Berners-Lee started piecing together the World Wide Web that we saw that the technology of the Internet was now friendly to innovations in search engine technology. Tim Berners-Lee developed the hypertext markup language standard, otherwise known as HTML, which is the language that websites are written in. It is within this coding standard, which enabled search engine developers to index the web utilizing, “Bots, or Spiders.”
Bots or Spiders are different names for the same thing. These are programs, which are written to crawl across the Internet gathering information from websites, or files. These programs are the real brains behind the search engine technology. If it were not for these little guys, people would have to manually index the entire Internet one site at a time. Given the quantity of sites on the web, it is unlikely that this could ever be accomplished, although the DMOZ Project, started by AOL, does a good job, they can never truly compete with an automated indexing program. These programs automate the task of indexing websites and the way in which they index is in accordance with its programming instructions.
The early search engines on the Internet included companies like, “Excite, LookSmart, and Yahoo. These search engines started as just web directories. They were basically just a collection of links to website and through sheer necessity, due the explosive growth in the Internet, evolved into utilizing automated indexing. For the most part, this technology was an unprecedented jump in the performance of search engines and paved the way for the average individual to find exactly what they were looking for. These early Bots were only indexing the title of the web page and a brief description based upon the text presented at the beginning of the site. It was also an issue that the bots would list the sites in the order in which they found them. There was no relevancy, or rating of websites at that time. It was not a perfect solution and finding exactly what you were looking for required hours of reading through an overwhelming amount of search results.
It was not long after this that search engines improved upon their technology even further and developed bots which would index websites based upon certain META tags present, that were coded into the text of the web page. These tags included such things as, “Title,” “Description,” and “Keywords.” “Title” would be the actual name of the website, “Description” would be just that, a description of the content presented in the website. The “Keywords” tag was very special indeed and was the actual method in which these sites were indexed according to relevancy. The Bots would read these tags and index the sites according to a predetermined algorithm.
Unfortunately, whenever somebody comes up with a great way to make things simpler, there are those that take advantage of it and soon discover ways in which they can abuse it for their own personal gain. This was the case at this point in time when web developers figured out just how the Bots were indexing their sites. The Bots would index and rank sites by comparing the “Title,” “Description,” and “Keywords” tags, based upon their relationship with each other. In other words, you could rank higher in the search engines if, your title was used in the description and both of those were listed in your keywords. This turned out to be a simple equation and soon the Bots were having a real hard time ranking sites because if the web developers scored low in the index, they simply had to modify those three tags to increase their standing.
It was not long before the search engine developers started to realize that they were being bogged down in the mire of trying to combat worthless links being presented in their search engine’s results. Yahoo was probably affected the most and soon found themselves toe to toe with the up and comer Google. It was at this time that the developers made a slight, but very important change to the coding in their Bots. They recoded their software to scan the web pages themselves and then compare the content to the tags. If the words contained within the tags were present on the web page, then it was indexed at a higher level. Of course, this was not the perfect fix, but it did present a way in which webmasters could no longer easily, and so flagrantly, misrepresent their websites to the potential users.
The search engine Google does not rely heavily on META tags to index sites, but they are still used by other search engines. The reason for the change in the importance of META tags is quite simply to give the web developers less opportunity to manipulate their websites ranking, and give the search provider a better end product by making the results presented more suited to the users query. Search engine developers do not like the developers to have the ability to manipulate the information they are collecting. In this day and age data is money, if your data is corrupted or flawed, then you are going to have a difficult time competing in this market.
As of now, Google is the dominant force in the market of search engines. Google’s profits continue to grow and the search engine giant is not shy about developing new web-based technologies to lure new users to their services. In this market, users are the equivalent to money and all of the search engine developers are vying for their piece of that pie. Ask.com has been trying to gain some momentum in recent years, with its advertising and new search engine enhancements. Ask.com is pushing its algorithm as the wave of the future and an almost Artificial Intelligence approach to searching the Internet. The user types in their search phrase into the search engine box and not only do relevant links pop up, but also related topics. The other great enhancement to this search engine is that of having the ability to preview the content on the website without having to enter it. This is a nice way of ensuring that the site is actually presenting the content that it claims it is.
With all of these great innovations in the search engine industry, one has to wonder what is next, and how far can it really go before we hit the ceiling with this thing? Google is now working on a CSE, or Custom Search Engine, technology. According to an interview with Eric Enge, President of Stone Temple Consulting, Shashi Seth, who runs the Google Co-op, said, “Google is working on a customizable search engine which will be personalized to the individual user.” It is clear that if search engine developers can offer users the ability to search in a more personal manner then they will definitely open up the market and walk away with the prize. However, the obstacle they face is that of individuals not wanting to share their personal information and web browsing habits for the entire world to see. As for me, I like the idea of a personalized search engine, which would include many of the Google pet projects as well.