My Nucleus - Nucleus Members Website - Internet 101

Nucleus Support

Nucleus Internet 101

Web Browsers

Web Browsers

How to identify your web browser

Often there is some confusion over what a web browser is and what type you are using. Many people don’t know what a web browser is or what it does. A web browser is a program used for viewing web pages over the internet. When you click on an icon on your desk top to view the internet, you are clicking on your web browser. There are many types of web browsers, the most common are Internet Explorer on the PC (this includes Dell, IBM, HP and Compaq among others), and Safari on the Macintosh (Including iMac, MacBook, iBook and Mac Pro). There are a number of other browsers that work on both Mac and PC as well both Internet Explorer and Safari are available for both Mac and PC.

As you can see figuring out what browser you’re using can be quite confusing.

Your first clue is the icon you’re clicking on to open it, the common ones are


Internet Explorer	Safari

Mozilla Firefox	Google Chrome

When open on a PC you should see the icon representing each browser type in the top Right hand corner of the browser (unless it has been altered by a previous ISP.

When open on a Mac the first menu at the top will be the name of the browser

Identifying the version

Now that we have a basic idea of what family of browser you are using you may need to troubleshoot an issue that is version specific. Also different options are available for user control in different versions of each browser so knowing what version of browser you are using can come in handy.

To find the browser version:

On a PC

Click the Help Menu and chose ‘About Browser Name’ where Browser Name is the browser type you are using. For example when using Internet Explorer, chose Help > About Internet Explorer as seen below.

Note:If you are using Internet Explorer 8, the Help Menu may not be displayed by default. You will want to follow the same instructions as above, only click on the graphical help menu and then proceed to click on About Internet Explorer. If you would like to display the Menu Bar (contains File, Edit, Favorites, Help etc) simply right click by the graphical buttons at the top and enable Menu Bar (see below for more details on how this is done).

How to Enable the Menu Bar

The After Effect

As you can see, the Menu Bar is now displayed towards the top of the web browser as it is usually displayed in Internet Explorer 6 and 7.

On a Mac

Click the browser menu and chose ‘About Browser Name’ where Browser Name is the browser type you are using. For example when using Safari chose the ‘Safari’ menu and click ‘About Safari’ as seen below.

Making & Managing Bookmarks

Using Web Browsers

There are many common features available in most web browsers. Many newer browsers have functions for displaying Tabs (allowing you to have multiple pages open in a single window) as well all browsers have basic navigation features (Back, Forward, Home and Refresh). For a brief overview of the various functions available, please see the diagrams below.

Internet Explorer 6

Internet Explorer 7

Internet Explorer 8

Safari

Creating Bookmarks/Favorites

Bookmarks are a good way of keeping track of your most frequently visited websites. Most browsers, including all browsers named here have the option of saving bookmarks.

IE 6

To create a favorite:

Click the ‘Favorites’ menu and select ‘Add to Favorites’
Follow the prompts to save your bookmark.

All favorites are saved under in the ‘Favorites’ menu

IE 7 & 8

To create a favorite:

Click the ‘Favorites’ menu and select ‘Add to Favorites’ or Click the new ‘Favorites’ button and select ‘Add to Favorites’

All favorites are saved under in the ‘Favorites’ menu or the new ‘Favorites’ button.

Safari

To create a bookmark:

Click the ‘+’ next to the address bar OR go to the Bookmarks menu and choose 'Add Bookmark'

All Bookmarks are saved under the bookmarks menu.

The Default Homepage

When you first open your web browser you usually go directly to a main homepage. In some cases this page is set by software you have installed, the computer manufacturer, or the browser developers. In most cases however the computer user (that’s you) will want to change the default homepage to one of their choosing. Usually it’s a good idea to set it to something you want visit often, or has information that updates daily or more frequently that you may want to see, that said it’s important to note you can set your homepage to ANY SITE ON THE INTERNET (not just one that it is pre-set to).

Setting your default homepage is dependant on the type of browser you’re using.

In Internet Explorer 6

Click ‘Tools’ and chose ‘Internet Options’

Then type the name of the homepage you want in the ‘Address’ line or alternatively if you are on the page you want as your home page click ‘Use Current’

When you have the correct home page address in the ‘Address’ line, click ‘Apply’ then ‘OK’

In Internet Explorer 7 & 8

Click ‘Tools’ and chose ‘Internet Options’

Then type the name of the homepage you want in the ‘Home Page’ Box or alternatively if you are on the page you want as your home page click ‘Use Current’

When you have the correct home page address in the ‘Home page’ box, click ‘Apply’ then ‘OK’

In Safari

Click on the ‘Safari’ menu and click ‘Preferences’

Then type the name of the homepage you want in the ‘Home Page’ line or alternatively if you are on the page you want as your home page click ‘Set to Current Page’

When the correct address is in the homepage line click the red button in the top left corner.

Cookies, Java and Other Tasty Browsing Terms

There is a lot of terms used on the internet and about the internet that can sometimes be quite confusing. Below there are some brief descriptions of what these terms mean and why you should know them.

Cookies

No not the chocolate chip kind. Cookies are pieces of information delivered from a Web site to the client's browser, and then stored on the hard drive. Examples are login or registration information, online “shopping cart” information, user preferences, etc. Cookies can be read by that web site on the next visit. The information stored in cookies can also be used to track browsing habits and retrieve personal information by spyware and malicious websites. Most cookies are fairly safe and some are automatically deleted when you log out of website. However there are some that are able to keep information for long periods of time and can be used to “automatically log in” to a website every time you go to it. While in some cases these can be restricted by the website to only be accessible by the original website, in some cases this information can be accessed by other websites and by spyware to hijack your online accounts or track the websites you commonly go to. Many security programs and clean up utilities have a feature to automatically delete cookies, you can also manually delete cookies in most browser programs.

In Internet Explorer 6

Click ‘Tools’ and chose ‘Internet Options’

Then click the “Delete Cookies” button.

In Internet Explorer 7 & 8

Click ‘Tools’ and chose ‘Internet Options’

Click “Delete…”

In Internet Explorer 7 - click “Delete Cookies”

In Internet Explorer 8 - select what you would like to delete by checking the boxes on the left. Once this is done click “Delete ”

In Safari

Click on the ‘Safari’ menu and click ‘Preferences’

Go to the ‘Security’ Tab

Click ‘Show Cookies’

Click ‘Remove All’

And click ‘Remove All’ in the pop-up confirmation

Java, Flash, ASPX, Microsoft Silverlight, and Adobe Air

These are specific types of programs and scripts run from within a browser that display content simple web page coding can’t. These are often used for online games, forums and chat rooms, and any number of other uses, they are essentially used programs made to run within programs (browsers) and can be used to make any kind of program you can run on your computer. The advantage of these kinds of programs is they are run, saved and accessible from the internet. You can access your information from any computer with an internet connection and don’t have to worry about them taking up space on your computer. The disadvantage of these kinds of programs is they are run, saved and accessible from the internet. Your information is being streamed across the internet and if not properly secured can be accessible to anyone with the time and motivation to find it. If your account on a web page is hijacked then the hijacker has unlimited access to all the information in your account, and you can only access it from online, so if you can’t access the internet you cannot access programs made with these tools.

These program types typically involve installing an add-on, plug-in, or runtime on your computer so that your browser can read the programs. Any corruption in these add-on’s, plug-in’s or runtimes can cause tools and content written in these languages to fail to display.

Java and Javascript

Let us make a distinction here: Java is not Javascript, a common misconception created by two different organizations naming their products without consideration for the names of other products. Java is a programming language for writing full-blown software applications which are typically displayed through a web page, but are seperate, embedded components. Java programs but can also be run outside of the web browser, but those built into a webpage are referred to as Java Applets. Javascript is a scripting language commonly used for achieving visual effects and limited interactivity on a webpage, but is nevertheless part of the web page itself, not an embedded element

Browsing Safety

Keeping your browser safe from infections and those who wish to steal personal information is often one of the most confusing of subjects when it comes to computers. The truth is there is no one right answer in terms of what you should be doing to keep your computer safe. Every person has an opinion on what you should do, but there is no 100% guaranteed method of securing your computer. The most anyone can do to protect them selves is make themselves a harder target than someone else. That said, there should be some things you should consider when deciding what kind of security works for you.

Antivirus

An antivirus is the basic tool most people use to protect their computers. Antivirus are programs that scan information on your computer (and sometimes data coming in and going out as with mail scanning) for known viruses and things that look like known viruses. There are a number of different antivirus programs available ranging in price from free to $200.00+. The important fact to remember is that just about any antivirus will offer some level of protection, but in many cases having too big or intrusive an antivirus will affect computer performance and user experience. In many cases it may be necessary to disable certain features of an antivirus to use other programs on your computer as intended.

Firewall

A firewall is a program that blocks information coming in and out of a computer. Most firewalls will automatically “learn” what programs you normally use and what kinds of things they are allowed to do when you are using them. Other programs trying to get in to your computer from outside, or send information you don’t necessarily want sent (such as spyware) are usually blocked by a firewall. Please note a firewall will *Not* move, remove, or delete malicious software, it just stops it from sending out. In some cases a firewall will accidentally block a program you actually want to connect to the internet. In these cases it may be necessary to disable or manually change the firewall program.

Anti-Spyware

Anti-Spyware tools are programs that work much like anti-viruses. They scan your computer for spyware and adware that may have infected your computer. While the difference between a virus and spyware may be a bit vague, it is important to note that they *Are* different and while many antivirus programs now include a spyware scanner the reverse is not necessarily true (most anti-spyware programs are not virus scanners).

*Note: pop-up blockers are not antivirus programs. They are something like firewalls for spyware in that they simply stop the pop-ups created by spyware, they don’t remove the infection. In addition, since many types of spyware don’t create pop-ups, it is usually preferable to use a true anti-spyware program than a pop-up blocker.

General Security Tips

In general the best software on earth isn’t going to protect you if you are browsing and using the internet in an unsafe manner. Your computer will never save you from actions you take of your own free will. As such here is a list of ways to protect your self when online:

	Never click on a link in your email if you’re not sure where it will take you and who it came from.
	Don’t give personal information out over the internet if it can be avoided, and *Never* send personal information to a website or email address that looks suspicious.
	If you are not sure that a web page or email is actually from the person, organization, or company it says it’s from. Contact the person, organization, or company and confirm it is legitimate before you respond or give out any personal information.
	Computer infections will happen from time to time. If you notice any strange behavior such as sudden slow down of computer even when not on the internet, new files showing up with unusual or un-readable names, sudden changes in appearance of the computer that you didn’t make, pop-ups appearing at random when surfing, your antivirus, or anti-spyware software informing you of infection that cannot be removed or your firewall is informing you that a new program that appears suspicious is trying to access the internet you likely have an infection. Take the computer to a computer professional such as our Nucleus technicians and have it inspected and cleaned as soon as possible.

Search Engines Explained

Search Engines - Generation Technologies Summary

The term "Search Engine" generically refers to publically accessible web sites that allow you to perform searches on the Internet for all kinds of content and information. Most people are familar with Search Engines such as the present-day Google or Yahoo! web sites. The first applications that performed internet searching tasks were not found on web sites, but were instead programs that ran locally on your computer. Examples were Archie and Gopher and were used prior to 1993.

The technology of Search Engines is still evolving and there are many companies world-wide offering products to improve "data mining". Generally speaking, the ultimate goal in this field for computers to accurately produce personalized search results when queried in a way that's intuitive to humans.

Presently, there are many mathematically and syntactically formalized approaches for searching large bodies of data as well as specialized systems for searching through specific types of data, such as audio files. For the average person, using such facilities is unwieldy and presents a large learning curve which makes them unsuitable as public search products. Ideally, rather than using mathematical and "machine" language queries to find information, we would instead prefer to have a system that lets us ask it questions, as if we were speaking to another person who has an "intuitive sense" of the context question as well as the relevancy of the answers we're looking for. Such a text is highly human-oriented and very difficult to instruct a machine to do. Presently there is no solution for this, but there are many proposed technologies that will bring us closer to it.

Searching the Web

The process of actually searching the Web for material that interests you is very simple and straight forward. You're able to search the entire Web, specific Web Sites, news articles, pictures and more from a single web site (the search engine). Having such a page bookmarked in your Web Browser, or even setting it to your Home Page is a typical, easy method for users to go to the search engine when they need it.

Below, we've provided a list of several Web Pages on the Internet that are search engines, sorted alphabetically. Just click a link to explore the engine! In each case the idea is always the same - simply type words or a question related to your interests into the search field on the page and click the Search button.

	Ask Jeeves
	Clusty
	Google
	Grokker
	Picsearch
	Wikipedia
	Yahoo!

Searching for People

Once niche application for search engines is in the location of specific people (or information about them). Typical resources might simply be online Phone Books, but there are other online search engines that are capable of revealing incredibly detailed data about searched people. Here are a few examples of these engines:

	Find a Person
	Super Pages
	Telus
	USA People Search
	White Pages
	Yahoo! People

Searching for Places

Finally, another novel variety of searchable content on the web are geographic maps - simply typing in a complete address is usually enough for these types of engines to pinpoint the location on either a drawn map, a satellite map or both (a hybrid map). In addition to location services, the engines are usually also capable of providing optimized travel instructions between two addresses. Have a look at the following links:

	Google Maps
	Map Quest
	Yahoo! Maps

Search Technology - Advanced Details

Most people won't be too interested by the details presented about Search Engines on the remainder of this page as it's a bit more technical and certainly not required information. If you don't really care about how a search engine finds stuff for you, then you can just skip the content that follows...

If, on the other hand, you're still curious about how Search Engines work and where they're headed for the future, then you should check out what we've included below. To start with, we've made a brief summary of the various "generations" of technologies that have been used for searching internet content since the early 1990's. First and Second Generation Search engines are now things of the past, while ideas about a Fourth Generation are only just beginning to appear. As of this writing (early 2008), we're at the border between 2nd and 3rd generation search engines...

	First-generation Web Page search engines ranked sites based solely on page content - examples were Wandex, and Aliweb which appeared in 1993. Many search engines followed in subsequent years (1994 - 1997) which all attempted to vie for popularity - for instance Webcrawler, Infoseek, Altavista, Yahoo, Excite and Dogpile (among others).
	Second-generation engines rely on link analysis for ranking - so not only do they consider page content, but they also take the structure of the Web (ie. what other pages link to a target page) into account. Examples are Google, MSN Live Search, and Overture. Both Google and MSN appeared in 1998 but Google did not rise to significant prominence until about 2000. Since that time, the designers of these engines have been adding additional features, moving the total state of this technology closer to Third-generation. New and Innovative approaches to searching are now being available for use by the public - Google's Research Department, for example, have many projects that can be accessed. Other engines like Clusty, Grokker, and Ask Jeeves have developed their own proprietary solutions but are also freely available.
	Third-generation search technologies are designed to combine the scalability of existing internet search engines with new and improved relevancy models; they bring into the equation user preferences, social collaboration, collective intelligence, a rich user experience, and many other specialized capabilities that make information search more productive will be used. Computer algorithms which impliment these approaches are under heavy research and development, but some are beginning to appear.
	Fourth-generation search will no longer require explicit queries, but do "active information supply driven by user activity and context". At this point, speculating what software design approaches will be employed to meet this vague description is almost pointless, since some experts suggest that this generation of search engines is roughly 20 years away. From a modern perspective it's probably most accurate to simply say that fourth-generation will be comparable to science fiction made real. Interacting with such a search engine will probably create an experience indistinguishable from asking questions to an incredibly intelligent and intuitive human.

Search Engines - Gen 2 Technologies

To follow up the previous section, let's take a closer look at how search engines actually work by breaking down their various mechanisms into identifiable pieces and then describing each. Keep in mind that the technology behind search engines is private - proprietary solutions developed in-house by the companies that host the search services. Having said that, one can really only guess at how their products actually produce results. Nevertheless, a lot about search engine operation is known, even though the software that runs them is secret.

Search engines have three states of operation:

1. Web crawling (Spidering) - at this stage, automated web browsers follow every link they encounter, set in motion from a "core" group of web pages supplied to them by human operators. Page contents are subsequently analyzed using a variety of methods (see below), the results of which are then stored in a large, organized database.

2. Indexing - having harvested a web page for information and analyzed those contents in step 1, those results are now stored in a database which will later be referenced to answer search queries from an end-user. To avoid redundant work, pages already indexed are obviously not Spidered again. An index has a single purpose: It allows information to be found as quickly as possible. There are quite a few ways for an index to be built, but one of the most effective ways is to build what's called a hash table. In hashing, a mathematical formula is applied to attach a numeric value to each word that will be entered into the database. The formula is designed to evenly distribute the entries across a predetermined number of divisions and it is this numerical distribution, which is different from the distribution of words across the alphabet, that is the key to a hash table's effectiveness - a given word will always "hash" to a specific spot in the database, which can almost instantly return matched results.

3. The Search Facility - this is the user interface and software engine (a web page like Google or a locally-run application) that provides a syntax (or some other means) for a user to retrieve tailored results from the indexed database content once they submit a search query.

At the first appearance of 2nd-generation search engines, the techniques used to produce results had evolved from simple keyword lookups to involve the following major components which helped to adjust the relevancy of search results, allowing engines to order (or even exclude) matched pages with poor ratings:

Tracking Clicks	This is a value that reflect the total volume of clicks on links to a site, the assumption being that this is a partial indicator of popularity.
Page Reputation	Is a value of frequency that content on a page actually matches the "advertised" content.
Link Popularity	Is also a frequency value that shows the number of links to a specific page that users are actually clicking on.
Temporal Tracking	This is a "vectored" value (one that shows a direction) and indicates whether links to a target web page are growing or shrinking over time.
Link Quality	This is an indicator of how often links to a target site break, or how often they actually link to the page they are intended to.

Advances in Gen 2 Engine Technology

Following the introduction of 2nd Generation search engines, their authors began to introduce a variety of new systems to improve the accuracy of search results. Here's a summary of the techniques that have been employed to date:

Term Vector Databases

These systems weigh page keyword density to calculate the Page Vector of each web site indexed. The numbers assigned to the keyword phrases are called Term Vectors.

The engine first looks at all the information (keywords, one and two keyword phrase densities, page length, etc.) on a 'seed set' or group of sites and pages that it has already spidered and has in its index. This becomes the 'core' of the search engine to which other pages will be compared to generate comparative ranking data.

The Page Vectors of new sites (calculated in the same manner as Term Vectors) are compared to the seed set stats and used to assign the new page a relative number for each keyword phrase. Using these values, a Web page reputation is calculated by graphing interconnectivity and link relevancy, making sure the reputation of the page and the content on the page actually match.

The closest matches get the highest search engine positioning. The closer the term vector is to the page vector, the better the chance that the page has of being a top ten contender for any particular keyword phrase. It may even be 'folded' into the Core, bumping off some other page, causing it to fall out of the search engine (some engines allow a 'pay to stay' model so paid sites don't get bumped out).

The end result of the analysis and comparison is essentially, "what you say about your Web page, how the structure of other people's Web pages compares on the same topic, and what other people say your site is about all must match for your site to rank well in search results."

Page Rank

This is a proprietary page ranking algorithm by Google (like Term/Page Vectors) that positions web pages against others based on the number and Page Rank of other web sites and pages that link there. It operates on the premise that, "good or desirable pages are linked to more than others".

Boolean Operators

Some engines allow a special set of symbols to be specified in the search text itself to filter results produced in a search. A Boolean Operator is a language construct used in the scientific mathematics field of Logic. Although such operators produce entirely different results than arithmetic operators, you can loosely compare them to one another - just as you have the '+' arithmetic operator, you have the 'AND' operator in Logic.

When a Boolean Operation is performed between two pieces of data, you generate a situation that is either True or False. In the context of a search engine, you could perform a search like "recipe dessert AND chocolate", which indicates to the search engine that you want recipes for chocolate desserts, but none that are vanilla desserts. Additionally, however, you are also saying you want recipes for desserts, and not for meals like a breakfast or lunch. In other words, for a result to be returned, it must be about BOTH desserts and chocolate!

Proximity Search

This type of search function allows users to define the maximum and minimum distances between the keywords being searched. The reason for this is that you may search for "chocolate recipe" and, without Proximity Search, any webpage that contains both of these words anywhere on the page will be returned. So, if a webpage mentions chocolate in one section, but then (and much later) lists a recipe on preparing a steak dinner, the page would be returned.

With a Proximity Search, you can say that the words "chocolate" and "recipe" must occur within, for example, 5 words of each other. The assumption here is that if two words occur near to one another, they have a stronger contextual relationship than the same two words if they appear far apart. Words near to each other have a shared meaning, but words far from each other have only coincidental meaning, if they have any at all.

Stop Words

Stop Words are common words such as "a", "the", "do", "i" and "for", which Search Engines will ignore by default since they generally do not enhance the meaning of surrounding text. Occassionally, you might want Stop Words to be recognized for some reason - in such situations, Search Engines can usually be forced to recognize them by employing specific commands or symbols along with your search text.

Media Searching

Media Searching is the task of finding pictures, audio, or video files when the search engine has been supplied only a text description of what you're looking for. Since computers cannot yet "understand" seen or heard objects, they have no way of automatically deciding whether a picture, for example, contains either a baby or an elephant. Creating copmuter systems that do this is an area of intense research, but is still in its infancy and also comes with significant drawbacks, such as being excessively demanding on a computer's available processing power.

Some of these engines are human-powered, which means human operators manually decide what keywords to attach to a media file after reviewing the file itself. Others rely on an automated process of extracting keywords about the media from the context of where the file was located by assuming that the text surrounding the it (ie. links, captions, etc.) are indicators of the subject material..

Video Search - It is generally acknowledged that visual search into video does not work well and that no company is using it publicly. Automated indexing works by assessing keywords that might be found in the title of the media, attached to the media, or in links to the media all of which could be defined by authors and users of video hosted resources. Audio processing is also applied to the video in order to create a text-based transcript of any speech detected – the text itself can then be used for further analysis in the indexing process. Analyzing audio, however, is also highly unreliable with its own set of problems.

Audio Search – Again, keywords for each search can be found in the title of the media, in any attached text, or linked web pages. Processing of speech in audio is done to extract additional keywords but ambient music, noise and multiple people talking (such as in crowds or arguments) dramatically reduce the effectiveness of this approach.

Image Search - A common misunderstanding when it comes to image search is that the technology is based on detecting information in the image itself. But most image search works as other search engines - the metadata of the image is indexed as keywords that describe the contents of the image and are stored in a large database. When a search query is performed the image search engine looks up the index, and queries are matched with the stored information. More recent and advanced technologies like Facial Recognition, Edge Detection, and Social Collaboration are finding limited “niche” applications in this area.

Cambridge Ontology has developed a system that automatically identifies visual content in images and associates defined keywords to recognized objects. The system recognizes a wide range of object categories (e.g. grass, sky, sunset, beach etc.), and can also spot and interpret faces, postures, and body parts. From these components, it is able to derive other characteristics about human subjects such as gender. The technology works by grouping image pixels into definable regions (eg. Sky vs. ground), and then identifies objects in those regions (eg. Grass and rocks). Once objects are found, the “semantics” engine associates a list of predefined keywords and synonyms to each object.

It's expected that, in the future, this type of technology will be applied to individual frames in video, combined with even more analytical techniques that allow data from those frames to be associated to others, thereby enabling the processing (ie. recognition & analysis) of moving vs. static objects in the video.

Regular Expressions

Abbreviated as "RegEx", Regular Expressions are a formal syntax, like math, that provide a concise and flexible means for identifying text or patterns of text in large bodies of data which interests us.

Regular Expressions consist of rules and operators that can be applied to text in both simple or complicated ways to produce extremely accurate and flexible search results. They build on an older, weaker syntax that might be familar to some: "wildcard" symbols (such as '*') were used to indicate to a search program that "anything matches". In this way, you could search for the text, "her*" and expect to get results like, "hers" and "herself".

By employing mathematics, Language Theory, Logic, and Set Theory these engines are able to select very specific information from otherwise unwieldy quantities of data. Some examples are:

	Find the sequence of characters "car" in any context. Results may include "car", "cartoon", or "bicarbonate".
	Return whole lines of data, but only those that start with a date where the date can be in any number of accepted formats (eg. YY/MM/DD, DDD/MM/YY, etc.)
	Find sentences of a book that contain the exact word "car" precisely twice..
	Count the total occurrences of the words "Harry Potter" in a book.
	Find all instances of any email address in any file on a computer, but only email addresses (ie. not simply instances of text that contain an '@' symbol).
	Find entire sequences of a mail transaction from a given email address to any addresses on a given domain (while searching a mail log).
	Find sentences containing a specific word, but also find sentences with a synonym of that word.

Geocoding and Geoparsing

This is the processing of web pages so that results are matched to locations in a "geospatial" frame of reference, such as a street address or area. The assumption is that the searcher expects personally geocentric-relevant results from the engine. So, for example, if you search for "pizza restaurants" and since you didn't specify a location, the engine assumes you want restaurants near you, and not in another city or country. In this case, the engine might return a few locations of pizza establishments and display them on a map for you.

Search Engines - Gen 3 Technologies

Web maps	Contrary to what you might think, this technology will not actually produce results with respect to geographic maps. Instead, Web Maps will be useful filtering tools that enable us to get rid of duplicate sites and many stand alone pages that drive traffic to only a few destinations. This means pages like doorways, gateways and splash screens will soon get filtered out of search results.
Agents	Neural Nets and related AI (Artificially Intelligent software) will be used to 'come to know you' over a period of time, based on your past searching habits and the results you selected and browsed for lengthy periods. Agents are still an area of Computer Science that is under heavy research.
User Preferences	This allow searchers to load the engine full of lists of keywords for their general interests, likes/dislikes, geographical info, favorite Web sites and more. As a result, Context Engines can be enabled on a customized per person basis.
Clustering	Search results are automatically sorted into categories that are built "on-the-fly" at a glance by the engine as matching results are located. This means that the engines will, for example, group hits about Paris Hilton separately from hits about the country Paris. When the categories are presented, it will be clear which search results are relevant to you, allowing much faster assessment of the displayed pages.
Meta Search	This is a "search of other searches" and will likely include Clustering. Not much else is currently (publically) known about the future plans for this type of search.
Social Networking	In this scenario, one that is already gaining a lot of popularity, users tag their favorite results and block irrelevant content such as spam or misleading pages. When enabled, your search results will include statistically analyzed matches from those hand-picked by other users with similar areas of interest.
Answer Extraction	Instead of finding the page on which the answer might be located, this technology will produce those answers themselves in the search results. It emphasizes the processing of language rather than symbols - using the levels (positions) of words in the search phrase and the meanings associated with them.
Concept Searching	Some of this research involves using statistical analysis on pages containing the words or phrases you search for, in order to find other pages you might be interested in. Obviously, the information stored about each page is greater for a concept-based search engine, and far more processing is required for each search. Many groups are working to improve both results and performance of this type of search engine.
Natural Language Queries	The idea here is that you can type a question in the same way you would ask it to a human sitting beside you - no need to keep track of Boolean operators or complex query structures. The most popular natural language query site today is AskJeeves.com, which parses the query for keywords that it then applies to the index of sites it has built. It only works with simple queries; but competition is heavy to develop a natural-language query engine that can accept a query of great complexity.

Return to Index

Table of Contents