BEV Navigation Home Search Help/ FAQ About Services Health Youth Organizations Government Education Village Mall Calendar Neighborhoods Seniors Visitors E-Community Arts and Entertainment BEV Home

How to use Alta Vista to search the internet

Table of Contents

This material is from the help pages that Alta Vista provides for on-line use. This format allows the user to access basic information from a single web page.

Understanding the syntax - words, phrases, capitilization ...

Simple Queries

Advanced Queries

Understanding the syntax - words, phrases, capitalization...

Words

AltaVista treats every page on the Web and every article of Usenet news as a sequence of words. A word in this context means any string of letters and digits delimited either by punctuation and other non-alphabetic characters (for example, &, %, $, /, #, _, ~), or by white space (spaces, tabs, line ends, start of document, end of document). To be a word, a string of alphanumerics does not have to be spelled correctly or be found in any dictionary. All that is required is that someone typed it as a single word in a Web page or Usenet news article. Thus, the following are words if they appear delimited in a document: HAL5000, Gorbachevnik, 602e21, www, http, EasierSaidThanDone, etc. The following are all considered to be two words because the internal punctuation separates them: don't, digital.com, x-y, AT&T, 3.14159, U.S., All'sFairInLoveAndWar.

Only the words in a document are significant to AltaVista. AltaVista does not index punctuation or white space, so you can use AltaVista to look only for words and phrases, not punctuation.

Phrases

A phrase is a string of words that are contiguous in a document, although they may be separated by any amount of white space or punctuation. They do not have to be grammatical in any human language--they just have to occur in a document as a contiguous sequence of words. Some examples:

  • President of the U.S.A. (6-word phrase)
  • http://www.election.digital.com (5-word phrase)

Since the punctuation and white space are insignificant to AltaVista (except that they delimit words), the phrases above are indistinguishable from the following variants:

  • President of the U S A
  • http www election digital com

There are two conventions for typing a phrase in a query. The best way, leading to the least ambiguity, is to type the phrase as "a sequence of words separated by spaces and surrounded by double quotes". However, as an alternative, you may type the words of the phrase with punctuation (and no white space) between each pair of words. For example, these are all equivalent as queries:

  • "President of the U S A"
  • President-of-the-U-S-A
  • President/of/the/U/S/A
  • President.of.the.U-S-A

The first is the one we generally recommend. Be aware that the punctuation characters & | ! and ~ have meaning in Advanced queries, and * indicates the *-notation used in both Simple and Advanced queries.

Capitalization

Capital letters are considered distinct from lower-case letters. When a word is found in a Web page or a news article, its case is preserved when it is stored in the index.

When you enter a word in a query, therefore, it is always safe, and generally recommended, to type it all in lower-case, because lower-case letters indicate a case-insensitive match. If you type any capital letters, you force an exact case match on the entire word.

Thus, the word turkey in a query will match any of turkey, Turkey, tUrKeY or TURKEY occurring in a document. But the capitalized word Turkey in a query will match only Turkey in the document, and not any of the other capitalization variants.

Accents

Accents are treated in the same way as capitalization. An accented word used in a query forces an exact match on the entire word. For example, if you use éléphant in a query, you will match only the French spelling for the pachyderm. However, if you do not care to enter accents in the search window (something which is browser, platform, and keyboard-dependent), you can always safely omit the accents, thereby matching both the French and English spellings.

The *-notation

To search for occurrences of any of a group of words with a similar pattern, AltaVista provides the *-notation. For example, you might want to search for matches of sing, singer, singers, singing. In this case, place the *-notation at the end of the word whose inflections you want to include in the search: sing*. But, a word of warning. AltaVista will also match words lexically unrelated to your query word. So the query sing* will also find matches for singe, single, singular, and for foreign words such as French singulier.

The *-notation cannot be used without restriction. To make such queries computationally feasible, AltaVista requires that the * be used only after at least three letters. The *-notation will match from zero up to five additional letters in lower-case only. Capital letters and digits will not therefore be matched.

The *-notation can sometimes be useful for finding variant spellings: for example, cantalo* will find matches for cantaloup, cantaloupe, cantalope, and their plurals. But take care how you construct the query word. For example, if you want to find matches for both color and colour, a query of the form col*r is not the most efficient. This query will also find matches for collector and atomic collider. In this case, it is more efficient to submit the query colo*r, which will find matches for both color and colour.

Finally, if your search using the *-notation finds too many matches, AltaVista will ignore the query. The query inte*, for example, produces the result,

    Ignored inte*: 4292323

    No documents match this query

Simple Queries

paris "petite galerie" louvre
Finds documents containing as many of these words and phrases as possible, ranked so that documents with the most matches are presented first. A phrase is any string of adjacent words. The preferred way to link words into a phrase is to use quotes. Lower-case search will find matches of capitalized words also. For example, paris will find matches for paris, Paris, and PARIS. Capital letters in a search will force an exact case match on the entire word. For example, submitting a query for parIS will search only for matches of parIS. (Don't be surprised if there are none.)

+noir +film -"pinot noir"
Matches may be required, or prohibited. Precede a required word or phrase with + and a prohibited one with -. This query finds documents containing film and noir, but not containing pinot noir.

antique;pump;organ
Punctuation glues words into a phrase, just as quotes do. Punctuation is treated as white space, so this example is equivalent to "antique pump organ" (that is, three words enclosed in quotes).

quilt*
This query matches pages that contain at least one word such as quilt, quilts, quilting, quilted, quilter etc. Hint: The *-notation is also useful for searching for variant spellings. For example, alumi*m will find matches for both aluminum and the British English aluminium.

Constraining Searches

It is possible to restrict searches to certain portions of documents by using the following syntax. The keyword (link, title, image,...) should be in lower-case, and immediately followed by a colon.

Constraining searches in Web pages:

anchor:click-here
Matches pages with the phrase click here in the text of a hyperlink.

applet:NervousText
Matches pages containing the name of the Java applet class found in an applet tag; in this case, NervousText.

host:digital.com
Matches pages with the phrase digital.com in the host name of the Web server.

image:comet.jpg
Matches pages with comet.jpg in an image tag.

link:thomas.gov
Matches pages that contain at least one link to a page with thomas.gov in its URL.

text:algol68
Matches pages that contain the word algol68 in any part of the visible text of a page. (ie, the word is not in a link or an image, for example.)

title:"The Wall Street Journal"
Matches pages with the phrase The Wall Street Journal in the title.

url:home.html
Matches pages with the words home and html together in the page's URL. Equivalent to url: "home html".

The importance of parentheses

gold or silver and platinum
gold or (silver and platinum)
(gold or silver) and platinum

The first two queries are equivalent. They return documents containing both silver and platinum, together with documents containing gold. If you want the search to find documents containing platinum and, in addition, in the same document, either gold or silver, you must use the third query pattern.

not gold and silver
(not gold) and silver
not (gold and silver)

The first two queries are equivalent. They return documents containing silver but not gold. If you want the search to eliminate documents that contain both gold and silver, you must use the third query pattern.

gold near silver and platinum
(gold near silver) and platinum
(gold near silver) and (gold near platinum)

The first two queries are equivalent. They return documents containing gold located close to silver, and in addition, in the same document, the word platinum. If you want the search to find documents containing gold located close to silver and, in addition, in the same document, gold close to platinum, you must use the third query pattern.

not gold near silver
not (gold near silver)
silver and not (gold near silver)

The first two queries are equivalent. They eliminate from the search all documents containing silver located close to gold. If you want the search to find documents containing silver but want to eliminate those that contain gold located close to silver, you must use the third query pattern.

gold near silver or platinum
(gold near silver) or platinum

The two queries above are equivalent. They find documents containing gold located close to silver, together with documents containing platinum.

gold near (silver or platinum)
(gold near silver) or (gold near platinum)

The two queries above are equivalent. They find documents containing gold located close to silver, together with documents containing gold located close to platinum.

Advanced Queries

Advanced Queries use operators and expression syntax to construct queries. The rules for defining words and phrases, capitalization and wildcards are, however, the same as for Simple Queries.

  • You must use the binary operators AND, OR, NEAR, and the unary operator NOT to combine words and phrases. The operators can also be written in lower-case: and, or, not, near. Alternatively, you can use the symbols & for AND, | for OR, ! for NOT, and ~ for NEAR.
  • If you need to use any of these words as search words in a query, you must place them in quotes.
  • You are allowed to use parentheses to group search expressions; in fact it's recommended as less confusing.

Using the binary operators AND, OR, and NEAR, and the unary operator NOT

AND
kayak AND "San Juan Islands"
The operator AND ensures that both are present in the resulting documents. The operator AND binds less tightly than juxtaposition.

OR
"Digital Equipment Corporation" OR DEC
The operator OR ensures that at least one is present in the resulting documents. The operator OR binds less tightly than the operator AND.

NEAR
Louis NEAR Monier
The operator NEAR ensures that both are within ten words of each other in the resulting documents. The operator NEAR binds less tightly than the operator NOT and associates to the left. This query matches Louis Monier, Louis M. Monier and Monier, Louis.

AND NOT
vegetable AND NOT "brussel sprouts"
The operator NOT is used to exclude words or phrases from a query. The operator NOT binds less tightly than the operator OR . This query is equivalent to vegetable and (not "brussel sprouts"). Do not use vegetable NOT "brussel sprouts"; this query is syntactically illegal.


Home | eCommunity | Visitors | Neighborhoods | Village Mall | Government | Youth | Services
Help | Arts/Entertainment | Seniors | Calendar/News | Education | Organizations | Health
About BEV | Search BEV | Volunteer at BEV | Contact BEV

divider

Blacksburg Electronic Village