Architect-On-Duty architect-on-duty.com                                      Last Update 20061209

Boolean Logic - History and Basic Information

George Boole, a math whiz born in England in 1815, the year Napoleon was being hassled at Waterloo. George was the son of a shoemaker who couldn't afford to send him to Oxford or Cambridge, so young George educated himself for the most part. This self-taught scholar was an expert linguist, a teacher, an aspiring clergyman and a father of five daughters. His bride was the niece of George Everest, for whom that notable mountain is named. But his main claim to fame is a form of symbolic logic, later christened Boolean algebra, in which there are only two ultimate values--true or false.

Boole was a contemporary of a couple of other 19th century English pioneers of computer science. Charles Babbage was busily developing an "analytical engine," the ancestor of the digital computer of today, just about the time George was defining his logic. And Ada Byron King, Countess of Lovelace and daughter of the poet, Lord Byron, worked closely with Babbage on what the public of that era deemed his greatest folly. Ada is thought to be the first computer programmer. She theorized that punch cards, used in weaving intricate designs on Jacquard looms, could be employed as the "software" to give life to Babbage's "hardware." And she might have pulled it off had she not died at the tender age of 36. But IBM sure liked her idea!

When computers finally got themselves more fully developed in the mid-20th century, they were based on the binary number system in which each and every bit has one of only two values--zero or one. (The proverbial switch--on or off--signified by a difference in voltage!) In 1938 an MIT grad student, Claude Shannon, noticed that George's true/false logic system fit nicely with computer science's 0/1 binary bent. Thanks to Shannon, Boole's logic was discovered to be the most sensible way to sift through vast amounts of computer-based data and the Boolean search was born.

Boolean logic uses punctuation and prepositions to narrow a search. By using the mundane words AND, OR and NOT, and sometimes NEAR, as operators, and occasionally quotation marks or brackets, wheat generally emerges from chaff. Since a picture is said to be worth a thousand words, take a look at this site for a graphic representation in living colour of how Boolean logic narrows down a search. This link will save several dozen paragraphs of confusing descriptions about overlapping circles or boxes and it will give you a pretty standard illustration of the classic diagrams dreamed up by John Venn, another Boole peer, representing the AND, OR and NOT logic concepts.

As a rule, AND trumps OR, so you may need to use parentheses to clarify the search:

A or B and not C or D might be read as

A or (B and not C) or D unless you specify

(A or B) and not (C or D)

Some engines allow the use of quotation marks in place of brackets. And some engines will let you do "proximity searches" using the operator NEAR. For example: mortality AND (auto NEAR crash) may give you a list of hits on car accident survival statistics where the word auto is not necessarily always exactly adjacent to the word crash.

Wasn't that enlightening? Now that you've got a grip on how that's done, let's go over a few more tips for successful searching. The Internet has been likened to a library after an earthquake. All the books are still there, but they're scattered all over the floor. And if the card catalogue survived the quake, you'd at least be able to tell what part of the library your book was in, if not the shelf it used to be on. A search engine is like a card catalogue. It can give you a general sense of direction, and the more you can tell it, the more it can tell you about how to locate the information you need in the gigantic database that is the Internet.

But all search engines are not created equal. In fact some search engines are fast becoming directories a la Yahoo. AltaVista is my personal favourite. It's an industrial-strength, keyword-based search engine with both simple and advanced (Boolean) options, and it's fast--and getting faster since it was recently acquired by Compaq. Infoseek is also a keyword-based engine, and a great one, but instead of Boolean operators, it uses + and - symbols. Excite is keyword-plus-concept based, while Lycos is keyword-based but tending toward becoming a subject directory. Webcrawler now has Boolean options, but is still somewhat slow. HotBot is flashy and has some nice special options, while Yahoo is the granddaddy of directories, a subject index extraordinaire, which doesn't actually search the entire Web or Usenet, but is fast becoming the leading "portal" service--next to AOL, that is.

Each of the engines or directories has specific search instructions right on their site for your edification and enlightenment. But if you don't want to bother with their subtle differences, you can always use the "meta" solution--Internet Sleuth . With this tool, you can search all engines and directories in one fell swoop.

 

QUICK ADVICE ON BOOLEAN SEARCHING

The Internet is a vast computer database. As such, its contents must be searched according to the rules of computer database searching. Much database searching is based on the principles of Boolean logic. Boolean logic refers to the logical relationship among search terms, and is named for the British mathematician George Boole.

On Internet search engines, the options to construct logical relationships among search terms extend beyond the traditional practice of Boolean searching. This will be covered in the section below, Boolean Searching on the Internet.

Boolean logic consists of three logical operators:

·         OR

·         AND

·         NOT

Each operator can be visually described by using Venn diagrams, as shown below.

 

OR
Venn diagram for OR

college OR university

Query:           I would like information about college.

  • In this search, we will retrieve records in which AT LEAST ONE of the search terms is present. We are searching on the terms college and also university since documents containing either of these words might be relevant.
  • This is illustrated by:
    • the shaded circle with the word college representing all the records that contain the word "college"
    • the shaded circle with the word university representing all the records that contain the word "university"
    • the shaded overlap area representing all the records that contain both "college" and "university"

OR logic is most commonly used to search for synonymous terms or concepts.

Here is an example of how OR logic works:

Search terms

Results

college

17,320,770

university

33,685,205

college OR university

33,702,660

OR logic collates the results to retrieve all the unique records containing one term, the other, or both.

The more terms or concepts we combine in a search with OR logic, the more records we will retrieve.

Venn diagram for OR

For example:

Search terms

Results

college

17,320,770

university

33,685,205

college OR university

33,702,660

college OR university OR campus

33,703,082


AND
Venn diagram for AND

poverty AND crime

Query:    I'm interested in the relationship between poverty and crime.

  • In this search, we retrieve records in which BOTH of the search terms are present
  • This is illustrated by the shaded area overlapping the two circles representing all the records that contain both the word "poverty" and the word "crime"
  • Notice how we do not retrieve any records with only "poverty" or only "crime"

Here is an example of how AND logic works:

Search terms

Results

poverty

783,447

crime

2,962,165

poverty AND crime

1,677

The more terms or concepts we combine in a search with AND logic, the fewer records we will retrieve.

Venn diagram for AND

For example:

Search terms

Results

poverty

783,447

crime

2,962,165

poverty AND crime

1,677

poverty AND crime AND gender

76

A few Internet search engines make use of the proximity operator NEAR. A proximity operator determines the closeness of terms within a source document. NEAR is a restrictive AND. The closeness of the search terms is determined by the particular search engine. For example, NEAR in AltaVista (Power Search) is 10 words. As another example, Google defaults to proximity searching by default.


NOT
Venn diagram for NOT

cats NOT dogs

Query:    I want to see information about cats, but I want to avoid seeing anything about dogs.

  • In this search, we retrieve records in which ONLY ONE of the terms is present
  • This is illustrated by the shaded area with the word cats representing all the records containing the word "cats"
  • No records are retrieved in which the word "dogs" appears, even if the word "cats" appears there too

Here is an example of how NOT logic works:

Search terms

Results

cats

3,651,252

dogs

4,556,515

cats NOT dogs

81,497

NOT logic excludes records from your search results. Be careful when you use NOT: the term you do want may be present in an important way in documents that also contain the word you wish to avoid.


BOOLEAN SEARCHING ON THE INTERNET


When you use an Internet search engine, the use of Boolean logic may be manifested in three distinct ways:

  1. Full Boolean logic with the use of the logical operators
  2. Implied Boolean logic with keyword searching
  3. Predetermined language in a user fill-in template

1. Full Boolean logic with the use of the logical operators

Many search engines offer the option to do full Boolean searching requiring the use of the Boolean logical operators.

Examples:

Query:    I need information about cats.

Boolean logic:    OR

Search:    cats OR felines

Query:    I'm interested in dyslexia in adults.

Boolean logic:    AND

Search:    dyslexia AND adults

Query:    I'm interested in radiation, but not nuclear radiation.

Boolean logic:    NOT

Search:    radiation NOT nuclear

Query:    I want to learn about cat behavior.

Boolean logic:    OR, AND

Search:    (cats OR felines) AND behavior

Note: Use of parentheses in this search is known as forcing the order of processing. In this case, we surround the OR words with parentheses so that the search engine will first process this part of the search. Next, the search engine with combine this result with the last part of the search. Using this method, we are assured that the OR terms are kept together as a logical unit.

2. Implied Boolean logic with keyword searching

Keyword searching refers to a search type in which you enter terms representing the concepts you wish to retrieve. Boolean operators are not used.

Implied Boolean logic refers to a search in which symbols are used to represent Boolean logical operators. In this type of search on the Internet, the absence of a symbol is also significant, as the space between keywords defaults to either OR logic or AND logic. Many well-known search engines traditionally defaulted to OR logic, but as a rule are moving away from the practice and defaulting to AND.

Implied Boolean logic has become so common in Web searching that it may be considered a de facto standard.

Examples:

Query:    I need information about cats.

Boolean logic:    OR

Search:    cats    felines

This example holds true for the search engines that interpret the space between keywords as the Boolean OR. To find out which logic the engine is using as the default, consult the help files at the site. Nowadays, there are few engines that use OR logic as the default.

Query:    I'm interested in dyslexia in adults.

Boolean logic:    AND

Search:    +dyslexia    +adults

Query:    I'm interested in radiation, but not nuclear radiation.

Boolean logic:    NOT

Search:    radiation    -nuclear

Query:    I want to learn about cat behavior.

Boolean logic:    OR, AND

Search:    cats    felines    +behavior

3. Predetermined language in a user fill-in template

Some search engines offer a search template which allows the user to choose the Boolean operator from a menu. Often the logical operator is expressed with substitute language rather than with the operator itself.

Examples:

Query:    I need information about cats

Boolean logic:    OR

Search:    Any of these words/Can contain the words/Should contain the words

Query:    I'm interested in dyslexia in adults.

Boolean logic:    AND

Search:    All of these words/Must contain the words

Query:    I'm interested in radiation, but not nuclear radiation.

Boolean logic:    NOT

Search:    Must not contain the words/Should not contain the words

Query:    I want to learn about cat behavior.

Boolean logic:    OR, AND

Search:    Combine options as above if the template allows multiple search statements

Quick Comparison Chart:
Full Boolean vs. Implied Boolean vs. Templates

 

Full Boolean

Implied Boolean

Template Terminology

OR

college or university

college    university
*see note below

any of these words
can contain the words
should contain the words

AND

poverty and crime

+poverty    +crime

all of these words
must contain the words

NOT

cats not dogs

cats    -dogs

must not contain the words
should not contain the words

NEAR, etc.

cats near dogs

N/A

near

* This search statement will resolve to AND logic at search engines that use AND as the default. Nowadays most search engines default to AND. Always play it safe, however, and consult the Help files at each site to find out which logic is the default.

LAST UPDATE    09/12/2006