Augsburg Logo



Main | Course Syllabus | Supplementary Readings | Other Course Documents
FAQ | WWW Starting Points | HTML Resources | ISTE Standards

Savvy Searching: There's more to searching than hitting the enter key!

This reading will discuss various strategies and tools for searching the Internet. This is a crowded field, with exotic names such as Dogpile, Google, Yahoo!, to name a few. Hopefully this lesson will give you a flavor of the myriad of tools (usually called "search engines") and strategies available for searching the Internet. While most folks believe they're fairly proficient at web searching, in fact most folks don't know the most basic elements of how searching works or how to be more a more savvy searcher. There are also quite a number of very important but often unknown aspects to searching that dramatically impact the integrity, safety, and overall nature of the searching experience. It's not as simple as you may think! So let's make sure you're not one of those folks! We will start with the basics of searching. Later, you'll learn more about many of the risks, challenges and leading-edge issues that impact your searching much more than you may think.

Searching the World Wide Web and More: Some General Tips for Searching

First, search engines typically use Boolean logical operators to do their searching. Basically, this means that you can use specific words to set up a fairly sophisticated search that can be precisely defined. Search engines often recognize the operators "and," "not," "or," and parentheses. The particular syntax that activates these logical operators varies from one search tool to another, but the general principle applies to all search engines. I'll explain each of these briefly below:

And - this allows you to search for items or directories that include two or more terms of interest. Both terms will have to be in the document or directory title to be included in your search results. An example of an "and" search is the following:

Words to search for: utah and weather

(Note: Most search engines will assume an "and" if you do not place an operator between the two words - "Utah weather" will return the same search results).

Not - this operator allows you to exclude a term from the items you are searching for. You might wish to locate documents on foreign language programs except for ones involving Spanish. This search would look like the following:

Words to search for: language programs not Spanish

Or - the "or" operator should only be used rarely because it will return items that include _either_ of the terms you enter. For instance, the following search would give you a huge number of results:

Words to search for: ibm or mac

Parentheses - Parentheses allow you to do rather complex searches by breaking your search into separate elements. For instance, say you want to find items on travel in either Spain or Portugal. Your search would look like this:

Words to search for: travel and (Spain or Portugal)

There are several combinations of this type that can be made. My advice is to avoid making your search too complex. Start with a fairly complex search, and then break it down until you get the results you need.

Second, search engines allow for word truncation. This means that you can enter a word root as a search term and then add an asterisk at the end (sometimes the asterisk is not required). The search tool will search for any word that begins with that root. For example, the search term "librar*" would return items with the following words:

library

libraries

librarian

librarians

Truncation can be quite useful, particularly with the issue of plurals and singulars. Say you were interested in items on "librarians" or "librarian." You could enter "librarian*" your search term and get items with either term in them.

Third, typically you can control how many results or "hits" a search engine will display per page. A standard search will default to no more than 50 or 100 items per page. You can usually modify this parameter via a pop-up menu or dialog box at the site.

How Search Engines Work

In the past few years, many very sophisticated web-based search tools have been developed which can search web pages, Usenet news postings, electronic phonebooks and more. There is a lot more to searching the Internet than Google. Search engines are essentially databases in which computer users may search by asking questions, called "queries." As useful and powerful as many of these search sites are, they are not as comprehensive or as useful as they might appear. Google is one of those databases, but like any search engine database, it doesn't catalog the whole Internet, not by a long shot!

When we say that Google or other search engines don't cover the entire Internet, what does that mean? A recent Associated Press report cited a study of 11 of the largest search sites. The study found that even the most powerful sites only cover about one-sixth of the web pages available on the Internet at any particular time and that the time it takes them to list new sites (i.e., find and index) is growing. The study found that on average it takes a new web page up to six months or more to make it into a search engine's listings. With nearly one billion websites out there, that means any one search engine is missing as many as 750 million sites. Also, the various search engines vary widely in their quality, speed, and ease of use.

There are other problems with our current web search approaches. A recent article in the New York Times reviewed the so-called "deep web" problem. Search engines rely on programs known as crawlers (or spiders) that gather information by following the trails of hyperlinks that tie the Web together (see more on this below). While that approach works well for the pages that make up the surface Web, these programs have a harder time penetrating databases that are set up to respond to typed queries. Click here to read more about the strategies researchers are developing to mine the "deep web."

In general, search engines use two methods to gather new web sites for their listings. First they accept self "nominations" from web page developers. You will see a link on most search engine home pages indicating where someone can go to add pages to that search site. Second, most web sites employ sophisticated software robots or "spiders" whose job is to continuously surf the web "harvesting" new web pages. These spiders use sets of rules (called algorithms) to place sites in more-or-less relevant categories. Not all of the spiders do a great job of categorizing the web sites they find (as you surely have found if you have used them).

The best search sites don't rely on computers to do all of their categorizing. They do it the old fashioned way--with librarians! That's one of the reasons scholarly index sites such as the ones found in the Lindell Library are more helpful than a traditional search engine. Despite its popularity, Google's index (as contrasted with a dedicated library-based search engine tool such as PsychLit) has a much less helpful academic database. Smaller special-purpose databases such as PsychLit make up for their relative smallness with quality. It is very high quality because a real live person actually looks at the sites before they are placed in the PsychLit index.

Library databases such as PsychLit and others are very high quality as compared with general-purpose search engines such as Google, Yahoo! and Bing, but there's a trade-off. The subscription to PsychLit costs the library about $15,000 per year. Remember one of the basic rules of media literacy: "If you don't pay for it, then you are the product." Likewise, if you do pay for it, you often get something of value in return--in this case, a higher quality vetted database of resources. For academic and scholarly work, it is almost always better to login to your library's website and use their resources. As another wise person once said, "There's no free lunch."

The best way to test search sites is to try several and then bookmark your favorites for future visits. Typically these sites require a Web browser, although some may use stand-alone applications, some of which are Java-based. A partial list of some popular general-purpose and special-purpose search engines and indices may be found below.

Start With The Right Site

When you're looking for something specific, like movie reviews, zip codes, legislation, etc., the key to finding useful data on the the Internet may be starting with the right search resource. Many sites are specific to one or just a few topics or databases, making them a much better resource for that specific domain than the big generic search engines like GoogleHere's a quick review of some of the sites best suited for finding specific kinds of information.

To find information about...
...check here.
Government
www.usa.gov
Health
www.medlineplus.gov
Law
Legal Information (from LexisNexis)
Movies
www.imdb.com
People
www.accurint.com (also from LexisNexis--this site is not free)
News
therealnews.com
Reference
www.refdesk.com
Words
www.onelook.com

Do More With Google

Google has some hidden features that may be extremely useful. Here's a table describing some of these features. Go to http://www.google.com/help/features.html for a complete list.

Feature:

What to type:

Result your get:

Dictionary

define:word

Links to definitions

Calculator

10*35+4 (or any other equation)

The answer

Phone Book

first name, last name, zip code, or last name, zip code

Phone book matches

Special codes

package tracking numbers, area codes, vehicle ID numbers

Relevant results

Stock Quotes

sticks:ticker symbol

Recent stock quotes

Maps

street address, city, state, or zip code

Links to maps

Who Links to...

link:site URL

Websites that link to that URL

Search only one website

search term site: site URL, e.g.,
graduation
site:www.augsburg.edu

Search results limited to that site

A website has been developed that automates access to many of these special Google features. it's called Soople, and is available at: <http://www.soople.com/>.

Even More Google-ology

Google has prepared an online course to assist users in becoming power-searchers. The class is an actual online course: you need to register and attend (virtually) and you receive a certificate when it's completed. New sections of the course open periodically. Click here to learn more about the next class.

Other Issues to Consider: Security, Safety, and the "Filter Bubble"

As a recent New York Times article discussed, as the mobile web becomes more important, the nature of searching is changing. The traditional search (type in a search term, then sort through a list of blue-underlined hyperlinks) is being replaced by plain-language searching (Hello Siri? Where's the closest coffee shop?) and focused search (going directly to Amazon.com to shop-search). Having to attend to the details of properly formatted search queries (such as we discussed above) are being replaced by plain-language searches and context-based search where you don't really search at all, per se. For example, Android-powered smart phones using Google Now and Google Assistant (Google's answer to Siri) provide you home-screen updates regarding traffic conditions, but only at 5:00 p.m. on a weekday (or whenever your smart phone determines that you commute). How does your smart phone do that? It's paying attention to your movements and habits, that's how! Is this a good thing or a bad thing?

Artificial Intelligence-powered digital assistants such as Amazon's Alexa and Google's Assistant are now available in a hands-free mode via the Amazon Echo and Google Home devices. Again, searching is morphing into something completely different than the traditional search. It's too early to tell exactly how this will impact schools and learning, but I can't imagine children who have these devices in their homes aren't already using them for homework help.

And then there's the potential for privacy concerns: Alexa and Google are always listening. How do you think they know when to respond when you say, "Alexa," or "Hey Google"?

Of course, this raises other questions about which we need to reflect: If your phone is tracking your movements and behavior (to give you score updates for your favorite team or notifications if your flight is delayed), and it's always listening to everything you say, who is in charge of that information and precisely what are they doing with that information? As Danny Sullivan, an editor of Search Engine Land said regarding all of the information users type into those "little boxes" and provide by agreeing to be tracked, “You have millions of people a day saying exactly what they want, and if you’re an advertiser, it’s a beautiful vehicle.” Are your search queries some advertisers "beautiful vehicle"? Did you even know this was happening?

This tracking of user behavior leads to the search tools' anticipating your wishes. At first glance, that sounds good, right? But think about it. If you regularly read news only from MSNBC or Fox News, and the search engine then anticipates your preferences, you'll begin to see only search results that conform to your pre-existing political biases. This anticipatory algorithm is one of the reasons why you get different search results when you use different search engines. This phenomenon has a name: it is called "the filter bubble," and it is one of the many new risks experts see as searching continues to evolve. To watch a TED Talk on this topic, click here. The search engine DuckDuckGo was developed to specifically deal with the emerging problem of the filter bubble. (See more about DuckDuckGo below and how DuckDuckGo attempts to circumvent it.)

Another wrinkle is reflected in Samsung's decision to incorporate internet-based "whitepages" into their mobile devices. This may permit you to have a better idea of who is calling even if that number is not in your contact/dialer list. Of course, inasmuch as this service is "free," you know what that means; if you don't pay for it, then you are the product!

Of course, the search engine companies aren't the only ones watching what you're searching. The U.S. Government's PRISM project is one effort to fight terrorism, but in order to do it, they have to monitor phone calls, web searches, text messages and tweets. Read more about it here. As I mentioned above, there's a lot more to searching than typing a word in the box and clicking the blue links. A lot more.

Security and Privacy Issues: Beyond Information Searches

If corporate and government tracking of your information searches wasn't enough to worry about, we also need to deal with tracking of your spending and consumer habits. Each of us a "profile" that businesses use to target ads to us based on our previous behavior on the Internet. Search for the closest Italian restaurant? You'll probably see more ads for Italian restaurants in the future. Update your Facebook status to "engaged"? You can bet you'll see bridal ads (if you're a woman) or vacation ads (if you're a man). And it's more than just Google and Facebook. Your mobile phone provider and your Internet Service Provider are in on the act, too. It's all very sophisticated. In fact, the practioners of these "black arts" claim they can predict your vote in the upcoming election just by looking at the data they have gathered about you. For more information on how this all works and what you can do about it, please take a look at this recent YahooTech article on the subject.

As teachers and leaders of young learners, we need to attend to these activities and help prepare our students to navigate an Internet where everything you do and type is a commodity. Remember: If you don't pay for it, then you are the product. Searching is not as simple as most users think. I hope you, like me, are interested in understanding these issues so we don't become a pawn in someone else's shell game.


DuckDuckGo is a search engine which bills itself as, "The search engine that doesn't track you." All modern digital devices track users activities and habits. This is why you get personalized ads on Facebook pages and how your iPad knows to connect to your favorite WiFi network. But some have noted an interesting and perhaps unintended result of tracking user searches, something called "the filter bubble." (For more on this phenomenon, see the discussion below.) DuckDuckGo is an attempt to provide relevant search results without maintaining records of users' searches beyond your current search. See what you think at:

https://duckduckgo.com/

Where To Go for More Information About Search Engines

Search Engine Showdown is a web site which offers comprehensive comparisons of the major search engines including lists of features of the major search sites, reviews, hints on effective search strategies, statistics on usage of major sites, and more. The site even provides links to a Usenet newsgroup and a listserv on the topic of search engine technology and performance.

http://www.searchengineshowdown.com/

Happy searching!


Internet Lessons version 2.2. Copyright of lessons (C) 2017 by Joseph A. Erickson, All Rights Reserved. Permission Granted for Individual Usage.

If you plan to distribute multiple copies of this work, please contact the author.



Main | Course Syllabus | Supplementary Readings | Other Course Documents
FAQ | WWW Starting Points | HTML Resources | ISTE Standards