Glossary Terms
S
Safari
A popular Apple browser.
Salton, Gerard
Scientist who pioneered the information retrieval field.
See also: A Theory of Indexing - 1975 book by Gerard Salton
Scumware
Intrusive software and programs which usually target ads, violate privacy, and are often installed without
the computer owner knowing what the software does.
Search History
Many search engines store user search history information. This data can be used for better ad targeting
or to make old information more findable.
Search engines may also determine what a document is about and how much they trust a domain based on
aggregate usage data. Many brand related search queries is a strong signal of quality.
Search Engine
A tool or device used to find relevant information. Search engines consist of a spider, index, relevancy
algorithms and search results.
SEM
Search engine marketing.
Also known as: Search Marketing
SEO
Search engine optimization is the art and science of publishing information and marketing it in a manner
that helps search engines understand your information is relevant to relevant search queries.
SEO consists largely of keyword research, SEO copywriting, information architecture, link building, brand
building, building mindshare, reputation management, and viral marketing.
SEO Copywriting
Writing and formatting copy in a way that will help make the documents appear relevant to a wide array of
relevant search queries. There are two main ways to write titles and be SEO friendly
Write literal titles that are well aligned with things people search for. This works well if you need backfill
content for your site or already have an amazingly authoritative site.
Write page titles that are exceptionally compelling to link at. If enough people link at them then your pages
and site will rank for many relevant queries even if the keywords are not in the page titles.
See also:
Search Engine Friendly Copywriting - What Does 'Write Naturally' Mean for SEO?
Copyblogger: Magnetic Headlines
SERP
Search Engine Results Page is the page on which the search engines show the results for a search query.
Search Marketing
Marketing a website in search engines. Typically via SEO, buying pay per click ads, and paid inclusion.
Server
Computer used to host files and serve them to the WWW.
Dedicated servers usually run from $100 to $500 a month. Virtual servers typically run from $5 to $50 per
month.
Server Logs
Files hosted on servers which display website traffic trends and sources.
Server logs typically do not show as much data and are not as user friendly as analytics software. Not all
hosts provide server logs.
Singular Value Decomposition
The process of breaking down a large database to find the document vector (relevance) for various items
by comparing them to other items and documents. Important steps:
Stemming: taking in account for various forms of a word on a page
Local Weighting: increasing the relevance of a given document based on the frequency a term appears in
the document
Global Weighting: increasing the relevance of terms which appear in a small number of pages as they are
more likely to be on topic than words that appear in most all documents.
Normalization: penalizing long copy and rewarding short copy to allow them fair distribution in results. a
good way of looking at this is like standardizing things to a scale of 100.
Multi dimensional scaling is more efficient than singular value decomposition because it requires
exceptionally less computation. When combined with other ranking factors only a rough approximation of
relevance is necessary.
Siphoning
Techniques used to steal another web sites traffic, including the use of spyware or cybersquatting.
Site Map
Page which can be used to help give search engines a secondary route to navigate through your site.
Tips: On large websites the on page navigation should help search engines find all applicable web pages.
On large websites it does not make sense to list every page on the site map, just the most important pages.
Site maps can be used to help redistribute internal link authority toward important pages or sections, or
sections of your site that are seasonally important. Site maps can use slightly different or more descriptive
anchor text than other portions of your site to help search engines understand what your pages are about.
Site maps should be created such that they are useful to humans, not just search engines.
Slashdot
Central editorially driven community news site focusing on technology and nerd related topics created by
Rob Malda.
See also: Slashdot.org
Snippit (see Description)
Social Media
Websites which allow users to create the valuable content. A few examples of social media sites are social
bookmarking sites and social news sites.
See also: Del.icio.us - social bookmarking program, Digg - social news site
Spam
Unsolicited email messages.
Search engines also like to outsource their relevancy issues by calling low quality search results spam.
They have vague ever changing guidelines which determine what marketing techniques are acceptable at
any given time. Typically search engines try hard not to flag false positives as spam, so most algorithms
are quite lenient, as long as you do not build lots of low quality links, host large quantities of duplicate
content, or perform other actions that are considered widely outside of relevancy guidelines. If your site is
banned from a search engine you may request reinclusion after fixing the problem.
See also:
Google Webmaster Guidelines, Microsoft Live Search: Guidelines for successful indexing
Yahoo! Search Content Quality Guidelines, BMW Spamming - Matt Cutts posted about BMW using search
spam. Due to their brand strength BMW was reincluded in Google quickly.
Spamming
The act of creating and distributing spam.
Spider
Search engine crawlers which search or "spider" the web for pages to include in the index.
Many non-traditional search companies have different spiders which perform other applications. For
example, TurnItInBot searches for plagiarism. Spiders should obey the robots.txt protocol.
Splash Page
Feature rich or elegantly designed beautiful web page which typically offers poor usability and does not
offer search engines much content to index.
Make sure your home page has relevant content on it if possible.
Splog
Spam blog, typically consisting of stolen or automated low quality content.
Spyware
Software programs which spy on web users, often used to collect consumer research and to behaviorally
targeted ads.
See also: Ad Aware - spyware removal software
Stop Badware - site about fighting spyware and other adverse sleazy software programs
Squidoo
Topical lens site created by Seth Godin.
See also:
Squidoo.com
SSI
Server Side Includes are a way to call portions of a page in from another page. SSI makes it easier to
update websites. To use a server side include you have to follow one of the conditions:
end file names in a .shtml or .shtm extension
use PHP or some other language which makes it easy to include files via that programming language
change your .htaccess file to make .html or .htm files be processed as though they were .shtml files.
The code to create a server side include looks like this: <!--#include virtual="/includes/filename.html" -->
Static Content
Content which does not change frequently. May also refer to content that does not have any social
elements to it and does not use dynamic programming languages.
Many static sites do well, but the reasons fresh content works great for SEO are:
If you keep building content every day you eventually build a huge archive of content
By frequently updating your content you keep building mindshare, brand equity, and give people fresh
content worth linking at
Stemming
Using the stem of a word to help satisfy search relevancy requirements. EX: searching for swimming can
return results which contain swim. This usually enhances the quality of search results due to the extreme
diversity of word used in, and their application in the English language.
Stop Words
Common words (ex: a, to, and, is ...) which add little relevancy to a search query, and are thus are removed
from the search query prior to finding relevant search results.
It is both fine and natural to use stop words in your page content. The reason stop words are ignored when
people search is that the words are so common that they offer little to no discrimination value.
Sullivan, Danny
Founder and lead editor of SearchEngineWatch.com, who later started SearchEngineLand.com.
See also:
Daggle - Danny's personal blog
SearchEngineWatch - Danny's old website
SearchEngineLand - Danny's new website
Submission
The act of making information systems and related websites aware of your website. In most cases you no
longer need to submit your website to large scale search engines, they follow links and index content. The
best way to submit your site is to get others to link to it.
Some topical or vertical search systems will require submission, but you should not need to submit your site
to large scale search engine.
Supplemental Results
Documents which generally are trusted less and rank lower than documents in the main search index.
Some search engines, such as Google, have multiple indicies. Documents which are not well trusted due to
any of the following conditions:
limited link authority relative to the number of pages on the site, duplicate content or near duplication,
exceptionally complex URLs
Documents in the supplemental results are crawled less frequently than documents in the main index. Since
documents in the supplemental results are typically considered to be trusted less than documents in the
regular results, those pages probably carry less weight when they vote for other pages by linking at them.
You can find document's on this site that are in Google's supplemental results by searching for
site:seobook.com *** -view:randomstring
T
