twitter  facebook  feedburner  youtube  linkedin

 

Category: Robots & Crawlers

Using A Web Crawler Simulator to Improve SEO

Have you ever dreamed of owning a treasure map? A real one? A map that showed you exactly where a trove worth millions of dollars was located? Unfortunately, since you are living in the real world the chances of owning the map and finding buried treasure are nil.

Or are they? Today’s treasure isn’t buried. It’s spread across a vast online network called the Internet and it involves the nascent science of SEO.

As the science of SEO has developed, so have the SEO tools. One tool that you should be aware of is called a web crawler simulator. The web crawler simulator essentially shows you how search engines are viewing your website so you can respond accordingly, tweaking your SEO strategies to reflect the algorithms of the search engines.

What is a web crawler simulator?

First, let’s make sure we understand what a web crawler simulator is. That requires backing up a bit and making sure we understand what a web crawler is. A web crawler is a complex computer program that systematically “crawls” or searches the Internet and forms search engine entries from what it reads. A good way to think of it is like a giant automated indexing tool for the Internet. The “index” in this case is search engines like Google, Yahoo, or Bing. Some people refer to web crawlers as web spiders.

That helps us understand what a web crawler simulator is. A web crawler simulator helps you to analyze your website according to the search engine web crawlers.

What can a web crawler simulator do?

If you have a tool that tells you how Google is looking at your site, you have a treasure map. Taking your site to the first page of a Google search engine is essentially like striking gold. The web crawler simulator can help to take you there.

Web crawler simulators do not look like very attractive maps. Gone are the frayed parchments and the red “X.” A web crawler simulator simply shows you what the web crawlers “sees” when it crawls your page. The result is a bland readout of HTML coding, sans pictures.

The new perspective—viewing your website through the eyes of a crawler—can help you make changes that will improve the SEO quality of your site. It demonstrates your keyword density and your metadata. It is a “dry run” of how the crawler will look at your website, and therefore where you’ll end up in line when Google, Yahoo, or Bing return results from a search engine query.

Free Web Crawler Simulators

Here are a few common web crawler simulators that you can try:

  • http://www.smart-it-consulting.com/internet/google/googlebot-spoofer/index.htm
  • http://www.webconfs.com/search-engine-spider-simulator.php
  • http://www.seobench.com/search-engine-crawler-simulator/
  • http://www.mydigitallife.info/tools/search-engine-spider-and-crawler-simulator/

Nobody ever said that treasure hunting was easy work. It still takes time, requires that you dig a little bit, and you may end up with some sore muscles. But I think you’ll be glad you did.

Is AJAX SEO Possible?

Among the stickiest of SEO conundrums is Ajax SEO. Just seeing those two words together—Ajax SEO—makes the SEO gurus shudder and lose a few hours of sleep at night. While the wonderful dynamism of Ajax web apps is so smooth and tantalizing, it has always posed a problem for SEO purposes. So, is AJAX SEO possible?

The Problem with Ajax SEO

SEO crawlers were originally designed for HTML. Somehow, they’ve never been able to totally crawl out of the HTML age, though they are getting better. However, because of this problem many Ajax pages simply are not crawled by the SEO robots. Other problems are prevalent among Ajax pages. Since one URL governs an entire Ajax page, SEO-crucial links and navigation are missing. SEO crawlers may even suspect unethical SEO techniques when crawling Ajax pages. Further problems abound—lack of indexing, crummy site navigation, and saving/sending/bookmarking problems abound.

A Helpful Tool for Ajax SEO

Ajax doesn’t support deep linking. It could create some frustration for bookmarking and site navigation. Even though Ajax alone doesn’t give you deep linking capabilities, there are ways to skirt this problem. Perhaps the most popular and widely used tool for creating unique virtual URLs is SWFAddress. This powerful tool essentially turns Flash and Ajax into simple HTML, allowing users to navigate, bookmark, save, e-mail and refresh Ajax pages. The SEO power of SWFAddress is that it creates actual HTML links that give the Ajax site more visibility to SEO robots. As long as the Ajax engineer submits an XML sitemap using SWFAddress to the search engines, there is a much better possibility of improving Ajax SEO.

Insightful Tips for Ajax SEO

Rather than create an entire site in Ajax, make a distinct URL for individual pages in order to get more visibility with the SEO robot.

Include keywords, and place them at the beginning of the site where they are more likely to be found by the SEO crawler. Use Ajax only when it’s necessary, not merely for a flashy site. Ajax is helpful for achieving a page’s dynamic interaction with the server. The fact still holds true that although you can obtain some degree of Ajax SEO, it will not be as high as a non-Ajax site.

In case you are craving a yes or no answer to the article title, here it is. Yes, AJAX SEO is possible. Of course, there is a disclaimer. Ajax SEO is limited. By following the insightful tips above, you’ll obtain at least some level of AJAX SEO.

Bing Crawler Optimization Tips

Just when you thought you had Google figured out, a major monkey wrench comes hurtling through cyberspace. With bone-crunching chaos, it smashes your perfectly engineered SEO strategy into smithereens. When the dust clears and the fallout passes, what you’re staring at is a strange new animal. It’s not Google. It’s not Yahoo. It’s Bing. It’s Microsoft’s attempt to gain ascendancy in the king-of-the-search-engine-hill competition.

Enter a new head-scratching dilemma:  what is involved in Bing crawler optimization? Even though Google still tops the charts when it comes to search engine usage, there are some important facts you need to know when optimizing your site for Bing’s SEO crawler. Crawler mavens and SEO masters have come up with some answers. Here’s what you need to know about Bing crawler optimization. First, some general advice, and then a list of tips.

General Advice:  Bing is Not Google

First, you must be aware that Bing is not Google. Yes, that’s a painfully obvious statement, but here’s why it’s worth stating:  Bing uses a totally different system of crawling than Google does. For this reason, the optimization techniques and expectations that you may be familiar with under Google simply will not work with Bing. Here are some specifics.

Bing crawls slower than Google. Bing is what you could call a very cautious search engine. Rather than instantaneously crawl and rank post on the web that has high keyword or backlink levels, Bing is patient enough to wait for verification, warranting, and approval from a network of sites. Obviously, this means a longer wait for better Bing search engine ranking, but the slow-and-steady approach also protects Bing’s reputation. Bing demands a higher degree of honesty and integrity from the get-go than does Google. Bing wants high quality content. Bing’s SEO robot wants unique pages. Focus on quality and be patient. You’ll hit Bing’s pages soon enough.

Bing wants you to submit your site. For faster site ranking, you should submit your site to Bing. Bing wants it that way. If you open a webmaster account with Bing, which requires signing up with a Windows Live ID, you will have a higher degree of ‘trust’ from Bing. Adding your site to Bing and including a site map is an important first step in getting recognized by Bing.

Bing is young  … but growing. In the final analysis, don’t get overly panicky if you just can’t figure out Bing. SEO Crawlers and SEO robots are confusing. Bing is a young search engine, and as formidable as it may seem, it simply hasn’t hit the tipping point of being the major search engine. Google should still be your bread and butter when it comes to optimization techniques. Bing may catch up over time, and by then we’ll have some more things figured out about Bing’s SEO robot.

Specific Advice: A list of tips

So for the practical, brass tacks kind of advice, here’s a list of tips on Bing crawler optimization:

  • Go for low keyword density. If you have more than three keywords per page, you’ll get the hairy eyebrow from the SEO crawler.
  • Go for uniqueness. Specifically, you should have unique <title> tags and <meta> description tags. The title tags should match the keywords on the rest of the content in your site.
  • Where possible, use text navigation links, ranger than graphical links within the site.
  • Go for stellar content. Think, “A very intelligent human being is going to read this,” not, “Hmm. Wonder how the Bing crawler is going to do here?” Aim for powerful content, not just Bing crawler optimization.
  • Focus. A narrow collection of keywords is important. Bing crawler optimization means that you should not use a broad range of keywords in your site content. Stick to a topic and focus on the keywords within that topic.
  • Stay ethical. Any no-nos of SEO technique are still no-nos on Bing. They nab keyword stuffing and hidden text remarkably well.
  • Aim for high quality backlinks, not just a big quantity of backlinks. To borrow a cliché: quality not quantity. Older sites are considered “high quality.” Bing doesn’t like Blogspot, nor really any blog sites for that matter. On the other hand, Bing likes Hubpages a lot.
  • Put keywords into your URL.

That list of advice is pretty simple and short, but will go a long way as you engage in the kind of Bing optimization that will put you at the top of the page.

Sitemap.XML – Why Changefreq & Priority Are Important

If your website has an XML sitemap, Changefreq and Priority are two important tags for supplying data to the search engines. They affect when and how often search engine “spiders” (also called “robots” or “crawlers”) visit your site’s individual pages, which has various implications. Although using the Changefreq and Priority XML sitemap tags is voluntary, they remain important for several reasons…

According to Google.com, the Changefreq XML tag may be set to one of seven frequencies: “never”, “yearly”, “monthly”, “weekly”, “daily”, “hourly”, or “always”. This tells the search engines approximately how often each page is updated. An update refers to actual changes to the HTML code or text of the page, not updated Flash content or modified images. Changefreq examples…

NEVER: Old news stories, press releases, etc.
YEARLY: Contact, “About Us”, login, registration pages.
MONTHLY: FAQs, instructions, occasionally updated articles.
WEEKLY: Product info pages, website directories.
DAILY: Blog entry index, classifieds, small message board.
HOURLY: Major news site, weather information, forum.
ALWAYS: Stock market data, social bookmarking categories.

The Priority XML sitemap tag is useful, although not quite as important. It is set to a number ranging from zero to one; if no number is assigned, a page’s priority is 0.5. A high priority page may be indexed more often and/or appear above other pages from the same site in search results. Here are some examples of different types of pages and how their Priority sitemap XML tag value might be set, depending upon how important they are…

0.8-1.0: Homepage, subdomains, product info, major features.
0.4-0.7: Articles and blog entries, category pages, FAQs.
0.0-0.3: Outdated news, info that has become irrelevant.

How strictly they want to follow the Priority and Changefreq sitemap specifications is up to the search engines; these XML tags are considered preferences, not orders. This doesn’t mean search engines don’t consider Priority and Changefreq important, just that they won’t put sitemap instructions before their own interests (like making sure a site hasn’t changed its subject or become pornographic).

But why is it important when or how frequently search engine “spiders” index your pages? When a “spider” visits a web page, it records information about the page’s content, title, META tags, links, and other characteristics. This ensures that search results reflect its latest content and take into account any recent improvements (such as new META tags or repaired links).

However, it is unnecessary for “spiders” to regularly scan pages that are seldom or never updated. Spider indexing consumes bandwidth (which can increase the cost of operating your website), and may briefly slow access to your site if it is run on a low-capacity server. Thus it is important to set the Changefreq sitemap tag to accurately reflect how often individual pages are updated.

SEO for PDFs – Optimize Your PDF Documents for Search

Here are a couple SEO tips for your website when creating PDF documents.

1. Optimize The PDF File Name. When you create the pdfs, be sure to include your SEO targeted keywords in the file name.  Make sure that Google can identify what the PDF is about.  Don’t overdo it, but try to work in a few keywords into the file name like target-keyword.pdf

2. Complete Document Properties. Most PDFs are indexed without specified document properties, the most important of which is the Title. This document property is the equivalent of the html title tag.  If you don’t complete the Title property, the search engine is going to creat a title from the PDF’s content, and it may not be optimized for your targeted SEO keywords.  There are other PDF meta data properties that can be completed, the only other one of importance is the Subject property which is the equivalent of a meta description for .html pages.

3. Optimize your text in the PDF.
For your PDF’s optmize the copy just like you would web page copy.  Don’t overdo it, but use target keywords in the first hundred words.  Try to use some variations to support long-tail while focusing on a main keyword for optimization.

4. Build links into PDFs. Include links in your PDFs, and pay attention to the anchor text used as Search engines recognize these links. In addition to including links in PDFs for search-related purposes, there’s also a good business reason as PDFs are often passed along to others via email.

For B2B websites with PDFs and White Papers, this optimization of PDFs can be incredibly important for your overall SEO strategy.  Qualified B2B visitors can be hard to find, make sure you give yourself every opportunity by leveraging your PDFs for top rankings.