Google AIS Custom Search

charts (2)

Total Indexed Count

Google says that this count is accurate (unlike the site: search operator) and is post-canonicalization. In other words, if your site includes a lot of duplicate URLs (due to things like tracking parameters) and the pages include the canonical attribute or Google has otherwise identified and clustered those duplicate URLs, this count only includes that canonical version and not the duplicates. You can also get this data by submitting XML Sitemaps but you’ll only see complete indexing numbers if your Sitemaps are comprehensive.

Google also charts this data over time for the past year.

Edited to add: Google has told me that the data may have a lag time of a couple of weeks, which makes it more useful for trends than for real-time action. Also, if you look at domain.com, you’ll see stats for all subdomains, and if you look at www.domain.com, you’ll see stats for only the www subdomain (of course this means that if you don’t use www for your site as with searchengineland.com, there’s no easy way to see this data with subdomain information excluded.)

Advanced Status: How This Data Is Useful and Actionable

The Advanced option provides additional details:

Google Index Status Advanced

Great, right? More data is always good! Well, maybe. The key is what you take away from the data and how you can use it. To make sense of this data, the best approach is to exclude the Ever Crawled number and look at it separately (more on that in a moment). So, you’re left with:

  • total indexed
  • not selected
  • blocked by robots

The sum of these three numbers tells you the number of URLs Google is currently considering. In the example above, Google is looking at 252,252 URLs. 22,482 of those are blocked by robots.txt, which is fairly straightforward. This mostly matches the number of URLs reported as blocked under Blocked URLs (22,346). Unfortunately, it’s become difficult to look at the list of what those URLs are. The blocked URLs report is no longer available in the UI, although it is available through the API. That leaves 229,770 URLs. Which means 74% of the URLs weren’t selected for the index. Why not? Is this bad? The trouble with looking at these numbers without context is that it’s difficult to tell.

Let’s say we’re looking at a site with 50,000 indexable pages. Has Google crawled only 31,480 unique pages and indexed all of them? (In this case, all of the not selected would be non-canonical URL variations with tracking codes and the like.) Or has Google crawled all 50,000 (plus non-canonical variations) but has decided only 31,480 of the 50,000 were valuable enough to index? Or maybe only 10,000 of those URLs indexed are unique, and due to problems with canonicalization, a lot of duplicates are indexed as well.

This problem is difficult to solve without a lot of other data points to provide context. Google told me that:

“A URL can be not selected for indexing for many reasons including:

  • It redirects to another page
  • It has a rel=”canonical” to another page
  • Our algorithms have detected that its contents are substantially similar to another URL and picked the other URL to represent the content.”

If the not selected count is solely showing the number of non-canonical URLs, then we can generally extrapolate that for our example, Google has seen 31,480 unique pages from our 50,000-page site and has crawled a lot of non-canonical versions of those pages as well. If the not selected count also includes pages that Google has decided aren’t valuable enough to index (because they are blank, boilerplate only, or spammy), then things are less clear. (Edited to add: Google has further clarified that “not selected” includes any URLs flagged as non-canonical (and the third bullet above  could include blank, boilerplate, or duplicate pages), with meta robots noindex tags, and that redirect and is not based on page quality.)

If 74% of Google’s crawl is of non-canonical URLs that aren’t indexed and redirects, is that a bad thing? Not necessarily. But it’s worth taking a look your URL structure. Non-canonical URLs are unavoidable: tracking parameters, sort orders, and the like. But can you make the crawl more efficient so that Google can get to all 50,000 of those unique URLs? Google’s Maile Ohye has some good tips for ecommerce sites on her blog. Make sure you’re making full use of Google’s parameter handling features to indicate which parameters shouldn’t be crawled at all. For very large sites, crawl efficiency can make a substantial difference in long tail traffic. More pages crawled = more pages indexed = more search traffic.

Ever Crawled

What about the ever crawled number? This data points should be looked at separately from the rest as it’s an aggregate number from all time. In our example, 1.5 million URLs have been crawled. But Google is currently considering only 252,252 URLs. What’s up with the other 1.2 million? This number includes things like 404s, but tor this same site, Google is reporting only 5,000 of those, so that doesn’t account for everything. Since this count is “ever” rather than “current”, things like 404s have surely piled up over time. Edited to add: Google has clarified that all numbers are for HTML files only, and not for filetypes like images, CSS files or JavaScript files.

In any case, I think this number is much more difficult to gain actionable insight from. If the ever crawled number is substantially smaller than the size of your site, then this number is very useful indeed as some problem definitely exists that you should dive into. But for the sites I’ve looked at so far, the ever crawled number is substantially higher than the site size.

Site size can be difficult to pin down, but for those of you who have good sense of that, are you finding that most of your pages are indexed?

Source - http://searchengineland.com/google-reveals-index-secrets-charts-indexing-of-your-site-over-time-128559

Read more…

Global Consumers Place Highest Trust in Earned Media

Online automotive consumers place the most amount of trust in earned media, and the least in ads served on mobile phones, finds Nielsen [download page] in an April 2012 report. An impressive 92% of automotive consumers surveyed around the world said they trust earned media, such as word-of-mouth or recommendations from friends and family, an 18% increase from 2007. Automotive consumer opinions posted online (70%) was next-most trusted, outpacing other formats such as editorial content within newspaper articles, dealership and car company websites (both at 58%). Text ads on mobile phones are trusted by just 29% of automotive consumers.

This finding contrasts with April 2012 survey results from Ipsos, which found that while automotive consumers worldwide may turn to their friends for advice on vehicle purchases and repair services, only 38% will trust a dealership, make of vehicle or a service department more because friends recommended it.

Traditional Media Takes a Fall
Data from Nielsen’s “Global Trust in Advertising 2012″ indicates that automotive consumer trust in traditional paid advertising messages has taken a significant drop. While close to half say they trust TV (47%), magazine (47%), and newspaper ads (46%), confidence in these ads has dropped by 24%, 20%, and 25%, respectively from 2009 to 2011, when the latest survey was conducted.

Despite this fall in trust, traditional media ads, particularly on TV, appear to have their intended effect. According to April 2012 survey results from ExactTarget, TV ads influence a larger proportion of online automotive consumers... a product or service than a variety of other advertising media. 53% of respondents said a TV ad had influenced them to purchase a vehicle or maintenance service in the past 12 months, putting TV ads far ahead of newspaper ads (32%) and magazine ads (30%). In fact, three times more respondents said they had been influenced by a TV ad than by a banner or other ad on a website (53% vs. 18%).

Trust in Online Ads Low, But Growing
The Nielsen study finds that trust in most online ads is relatively lower than on traditional media, save for ads found on OEM branded websites, which are trusted by 58% of consumers. For example, only 40% trust ads served in search engine results, while just 36% trust online video ads, or ads on social networks. These findings are similar to Nielsen and NM Incite survey results released in February 2012, which found more trust in branded website ads than any other form of online adv....

Despite low rates of trust in online banner ads (33%), this represents a 27% increase since 2007. Similarly, while the level of trust placed in mobile phone advertising is still low, at 29%, this is an increase of 61% since 2007, and 21% since 2009.

Attitudes Towards Relevance Mirror Trust
The Nielsen survey also asked respondents to identify which advertising and brand messaging platforms are the most relevant to them when searching for information about the products, finding that the relevancy results often mirrored the trust responses. Recommendations from friends and family again topped the list, at 90% of respondents, followed by consumer opinions posted online (75%), branded websites (59%), and editorial content such as newspaper articles (55%). The relevance of paid traditional media platforms ranged from about 40-50%, while many online platforms scored lower, save for ads served in search engine results (42%).

Other Findings:
Latin American consumers had the highest levels of trust across 17 of the 19 advertising methods identified, when compared to other regions.
Trust in mobile phone ads was highest in the Middle East and Africa, with 40% indicating trust in text ads on mobile phones. These consumers also placed more trust in billboard and outdoor advertising than the global average (59% vs. 47%).
Consumers in Asia Pacific reported a higher level of trust in all formats surveyed when compared to the global average. They also had the highest level of trust in earned media, such as recommendations from friends and family (94%) and consumer opinions posted online (76%).
North Americans and Europeans appear to be the most skeptical consumers, with European respondents reporting the lowest levels of trust in all but 1 format (consumer opinions posted online - 64%).

About the Data: The Nielsen Global Trust in Advertising Survey was conducted in August/ September 2011 and polled more than 28,000 consumers in 56 countries throughout Asia Pacific, Europe, Latin America, the Middle East, Africa and North America. The Nielsen survey is based on the behavior of respondents with online access only. Internet penetration rates vary by country. Nielsen uses a minimum reporting standard of 60% internet penetration or 10M online population for survey inclusion.

Source - http://www.marketingcharts.com/television/global-consumer-trust-highest-in-earned-media-21766/ 

Read more…

SPONSORS