Thursday - Jan 6, 2011
As part of its mission of organizing the world’s information, Google has been scanning and digitizing millions of books. The company estimates that there are around 130 million existing books. So far, it has scanned more than 12 million. The goal is to digitize all of them by the end of the decade, creating an unprecedented resource.
Google Books offers full text search of all the volumes in its digital library. It also has amassed a database of 500 billion words from books published between 1500 and 2008. Using a recently released tool, you can graph the occurrence of keywords over time. Warning: It can be addictive.
According to Erez Lieberman Aiden, a junior fellow at the Society of Fellows at Harvard, “The goal is to give an 8-year-old the ability to browse cultural trends throughout history, as recorded in books.” Aiden was a member of the team that compiled the data.
So how does it work? You can search for the number of times a word or term appears over a specified time period. I searched for “chocolate” between 1700 and 2000. Here’s the result:
Next, I added “vanilla”. As you can see by the trend line, it’s not nearly as popular, although it generally follows a similar trend.
For those of you sidelined by winter weather and looking for some fun with words, give Google’s Ngram Viewer a try.
Monday - Aug 2, 2010
In the late 1990s, an epic battle raged to dominate the Internet. In one corner was Microsoft, the behemoth that controlled the computer desktop with its Windows operating system. In the other, was Netscape, a scrappy Silicon Valley start-up. The battleground was the web browser, the software essential to access the Internet.
It’s a classic David vs. Goliath story. A small, venture capital funded company fighting the world’s largest and richest software company. Many people rooted for Netscape, the acknowledged underdog, against Bill Gates’ Evil Empire. I don’t have to tell you who won. Netscape Navigator was eventually acquired by AOL and in 2008, it stopped supporting the software, relegating Netscape to an Internet footnote. Today, Microsoft Internet Explorer holds a commanding 60% share of the browser market.
Regardless of who won, the battle was a boon for consumers, Not only did it lead to continual improvements in the software, but the demise of Netscape Navigator lead to the ascent of Mozilla Firefox, another worthy Internet Explorer opponent.
Today, another battle is being waged, also between a Silicon Valley company (this one is not so scrappy anymore) and Microsoft–the Search War. Google now dominates Internet search and 95% of its revenues are derived from search-related advertising. Microsoft wants a piece of the action, so last year it rebranded MSN Search as Bing, its “decision engine” and put a huge advertising push to promote its “new” search service. The score now stands at 12.7% for Bing vs. 62.6% for Google. Bing is slowly gaining, but that’s not the point: Competition is good.
As in the Browser War, consumers are the ultimate beneficiaries as each Goliath improves its product, spending billions of dollars in the process. For an analysis of the battle plans, check out this story in the New York Times.
Friday - Jun 11, 2010
If you believe the Big Bang theory, then the Universe is constantly expanding. The same can be said about the Internet. Every day hundreds of thousands of gigabytes of information add to this electronic repository of human knowledge. Think of all the articles, discussions, photos, tweets and videos uploaded daily. How do you find it all?
Google is arguable the best all-purpose tool. 24/7 it sends out software programs called spiders that crawl the Web, indexing all the content it discovers. (Here’s an animated overview of how it works.) Those spiders have their work cut out for them.
Up until recently, Google’s index was updated every 30 days. Now it’s getting a mega-jolt with Caffeine, a new technology that speeds up the indexing process, updating content almost as soon as it’s published. That means search result are fresher.
Why should you care? If you want the latest information (and who doesn’t), the new, caffeinated Google aims to deliver it. This post appeared in a Google search within a minute after it was published. In the ballooning universe of the Internet, that’s impressive. And when a service like Google upgrades its software we don’t have to do a thing, except take advantage of the improvements.

