Google as a Neural Network
by Phil Marks
Is Google Really a Neural Network?
Firstly, what do we mean by Google? For the purposes of this discussion, we refer to Google as a combination of
a search engine and an instantaneous results set across all web site and blog resources worldwide. Now, the last
numbers I saw (Feb 2010) estimated 750 million websites worldwide, plus 200 million blogs. There are of course
other domains which Google also scans. Other figures suggest 25 billion indexed webpages (Netcraft March 2009).
That was a year ago, so the numbers will be much larger now.
Here, I use the term neural network not in the strict AI sense, but in a more general sense.
Now, consider the human brain as I understand it (a very simple model). It has a set of data inputs (visual,
auditory, chemical – taste and smell, pressure – touch, thermal, inertial – the ear canals, that we know of) and a
memory structure. Data input is stored in short term memory becoming information – i.e. brain processing adds
context, then sorted and filtered and then moved to long term memory. The short and long term memory takes the form
of synapses (junctions between brain cells). More input in a given memory area strengthens the relevant synapses.
We know that as we age, the more salient memories (stronger synapses from earlier in our lives) are easier to
retrieve, and short term memory becomes less efficient. Our ability to build new synapses falls off with age in
most people. Autonomic responses (e.g. breathing) use ‘hard wired’ memory in the hypothalamus which is a very
primitive part of the brain structure. And of course, once we mature we cannot grow new brain cells (though the
research in this area is promising).
So, consider Google to have a set of data inputs – primarily the bot/crawler data gathering, but also input
about the ‘popularity’ of web pages as gathered through use of its search engine by users. The data from these bots
about a given web page – for example keyword relevance of content, the number of external links to the page and so
on, is converted into Google’s proprietary and secret page rank scores and provides a ‘salience’ for the analogous
or proxy Google synapse. The Google synapse analogue is simply (I assume, as I am not privy to Google’s design) a
database row for the website/page with the aforementioned data items (including the page rank/scoring factors) in
the columns, site map entries and site refresh rate and search history information, plus very probably a whole lot
more besides.
Of course the analogy with the human brain breaks down with time, as we would not expect the Google model to
suffer from a capacity limitation or by a constraint imposed by ‘technology’ (as happens with the brain when we age
and the synapse building and strengthening processes become less efficient).
So, what use is this analogy to us? Well, consider how we might wish to add to human brain capacity and extend
its efficiency - we are getting into William Gibson territory now (he was the author who invented the term
‘cyberspace’). Why plug additional memory chips into the brain, when all that is needed is a wireless brain
connection to Google. Science fiction? I don’t think it is that far away (less than 50 years). The potential social
consequences are quite frightening to consider, and I'm sure that the legislators will start work on this area.
© 2010 Phil Marks
Building sites and marketing at => http://www.ezeesoft.co.uk
|