Friday 27 July 2018

RDF (the 'Semantic Web') and the Human Brain

I was introduced to the RDF (Resource Description Framework) data model by the chairpeople  (waving to Paul Rouet ; ) of the "Paris Time Machine" project; they were sorely in need of a 'tech guy' (and I was the only one on the 'team'), but it was the only computer-oriented thing on their 'cahier des charges' that I wasn't qualified for; not only had I no experience with RDF, but I was totally unaware of its existence until then. I'm wondering how I managed to miss it: it's been around since 1999, created almost in tandem with the XML format (that is only beginning to seem 'more value than noise' for me), but it never took off, and is still far from anything approaching a standard (use) today. Now that I've looked into it, its potential utility is, well, amazing, but it's going to require a lot of work to implement: either the whole of the web is going to have to be re-factored to accommodate it, or we're going to have to develop an AI that can reliably read and extract data from all forms of publication (print and web). I'm working towards, and vying for, the latter.

RDF at its base is not a complicated affair, and its syntax took only a couple days to master. Basically, each bit of data is a 'subject-predicate-object' "triplet", for example: "Bob=><-->last_name=>Smith', or "Bob=>address=>25 maple lane' or "Bob=>phone_number=>0 (145) 628-5400'. So if we were to do a search for (subject) 'Bob', we would get all the data 'attributed' to that subject: last_name, address, phone_number. Of course, in larger data collections, 'Bob' would be a bad 'central node/identifier' choice (because that what it becomes in this context), but I'm sure you get the picture: in this way, it would be possible to attribute any 'type' (dictated by the predicate) of information to that subject, without any limitation and possibility of conflict (Bob can have two phone numbers: both will have a 'Bob=>phone_number' subject-predicate, and a query for 'Bob=>phone_number' (or just 'Bob') will return both). Furthermore, one triplet's object (data) can be a subject with data of its own: for example: "0 (145) 628-5400=>phone_number_type=>land_line' would turn up as a 'second level' of data in an 'all about Bob' simple 'Bob' query. So with this method, data linked to data linked to data... the possibilities are endless.

But that's not what excited me about it: I've always been fascinated by neuroscience (basically: understanding my own (brain's) quirks), and as I learned more about RDF, my thoughts, with bells ringing, returned increasingly there: there are a lot of similarities between the workings of RDF and the human brain.

Granted, RDF is a step 'above' our 'fired-or-not' basically-binary synapses, but the organisation seems the same. If we were to think of 'Bob', our brain would return all the data it contained that could be attributed to that entity. Our brain 'identifies' "Bob" by a group of synapses ('identifier'), and that is where I thought the difference with RDF was, but if we were to examine a more complicated RDF dataset,  easily-conflict-prone subjects such as 'Bob' would have to become unique identifiers as well (and 'that particular Bob's' first name would become, say:10001010=>first_name=>Bob (and 10001010<-->=>address=>25 maple lane, etc.). In reality, to avoid conflicts, most likely every 'thing' in existence should have a unique identifier (save, for example, our most fundamental elements (atom-types, fermion-types, etc.)... so if we reductio ad absurdam our computer's 'unique id', it will be a collection of 'on or off' binary values... the same as our brain's.

====

Just a footnote here to underline that this 'binary cocktail' outline most likely does not describe the entirety of the brain's thought-memory-recall process; probably other chemical 'filters' figure in there too (and this is how we give 'value' to retrieved memories (over others)). This is yet something else to explore (and perhaps even exploit, if it can be re-created technologically), but for the purposes of this what-is-supposed-to-be RDF perusal, going there would be but a distraction.