Tuesday 9 May 2017: Distances Between Drugs

Wikipedia is agnostic, quite intentionally, about many contentious issues. On Wikidata, this attitude can translate into permissive attitudes to “ontology”, and it can look like a misnomer to say it even has one, in the commonly used sense for databases.

That issue becomes abundantly clear in the tree of species I wrote about last week. I was implicitly relying on traditional taxonomic ranks, so that homo sapiens as species is a definite number of ranks below Chordata, a phylum one step above the vertebrates. Clades, based more directly on genomes, form a competing system. And Wikidata includes both, as well as some parts of classifications that are no longer currently used.

But the point about distances between scientific concepts is that they should be useful. Passing to pharmaceuticals, it at first sight is easy enough to recognise tree-like structure. Antibiotics are a subgroup, and sulphonamides are a subgroup of antibiotics. A particular sulphonamide drug is by that reckoning three levels down from drugs as a whole.

A problem arises when one notices that antibiotics are defined by function: they kill bacteria. The name “sulphonamide”, on the other hand, suggests what is true, that it is a classification by the chemical properties of a molecule.

If you think about New York, there is a distance by blocks. If you need to go five blocks north and three east, your destination is eight blocks away by road. And that’s the minimum. We can take the same attitude if there are two dimensions in which to separate terms: just add the distances. A New York map shows a grid, so we are moving away from tree-like structures.

It has been put to me that what I have been proposing is an “ontological distance”. Maybe so. But perhaps more important is that “distance by blocks” is just one choice: there is nothing set in granite about counting blocks, rather than distance on the sidewalk, and blocks tend to be rectangular rather than square. There is something arbitrary going on.

The discussion in these first three posts is going to bite its tail here. Ontology, these days, is not really dealing with the great chain of being. I like something said by John Webster in 1623, in The Devil’s Law Case, about “nets to catch the wind”. He meant the ambitions of kings, but the point is ambition, not who suffers from it. The finer the mesh of our net, the more we should be able to catch.

Nature doesn’t differentiate itself for us: concepts are what we build with, to pin down its flow. The queasy confusions around “ox” are just one aspect: we need the definite to move forward in understanding the indefinite in our environment. The very idea of a species is possibly “only” a way of discussing a cluster of genomes. In contrast, a drug these days is likely a pure chemical compound. If it is trickier for me to define the distance between drugs than for species, that rather proves the point: operationally, my interest is in a “signal” of emergence, in the scientific literature, and it may be faint.

