Saturday, November 13, 2010

Ontologies of Algorithm Implementations

A place for you to leave comments about my essay Ontologies of Algorithm Implementations wherein I gave examples of how it's not simple to say "this data set was generated by algorithm X."

5 comments:

chem-bla-ics said...

Hi Andrew,

part of the CHEMINF ontology is based on the earlier work on the Blue Obelisk Descriptor Ontology, and both make this distinction between implementation and algorithm very clear.

Egon

Andrew Dalke said...

Hi Egon,

Implementation vs. algorithm was one part of my commentary. The biggest part (I added some more text, btw) is that many algorithms go into an implementation.

Suppose you say that ToolkitX's hash fingerprints identify compound X as the most similar structure to Y in dataset A. Then you've got 1) Tanimoto for binary fingerprints, 2) aromaticity perception, 3) fingerprint generation, 4) RNG choice, 5) possibly the use of Swamidass and Baldi for constraining the search space, and 6) my own modification to make one stage of that S. and B. algorithm be done in constant memory.

Which of these are relevant and how should they be noted?

Dmitry Pavlov said...

Hello Andrew,

You can look at the Indigo's implementation in the Matr3f::bestFit() method:

https://github.com/ggasoftware/indigo/blob/master/common/math/best_fit.cpp

It apparently includes the 1978 correction, although at the time of writing we did not know about this 1978 paper :) We did the correction independently.

Happy New Year!

Regards,

Dmitry

Dmitry Pavlov said...

not Matr3f, but Transform3f

Andrew Dalke said...

С Новым Годом, Dimitry! Thanks for the comment. I figured you all did it right. And it's a case I hadn't considered.

That you hadn't known about the updated algorithm highlights something lacking in the literature records since ideally I would like to see a comment "corrected in ..." when I read the original paper.

Since we're talking about ontologies, perhaps an ontology of literature references: "this paper implements XYZ", ".. corrects XYZ", "... implements a subset of XYZ", "... was inspired by XYZ", ".. thinks XYZ is horrible but has to reference it because my advisor said so." And so on. :)