Wednesday, August 3, 2016
Reading ASCII file in Python3.5 is 2-3x faster as bytes than string
I also checked how the RDKit handles invalid Unicode, to see what another toolkit did for the same problem. I concluded that it uses bytes internally and exposes strings, which causes problems if those bytes cannot be converted to strings.
This is the place to leave comments about that post.
Sunday, July 6, 2014
The origin of the connection table
I've been trying to understand the origin of the connection table, and the origin of the term "connection table." The full details are in my essay "The origin of the connection table."
My investigations lead me to believe that Calvin Mooers in 1951 described the first practical connection table, which was used by many people in the 1950s and 1960s. In the mid-1990s, people started saying that George Wheland in 1949 was the first to describe the connection table. I investigated that earlier claim. Wheland's text book does not describe a practical connection table for general purpose use, nor was the proposal suggested for use by a computer. Wheland brought it up to emphasize that nongeometrical representations were equally as descriptive as diagrams, but did not believe that the connection matrix was of practical use.
I then tried to figure out why people say that Wheland is the creator of the connection table, but that's unresolved.
Finally, I tried to figure out when the term "connection table" was coined. It appears to be 1963, from people working at or affiliated with Chemical Abstracts, and perhaps due to influences from electrical engineering.
If you have comments about that essay, leave them here.
Wednesday, June 18, 2014
Calvin Mooers
Saturday, July 27, 2013
Comments about 'Varkony Reconsidered'
I presented a talk titled "Varkony Reconsidered: Subgraph enumeration and the Multiple MCS problem" at the 6th Joint Sheffield Conference on Chemoinformatics on 23 July 2013.
The major goal of the presentation was to show how the 1979 paper of Varkony, which is part of the MCS literature, is better seen as an early paper in frequent subgraph mining. My own MCS algorithm, fmcs, is a variation of that approach, and is best seen as an intellectual descendent of an algorithm by Takahashi. Both the Varkony and Takahashi papers are relatively uncited in the MCS literature, so I spend some time explaining how they fit into the larger context. The full MCS talk is on my web site. This is the place to leave comments.Monday, May 21, 2012
Testing hard algorithms
Thursday, May 17, 2012
Topologically non-planar molecules
Saturday, May 12, 2012
Maximum Common Substructures and fmcs
- MCS background
- fmcs - find the MCS of a set of compounds
- Finding the MCSes for the ChEBI ontology
- Some analysis information in Testing hard problems