Refining data stored in SQLite - how to join several contacts?

Posted by Krab on Stack Overflow See other posts from Stack Overflow or by Krab
Published on 2010-05-25T11:45:15Z Indexed on 2010/05/25 11:51 UTC
Read the original article Hit count: 344

Filed under:
|
|
|
|

Problem background

Imagine this problem. You have a water molecule which is in contact with other molecules (if the contact is a hydrogen bond, there can be 4 other molecules around my water). Like in the following picture (A, B, C, D are some other atoms and dots mean the contact).

 A   B
  . .
   O
  / \
 H   H
 .   .
 C   D

I have the information about all the dots and I need to eliminate the water in the center and create records describing contacts of A-C, A-D, A-B, B-C, B-D, and C-D.

Database structure

Currently, I have the following structure in the database:

Table atoms:

  • "id" integer PRIMARY KEY,
  • "amino" char(3) NOT NULL, (HOH for water or other value)
  • other columns identifying the atom

Table contacts:

  • "acceptor_id" integer NOT NULL, (the atom near to my hydrogen, here C or D)
  • "donor_id" integer NOT NULL, (here A or B)
  • "directness" char(1) NOT NULL, (this should be D for direct and W for water-mediated)
  • other columns about the contact, such as the distance

Current solution (insufficient)

Now, I'm going through all the contacts which have donor.amino = "HOH". In this sample case, this would select contacts from C and D. For each of these selected contacts, I look up contacts having the same acceptor_id as is the donor_id in the currently selected contact. From this information, I create the new contact. At the end, I delete all contacts to or from HOH.

This way, I am obviously unable to create C-D and A-B contacts (the other 4 are OK).

If I try a similar approach - trying to find two contacts having the same donor_id, I end up with duplicate contacts (C-D and D-C).

Is there a simple way to retrieve all six contacts without duplicates?

I'm dreaming about some one page long SQL query which retrievs just these six wanted rows. :-)
It is preferable to conserve information about who is donor where possible, but not strictly necessary.

Big thanks to all of you who read this question to this point.

© Stack Overflow or respective owner

Related posts about python

Related posts about sql