Querying a Cassandra column family for rows that have not been updated in X days

Posted by knorv on Stack Overflow See other posts from Stack Overflow or by knorv
Published on 2010-03-31T12:53:05Z Indexed on 2010/03/31 15:03 UTC
Read the original article Hit count: 508

I'm moving an existing MySQL based application over to Cassandra. So far finding the equivalent Cassandra data model has been quite easy, but I've stumbled on the following problem for which I'd appreciate some input:

Consider a MySQL table holding millions of entities:

CREATE TABLE entities (
  id INT AUTO_INCREMENT NOT NULL,
  entity_information VARCHAR(...),
  entity_last_updated DATETIME,
  PRIMARY KEY (id),
  KEY (entity_last_updated)
);

The table is regularly queried for entities that need to be updated:

 SELECT id FROM entities 
  WHERE entity_last_updated IS NULL 
     OR entity_last_updated < DATE_ADD(NOW(), INTERVAL -7*24 HOUR)
  ORDER BY entity_last_updated ASC;

The entities returned by this queries are then updated using the following query:

 UPDATE entities 
    SET entity_information = ?, 
        entity_last_updated = NOW()
  WHERE id = ?;

What would be the corresponding Cassandra data model that would allow me to store the given information and effectively query the entities table for entities that need to be updated (that is: entities that have not been updated in the last seven days)?

© Stack Overflow or respective owner

Related posts about cassandra

Related posts about nosql