Is this over-abstraction? (And is there a name for it?)
- by mwhite
I work on a large Django application that uses CouchDB as a database and couchdbkit for mapping CouchDB documents to objects in Python, similar to Django's default ORM. It has dozens of model classes and a hundred or two CouchDB views.
The application allows users to register a "domain", which gives them a unique URL containing the domain name that gives them access to a project whose data has no overlap with the data of other domains. Each document that is part of a domain has its domain property set to that domain's name.
As far as relationships between the documents go, all domains are effectively mutually exclusive subsets of the data, except for a few edge cases (some users can be members of more than one domain, and there are some administrative reports that include all domains, etc.).
The code is full of explicit references to the domain name, and I'm wondering if it would be worth the added complexity to abstract this out. I'd also like to know if there's a name for the sort of bound property approach I'm taking here.
Basically, I have something like this in mind:
Before
in models.py
class User(Document):
domain = StringProperty()
class Group(Document):
domain = StringProperty()
name = StringProperty()
user_ids = StringListProperty()
# method that returns related document set
def users(self):
return [User.get(id) for id in self.user_ids]
# method that queries a couch view optimized for a specific lookup
@classmethod
def by_name(cls, domain, name):
# the view method is provided by couchdbkit and handles
# wrapping json CouchDB results as Python objects, and
# can take various parameters modifying behavior
return cls.view('groups/by_name', key=[domain, name])
# method that creates a related document
def get_new_user(self):
user = User(domain=self.domain)
user.save()
self.user_ids.append(user._id)
return user
in views.py:
from models import User, Group
# there are tons of views like this, (request, domain, ...)
def create_new_user_in_group(request, domain, group_name):
group = Group.by_name(domain, group_name)[0]
user = User(domain=domain)
user.save()
group.user_ids.append(user._id)
group.save()
in group/by_name/map.js:
function (doc) {
if (doc.doc_type == "Group") {
emit([doc.domain, doc.name], null);
}
}
After
models.py
class DomainDocument(Document):
domain = StringProperty()
@classmethod
def domain_view(cls, *args, **kwargs):
kwargs['key'] = [cls.domain.default] + kwargs['key']
return super(DomainDocument, cls).view(*args, **kwargs)
@classmethod
def get(cls, *args, **kwargs, validate_domain=True):
ret = super(DomainDocument, cls).get(*args, **kwargs)
if validate_domain and ret.domain != cls.domain.default:
raise Exception()
return ret
def models(self):
# a mapping of all models in the application. accessing one returns the equivalent of
class BoundUser(User):
domain = StringProperty(default=self.domain)
class User(DomainDocument):
pass
class Group(DomainDocument):
name = StringProperty()
user_ids = StringListProperty()
def users(self):
return [self.models.User.get(id) for id in self.user_ids]
@classmethod
def by_name(cls, name):
return cls.domain_view('groups/by_name', key=[name])
def get_new_user(self):
user = self.models.User()
user.save()
views.py
@domain_view # decorator that sets request.models to the same sort of object that is returned by DomainDocument.models and removes the domain argument from the URL router
def create_new_user_in_group(request, group_name):
group = request.models.Group.by_name(group_name)
user = request.models.User()
user.save()
group.user_ids.append(user._id)
group.save()
(Might be better to leave the abstraction leaky here in order to avoid having to deal with a couchapp-style //! include of a wrapper for emit that prepends doc.domain to the key or some other similar solution.)
function (doc) {
if (doc.doc_type == "Group") {
emit([doc.name], null);
}
}
Pros and Cons
So what are the pros and cons of this?
Pros:
DRYer
prevents you from creating related documents but forgetting to set the domain.
prevents you from accidentally writing a django view - couch view execution path that leads to a security breach
doesn't prevent you from accessing underlying self.domain and normal Document.view() method
potentially gets rid of the need for a lot of sanity checks verifying whether two documents whose domains we expect to be equal are.
Cons:
adds some complexity
hides what's really happening
requires no model modules to have classes with the same name, or you would need to add sub-attributes to self.models for modules. However, requiring project-wide unique class names for models should actually be fine because they correspond to the doc_type property couchdbkit uses to decide which class to instantiate them as, which should be unique.
removes explicit dependency documentation (from group.models import Group)