I'm storing some very basic information "data sources" coming into my application. These data sources can be in the form of a document (e.g. PDF, etc.), audio (e.g. MP3, etc.) or video (e.g. AVI, etc.). Say, for example, I am only interested in the filename of the data source. Thus, I have the following table:
DataSource
Id (PK)
Filename
For each data source, I also need to store some of its attributes. Example for a PDF would be "numbe of pages." Example for audio would be "bit rate." Example for video would be "duration." Each DataSource will have different requirements for the attributes that need to be stored. So, I have modeled "data source attribute" this way:
DataSourceAttribute
Id (PK)
DataSourceId (FK)
Name
Value
Thus, I would have records like these:
DataSource->Id = 1
DataSource->Filename = 'mydoc.pdf'
DataSource->Id = 2
DataSource->Filename = 'mysong.mp3'
DataSource->Id = 3
DataSource->Filename = 'myvideo.avi'
DataSourceAttribute->Id = 1
DataSourceAttribute->DataSourceId = 1
DataSourceAttribute->Name = 'TotalPages'
DataSourceAttribute->Value = '10'
DataSourceAttribute->Id = 2
DataSourceAttribute->DataSourceId = 2
DataSourceAttribute->Name = 'BitRate'
DataSourceAttribute->Value '16'
DataSourceAttribute->Id = 3
DataSourceAttribute->DataSourceId = 3
DataSourceAttribute->Name = 'Duration'
DataSourceAttribute->Value = '1:32'
My problem is that this doesn't seem to scale. For example, say I need to query for all the PDF documents along with thier total number of pages:
Filename, TotalPages
'mydoc.pdf', '10'
'myotherdoc.pdf', '23'
...
The JOINs needed to produce the above result is just too costly. How should I address this problem?