Can this Query be corrected or different table structure needed? (database dumps provided)
- by sandeepan
This is a bit lengthy but I have provided sufficient details and kept things very clear. Please see if you can help. (I will surely accept answer if it solves my problem)
I am sure a person experienced with this can surely help or suggest me to decide the tables structure.
About the system:-
There are tutors who create classes
A tags based search approach is being followed
Tag relations are created/edited when new tutors registers/edits profile data and when tutors create classes (this makes tutors and classes searcheable).For simplicity, let us consider only tutor name and class name are the fields which are matched against search keywords.
In this example, I am considering -
tutor "Sandeepan Nath" has created a class called "first class"
tutor "Bob Cratchit" has created a class called "new class"
Desired search results-
AND logic to be appied on the search keywords and match against class and tutor data(class name + tutor name), in other words, All those classes be shown such that all the search terms are present in the class name or its tutor name. Example to be clear -
Searching "first class" returns class with id_wc = 1. Working
Searching "Sandeepan class" should also return class with id_wc = 1. Not working in System 2.
Problem with profile editing and searching
To tell in one sentence, I am facing a conflict between the ease of profile edition (edition of tag relations when tutor profiles are edited) and the ease of search logic. In the beginning, we had one table structure and search was easy but tag edition logic was very clumsy and unmaintainable(Check System 1 in the section below) .
So we created separate tag relations tables to make profile edition simpler but search has become difficult.
Please dump the tables so that you can run the search query I have given below and see the results.
System 1 (previous system - search easy - profile edition difficult):-
Only one table called All_Tag_Relations table had the all the tag relations. The tags table below is common to both systems 1 and 2.
CREATE TABLE IF NOT EXISTS `all_tag_relations` (
`id_tag_rel` int(10) NOT NULL AUTO_INCREMENT,
`id_tag` int(10) unsigned NOT NULL DEFAULT '0',
`id_tutor` int(10) DEFAULT NULL,
`id_wc` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id_tag_rel`),
KEY `All_Tag_Relations_FKIndex1` (`id_tag`),
KEY `id_wc` (`id_wc`),
KEY `id_tag` (`id_tag`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `all_tag_relations` (`id_tag_rel`, `id_tag`, `id_tutor`, `id_wc`) VALUES
(1, 1, 1, NULL),
(2, 2, 1, NULL),
(3, 1, 1, 1),
(4, 2, 1, 1),
(5, 3, 1, 1),
(6, 4, 1, 1),
(7, 6, 2, NULL),
(8, 7, 2, NULL),
(9, 6, 2, 2),
(10, 7, 2, 2),
(11, 5, 2, 2),
(12, 4, 2, 2);
CREATE TABLE IF NOT EXISTS `tags` (
`id_tag` int(10) unsigned NOT NULL AUTO_INCREMENT,
`tag` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id_tag`),
UNIQUE KEY `tag` (`tag`),
KEY `id_tag` (`id_tag`),
KEY `tag_2` (`tag`),
KEY `tag_3` (`tag`),
KEY `tag_4` (`tag`),
FULLTEXT KEY `tag_5` (`tag`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=8 ;
INSERT INTO `tags` (`id_tag`, `tag`) VALUES
(1, 'Sandeepan'),
(2, 'Nath'),
(3, 'first'),
(4, 'class'),
(5, 'new'),
(6, 'Bob'),
(7, 'Cratchit');
Please note that for every class, the tag rels of its tutor have to be duplicated. Example, for class with id_wc=1, the tag rel records with id_tag_rel = 3 and 4 are actually extras if you compare with the tag rel records with id_tag_rel = 1 and 2.
System 2 (present system - profile edition easy, search difficult)
Two separate tables Tutors_Tag_Relations and Webclasses_Tag_Relations have the corresponding tag relations data (Please dump into a separate database)-
CREATE TABLE IF NOT EXISTS `tutors_tag_relations` (
`id_tag_rel` int(10) NOT NULL AUTO_INCREMENT,
`id_tag` int(10) unsigned NOT NULL DEFAULT '0',
`id_tutor` int(10) DEFAULT NULL,
PRIMARY KEY (`id_tag_rel`),
KEY `All_Tag_Relations_FKIndex1` (`id_tag`),
KEY `id_tag` (`id_tag`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `tutors_tag_relations` (`id_tag_rel`, `id_tag`, `id_tutor`) VALUES
(1, 1, 1),
(2, 2, 1),
(3, 6, 2),
(4, 7, 2);
CREATE TABLE IF NOT EXISTS `webclasses_tag_relations` (
`id_tag_rel` int(10) NOT NULL AUTO_INCREMENT,
`id_tag` int(10) unsigned NOT NULL DEFAULT '0',
`id_tutor` int(10) DEFAULT NULL,
`id_wc` int(10) DEFAULT NULL,
PRIMARY KEY (`id_tag_rel`),
KEY `webclasses_Tag_Relations_FKIndex1` (`id_tag`),
KEY `id_wc` (`id_wc`),
KEY `id_tag` (`id_tag`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `webclasses_tag_relations` (`id_tag_rel`, `id_tag`, `id_tutor`, `id_wc`) VALUES
(1, 3, 1, 1),
(2, 4, 1, 1),
(3, 5, 2, 2),
(4, 4, 2, 2);
CREATE TABLE IF NOT EXISTS `tags` (
`id_tag` int(10) unsigned NOT NULL AUTO_INCREMENT,
`tag` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id_tag`),
UNIQUE KEY `tag` (`tag`),
KEY `id_tag` (`id_tag`),
KEY `tag_2` (`tag`),
KEY `tag_3` (`tag`),
KEY `tag_4` (`tag`),
FULLTEXT KEY `tag_5` (`tag`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=8 ;
INSERT INTO `tags` (`id_tag`, `tag`) VALUES
(1, 'Sandeepan'),
(2, 'Nath'),
(3, 'first'),
(4, 'class'),
(5, 'new'),
(6, 'Bob'),
(7, 'Cratchit');
CREATE TABLE IF NOT EXISTS `all_tag_relations` (
`id_tag_rel` int(10) NOT NULL AUTO_INCREMENT,
`id_tag` int(10) unsigned NOT NULL DEFAULT '0',
`id_tutor` int(10) DEFAULT NULL,
`id_wc` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id_tag_rel`),
KEY `All_Tag_Relations_FKIndex1` (`id_tag`),
KEY `id_wc` (`id_wc`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
insert into All_Tag_Relations select NULL,id_tag,id_tutor,NULL from Tutors_Tag_Relations;
insert into All_Tag_Relations select NULL,id_tag,id_tutor,id_wc from Webclasses_Tag_Relations;
Here you can see how easily tutor first name can be edited only in one place.
But search has become really difficult, so on being advised to use a Temporary table, I am creating one at every search request, then dumping all the necessary data and then searching from it, I am creating this All_Tag_Relations table at search run time. Here I am just dumping all the data from the two tables Tutors_Tag_Relations and Webclasses_Tag_Relations. But, I am still not able to get classes if I search with tutor name
This is the query which searches "first class". Running them on both the systems shows correct results (returns the class with id_wc = 1).
SELECT wtagrels.id_wc,SUM(DISTINCT( wtagrels.id_tag =3)) AS
key_1_total_matches,
SUM(DISTINCT( wtagrels.id_tag =4)) AS
key_2_total_matches
FROM all_tag_relations AS wtagrels
WHERE ( wtagrels.id_tag =3
OR wtagrels.id_tag =4 )
GROUP BY wtagrels.id_wc
HAVING key_1_total_matches = 1
AND key_2_total_matches = 1
LIMIT 0, 20
But, searching for "Sandeepan class" works only with the 1st system
Here is the query which searches "Sandeepan class"
SELECT wtagrels.id_wc,SUM(DISTINCT( wtagrels.id_tag =1)) AS
key_1_total_matches,
SUM(DISTINCT( wtagrels.id_tag =4)) AS
key_2_total_matches
FROM all_tag_relations AS wtagrels
WHERE ( wtagrels.id_tag =1
OR wtagrels.id_tag =4 )
GROUP BY wtagrels.id_wc
HAVING key_1_total_matches = 1
AND key_2_total_matches = 1
LIMIT 0, 20
Can anybody alter this query and somehow do a proper join or something to get correct results. That solves my problem in a nice way.
As you can figure out, the reason why it does not work in system 2 is that in system 1, for every class, one additional tag relation linking class and tutor name is present. e.g. for class first class, (records with id_tag_rel 3 and 4) which returns the class on searching with tutor name.
So, you see the trade-off between the search and profile edition difficulty with the two systems. How do I overcome both. I have to reach a conclusion soon.
So far my reasoning is
it is definitely not good from a code maintainability point of view to follow the single tag rel table structure of system one, because in a real system while editing a field like "tutor qualifications", there can be as many records in tag rels table as there are words in qualification of a tutor (one word in a field = one tag relation). Now suppose a tutor has 100 classes. When he edits his qualification, all the tag rel rows corresponding to him are deleted and then as many copies are to be created (as per the new qualification data) as there are classes. This becomes particularly difficult if later more searcheable fields are added. The code cannot be robust.
Is the best solution to follow system 2 (edition has to be in one table - no extra work for each and every class) and somehow re-create the all_tag_relations table like system 1 (from the tables tutor_tag_relations and webclasses_tag_relations), creating the extra tutor tag rels for each and every class by a tutor (which is currently missing in system 2's temporary all_tag_relations table). That would be a time consuming logic script. I doubt that table can be recreated without resorting to PHP sript (mysql alone cannot do that). But the problem is that running all this at search time will make search definitely slow.
So, how do such systems work? How are such situations handled?
I thought about we can run a cron which initiates that PHP script, say every 1 minute and replaces the existing all_tag_relations table as per new tag rels from tutor_tag_relations and webclasses_tag_relations (replaces means creates a new table, deletes the original and renames the new one as all_tag_relations, otherwise search won't work during that period- or is there any better way to that?).
Anyway, the result would be that any changes by tutors will reflect in search in the next 1 minute and not immediately.
An alternateve would be to initate that PHP script every time a tutor edits his profile. But here again, since many users may edit their profiles concurrently, will the creation of so many tables be a burden and can mysql make the server slow?
Any help would be appreciated and working solution will be accepted as answer.
Thanks,
Sandeepan