index = {'Michael': [['mj.com',1], ['Nine.com',9],['i.com', 34]], /
'Jackson': [['One.com',4],['mj.com', 2],['Nine.com', 10], ['i.com', 45]], /
'Thriller' : [['Seven.com', 7], ['Ten.com',10], ['One.com', 5], ['mj.com',3]}
# In this dictionary (index), for eg: 'KEYWORD':
# [['THE LINK in which KEYWORD is present,'POSITION
# of KEYWORD in the page specified by link']]
eg: Michael is present in MJ.com, NINE.com, and i.com at positions 1, 9, 34 of respective pages.
Please help me with a python procedure which takes index and KEYWORDS as input.
When i enter 'MICHAEL'. The result should be:
>>['mj.com', 'nine.com', 'i.com']
When I enter 'MICHAEL JACKSON'. The result should be :
>>['mj.com', 'Nine.com']
as 'Michael' and 'Jackson' are present at 'mj.com' and 'nine.com' consecutively i.e. in positions (1,2) & (9,10) respectively. The result should not show 'i.com' even though it contains both KEYWORDS but they are not placed consecutively.
When I enter 'MICHAEL JACKSON THRILLER', the result should be
['mj.com']
as the 3 words 'MICHAEL', 'JACKSON', 'THRILLER' are placed consecutively in 'mj.com' ie positions (1, 2, 3) respectively.
If I enter 'THRILLER JACKSON' or 'THRILLER FEDERER', the result should be NONE.