Message Queue: Which one is the best scenario?

Posted by pandaforme on Programmers See other posts from Programmers or by pandaforme
Published on 2014-06-09T08:44:52Z Indexed on 2014/06/09 9:41 UTC
Read the original article Hit count: 425

Filed under:

message-queue

|

web-crawler

I write a web crawler.

The crawler has 2 steps:

get a html page
then parse the page

I want to use message queue to improve performance and throughput.

I think 2 scenarios:

scenario 1:

    structure: 
    urlProducer -> queue1 -> urlConsumer -> queue2 -> parserConsumer

urlProducer: get a target url and add it to queue1

urlConsumer: according to the job info, get the html page and add it to queue2

parserConsumer: according to the job info, parse the page

scenario 2:

    structure:
    urlProducer -> queue1 -> urlConsumer
    parserProducer-> queue2 -> parserConsumer

urlProducer : get a target url and add it to queue1

urlConsumer: according to the job info, get the html page and write it to db

parserProducer: get the html page from db and add it to queue2

parserConsumer: according to the job info, parse the page

There are multiple producers or consumers in each structure.

scenario1 likes a chaining call. It's difficult to find the point of problem, when occurring errors.

scenario2 decouples queue1 and queue2. It's easy to find the point of problem, when occurring errors.

I'm not sure the notion is correct.

Which one is the best scenario? Or other scenarios?

Thanks~

© Programmers or respective owner

Related posts about message-queue

what are popular message queue services in Java?

as seen on Stack Overflow - Search for 'Stack Overflow'
Other than JMS, what other queue based applications/services are popular? Just want to know what is used in the industry, I am learning the basics and want to know what to put on the list. >>> More
How do I detect a connection break using MessageQueue

as seen on Stack Overflow - Search for 'Stack Overflow'
My application written in C# makes use of the MessageQueue class in .NET for communicating messages with another remote application and the MessageQueue should always be "connected" (heartbeat present) with the remote messageQueue under all circumstances. If it is not "connected", then it signals… >>> More
Message queue for real time chat , ASP.NET

as seen on Stack Overflow - Search for 'Stack Overflow'
How do i create message queuing mechanism for real time chat in asp.net ? At least post some points to start with because for now i'm using synchronous calls to DB for any change. >>> More
Message Driven Bean with Java Message Queue down

as seen on Stack Overflow - Search for 'Stack Overflow'
I have the following problem deploying my application. It uses JMS and a remote openMQ for communication between servers. The problem is that the connection is not fully reliable so it can be up or down. For reconnecting I set the jms reconnect glassfish property so it reconnects if at some moment… >>> More
MessageQueue.BeginReceive() null ref error - c#

as seen on Stack Overflow - Search for 'Stack Overflow'
Have a windows service that listens to a msmq. In the OnStart method is have this protected override void OnStart(string[] args) { try { _queue = new MessageQueue(_qPath);//this part works as i had logging before and afer this call … >>> More

Related posts about web-crawler

web crawler needed

as seen on Stack Overflow - Search for 'Stack Overflow'
does anybody know where i can get a free web crawler that actually works with minimal coding by me. ive googled it and can only find really old ones that dont work or openwebspider which doesnt seem to work. ideally id like to store just the web addresses and which links that page contains any suggestions… >>> More
Building an automatic web crawler

as seen on Stack Overflow - Search for 'Stack Overflow'
I am building a web application crawler that's meant not only to find all the links or pages in a web application, but also perform all the allowed actions in the app (such as pushing buttons, filling forms, notice changes in the DOM even if they did not trigger a request etc.) Basically, this is… >>> More
Appengine Apps Vs Google bot web crawler

as seen on Stack Overflow - Search for 'Stack Overflow'
i built an appengine web app cricket.hover.in. The web app consists of about 15k url's linked in it, But even after a long time of my launch, no pages are indexed on google. Any base link place on my root site hover.in are being indexed with in minutes. but i placed the same link home page of root… >>> More
Extracting data from internet

as seen on Programmers - Search for 'Programmers'
I would like to extract data from internet like www.mozenda.com does but I want to write my own program to do that. Specific data I'm looking for is various event data. Based on my research, I think custom web crawler is my answer but I Would like to confirm the answer and see if there are any suggestion… >>> More
Web crawler update strategy

as seen on Stack Overflow - Search for 'Stack Overflow'
I want to crawl useful resource (like background picture .. ) from certain websites. It is not a hard job, especially with the help of some wonderful projects like scrapy. The problem here is I not only just want crawl this site ONE TIME. I also want to keep my crawl long running and crawl the updated… >>> More