Search Results

Search found 1 results on 1 pages for 'user332912'.

Page 1/1 | 1 

  • Paginating requests to an API

    - by user332912
    I'm consuming (via urllib/urllib2) an API that returns XML results. The API always returns the total_hit_count for my query, but only allows me to retrieve results in batches of, say, 100 or 1000. The API stipulates I need to specify a start_pos and end_pos for offsetting this, in order to walk through the results. Say the urllib request looks like "http://someservice?query='test'&start_pos=X&end_pos=Y". If I send an initial 'taster' query with lowest data transfer such as http://someservice?query='test'&start_pos=1&end_pos=1 in order to get back a result of, for conjecture, total_hits = 1234, I'd like to work out an approach to most cleanly request those 1234 results in batches of, again say, 100 or 1000 or... This is what I came up with so far, and it seems to work, but I'd like to know if you would have done things differently or if I could improve upon this: hits_per_page=1000 # or 1000 or 200 or whatever, adjustable total_hits = 1234 # retreived with BSoup from 'taster query' base_url = "http://someservice?query='test'" startdoc_positions = [n for n in range(1, total_hits, hits_per_page)] enddoc_positions = [startdoc_position + hits_per_page - 1 for startdoc_position in startdoc_positions] for start, end in zip(startdoc_positions, enddoc_positions): if end total_hits: end = total_hits print "url to request is:\n ", print "%s&start_pos=%s&end_pos=%s" % (base_url, start, end) p.s. I'm a long time consumer of StackOverflow, especially the Python questions, but this is my first question posted. You guys are just brilliant.

    Read the article

1