Search Results

Search found 13683 results on 548 pages for 'python sphinx'.

Page 370/548 | < Previous Page | 366 367 368 369 370 371 372 373 374 375 376 377  | Next Page >

  • Scrapy Not Returning Additonal Info from Scraped Link in Item via Request Callback

    - by zoonosis
    Basically the code below scrapes the first 5 items of a table. One of the fields is another href and clicking on that href provides more info which I want to collect and add to the original item. So parse is supposed to pass the semi populated item to parse_next_page which then scrapes the next bit and should return the completed item back to parse Running the code below only returns the info collected in parse If I change the return items to return request I get a completed item with all 3 "things" but I only get 1 of the rows, not all 5. Im sure its something simple, I just can't see it. class ThingSpider(BaseSpider): name = "thing" allowed_domains = ["somepage.com"] start_urls = [ "http://www.somepage.com" ] def parse(self, response): hxs = HtmlXPathSelector(response) items = [] for x in range (1,6): item = ScrapyItem() str_selector = '//tr[@name="row{0}"]'.format(x) item['thing1'] = hxs.select(str_selector")]/a/text()').extract() item['thing2'] = hxs.select(str_selector")]/a/@href').extract() print 'hello' request = Request("www.nextpage.com", callback=self.parse_next_page,meta={'item':item}) print 'hello2' request.meta['item'] = item items.append(item) return items def parse_next_page(self, response): print 'stuff' hxs = HtmlXPathSelector(response) item = response.meta['item'] item['thing3'] = hxs.select('//div/ul/li[1]/span[2]/text()').extract() return item

    Read the article

  • Getting unpredictable data into a tabular format

    - by Acorn
    The situation: Each page I scrape has <input> elements with a title= and a value= I don't know what is going to be on the page. I want to have all my collected data in a single table at the end, with a column for each title. So basically, I need each row of data to line up with all the others, and if a row doesn't have a certain element, then it should be blank (but there must be something there to keep the alignment). eg. First page has: {animal: cat, colour: blue, fruit: lemon, day: monday} Second page has: {animal: fish, colour: green, day: saturday} Third page has: {animal: dog, number: 10, colour: yellow, fruit: mango, day: tuesday} Then my resulting table should be: animal | number | colour | fruit | day cat | none | blue | lemon | monday fish | none | green | none | saturday dog | 10 | yellow | mango | tuesday Although it would be good to keep the order of the title value pairs, which I know dictionaries wont do. So basically, I need to generate columns from all the titles (kept in order but somehow merged together) What would be the best way of going about this without knowing all the possible titles and explicitly specifying an order for the values to be put in?

    Read the article

  • How can I draw a log-normalized imshow plot with a colorbar representing the raw data in matplotlib

    - by Adam Fraser
    I'm using matplotlib to plot log-normalized images but I would like the original raw image data to be represented in the colorbar rather than the [0-1] interval. I get the feeling there's a more matplotlib'y way of doing this by using some sort of normalization object and not transforming the data beforehand... in any case, there could be negative values in the raw image. import matplotlib.pyplot as plt import numpy as np def log_transform(im): '''returns log(image) scaled to the interval [0,1]''' try: (min, max) = (im[im > 0].min(), im.max()) if (max > min) and (max > 0): return (np.log(im.clip(min, max)) - np.log(min)) / (np.log(max) - np.log(min)) except: pass return im a = np.ones((100,100)) for i in range(100): a[i] = i f = plt.figure() ax = f.add_subplot(111) res = ax.imshow(log_transform(a)) # the colorbar drawn shows [0-1], but I want to see [0-99] cb = f.colorbar(res) I've tried using cb.set_array, but that didn't appear to do anything, and cb.set_clim, but that rescales the colors completely. Thanks in advance for any help :)

    Read the article

  • initialize a numpy array

    - by Curious2learn
    Is there way to initialize a numpy array of a shape and add to it? I will explain what I need with a list example. If I want to create a list of objects generated in a loop, I can do: a = [] for i in range(5): a.append(i) I want to do something similar with a numpy array. I know about vstack, concatenate etc. However, it seems these require two numpy arrays as inputs. What I need is: big_array # Initially empty. This is where I don't know what to specify for i in range(5): array i of shape = (2,4) created. add to big_array The big_array should have a shape (10,4). How to do this? Thanks for your help.

    Read the article

  • Deterministic key serialization

    - by Mike Boers
    I'm writing a mapping class which uses SQLite as the storage backend. I am currently allowing only basestring keys but it would be nice if I could use a couple more types hopefully up to anything that is hashable (ie. same requirements as the builtin dict). To that end I would like to derive a deterministic serialization scheme. Ideally, I would like to know if any implementation/protocol combination of pickle is deterministic for hashable objects (e.g. can only use cPickle with protocol 0). I noticed that pickle and cPickle do not match: >>> import pickle >>> import cPickle >>> def dumps(x): ... print repr(pickle.dumps(x)) ... print repr(cPickle.dumps(x)) ... >>> dumps(1) 'I1\n.' 'I1\n.' >>> dumps('hello') "S'hello'\np0\n." "S'hello'\np1\n." >>> dumps((1, 2, 'hello')) "(I1\nI2\nS'hello'\np0\ntp1\n." "(I1\nI2\nS'hello'\np1\ntp2\n." Another option is to use repr to dump and ast.literal_eval to load. This would only be valid for builtin hashable types. I have written a function to determine if a given key would survive this process (it is rather conservative on the types it allows): def is_reprable_key(key): return type(key) in (int, str, unicode) or (type(key) == tuple and all( is_reprable_key(x) for x in key)) The question for this method is if repr itself is deterministic for the types that I have allowed here. I believe this would not survive the 2/3 version barrier due to the change in str/unicode literals. This also would not work for integers where 2**32 - 1 < x < 2**64 jumping between 32 and 64 bit platforms. Are there any other conditions (ie. do strings serialize differently under different conditions)? (If this all fails miserably then I can store the hash of the key along with the pickle of both the key and value, then iterate across rows that have a matching hash looking for one that unpickles to the expected key, but that really does complicate a few other things and I would rather not do it.) Any insights?

    Read the article

  • Form (or Formset?) to handle multiple table rows in Django

    - by Ben
    Hi, I'm working on my first Django application. In short, what it needs to do is to display a list of film titles, and allow users to give a rating (out of 10) to each film. I've been able to use the {{ form }} and {{ formset }} syntax in a template to produce a form which lets you rate one film at a time, which corresponds to one row in a MySQL table, but how do I produce a form that iterates over all the movie titles in the database and produces a form that lets you rate lots of them at once? At first, I thought this was what formsets were for, but I can't see any way to automatically iterate over the contents of a database table to produce items to go in the form, if you see what I mean. Currently, my views.py has this code: def survey(request): ScoreFormSet = formset_factory(ScoreForm) if request.method == 'POST': formset = ScoreFormSet(request.POST, request.FILES) if formset.is_valid(): return HttpResponseRedirect('/') else: formset = ScoreFormSet() return render_to_response('cf/survey.html', { 'formset':formset, }) And my survey.html has this: <form action="/survey/" method="POST"> <table> {{ formset }} </table> <input type = "submit" value = "Submit"> </form> Oh, and the definition of ScoreForm and Score from models.py are: class Score(models.Model): movie = models.ForeignKey(Movie) score = models.IntegerField() user = models.ForeignKey(User) class ScoreForm(ModelForm): class Meta: model = Score So, in case the above is not clear, what I'm aiming to produce is a form which has one row per movie, and each row shows a title, and has a box to allow the user to enter their score. If anyone can point me at the right sort of approach to this, I'd be most grateful. Thanks, Ben

    Read the article

  • How to get bit rotation function to accept any bit size?

    - by calccrypto
    i have these 2 functions i got from some other code def ROR(x, n): mask = (2L**n) - 1 mask_bits = x & mask return (x >> n) | (mask_bits << (32 - n)) def ROL(x, n): return ROR(x, 32 - n) and i wanted to use them in a program, where 16 bit rotations are required. however, there are also other functions that require 32 bit rotations, so i wanted to leave the 32 in the equation, so i got: def ROR(x, n, bits = 32): mask = (2L**n) - 1 mask_bits = x & mask return (x >> n) | (mask_bits << (bits - n)) def ROL(x, n, bits = 32): return ROR(x, bits - n) however, the answers came out wrong when i tested this set out. yet, the values came out correctly when the code is def ROR(x, n): mask = (2L**n) - 1 mask_bits = x & mask return (x >> n) | (mask_bits << (16 - n)) def ROL(x, n,bits): return ROR(x, 16 - n) what is going on and how do i fix this?

    Read the article

  • How small is *too small* for an opensource project?

    - by Adam Lewis
    I have a fair number of smaller projects / libraries that I have been using over the past 2 years. I am thinking about moving them to Google Code to make it easier to share with co-workers and easier to import them into new projects on my own environments. The are things like a simple FSMs, CAN (Controller Area Network) drivers, and GPIB drivers. Most of them are small (less than 500 lines), so it makes me wonder are these types of things too small for a stand alone open-source project? Note that I would like to make it opensource because it does not give me, or my company, any real advantage.

    Read the article

  • twisted reactor stops too early

    - by pygabriel
    I'm doing a batch script to connect to a tcp server and then exiting. My problem is that I can't stop the reactor, for example: cmd = raw_input("Command: ") # custom factory, the protocol just send a line reactor.connectTCP(HOST,PORT, CommandClientFactory(cmd) d = defer.Deferred() d.addCallback(lambda x: reactor.stop()) reactor.callWhenRunning(d.callback,None) reactor.run() In this code the reactor stops before that the tcp connection is done and the cmd is passed. How can I stop the reactor after that all the operation are finished?

    Read the article

  • method __getattr__ is not inherited from parent class

    - by ??????
    Trying to subclass mechanize.Browser class: from mechanize import Browser class LLManager(Browser, object): IS_AUTHORIZED = False def __init__(self, login = "", passw = "", *args, **kwargs): super(LLManager, self).__init__(*args, **kwargs) self.set_handle_robots(False) But when I make something like this: lm["Widget[LinksList]_link_1_title"] = anc then I get an error: Traceback (most recent call last): File "<pyshell#8>", line 1, in <module> lm["Widget[LinksList]_link_1_title"] = anc TypeError: 'LLManager' object does not support item assignment Browser class have overridden method __getattr__ as shown: def __getattr__(self, name): # pass through _form.HTMLForm methods and attributes form = self.__dict__.get("form") if form is None: raise AttributeError( "%s instance has no attribute %s (perhaps you forgot to " ".select_form()?)" % (self.__class__, name)) return getattr(form, name) Why my class or instance don't get this method as in parent class?

    Read the article

  • has any tools easy to download or uploaed data from gae ..

    - by zjm1126
    i find this: http://aralbalkan.com/1784 but it is : Gaebar is an easy-to-use, standalone Django application that you can plug in to your existing Google App Engine Django or app-engine-patch-based Django applications on Google App Engine to give them datastore backup and restore functionality. my app is not based on django,so did you know any tools esay to do this . thanks

    Read the article

  • How to create instances of a class from a static method?

    - by Pierre
    Hello. Here is my problem. I have created a pretty heavy readonly class making many database calls with a static "factory" method. The goal of this method is to avoid killing the database by looking in a pool of already-created objects if an identical instance of the same object (same type, same init parameters) already exists. If something was found, the method will just return it. No problem. But if not, how may I create an instance of the object, in a way that works with inheritance? >>> class A(Object): >>> @classmethod >>> def get_cached_obj(self, some_identifier): >>> # Should do something like `return A(idenfier)`, but in a way that works >>> class B(A): >>> pass >>> A.get_cached_obj('foo') # Should do the same as A('foo') >>> A().get_cached_obj('foo') # Should do the same as A('foo') >>> B.get_cached_obj('bar') # Should do the same as B('bar') >>> B().get_cached_obj('bar') # Should do the same as B('bar') Thanks.

    Read the article

  • Performing non-blocking requests? - Django

    - by RadiantHex
    Hi folks, I have been playing with other frameworks, such as NodeJS, lately. I love the possibility to return a response, and still being able to do further operations. e.g. def view(request): do_something() return HttpResponse() do_more_stuff() #not possible!!! Maybe Django already offers a way to perform operations after returning a request, if that is the case that would be great. Help would be very much appreciated! =D

    Read the article

  • Execute function without sending 'self' to it

    - by Sergey
    Is that possible to define a function without referencing to self this way? def myfunc(var_a,var_b) But so that it could also get sender data, like if I defined it like this: def myfunc(self, var_a,var_b) That self is always the same so it looks a little redundant here always to run a function this way: myfunc(self,'data_a','data_b'). Then I would like to get its data in the function like this sender.fields. UPDATE: Here is some code to understand better what I mean. The class below is used to show a page based on Jinja2 templates engine for users to sign up. class SignupHandler(webapp.RequestHandler): def get(self, *args, **kwargs): utils.render_template(self, 'signup.html') And this code below is a render_template that I created as wrapper to Jinja2 functions to use it more conveniently in my project: def render_template(response, template_name, vars=dict(), is_string=False): template_dirs = [os.path.join(root(), 'templates')] logging.info(template_dirs[0]) env = Environment(loader=FileSystemLoader(template_dirs)) try: template = env.get_template(template_name) except TemplateNotFound: raise TemplateNotFound(template_name) content = template.render(vars) if is_string: return content else: response.response.out.write(content) As I use this function render_template very often in my project and usually the same way, just with different template files, I wondered if there was a way to get rid of having to call it like I do it now, with self as the first argument but still having access to that object.

    Read the article

  • mod_wsgi daemon mode vs threaded fastcgi

    - by t0ster
    Can someone explain the difference between apache mod_wsgi in daemon mode and django fastcgi in threaded mode. They both use threads for concurrency I think. Supposing that I'm using nginx as front end to apache mod_wsgi. UPDATE: I'm comparing django built in fastcgi(./manage.py method=threaded maxchildren=15) and mod_wsgi in 'daemon' mode(WSGIDaemonProcess example threads=15). They both use threads and acquire GIL, am I right?

    Read the article

  • Not able to pass multiple override parameters using nose-testconfig 0.6 plugin in nosetests

    - by Jaikit
    Hi, I am able to override multiple config parameters using nose-testconfig plugin only if i pass the overriding parameters on commandline. e.g. nosetests -c nose.cfg -s --tc=jack.env1:asl --tc=server2.env2:abc But when I define the same thing inside nose.cfg, than only the value for last parameter is modified. e.g. tc = server2.env2:abc tc = jack.env1:asl I checked the plugin code. It looks fine to me. I am pasting the part of plugin code below: parser.add_option( "--tc", action="append", dest="overrides", default = [], help="Option:Value specific overrides.") configure: if options.overrides: self.overrides = [] overrides = tolist(options.overrides) for override in overrides: keys, val = override.split(":") if options.exact: config[keys] = val else: ns = ''.join(['["%s"]' % i for i in keys.split(".") ]) # BUG: Breaks if the config value you're overriding is not # defined in the configuration file already. TBD exec('config%s = "%s"' % (ns, val)) Let me know if any one has any clue.

    Read the article

  • Finding matching submatrics inside a matrix

    - by DaveO
    I have a 100x200 2D array expressed as a numpy array consisting of black (0) and white (255) cells. It is a bitmap file. I then have 2D shapes (it's easiest to think of them as letters) that are also 2D black and white cells. I know I can naively iterate through the matrix but this is going to be a 'hot' portion of my code so speed is an concern. Is there a fast way to perform this in numpy/scipy? I looked briefly at Scipy's correlate function. I am not interested in 'fuzzy matches', only exact matches. I also looked at some academic papers but they are above my head.

    Read the article

  • Programmatically sync the db in Django

    - by Attila Oláh
    I'm trying to sync my db from a view, something like this: from django import http from django.core import management def syncdb(request): management.call_command('syncdb') return http.HttpResponse('Database synced.') The issue is, it will block the dev server by asking for user input from the terminal. How can I pass it the '--noinput' option to prevent asking me anything? I have other ways of marking users as super-user, so there's no need for the user input, but I really need to call syncdb (and flush) programmatically, without logging on to the server via ssh. Any help is appreciated.

    Read the article

  • Extract points within a shape from a raster

    - by user308827
    Hi, I have a raster file (basically 2D array) with close to a million points. I am trying to extract a circle from the raster (and all the points that lie within the circle. Using ArcGIS is exceedingly slow for this. Can anyone suggest any image processing library that is both easy to learn and powerful and quick enough for something like this? Thanks!

    Read the article

  • Why does my buffered GraphicsContext application have a flickering problem?

    - by Bibendum
    import wx class MainFrame(wx.Frame): def __init__(self,parent,title): wx.Frame.__init__(self, parent, title=title, size=(640,480)) self.mainPanel=DoubleBufferTest(self,-1) self.Show(True) class DoubleBufferTest(wx.Panel): def __init__(self,parent=None,id=-1): wx.Panel.__init__(self,parent,id,style=wx.FULL_REPAINT_ON_RESIZE) self.SetBackgroundColour("#FFFFFF") self.timer = wx.Timer(self) self.timer.Start(100) self.Bind(wx.EVT_TIMER, self.update, self.timer) self.Bind(wx.EVT_PAINT,self.onPaint) def onPaint(self,event): event.Skip() dc = wx.MemoryDC() dc.SelectObject(wx.EmptyBitmap(640, 480)) gc = wx.GraphicsContext.Create(dc) gc.PushState() gc.SetBrush(wx.Brush("#CFCFCF")) bgRect=gc.CreatePath() bgRect.AddRectangle(0,0,640,480) gc.FillPath(bgRect) gc.PopState() dc2=wx.PaintDC(self) dc2.Blit(0,0,640,480,dc,0,0) def update(self,event): self.Refresh() app = wx.App(False) f=MainFrame(None,"Test") app.MainLoop() I've come up with this code to draw double buffered GraphicsContext content onto a panel, but there's a constant flickering across the window. I've tried different kinds of paths, like lines and curves but it's still there and I don't know what's causing it.

    Read the article

  • Problem with anchor tags in Django after using lighttpd + fastcgi

    - by Drew A
    I just started using lighttpd and fastcgi for my django site, but I've noticed my anchor links are no longer working. I used the anchor links for sorting links on the page, for example I use an anchor to sort links by the number of points (or votes) they have received. For example: the code in the html template: ... {% load sorting_tags %} ... {% ifequal sort_order "points" %} {% trans "total points" %} {% trans "or" %} {% anchor "date" "date posted" %} {% order_by_votes links request.direction %} {% else %} {% anchor "points" "total points" %} {% trans "or" %} {% trans "date posted" %} ... The anchor link on "www.mysite.com/my_app/" for total points will be directed to "my_app/?sort=points" But the correct URL should be "www.mysite.com/my_app/?sort=points" All my other links work, the problem is specific to anchor links. The {% anchor %} tag is taken from django-sorting, the code can be found at http://github.com/directeur/django-sorting Specifically in django-sorting/templatetags/sorting_tags.py Thanks in advance.

    Read the article

< Previous Page | 366 367 368 369 370 371 372 373 374 375 376 377  | Next Page >