Search Results

Search found 3758 results on 151 pages for 'efficient'.

Page 139/151 | < Previous Page | 135 136 137 138 139 140 141 142 143 144 145 146 | Next Page >

Where is my VMware-ws FreeNAS CIFS(ZFS) bottle-neck?

- by maka

Background: I'm building a quiet HTPC + NAS that is also supposed to be used for general computer usage. I'm so far generally happy with things, it was just that I was expecting a little better IO performance. I have no clue if my expectations are unreal. The NAS is there as a general purpose file storage and as a media server for XBMC and other devices. ZFS is a requirement. Question: Where is my bottle-neck, and is there anything I can do config wise, to improve my performance? I'm thinking VM-disk settings could be something but I really have no idea where to go since I'm neither experienced with FreeNAS nor VMware-WS. Tests: When I'm on the host OS and copy files (from the SSD) to the CIFS share, I get around 30 Mbytes/sec read and write. When I'm on my laptop laptop, wired to the network, I get about the same specs. The test I've done are with a 16 GB ISO, and with about 200 MB of RARs and I've tried avoiding the RAM-cache by reading different files than the ones I'm writing ( 10 GB). It feels like having less CPU cores is a lot more efficient, since the resource manager in Windows reports less CPU-usage. With 4 cores in VMware, CPU usage was 50-80%, with 1 core it was 25-60%. EDIT: HD ActiveTime was quite high on SSD so I moved the page file, disabled hibernate and enabled Win DiskCache both on SSD and RAID. This resulted in no real performance difference for one file, but if i transferred 2 files the total speed went up to 50 Mbytes/s vs ~40. The ActiveTime avg also went down a lot (to ~20%) but has now higher bursts. DiskIO is on ~ 30-35 Mbytes/s avgs, with ~100Mb bursts. Network is on 200-250Mbits/s with ~45 active TCP connections. Hardware Asus F2A85-M Pro A10-5700 16GB DDR3 1600 OCZ Vertex 2 128GB SSD 2x Generic 1tb 7200 RPM drives as RAID0 (in win7) Intel Gigabit Desktop CT Software Host OS: Win7 (SSD) VMware Worksation 9 (SSD) FreeNAS 8.3 VM (20GB VDisk on SSD) CPU: I've tried 1, 2 and 4 cores. Virtualisation engine, Preferred mode: Automatic 10,24Gb ram 50Gb SCSI VDisk on the RAID0, VDisk is formatted as ZFS and exposed through CIFS through FreeNAS. NIC Bridge, Replicate physical network state Below are two typical process print-outs while I'm transfering one file to the CIFS share. last pid: 2707; load averages: 0.60, 0.43, 0.24 up 0+00:07:05 00:34:26 32 processes: 2 running, 30 sleeping Mem: 101M Active, 53M Inact, 1620M Wired, 2188K Cache, 149M Buf, 8117M Free Swap: 4096M Total, 4096M Free PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND 2640 root 1 102 0 50164K 10364K RUN 0:25 25.98% smbd 1897 root 6 44 0 168M 74808K uwait 0:02 0.00% python last pid: 2746; load averages: 0.93, 0.60, 0.33 up 0+00:08:53 00:36:14 33 processes: 2 running, 31 sleeping Mem: 101M Active, 53M Inact, 4722M Wired, 2188K Cache, 152M Buf, 5015M Free Swap: 4096M Total, 4096M Free PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND 2640 root 1 76 0 50164K 10364K RUN 0:52 16.99% smbd 1897 root 6 44 0 168M 74816K uwait 0:02 0.00% python I'm sorry if my question isn't phrased right, I'm really bad at these kind of things, and it is the first time I post here at SU. I also appreciate any other suggestions to something, I could have missed.

Read the article
Distributed and/or Parallel SSIS processing

- by Jeff

Background: Our company hosts SaaS DSS applications, where clients provide us data Daily and/or Weekly, which we process & merge into their existing database. During business hours, load in the servers are pretty minimal as it's mostly users running simple pre-defined queries via the website, or running drill-through reports that mostly hit the SSAS OLAP cube. I manage the IT Operations Team, and so far this has presented an interesting "scaling" issue for us. For our daily-refreshed clients, the server is only "busy" for about 4-6 hrs at night. For our weekly-refresh clients, the server is only "busy" for maybe 8-10 hrs per week! We've done our best to use some simple methods of distributing the load by spreading the daily clients evenly among the servers such that we're not trying to process daily clients back-to-back over night. But long-term this scaling strategy creates two notable issues. First, it's going to consume a pretty immense amount of hardware that sits idle for large periods of time. Second, it takes significant Production Support over-head to basically "schedule" the ETL such that they don't over-lap, and move clients/schedules around if they out-grow the resources on a particular server or allocated time-slot. As the title would imply, one option we've tried is running multiple SSIS packages in parallel, but in most cases this has yielded VERY inconsistent results. The most common failures are DTExec, SQL, and SSAS fighting for physical memory and throwing out-of-memory errors, and ETLs running 3,4,5x longer than expected. So from my practical experience thus far, it seems like running multiple ETL packages on the same hardware isn't a good idea, but I can't be the first person that doesn't want to scale multiple ETLs around manual scheduling, and sequential processing. One option we've considered is virtualizing the servers, which obviously doesn't give you any additional resources, but moves the resource contention onto the hypervisor, which (from my experience) seems to manage simultaneous CPU/RAM/Disk I/O a little more gracefully than letting DTExec, SQL, and SSAS battle it out within Windows. Question to the forum: So my question to the forum is, are we missing something obvious here? Are there tools out there that can help manage running multiple SSIS packages on the same hardware? Would it be more "efficient" in terms of parallel execution if instead of running DTExec, SQL, and SSAS same machine (with every machine running that configuration), we run in pairs of three machines with SSIS running on one machine, SQL on another, and SSAS on a third? Obviously that would only make sense if we could process more than the three ETL we were able to process on the machine independently. Another option we've considered is completely re-architecting our SSIS package to have one "master" package for all clients that attempts to intelligently chose a server based off how "busy" it already is in terms of CPU/Memory/Disk utilization, but that would be a herculean effort, and seems like we're trying to reinvent something that you would think someone would sell (although I haven't had any luck finding it). So in summary, are we missing an obvious solution for this, and does anyone know if any tools (for free or for purchase, doesn't matter) that facilitate running multiple SSIS ETL packages in parallel and on multiple servers? (What I would call a "queue & node based" system, but that's not an official term). Ultimately VMWare's Distributed Resource Scheduler addresses this as you simply run a consistent number of clients per VM that you know will never conflict scheduleing-wise, then leave it up to VMWare to move the VMs around to balance out hardware usage. I'm definitely not against using VMWare to do this, but since we're a 100% Microsoft app stack, it seems like -someone- out there would have solved this problem at the application layer instead of the hypervisor layer by checking on resource utilization at the OS, SQL, SSAS levels. I'm open to ANY discussion on this, and remember no suggestion is too crazy or radical! :-) Right now, VMWare is the only option we've found to get away from "manually" balancing our resources, so any suggestions that leave us on a pure Microsoft stack would be great. Thanks guys, Jeff

Read the article
jqGrid multi-checkbox custom edittype solution

- by gsiler

For those of you trying to understand jqGrid custom edit types ... I created a multi-checkbox form element, and thought I'd share. This was built using version 3.6.4. If anyone has a more efficient solution, please pass it on. Within the colModel, the appropriate edit fields look like this: edittype:'custom' editoptions:{ custom_element:MultiCheckElem, custom_value:MultiCheckVal, list:'Check1,Check2,Check3,Check4' } Here are the javascript functions (BTW, It also works – with some modifications – when the list of checkboxes is in a DIV block): //———————————————————— // Description: // MultiCheckElem is the "custom_element" function that builds the custom multiple check box input // element. From what I have gathered, jqGrid calls this the first time the form is launched. After // that, only the "custom_value" function is called. // // The full list of checkboxes is in the jqGrid "editoptions" section "list" tag (in the options // parameter). //———————————————————— function MultiCheckElem( value, options ) { //———- // for each checkbox in the list // build the input element // set the initial "checked" status // endfor //———- var ctl = ''; var ckboxAry = options.list.split(','); for ( var i in ckboxAry ) { var item = ckboxAry[i]; ctl += '<input type="checkbox" '; if ( value.indexOf(item + '|') != -1 ) ctl += 'checked="checked" '; ctl += 'value="' + item + '"> ' + item + '</input><br /> '; } ctl = ctl.replace( /<br /> $/, '' ); return ctl; } //———————————————————— // Description: // MultiCheckVal is the "custom_value" function for the custom multiple check box input element. It // appears that jqGrid invokes this function the first time the form is submitted and, the rest of // the time, when the form is launched (action = set) and when it is submitted (action = 'get'). //———————————————————— function MultiCheckVal(elem, action, val) { var items = ''; if (action == 'get') // the form has been submitted { //———- // for each input element // if it's checked, add it to the list of items // endfor //———- for (var i in elem) { if (elem[i].tagName == 'INPUT' && elem[i].checked ) items += elem[i].value + ','; } // items contains a comma delimited list that is returned as the result of the element items = items.replace(/,$/, ''); } else // the form is launched { //———- // for each input element // based on the input value, set the checked status // endfor //———- for (var i in elem) { if (elem[i].tagName == 'INPUT') { if (val.indexOf(elem[i].value + '|') == -1) elem[i].checked = false; else elem[i].checked = true; } } // endfor } return items; }

Read the article
Computer science undergraduate project ideas

- by Mehrdad Afshari

Hopefully, I'm going to finish my undergraduate studies next semester and I'm thinking about the topic of my final project. And yes, I've read the questions with duplicate title. I'm asking this from a bit different viewpoint, so it's not an exact dupe. I've spent at least half of my life coding stuff in different languages and frameworks so I'm not looking at this project as a way to learn much about coding and preparing for real world apps or such. I've done lots of those already. But since I have to do it to complete my degree, I felt I should spend my time doing something useful instead of throwing the whole thing out. I'm planning to make it an open source project or a hosted Web app (depending on the type) if I can make a high quality thing out of it, so I decided to ask StackOverflow what could make a useful project. Situation I've plenty of freedom about the topic. They also require 30-40 pages of text describing the project. I have the following points in mind (the more satisfied, the better): Something useful for software development Something that benefits the community Having academic value is great Shouldn't take more than a month of development (I know I'm lazy). Shouldn't be related to advanced theoretical stuff (soft computing, fuzzy logic, neural networks, ...). I've been a business-oriented software developer. It should be software oriented. While I love hacking microcontrollers and other fun embedded electronic things, I'm not really good at soldering and things like that. I'm leaning toward a Web application (think StackOverflow, PasteBin, NerdDinner, things like those). Technology It's probably going to be done in .NET (C#, F#) and Windows platform. If I really like the project (cool low level hacking), I might actually slip to C/C++. But really, C# is what I'm efficient at. Ideas Programming language, parsing and compiler related stuff: Designing a domain specific programming language and compiler Templating language compiled to C# or IL Database tools and related code generation stuff Web related technologies: ASP.NET MVC View engine doing something cool (don't know what exactly...) Specific-purpose, small, fast ASP.NET-based Web framework Applications: Visual Studio plugin to integrate with Bazaar (it's too much work, I think). ASP.NET based, jQuery-powered issue tracker (and possibly, project lifecycle management as a whole - poor man's TFS) Others: Something related to GPGPU Looking forward for great ideas! Unfortunately, I can't help on a currently existing project. I need to start my own to prevent further problems (as it's an undergrad project, nevertheless).

Read the article
Delete Nodes + attributes that match Xpath except specific attributes

- by Ryan Ternier

I'm trying to find the best (efficient) way of doing this. I have a medium sized XML document. Depending on specific settings certain portions of it need to be filtered out for security reasons. I'll be doing this in XSLT as it's configurable and no code should need changing. I've looked around, but not getting much luck on it. For example: I have the following XPath: //*[@root='2.16.840.1.113883.3.51.1.1.6.1'] Whicrooth gives me all nodes with a root attribute equal to a specific OID. In these nodes I want to have all attributes except for a few (ex. foo and bar) erased, and then having another attribute added (ex. reason) I also need to have multiple XPath expressions that can be ran to zero down on a specific node and clear it's contents out in a similar fashion, with respect to nodes with specific attributes. I'm playing around with information from: XPath expression to select all XML child nodes except a specific list? and XSLT Remove Elements and/or Attributes by Name per XSL Parameters Will update shortly when I can have access what what I"ve done so far. Example: XML Before Transformation <root> <childNode> <innerChild root="2.16.840.1.113883.3.51.1.1.6.1" a="b" b="c" type="innerChildness"/> <innerChildSibling/> </childNode> <animals> <cat> <name>bob</name> </cat> </animals> <tree/> <water root="2.16.840.1.113883.3.51.1.1.6.1" z="zed" l="ell" type="liquidLIke"/> </root> After <root> <childNode> <innerChild root="2.16.840.1.113883.3.51.1.1.6.1" flavor="MSK"/>  <innerChildSibling/> </childNode> <animals> <cat flavor="MSK" />  </animals> <tree/> <water root="2.16.840.1.113883.3.51.1.1.6.1" flavor="MSK"/>  </root>

Read the article
delta-dictionary/dictionary with revision awareness in python?

- by shabbychef

I am looking to create a dictionary with 'roll-back' capabilities in python. The dictionary would start with a revision number of 0, and the revision would be bumped up only by explicit method call. I do not need to delete keys, only add and update key,value pairs, and then roll back. I will never need to 'roll forward', that is, when rolling the dictionary back, all the newer revisions can be discarded, and I can start re-reving up again. thus I want behaviour like: >>> rr = rev_dictionary() >>> rr.rev 0 >>> rr["a"] = 17 >>> rr[('b',23)] = 'foo' >>> rr["a"] 17 >>> rr.rev 0 >>> rr.roll_rev() >>> rr.rev 1 >>> rr["a"] 17 >>> rr["a"] = 0 >>> rr["a"] 0 >>> rr[('b',23)] 'foo' >>> rr.roll_to(0) >>> rr.rev 0 >>> rr["a"] 17 >>> rr.roll_to(1) Exception ... Just to be clear, the state associated with a revision is the state of the dictionary just prior to the roll_rev() method call. thus if I can alter the value associated with a key several times 'within' a revision, and only have the last one remembered. I would like a fairly memory-efficient implementation of this: the memory usage should be proportional to the deltas. Thus simply having a list of copies of the dictionary will not scale for my problem. One should assume the keys are in the tens of thousands, and the revisions are in the hundreds of thousands. We can assume the values are immutable, but need not be numeric. For the case where the values are e.g. integers, there is a fairly straightforward implementation (have a list of dictionaries of the numerical delta from revision to revision). I am not sure how to turn this into the general form. Maybe bootstrap the integer version and add on an array of values? all help appreciated.

Read the article
Python: How to read huge text file into memory

- by asmaier

I'm using Python 2.6 on a Mac Mini with 1GB RAM. I want to read in a huge text file $ ls -l links.csv; file links.csv; tail links.csv -rw-r--r-- 1 user user 469904280 30 Nov 22:42 links.csv links.csv: ASCII text, with CRLF line terminators 4757187,59883 4757187,99822 4757187,66546 4757187,638452 4757187,4627959 4757187,312826 4757187,6143 4757187,6141 4757187,3081726 4757187,58197 So each line in the file consists of a tuple of two comma separated integer values. I want to read in the whole file and sort it according to the second column. I know, that I could do the sorting without reading the whole file into memory. But I thought for a file of 500MB I should still be able to do it in memory since I have 1GB available. However when I try to read in the file, Python seems to allocate a lot more memory than is needed by the file on disk. So even with 1GB of RAM I'm not able to read in the 500MB file into memory. My Python code for reading the file and printing some information about the memory consumption is: #!/usr/bin/python # -*- coding: utf-8 -*- import sys infile=open("links.csv", "r") edges=[] count=0 #count the total number of lines in the file for line in infile: count=count+1 total=count print "Total number of lines: ",total infile.seek(0) count=0 for line in infile: edge=tuple(map(int,line.strip().split(","))) edges.append(edge) count=count+1 # for every million lines print memory consumption if count%1000000==0: print "Position: ", edge print "Read ",float(count)/float(total)*100,"%." mem=sys.getsizeof(edges) for edge in edges: mem=mem+sys.getsizeof(edge) for node in edge: mem=mem+sys.getsizeof(node) print "Memory (Bytes): ", mem The output I got was: Total number of lines: 30609720 Position: (9745, 2994) Read 3.26693612356 %. Memory (Bytes): 64348736 Position: (38857, 103574) Read 6.53387224712 %. Memory (Bytes): 128816320 Position: (83609, 63498) Read 9.80080837067 %. Memory (Bytes): 192553000 Position: (139692, 1078610) Read 13.0677444942 %. Memory (Bytes): 257873392 Position: (205067, 153705) Read 16.3346806178 %. Memory (Bytes): 320107588 Position: (283371, 253064) Read 19.6016167413 %. Memory (Bytes): 385448716 Position: (354601, 377328) Read 22.8685528649 %. Memory (Bytes): 448629828 Position: (441109, 3024112) Read 26.1354889885 %. Memory (Bytes): 512208580 Already after reading only 25% of the 500MB file, Python consumes 500MB. So it seem that storing the content of the file as a list of tuples of ints is not very memory efficient. Is there a better way to do it, so that I can read in my 500MB file into my 1GB of memory?

Read the article
Append indexed input fields with attached datepicker widget

- by Micor

I am looking for the most efficient way to add a set of fields with datepicker widget attached on click event. So here I have a set of indexed events being generated on a page: <div id="events"> <div class="event"> <input type="text" id="event_0_startDate" name="event[0].startDate" class="date"/> <input type="hidden" id="event_0_startDate_day" name="event[0].startDate_day" /> <input type="hidden" id="event_0_startDate_month" name="event[0].startDate_month" /> <input type="hidden" id="event_0_startDate_year" name="event[0].startDate_year"/> </div> <div class="event"> <input type="text" id="event_1_startDate" name="event[1].startDate" class="date"/> <input type="hidden" id="event_1_startDate_day" name="event[1].startDate_day" /> <input type="hidden" id="event_1_startDate_month" name="event[1].startDate_month" /> <input type="hidden" id="event_1_startDate_year" name="event[1].startDate_year" /> </div> .... more event divs .... </div> <a href="#" id="add_event">Add event</a> .. with datepicker widget attached: $(".date").datepicker({ onClose: function(dateText,picker) { var prefix = $(this).attr('id'); $('#' + prefix + '_month').val( dateText.split(/\//)[0] ); $('#' + prefix + '_day').val( dateText.split(/\//)[1] ); $('#' + prefix + '_year').val( dateText.split(/\//)[2] ); } }); I need a function that will add inside div id="events" after the last div class="event" another div class="event" where N is the next index and attach datepicker widget above to the new input field with class="date" like: <div class="event"> <input type="text" id="event_N_startDate" name="event[N].startDate" class="date"/> <input type="hidden" id="event_N_startDate_day" name="event[N].startDate_day"/> <input type="hidden" id="event_N_startDate_month" name="event[N].startDate_month" /> <input type="hidden" id="event_N_startDate_year" name="event[N].startDate_year" /> </div> Thank you.

Read the article
Need help implementing simple socket server using GIOService (GLib, Glib-GIO)

- by Mark Renouf

I'm learning the basics of writing a simple, efficient socket server using GLib. I'm experimenting with GSocketService. So far I can only seem to accept connections but then they are immediately closed. From the docs I can't figure out what step I am missing. I'm hoping someone can shed some light on this for me. When running the following: # telnet localhost 4000 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. Connection closed by foreign host. # telnet localhost 4000 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. Connection closed by foreign host. # telnet localhost 4000 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. Connection closed by foreign host. Output from the server: # ./server New Connection from 127.0.0.1:36962 New Connection from 127.0.0.1:36963 New Connection from 127.0.0.1:36965 Current code: /* * server.c * * Created on: Mar 10, 2010 * Author: mark */ #include <glib.h> #include <gio/gio.h> gchar *buffer; gboolean network_read(GIOChannel *source, GIOCondition cond, gpointer data) { GString *s = g_string_new(NULL); GError *error; GIOStatus ret = g_io_channel_read_line_string(source, s, NULL, &error); if (ret == G_IO_STATUS_ERROR) g_error ("Error reading: %s\n", error->message); else g_print("Got: %s\n", s->str); } gboolean new_connection(GSocketService *service, GSocketConnection *connection, GObject *source_object, gpointer user_data) { GSocketAddress *sockaddr = g_socket_connection_get_remote_address(connection, NULL); GInetAddress *addr = g_inet_socket_address_get_address(G_INET_SOCKET_ADDRESS(sockaddr)); guint16 port = g_inet_socket_address_get_port(G_INET_SOCKET_ADDRESS(sockaddr)); g_print("New Connection from %s:%d\n", g_inet_address_to_string(addr), port); GSocket *socket = g_socket_connection_get_socket(connection); gint fd = g_socket_get_fd(socket); GIOChannel *channel = g_io_channel_unix_new(fd); g_io_add_watch(channel, G_IO_IN, (GIOFunc) network_read, NULL); return TRUE; } int main(int argc, char **argv) { g_type_init(); GSocketService *service = g_socket_service_new(); GInetAddress *address = g_inet_address_new_from_string("127.0.0.1"); GSocketAddress *socket_address = g_inet_socket_address_new(address, 4000); g_socket_listener_add_address(G_SOCKET_LISTENER(service), socket_address, G_SOCKET_TYPE_STREAM, G_SOCKET_PROTOCOL_TCP, NULL, NULL, NULL); g_object_unref(socket_address); g_object_unref(address); g_socket_service_start(service); g_signal_connect(service, "incoming", G_CALLBACK(new_connection), NULL); GMainLoop *loop = g_main_loop_new(NULL, FALSE); g_main_loop_run(loop); }

Read the article
How to crop or get a smaller size UIImage in iPhone without memory leaks?

- by rkbang

Hello all, I am using a navigation controller in which I push a tableview Controller as follows: TableView *Controller = [[TableView alloc] initWithStyle:UITableViewStylePlain]; [self.navigationController pushViewController:Controller animated:NO]; [Controller release]; In this table view I am using following two methods to display images: - (UIImage*) getSmallImage:(UIImage*) img { CGSize size = img.size; CGFloat ratio = 0; if (size.width < size.height) { ratio = 36 / size.width; } else { ratio = 36 / size.height; } CGRect rect = CGRectMake(0.0, 0.0, ratio * size.width, ratio * size.height); UIGraphicsBeginImageContext(rect.size); [img drawInRect:rect]; return UIGraphicsGetImageFromCurrentImageContext(); UIGraphicsEndImageContext(); } - (UIImage*)imageByCropping:(UIImage *)imageToCrop toRect:(CGRect)rect { //create a context to do our clipping in UIGraphicsBeginImageContext(rect.size); CGContextRef currentContext = UIGraphicsGetCurrentContext(); //create a rect with the size we want to crop the image to //the X and Y here are zero so we start at the beginning of our //newly created context CGFloat X = (imageToCrop.size.width - rect.size.width)/2; CGFloat Y = (imageToCrop.size.height - rect.size.height)/2; CGRect clippedRect = CGRectMake(X, Y, rect.size.width, rect.size.height); //CGContextClipToRect( currentContext, clippedRect); //create a rect equivalent to the full size of the image //offset the rect by the X and Y we want to start the crop //from in order to cut off anything before them CGRect drawRect = CGRectMake(0, 0, imageToCrop.size.width, imageToCrop.size.height); CGContextTranslateCTM(currentContext, 0.0, drawRect.size.height); CGContextScaleCTM(currentContext, 1.0, -1.0); //draw the image to our clipped context using our offset rect //CGContextDrawImage(currentContext, drawRect, imageToCrop.CGImage); CGImageRef tmp = CGImageCreateWithImageInRect(imageToCrop.CGImage, clippedRect); //pull the image from our cropped context UIImage *cropped = [UIImage imageWithCGImage:tmp];//UIGraphicsGetImageFromCurrentImageContext(); CGImageRelease(tmp); //pop the context to get back to the default UIGraphicsEndImageContext(); //Note: this is autoreleased*/ return cropped; } But when I pop the Controller, the memory being used is not released. Is there any leaks in the above code used to create and crop images. Also are there any efficient method to deal with images in iPhone. I am having a lot of images and facing major challeges in resolving the memory issues. tnx in advance.

Read the article
Fulltext search for django : Mysql not so bad ? (vs sphinx, xapian)

- by Eric

I am studying fulltext search engines for django. It must be simple to install, fast indexing, fast index update, not blocking while indexing, fast search. After reading many web pages, I put in short list : Mysql MYISAM fulltext, djapian/python-xapian, and django-sphinx I did not choose lucene because it seems complex, nor haystack as it has less features than djapian/django-sphinx (like fields weighting). Then I made some benchmarks, to do so, I collected many free books on the net to generate a database table with 1 485 000 records (id,title,body), each record is about 600 bytes long. From the database, I also generated a list of 100 000 existing words and shuffled them to create a search list. For the tests, I made 2 runs on my laptop (4Go RAM, Dual core 2.0Ghz): the first one, just after a server reboot to clear all caches, the second is done juste after in order to test how good are cached results. Here are the "home made" benchmark results : 1485000 records with Title (150 bytes) and body (450 bytes) Mysql 5.0.75/Ubuntu 9.04 Fulltext : ========================================================================== Full indexing : 7m14.146s 1 thread, 1000 searchs with single word randomly taken from database : First run : 0:01:11.553524 next run : 0:00:00.168508 Mysql 5.5.4 m3/Ubuntu 9.04 Fulltext : ========================================================================== Full indexing : 6m08.154s 1 thread, 1000 searchs with single word randomly taken from database : First run : 0:01:11.553524 next run : 0:00:00.168508 1 thread, 100000 searchs with single word randomly taken from database : First run : 9m09s next run : 5m38s 1 thread, 10000 random strings (random strings should not be found in database) : just after the 100000 search test : 0:00:15.007353 1 thread, boolean search : 1000 x (+word1 +word2) First run : 0:00:21.205404 next run : 0:00:00.145098 Djapian Fulltext : ========================================================================== Full indexing : 84m7.601s 1 thread, 1000 searchs with single word randomly taken from database with prefetch : First run : 0:02:28.085680 next run : 0:00:14.300236 python-xapian Fulltext : ========================================================================== 1 thread, 1000 searchs with single word randomly taken from database : First run : 0:01:26.402084 next run : 0:00:00.695092 django-sphinx Fulltext : ========================================================================== Full indexing : 1m25.957s 1 thread, 1000 searchs with single word randomly taken from database : First run : 0:01:30.073001 next run : 0:00:05.203294 1 thread, 100000 searchs with single word randomly taken from database : First run : 12m48s next run : 9m45s 1 thread, 10000 random strings (random strings should not be found in database) : just after the 100000 search test : 0:00:23.535319 1 thread, boolean search : 1000 x (word1 word2) First run : 0:00:20.856486 next run : 0:00:03.005416 As you can see, Mysql is not so bad at all for fulltext search. In addition, its query cache is very efficient. Mysql seems to me a good choice as there is nothing to install (I need just to write a small script to synchronize an Innodb production table to a MyISAM search table) and as I do not really need advanced search feature like stemming etc... Here is the question : What do you think about Mysql fulltext search engine vs sphinx and xapian ?

Read the article
Overriding LINQ extension methods

- by Ruben Vermeersch

Is there a way to override extension methods (provide a better implementation), without explicitly having to cast to them? I'm implementing a data type that is able to handle certain operations more efficiently than the default extension methods, but I'd like to keep the generality of IEnumerable. That way any IEnumerable can be passed, but when my class is passed in, it should be more efficient. As a toy example, consider the following: // Compile: dmcs -out:test.exe test.cs using System; namespace Test { public interface IBoat { void Float (); } public class NiceBoat : IBoat { public void Float () { Console.WriteLine ("NiceBoat floating!"); } } public class NicerBoat : IBoat { public void Float () { Console.WriteLine ("NicerBoat floating!"); } public void BlowHorn () { Console.WriteLine ("NicerBoat: TOOOOOT!"); } } public static class BoatExtensions { public static void BlowHorn (this IBoat boat) { Console.WriteLine ("Patched on horn for {0}: TWEET", boat.GetType().Name); } } public class TestApp { static void Main (string [] args) { IBoat niceboat = new NiceBoat (); IBoat nicerboat = new NicerBoat (); Console.WriteLine ("## Both should float:"); niceboat.Float (); nicerboat.Float (); // Output: // NiceBoat floating! // NicerBoat floating! Console.WriteLine (); Console.WriteLine ("## One has an awesome horn:"); niceboat.BlowHorn (); nicerboat.BlowHorn (); // Output: // Patched on horn for NiceBoat: TWEET // Patched on horn for NicerBoat: TWEET Console.WriteLine (); Console.WriteLine ("## That didn't work, but it does when we cast:"); (niceboat as NiceBoat).BlowHorn (); (nicerboat as NicerBoat).BlowHorn (); // Output: // Patched on horn for NiceBoat: TWEET // NicerBoat: TOOOOOT! Console.WriteLine (); Console.WriteLine ("## Problem is: I don't always know the type of the objects."); Console.WriteLine ("## How can I make it use the class objects when the are"); Console.WriteLine ("## implemented and extension methods when they are not,"); Console.WriteLine ("## without having to explicitely cast?"); } } } Is there a way to get the behavior from the second case, without explict casting? Can this problem be avoided?

Read the article
NSXMLParser Memory Allocation Efficiency for the iPhone

- by Staros

Hello, I've recently been playing with code for an iPhone app to parse XML. Sticking to Cocoa, I decided to go with the NSXMLParser class. The app will be responsible for parsing 10,000+ "computers", all which contain 6 other strings of information. For my test, I've verified that the XML is around 900k-1MB in size. My data model is to keep each computer in an NSDictionary hashed by a unique identifier. Each computer is also represented by a NSDictionary with the information. So at the end of the day, I end up with a NSDictionary containing 10k other NSDictionaries. The problem I'm running into isn't about leaking memory or efficient data structure storage. When my parser is done, the total amount of allocated objects only does go up by about 1MB. The problem is that while the NSXMLParser is running, my object allocation is jumping up as much as 13MB. I could understand 2 (one for the object I'm creating and one for the raw NSData) plus a little room to work, but 13 seems a bit high. I can't imaging that NSXMLParser is that inefficient. Thoughts? Code... The code to start parsing... NSXMLParser *parser = [[NSXMLParser alloc] initWithData: data]; [parser setDelegate:dictParser]; [parser parse]; output = [[dictParser returnDictionary] retain]; [parser release]; [dictParser release]; And the parser's delegate code... -(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict { if(mutableString) { [mutableString release]; mutableString = nil; } mutableString = [[NSMutableString alloc] init]; } -(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string { if(self.mutableString) { [self.mutableString appendString:string]; } } -(void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName { if([elementName isEqualToString:@"size"]){ //The initial key, tells me how many computers returnDictionary = [[NSMutableDictionary alloc] initWithCapacity:[mutableString intValue]]; } if([elementName isEqualToString:hashBy]){ //The unique identifier if(mutableDictionary){ [mutableDictionary release]; mutableDictionary = nil; } mutableDictionary = [[NSMutableDictionary alloc] initWithCapacity:6]; [returnDictionary setObject:[NSDictionary dictionaryWithDictionary:mutableDictionary] forKey:[NSMutableString stringWithString:mutableString]]; } if([fields containsObject:elementName]){ //Any of the elements from a single computer that I am looking for [mutableDictionary setObject:mutableString forKey:elementName]; } } Everything initialized and released correctly. Again, I'm not getting errors or leaking. Just inefficient. Thanks for any thoughts!

Read the article
Best way to handle multiple tables to replace one big table in Rails? (e.g. 'Books1', 'Books2', etc.

- by mikep

Hello, I've decided to use multiple tables for an entity (e.g. Books1, Books2, Books3, etc.), instead of just one main table which could end up having a lot of rows (e.g. just Books). I'm doing this to try and to avoid a potential future performance drop that could come with having too many rows in one table. With that, I'm looking for a good way to handle this in Rails, mainly by trying to avoid loading a bunch of unused associations. (I know that I could use a partition for this, but, for now, I've decided to go the 'multiple tables' route.) Each user has their books placed into a specific table. The actual book table is chosen when the user is created, and all of their books go into the same table. I'm going to split the adds across the tables. The goal is to try and keep each table pretty much even -- but that's a different issue. One thing I don't particularly want to have is a bunch of unused associations in the User class. Right now, it looks like I'd have to do the following: class User < ActiveRecord::Base has_many :books1, :books2, :books3, :books4, :books5 end class Books1 < ActiveRecord::Base belongs_to :user end class Books2 < ActiveRecord::Base belongs_to :user end class Books3 < ActiveRecord::Base belongs_to :user end I'm assuming that the main performance hit would come in terms of memory and possibly some method call overhead for each User object, since it has to load all of those associations, which in turn creates all of those nice, dynamic model accessor methods like User.find_by_. But for each specific user, only one of the book tables would be usable/applicable, since all of a user's books are stored in the same table. So, only one of the associations would be in use at any time and any other has_many :bookX association that was loaded would be a waste. For example, with a user.id of 2, I'd only need books3.find_by_author('Author'), but the way I'm thinking of setting this up, I'd still have access to Books1..n. I don't really know Ruby/Rails does internally with all of those has_many associations though, so maybe it's not so bad. But right now I'm thinking that it's really wasteful, and that there may just be a better, more efficient way of doing this. So, a few questions: 1) Is there's some sort of special Ruby/Rails methodology that could be applied to this 'multiple tables to represent one entity' scheme? Are there any 'best practices' for this? 2) Is it really bad to have so many unused has_many associations for each object? Is there a better way to do this? 3) Does anyone have any advice on how to abstract the fact that there's multiple book tables behind a single books model/class? For example, so I can call books.find_by_author('Author') instead of books3.find_by_author('Author'). Thank you!

Read the article
How to optimize Core Data query for full text search

- by dk

Can I optimize a Core Data query when searching for matching words in a text? (This question also pertains to the wisdom of custom SQL versus Core Data on an iPhone.) I'm working on a new (iPhone) app that is a handheld reference tool for a scientific database. The main interface is a standard searchable table view and I want as-you-type response as the user types new words. Words matches must be prefixes of words in the text. The text is composed of 100,000s of words. In my prototype I coded SQL directly. I created a separate "words" table containing every word in the text fields of the main entity. I indexed words and performed searches along the lines of SELECT id, * FROM textTable JOIN (SELECT DISTINCT textTableId FROM words WHERE word BETWEEN 'foo' AND 'fooz' ) ON id=textTableId LIMIT 50 This runs very fast. Using an IN would probably work just as well, i.e. SELECT * FROM textTable WHERE id IN (SELECT textTableId FROM words WHERE word BETWEEN 'foo' AND 'fooz' ) LIMIT 50 The LIMIT is crucial and allows me to display results quickly. I notify the user that there are too many to display if the limit is reached. This is kludgy. I've spent the last several days pondering the advantages of moving to Core Data, but I worry about the lack of control in the schema, indexing, and querying for an important query. Theoretically an NSPredicate of textField MATCHES '.*\bfoo.*' would just work, but I'm sure it will be slow. This sort of text search seems so common that I wonder what is the usual attack? Would you create a words entity as I did above and use a predicate of "word BEGINSWITH 'foo'"? Will that work as fast as my prototype? Will Core Data automatically create the right indexes? I can't find any explicit means of advising the persistent store about indexes. I see some nice advantages of Core Data in my iPhone app. The faulting and other memory considerations allow for efficient database retrievals for tableview queries without setting arbitrary limits. The object graph management allows me to easily traverse entities without writing lots of SQL. Migration features will be nice in the future. On the other hand, in a limited resource environment (iPhone) I worry that an automatically generated database will be bloated with metadata, unnecessary inverse relationships, inefficient attribute datatypes, etc. Should I dive in or proceed with caution?

Read the article
"Wrapping" a BindingList<T> propertry with a List<T> property for serialization.

- by Eric

I'm writing an app that allows users search and browse catalogs of widgets. My WidgetCatalog class is serialized and deserialized to and from XML files using DataContractSerializer. My app is working now but I think I can make the code a lot more efficient if I started taking advantage of data binding rather then doing everything manually. Here's a stripped down version of my current WidgetCatalog class. [DataContract(Name = "WidgetCatalog")] class WidgetCatalog { [DataContract(Name = "Name")] public string Name { get; set; } [DataContract(Name = "Widgets")] public List<Widget> Widgets { get; set; } } I had to write a lot of extra code to keep my UI in sync when widgets are added or removed from a catalog, or their internal properties change. I'm pretty inexperienced with data-binding, but I think I want a BindingList<Widget> rather than a plain old List<Widget>. Is this right? In the past when I had similar needs I found that BindingList<T> does not serialize very well. Or at least the Event Handers between the items and the list are not serialized. I was using XmlSerializer though, and DataContractSerializer may work better. So I'm thinking of doing something like the code below. [DataContract(Name = "WidgetCatalog")] class WidgetCatalog { [DataMember(Name = "Name")] public string Name { get; set; } [DataMember(Name = "Widgets")] private List<Widget> WidgetSerializationList { get { return this._widgetBindingList.ToList<Widget>(); } set { this._widgetBindingList = new BindingList<Widget>(value); } } //these do not get serialized private BindingList<Widget> _widgetBindingList; public BindingList<Widget> WidgetBindingList { get { return this._widgetBindingList; } } public WidgetCatalog() { this.WidgetSerializationList = new List<Widget>(); } } So I'm serializing a private List<Widget> property, but the GET and SET accessors of the property are reading from, and writing to theBindingList<Widget> property. Will this even work? It seems like there should be a better way to do this.

Read the article
Fixed point math in c#?

- by x4000

Hi there, I was wondering if anyone here knows of any good resources for fixed point math in c#? I've seen things like this (http://2ddev.72dpiarmy.com/viewtopic.php?id=156) and this (http://stackoverflow.com/questions/79677/whats-the-best-way-to-do-fixed-point-math), and a number of discussions about whether decimal is really fixed point or actually floating point (update: responders have confirmed that it's definitely floating point), but I haven't seen a solid C# library for things like calculating cosine and sine. My needs are simple -- I need the basic operators, plus cosine, sine, arctan2, PI... I think that's about it. Maybe sqrt. I'm programming a 2D RTS game, which I have largely working, but the unit movement when using floating-point math (doubles) has very small inaccuracies over time (10-30 minutes) across multiple machines, leading to desyncs. This is presently only between a 32 bit OS and a 64 bit OS, all the 32 bit machines seem to stay in sync without issue, which is what makes me think this is a floating point issue. I was aware from this as a possible issue from the outset, and so have limited my use of non-integer position math as much as possible, but for smooth diagonal movement at varying speeds I'm calculating the angle between points in radians, then getting the x and y components of movement with sin and cos. That's the main issue. I'm also doing some calculations for line segment intersections, line-circle intersections, circle-rect intersections, etc, that also probably need to move from floating-point to fixed-point to avoid cross-machine issues. If there's something open source in Java or VB or another comparable language, I could probably convert the code for my uses. The main priority for me is accuracy, although I'd like as little speed loss over present performance as possible. This whole fixed point math thing is very new to me, and I'm surprised by how little practical information on it there is on google -- most stuff seems to be either theory or dense C++ header files. Anything you could do to point me in the right direction is much appreciated; if I can get this working, I plan to open-source the math functions I put together so that there will be a resource for other C# programmers out there. UPDATE: I could definitely make a cosine/sine lookup table work for my purposes, but I don't think that would work for arctan2, since I'd need to generate a table with about 64,000x64,000 entries (yikes). If you know any programmatic explanations of efficient ways to calculate things like arctan2, that would be awesome. My math background is all right, but the advanced formulas and traditional math notation are very difficult for me to translate into code.

Read the article
Data access pattern

- by andlju

I need some advice on what kind of pattern(s) I should use for pushing/pulling data into my application. I'm writing a rule-engine that needs to hold quite a large amount of data in-memory in order to be efficient enough. I have some rather conflicting requirements; It is not acceptable for the engine to always have to wait for a full pre-load of all data before it is functional. Only fetching and caching data on-demand will lead to the engine taking too long before it is running quickly enough. An external event can trigger the need for specific parts of the data to be reloaded. Basically, I think I need a combination of pushing and pulling data into the application. A simplified version of my current "pattern" looks like this (in psuedo-C# written in notepad): // This interface is implemented by all classes that needs the data interface IDataSubscriber { void RegisterData(Entity data); } // This interface is implemented by the data access class interface IDataProvider { void EnsureLoaded(Key dataKey); void RegisterSubscriber(IDataSubscriber subscriber); } class MyClassThatNeedsData : IDataSubscriber { IDataProvider _provider; MyClassThatNeedsData(IDataProvider provider) { _provider = provider; _provider.RegisterSubscriber(this); } public void RegisterData(Entity data) { // Save data for later StoreDataInCache(data); } void UseData(Key key) { // Make sure that the data has been stored in cache _provider.EnsureLoaded(key); Entity data = GetDataFromCache(key); } } class MyDataProvider : IDataProvider { List<IDataSubscriber> _subscribers; // Make sure that the data for key has been loaded to all subscribers public void EnsureLoaded(Key key) { if (HasKeyBeenMarkedAsLoaded(key)) return; PublishDataToSubscribers(key); MarkKeyAsLoaded(key); } // Force all subscribers to get a new version of the data for key public void ForceReload(Key key) { PublishDataToSubscribers(key); MarkKeyAsLoaded(key); } void PublishDataToSubscribers(Key key) { Entity data = FetchDataFromStore(key); foreach(var subscriber in _subscribers) { subscriber.RegisterData(data); } } } // This class will be spun off on startup and should make sure that all data is // preloaded as quickly as possible class MyPreloadingThread { IDataProvider _provider; MyPreloadingThread(IDataProvider provider) { _provider = provider; } void RunInBackground() { IEnumerable<Key> allKeys = GetAllKeys(); foreach(var key in allKeys) { _provider.EnsureLoaded(key); } } } I have a feeling though that this is not necessarily the best way of doing this.. Just the fact that explaining it seems to take two pages feels like an indication.. Any ideas? Any patterns out there I should have a look at?

Read the article
Would Like Multiple Checkboxes to Update World PNG Image Using Mogrify to FloodFill Countries With C

- by socrtwo

Hi. I'm seeing in another forum if the best way to do this is with Javascript or Ajax but I'm wondering if there is an even easier simpler way. I'm trying to create a web service where users can check which countries they have visited from a list of 175 or so and a World map image would then instantly update with a filled color. There are other similar services, but I'm envisioning mine to be both updating from checks in checkboxes and by clicking on the target country in the displayed image say with an imagemap. Additionally other solutions display all the visited countries in the same color. I would like different colors for different countries or at least for those countries that touch. Eventually I would like to include a feature that enables the choice of which colors to assign countries. I found a Sourceforge project called pwmfccd. It's simply an open source image of the world and the coordinates on the PNG image for all the countries. You can use mogrify from ImageMagick and floodfill to fill the countries with color. I have done this successfully, locally with batch files. My ISP has told me where mogrify is located, basically "/usr/bin/mogrify". I now have a horrendously complicated cgi script which if it worked is set to redraw the world map image with each checkbox. It's here. It also redraws the whole web page with each check. The web page starts here. Of course this is not at all efficient, and I think probably the real way to go is Ajax or Javascript, so that maybe just the image gets changed and redrawn, not the whole web page. Sorry I don't even know the difference between Javascript and Ajax and their relative merits at this point. I suppose you could make just one part of the image update with each check or click on the image instead of even just the image redrawing, but I have never even heard of a hint at being able to do that for irregularly shaped image elements like countries. So I guess an Image map and sister checkbox entries tied to mogrify events redrawing the user's personal copy of the image with an image refresh would be the only way to go. So how do you do this with something other than Javascript or Ajax or is that definitely the way to go and if so, how would you do it? Or can you after all cut up a web based image into irregular puzzle shaped piece which you can redraw individually at will. Thanks in advance for reading and considering answering this post.

Read the article
CSS selectors : should I make my CSS easier to read or optimise the speed

- by Laurent Bourgault-Roy

As I was working on a small website, I decided to use the PageSpeed extension to check if their was some improvement I could do to make the site load faster. However I was quite surprise when it told me that my use of CSS selector was "inefficient". I was always told that you should keep the usage of the class attribute in the HTML to a minimum, but if I understand correctly what PageSpeed tell me, it's much more efficient for the browser to match directly against a class name. It make sense to me, but it also mean that I need to put more CSS classes in my HTML. It make my .css file harder to read. I usually tend to mark my CSS like this : #mainContent p.productDescription em.priceTag { ... } Which make it easy to read : I know this will affect the main content and that it affect something in a paragraph tag (so I wont start to put all sort of layout code in it) that describe a product and its something that need emphasis. However it seem I should rewrite it as .priceTag { ... } Which remove all context information about the style. And if I want to use differently formatted price tag (for example, one in a list on the sidebar and one in a paragraph), I need to use something like that .paragraphPriceTag { ... } .listPriceTag { ... } Which really annoy me since I seem to duplicate the semantic of the HTML in my classes. And that mean I can't put common style in an unqualified .priceTag { ... } and thus I need to replicate the style in both CSS rule, making it harder to make change. (Altough for that I could use multiple class selector, but IE6 dont support them) I believe making code harder to read for the sake of speed has never been really considered a very good practice . Except where it is critical, of course. This is why people use PHP/Ruby/C# etc. instead of C/assembly to code their site. It's easier to write and debug. So I was wondering if I should stick with few CSS classes and complex selector or if I should go the optimisation route and remove my fancy CSS selectors for the sake of speed? Does PageSpeed make over the top recommandation? On most modern computer, will it even make a difference?

Read the article
How can I get image data from QTKit without color or gamma correction in Snow Leopard?

- by Nick Haddad

Since Snow Leopard, QTKit is now returning color corrected image data from functions like QTMovies frameImageAtTime:withAttributes:error:. Given an uncompressed AVI file, the same image data is displayed with larger pixel values in Snow Leopard vs. Leopard. Currently I'm using frameImageAtTime to get an NSImage, then ask for the tiffRepresentation of that image. After doing this, pixel values are slightly higher in Snow Leopard. For example, a file with the following pixel value in Leopard: [0 180 0] Now has a pixel value like: [0 192 0] Is there any way to ask a QTMovie for video frames that are not color corrected? Should I be asking for a CGImageRef, CIImage, or CVPixelBufferRef instead? Is there a way to disable color correction altogether prior to reading in the video files? I've attempted to work around this issue by drawing into a NSBitmapImageRep with the NSCalibratedColroSpace, but that only gets my part of the way there: // Create a movie NSDictionary *dict = [NSDictionary dictionaryWithObjectsAndKeys : nsFileName, QTMovieFileNameAttribute, [NSNumber numberWithBool:NO], QTMovieOpenAsyncOKAttribute, [NSNumber numberWithBool:NO], QTMovieLoopsAttribute, [NSNumber numberWithBool:NO], QTMovieLoopsBackAndForthAttribute, (id)nil]; _theMovie = [[QTMovie alloc] initWithAttributes:dict error:&error]; // .... NSMutableDictionary *imageAttributes = [NSMutableDictionary dictionary]; [imageAttributes setObject:QTMovieFrameImageTypeNSImage forKey:QTMovieFrameImageType]; [imageAttributes setObject:[NSArray arrayWithObject:@"NSBitmapImageRep"] forKey: QTMovieFrameImageRepresentationsType]; [imageAttributes setObject:[NSNumber numberWithBool:YES] forKey:QTMovieFrameImageHighQuality]; NSError* err = nil; NSImage* image = (NSImage*)[_theMovie frameImageAtTime:frameTime withAttributes:imageAttributes error:&err]; // copy NSImage into an NSBitmapImageRep (Objective-C) NSBitmapImageRep* bitmap = [[image representations] objectAtIndex:0]; // Draw into a colorspace we know about NSBitmapImageRep *bitmapWhoseFormatIKnow = [[NSBitmapImageRep alloc] initWithBitmapDataPlanes:NULL pixelsWide:getWidth() pixelsHigh:getHeight() bitsPerSample:8 samplesPerPixel:4 hasAlpha:YES isPlanar:NO colorSpaceName:NSCalibratedRGBColorSpace bitmapFormat:0 bytesPerRow:(getWidth() * 4) bitsPerPixel:32]; [NSGraphicsContext saveGraphicsState]; [NSGraphicsContext setCurrentContext:[NSGraphicsContext graphicsContextWithBitmapImageRep:bitmapWhoseFormatIKnow]]; [bitmap draw]; [NSGraphicsContext restoreGraphicsState]; This does convert back to a 'Non color corrected' colorspace, but the color values NOT are exactly the same as what is stored in the Uncompressed AVI files we are testing with. Also this is much less efficient because it is converting from RGB - "Device RGB" - RGB. Also, I am working in a 64-bit application, so dropping down to the Quicktime-C API is not an option. Thanks for your help.

Read the article
Using Hibernate to do a query involving two tables

- by Nathan Spears

I'm inexperienced with sql in general, so using Hibernate is like looking for an answer before I know exactly what the question is. Please feel free to correct any misunderstandings I have. I am on a project where I have to use Hibernate. Most of what I am doing is pretty basic and I could copy and modify. Now I would like to do something different and I'm not sure how configuration and syntax need to come together. Let's say I have two tables. Table A has two (relevant) columns, user GUID and manager GUID. Obviously managers can have more than one user under them, so queries on manager can return more than one row. Additionally, a manager can be managing the same user on multiple projects, so the same user can be returned multiple times for the same manager query. Table B has two columns, user GUID and user full name. One-to-one mapping there. I want to do a query on manager GUID from Table A, group them by unique User GUID (so the same User isn't in the results twice), then return those users' full names from Table B. I could do this in sql without too much trouble but I want to use Hibernate so I don't have to parse the sql results by hand. That's one of the points of using Hibernate, isn't it? Right now I have Hibernate mappings that map each column in Table A to a field (well the get/set methods I guess) in a DAO object that I wrote just to hold that Table's data. I could also use the Hibernate DAOs I have to access each table separately and do each of the things I mentioned above in separate steps, but that would be less efficient (I assume) that doing one query. I wrote a Service object to hold the data that gets returned from the query (my example is simplified - I'm going to keep some other data from Table A and get multiple columns from Table B) but I'm at a loss for how to write a DAO that can do the join, or use the DAOs I have to do the join. FYI, here is a sample of my hibernate config file (simplified to match my example): <hibernate-mapping package="com.my.dao"> <class name="TableA" table="table_a"> <id name="pkIndex" column="pk_index" /> <property name="userGuid" column="user_guid" /> <property name="managerGuid" column="manager_guid" /> </class> </hibernate-mapping> So then I have a DAOImplementation class that does queries and returns lists like public List<TableA> findByHQL(String hql, Map<String, String> params) etc. I'm not sure how "best practice" that is either.

Read the article
Comparing two large sets of attributes

- by andyashton

Suppose you have a Django view that has two functions: The first function renders some XML using a XSLT stylesheet and produces a div with 1000 subelements like this: <div id="myText"> <p id="p1"><a class="note-p1" href="#" style="display:none" target="bot">?</a></strong>Lorem ipsum</p> <p id="p2"><a class="note-p2" href="#" style="display:none" target="bot">?</a></strong>Foo bar</p> <p id="p3"><a class="note-p3" href="#" style="display:none" target="bot">?</a></strong>Chocolate peanut butter</p> (etc for 1000 lines) <p id="p1000"><a class="note-p1000" href="#" style="display:none" target="bot">?</a></strong>Go Yankees!</p> </div> The second function renders another XML document using another stylesheet to produce a div like this: <div id="myNotes"> <p id="n1"><cite class="note-p1"><sup>1</sup><span>Trololo</span></cite></p> <p id="n2"><cite class="note-p1"><sup>2</sup><span>Trololo</span></cite></p> <p id="n3"><cite class="note-p2"><sup>3</sup><span>lololo</span></cite></p> (etc for n lines) <p id="n"><cite class="note-p885"><sup>n</sup><span>lololo</span></cite></p> </div> I need to see which elements in #myText have classes that match elements in #myNotes, and display them. I can do this using the following jQuery: $('#myText').find('a').each(function() { var $anchor = $(this); $('#myNotes').find('cite').each(function() { if($(this).attr('class') == $anchor.attr('class')) { $anchor.show(); }); }); However this is incredibly slow and inefficient for a large number of comparisons. What is the fastest/most efficient way to do this - is there a jQuery/js method that is reasonable for a large number of items? Or do I need to reengineer the Django code to do the work before passing it to the template?

Read the article
Write binary stream to browser using PHP

- by Dave Jarvis

Background Trying to stream a PDF report written using iReport through PHP to the browser. The general problem is: how do you write binary data to the browser using PHP? Working Code The following code does the job, but (for many reasons) it is not as efficient as it should be (the code writes a file then sends the file contents the browser). // Load the MySQL database driver. // java( 'java.lang.Class' )->forName( 'com.mysql.jdbc.Driver' ); // Attempt a database connection. // $conn = java( 'java.sql.DriverManager' )->getConnection( "jdbc:mysql://localhost:3306/climate?user=$user&password=$password" ); // Extract parameters. // $params = new java('java.util.HashMap'); $params->put('DistrictCode', '101'); $params->put('StationCode', '0066'); $params->put('CategoryCode', '010'); // Use the fill manager to produce the report. // $fm = java('net.sf.jasperreports.engine.JasperFillManager'); $pm = $fm->fillReport($report, $params, $conn); header('Cache-Control: no-cache private'); header('Content-Description: File Transfer'); header('Content-Disposition: attachment, filename=climate-report.pdf'); header('Content-Type: application/pdf'); header('Content-Transfer-Encoding: binary'); header('Content-Length: ' . strlen( $result ) ); $path = realpath( "." ) . "/output.pdf"; $em = java('net.sf.jasperreports.engine.JasperExportManager'); $result = $em->exportReportToPdfFile($pm,$path); readfile( $path ); $conn->close(); Non-working Code To remove the slight redundancy (i.e., write directly to the browser), the following code looks like it should work, but it does not: $em = java('net.sf.jasperreports.engine.JasperExportManager'); $result = $em->exportReportToPdf($pm); header('Content-Length: ' . strlen( $result ) ); echo $result; Content is sent to the browser, but the file is corrupt (it begins with the PDF header) and cannot be read by any PDF reader. Question How can I take out the middle step of writing to the file and write directly to the browser so that the PDF is not corrupted? Thank you!

Read the article
Making a Grid in an NSView

- by Hooligancat

I currently have an NSView that draws a grid pattern (essentially a guide of horizontal and vertical lines) with the idea being that a user can change the spacing of the grid and the color of the grid. The purpose of the grid is to act as a guideline for the user when lining up objects. Everything works just fine with one exception. When I resize the NSWindow by dragging the resize handle, if my grid spacing is particularly small (say 10 pixels). the drag resize becomes lethargic in nature. My drawRect code for the grid is as follows: -(void)drawRect:(NSRect)dirtyRect { NSRect thisViewSize = [self bounds]; // Set the line color [[NSColor colorWithDeviceRed:0 green:(255/255.0) blue:(255/255.0) alpha:1] set]; // Draw the vertical lines first NSBezierPath * verticalLinePath = [NSBezierPath bezierPath]; int gridWidth = thisViewSize.size.width; int gridHeight = thisViewSize.size.height; int i; while (i < gridWidth) { i = i + [self currentSpacing]; NSPoint startPoint = {i,0}; NSPoint endPoint = {i, gridHeight}; [verticalLinePath setLineWidth:1]; [verticalLinePath moveToPoint:startPoint]; [verticalLinePath lineToPoint:endPoint]; [verticalLinePath stroke]; } // Draw the horizontal lines NSBezierPath * horizontalLinePath = [NSBezierPath bezierPath]; i = 0; while (i < gridHeight) { i = i + [self currentSpacing]; NSPoint startPoint = {0,i}; NSPoint endPoint = {gridWidth, i}; [horizontalLinePath setLineWidth:1]; [horizontalLinePath moveToPoint:startPoint]; [horizontalLinePath lineToPoint:endPoint]; [horizontalLinePath stroke]; } } I suspect this is entirely to do with the way that I am drawing the grid and am open to suggestions on how I might better go about it. I can see where the inefficiency is coming in, drag-resizing the NSWindow is constantly calling the drawRect in this view as it resizes, and the closer the grid, the more calculations per pixel drag of the parent window. I was thinking of hiding the view on the resize of the window, but it doesn't feel as dynamic. I want the user experience to be very smooth without any perceived delay or flickering. Does anyone have any ideas on a better or more efficient method to drawing the grid? All help, as always, very much appreciated.

Read the article

Search Results

Search found 3758 results on 151 pages for 'efficient'.

Page 139/151 | < Previous Page | 135 136 137 138 139 140 141 142 143 144 145 146 | Next Page >

- by maka

- by Jeff

- by gsiler

- by Mehrdad Afshari

- by Ryan Ternier

- by shabbychef

- by asmaier

- by Micor

- by Mark Renouf

- by rkbang

- by Eric

- by Ruben Vermeersch

- by Staros

- by mikep

- by dk

- by Eric

- by x4000

- by andlju

- by socrtwo

- by Laurent Bourgault-Roy

- by Nick Haddad

- by Nathan Spears

- by andyashton

- by Dave Jarvis

- by Hooligancat

< Previous Page | 135 136 137 138 139 140 141 142 143 144 145 146 | Next Page >