exalytics performance tuning - Page 132

SQL Timstamp Function

- by harrison

Is there any difference between these two queries? select * from tbl where ts < '9999-12-31-24.00.00.000000'; and select * from tbl where ts < timestamp('9999-12-31-24.00.00.000000'); When is the timestamp function required? Is there a difference in performance?

Read the article

What is the fastest findByName query with hibernate?

- by Karussell

I am sure I can improve the performance of the following findByName query of hibernate: public List<User> findByName(String name) { session.createCriteria(User.class).add(Restrictions.eq("name", name)).list(); } In which way should I improve it or even more important: in which ways should I improve it first? I will need the full object with all the collections (layz or not) and deps of this class.

Read the article

Is jdbc or ldap faster for basic read operations?

- by Brandon

I have a set of user data which I am try to access. Due to the way our company's employee data is set up, the information is available both through LDAP and through a table in our DB. I was curious, for standard read operations which would generally be a higher performance query?

Read the article

Why should I use SQL Server's BETWEEN ... AND syntax?

- by Jeff Meatball Yang

These two statements are logically equivalent: select * from Table where someColumn between 1 and 100 select * from Table where someColumn >= 1 and someColumn <= 100 Is there a potential performance benefit to one versus the other?

Read the article

What native C++ profiling tool do you suggest?

- by glutz78

Can anyone suggest a performance analysis tool that runs on win32 on a native c++ app? How about one that runs on Windows Mobile? Thank you.

Read the article

What are the benefits of using properties internally?

- by cyclotis04

Encapsulation is obviously helpful and essential when accessing members from outside the class, but when referring to class variables internally, is it better to call their private members, or use their getters? If your getter simply returns the variable, is there any performance difference?

Read the article

Resources for Prformance testing

- by munna

Our small concern is entrusted with creating an Application on ASP.NET with client-server model. As we are almost done with the development we are creating a small team for Performance test. I have googled in the net about the topic but without much help. If anyone of you can share 'How, What and Why' about perf test, it would be great help.

Read the article

What's the fastest lookup algorithm for a key, pair data structure (i.e, a map)?

- by truncheon

In the following example a std::map structure is filled with 26 values from A - Z (for key) and 0 – 26 for value. The time taken (on my system) to lookup the last entry (10000000 times) is roughly 250 ms for the vector, and 125 ms for the map. (I compiled using release mode, with O3 option turned on for g++ 4.4) But if for some odd reason I wanted better performance than the std::map, what data structures and functions would I need to consider using? I apologize if the answer seems obvious to you, but I haven't had much experience in the performance critical aspects of C++ programming. #include <ctime> #include <map> #include <vector> #include <iostream> struct mystruct { char key; int value; mystruct(char k = 0, int v = 0) : key(k), value(v) { } }; int find(const std::vector<mystruct>& ref, char key) { for (std::vector<mystruct>::const_iterator i = ref.begin(); i != ref.end(); ++i) if (i->key == key) return i->value; return -1; } int main() { std::map<char, int> mymap; std::vector<mystruct> myvec; for (int i = 'a'; i < 'a' + 26; ++i) { mymap[i] = i - 'a'; myvec.push_back(mystruct(i, i - 'a')); } int pre = clock(); for (int i = 0; i < 10000000; ++i) { find(myvec, 'z'); } std::cout << "linear scan: milli " << clock() - pre << "\n"; pre = clock(); for (int i = 0; i < 10000000; ++i) { mymap['z']; } std::cout << "map scan: milli " << clock() - pre << "\n"; return 0; }

Read the article

Mod_rewrite on all website images

- by Esteve Camps

I'm designing an image repository. I want to uncouple the filename from the image html link. For instance: image in filesystem is called images/items/12543.jpg HTML is <img src="images/car.jpg" /> Does anyone strongly discourages me to rewrite all image requests using PHP so when retrieving images/car.jpg, Apache really replies content from images/items/12543.jpg? I don't know if I may get performance problems.

Read the article

(Why) does Tomcat/Java perform better on Linux than on Windows?

- by ripper234

I just read this (one) study in which Tomcat under Linux outperformed Windows. From your experience, is this generally true? Any deep reason that could explain the performance difference?

Read the article

Is mono fast enough for Mac OS X?

- by prosseek

I have to use .NET/C# for the next company project. As I've developed my project on Mac, I looked into the mono for development environment/tool. Is the mono for Mac OS X is fast enough? I mean, what about the performance in running the assembly compared to running the same code on .NET under windows machine? Do I have to buy PC laptop for developing C#/.NET in practical sense?

Read the article

C# - Fast and simple multi dimensional data structures?

- by Jeremy Rudd

I need to store multi-dimensional data consisting of numbers in a manner thats easy to work with. I'm capturing data in real time, and once processed I would destroy and GC older data. This data structure must be fast so it won't hit my overall app performance. The faster the better. What are my choices in terms of platform supported data structures? I'm using VS 2010. and .NET 4.

Read the article

web drop-down combo box with large list of records

- by AlejandroR

The amount of records to be displayed in drop-down combo boxes affect the performance of internet applications. What are the current best practices to solve this problem? Are paginated drop-downs the only solution? What is considered a large list? 100 or 1000?

Read the article

What does "performant" software actually mean?

- by Roddy

I see it used a lot, but haven't seen a definition that makes complete sense. Wiktionary says "characterized by an adequate or excellent level of performance or efficiency", which isn't much help. Initially I though performant just meant "fast", but others seem to think it's also about stability, code quality, memory use/footprint, or some combination of all those. I think this is a "real" question - but if enough people reckon this is a subjective question, that's an answer in itself.

Read the article

Improving performance for WRITE operation on Oracle DB in Java

- by Lucky

I've a typical scenario & need to understand best possible way to handle this, so here it goes - I'm developing a solution that will retrieve data from a remote SOAP based web service & will then push this data to an Oracle database on network. Also, this will be a scheduled task that will execute every 15 minutes. I've event queues on remote service that contains the INSERT/UPDATE/DELETE operations that have been done since last retrieval, & once I retrieve the events for last 15 minutes, it again add events for next retrieval. Now, its just pushing data to Oracle so all my interactions are INSERT & UPDATE statements. There are around 60 tables on Oracle with some of them having 100+ columns. Moreover, for every 15 minutes cycle there would be around 60-70 Inserts, 100+ Updates & 10-20 Deletes. This will be an executable jar file that will terminate after operation & will again start on next 15 minutes cycle. So, I need to understand how should I handle WRITE operations (best practices) to improve performance for this application as whole ? Current Test Code (on every cycle) - Connects to remote service to get events. Creates a connection with DB (single connection object). Identifies the type of operation (INSERT/UPDATE/DELETE) & table on which it is done. After above, calls the respective method based on type of operation & table. Uses Preparedstatement with positional parameters, & retrieves each column value from remote service & assigns that to statement parameters. Commits the statement & returns to get event class to process next event. Above is repeated till all the retrieved events are processed after which program closes & then starts on next cycle & everything repeats again. Thanks for help !

Read the article

Maximum capabilities of MySQL

- by cdated

How do I know when a project is just to big for MySQL and I should use something with a better reputation for scalability? Is there a max database size for MySQL before degradation of performance occurs? What factors contribute to MySQL not being a viable option compared to a commercial DBMS like Oracle or SQL Server?

Read the article

Regex vs. string:find() for simple word boundary

- by user576267

Say I only need to find out whether a line read from a file contains a word from a finite set of words. One way of doing this is to use a regex like this: .*\y(good|better|best)\y.* Another way of accomplishing this is using a pseudo code like this: if ( (readLine.find("good") != string::npos) || (readLine.find("better") != string::npos) || (readLine.find("best") != string::npos) ) { // line contains a word from a finite set of words. } Which way will have better performance? (i.e. speed and CPU utilization)

Read the article

What is the Difference between onclick and href="javscript:function name ?

- by Shyju

Is there any difference between 1 : <a href="javascript:MyFunction()">Link1</a> and 2 : <a href="#" onclick="MyFunction()">Link2</a> ? Would one affect the page performance by any means ?

Read the article

What would be better, (1 database + 4 tables) or (2 databases + 2 tables each) ?

- by griseldas

Hi there, I would like to be advised on what would be better (in regards to performance) A) 1 DATABASE with 4 tables or B) 2 DATABASES (same server), each with 2 tables. The tables size and usage are more or less similar, so the 2 tables on Database 1 would be similar usage/size to the 2 tables on database 2 The tables could have +500,000 records and the 2 tables on each database are not related (no join queries etc between them) Thanks in advance for your comments

Read the article

Does Flex/Actionsctipt/Flash implements a mechanism for reusing Sprites?

- by php html

I'm creating a game and I create sprites(enemies). I keep creating and destroying sprites. Flash/Flex has a garbage collector which handles the destruction of unused resources. Should I create an object pool to reuse them, or should I leave flash/flex to handle the creation/destruction of objects? Which option is better from the performance point of view?

Read the article

How much slower is a try/catch block? [closed]

- by Euclid

Possible Duplicate: What is the real overhead of try/catch in C#? how much slower is a try catch block than a conditional? eg try { v = someArray[10]; } catch { v = defaultValue; } or if (null != someArray) { v = someArray[10]; } else { v = defaultValue; } is there much in it or isn't there a definative performance differance?

Read the article

Improving the performance of XSL

- by Rachel

In the below XSL for the variable "insert-data", I have an input param with the structure, <insert-data> <data compareIndex="4" nodeName="d1e1"> <a/> </data> <data compareIndex="5" nodeName="d1e1"> <b/> </data> <data compareIndex="7" nodeName="d1e2"> <a/> </data> <data compareIndex="9" nodeName="d1e2"> <b/> </data> </insert-data> where "nodeName" is the id of a node and "compareIndex" is the position of the text content relative to the node having id "$nodeName". I am using the below XSL to select all the text nodes(generate-id) that satisfy the above condition and construct a data xml. The below implementation works perfectly but the time taken for the execution is in min. Is there a better way of implementing or is there any in-efficient operation being used. From my observation the code where the preceding text length is calculated consumes the major time. Please share your thoughts to improve the performance of the XSL. I am using Java SAXON XSL transformer. <xsl:variable name="insert-data" as="element()*"> <xsl:for-each select="$insert-file/insert-data/data"> <xsl:sort select="xsd:integer(@index)"/> <xsl:variable name="compareIndex" select="xsd:integer(@compareIndex)" /> <xsl:variable name="nodeName" select="@nodeName" /> <xsl:variable name="nodeContent" as="node()"> <xsl:copy-of select="node()"/> </xsl:variable> <xsl:for-each select="$main-root/*//text()[ancestor::*[@id = $nodeName]]"> <xsl:variable name="preTextLength" as="xsd:integer" select="sum((preceding::text())[. ancestor::*[@id = $nodeName]]/string-length(.))" /> <xsl:variable name="currentTextLength" as="xsd:integer" select="string-length(.)" /> <xsl:variable name="sum" select="$preTextLength + $currentTextLength" as="xsd:integer"></xsl:variable> <xsl:variable name="split-index" select="$compareIndex - $preTextLength" as="xsd:integer"></xsl:variable> <xsl:if test="($sum ge $compareIndex) and ($compareIndex gt $preTextLength)"> <data split-index="{$split-index}" text-id="{generate-id(.)}"> <xsl:copy-of select="$nodeContent"/> </data> </xsl:if> </xsl:for-each> </xsl:for-each> </xsl:variable>

Read the article

HttpClient multithread performance

- by pepper

I have an application which downloads more than 4500 html pages from 62 target hosts using HttpClient (4.1.3 or 4.2-beta). It runs on Windows 7 64-bit. Processor - Core i7 2600K. Network bandwidth - 54 Mb/s. At this moment it uses such parameters: DefaultHttpClient and PoolingClientConnectionManager; Also it hasIdleConnectionMonitorThread from http://hc.apache.org/httpcomponents-client-ga/tutorial/html/connmgmt.html; Maximum total connections = 80; Default maximum connections per route = 5; For thread management it uses ForkJoinPool with the parallelism level = 5 (Do I understand correctly that it is a number of working threads?) In this case my network usage (in Windows task manager) does not rise above 2.5%. To download 4500 pages it takes 70 minutes. And in HttpClient logs I have such things: DEBUG ForkJoinPool-2-worker-1 [org.apache.http.impl.conn.PoolingClientConnectionManager]: Connection released: [id: 209][route: {}-http://stackoverflow.com][total kept alive: 6; route allocated: 1 of 5; total allocated: 10 of 80] Total allocated connections do not raise above 10-12, in spite of that I've set it up to 80 connections. If I'll try to rise parallelism level to 20 or 80, network usage remains the same but a lot connection time-outs will be generated. I've read tutorials on hc.apache.org (HttpClient Performance Optimization Guide and HttpClient Threading Guide) but they does not help. Task's code looks like this: public class ContentDownloader extends RecursiveAction { private final HttpClient httpClient; private final HttpContext context; private List<Entry> entries; public ContentDownloader(HttpClient httpClient, List<Entry> entries){ this.httpClient = httpClient; context = new BasicHttpContext(); this.entries = entries; } private void computeDirectly(Entry entry){ final HttpGet get = new HttpGet(entry.getLink()); try { HttpResponse response = httpClient.execute(get, context); int statusCode = response.getStatusLine().getStatusCode(); if ( (statusCode >= 400) && (statusCode <= 600) ) { logger.error("Couldn't get content from " + get.getURI().toString() + "\n" + response.toString()); } else { HttpEntity entity = response.getEntity(); if (entity != null) { String htmlContent = EntityUtils.toString(entity).trim(); entry.setHtml(htmlContent); EntityUtils.consumeQuietly(entity); } } } catch (Exception e) { } finally { get.releaseConnection(); } } @Override protected void compute() { if (entries.size() <= 1){ computeDirectly(entries.get(0)); return; } int split = entries.size() / 2; invokeAll(new ContentDownloader(httpClient, entries.subList(0, split)), new ContentDownloader(httpClient, entries.subList(split, entries.size()))); } } And the question is - what is the best practice to use multi threaded HttpClient, may be there is a some rules for setting up ConnectionManager and HttpClient? How can I use all of 80 connections and raise network usage? If necessary, I will provide more code.

Read the article

ListView slow performance

- by Mohamed Hemdan

I've created a list of recipes using Listview/customcursoradapter. A custom layout includes a photo for the recipe , Now I've some problems with the performance of viewing and scrolling the Listview although it has only 10 records (Target is 150). sometimes i get this error java.lang.OutOfMemoryError: bitmap size exceeds VM budget , I've tried to implement the Async task but i failed to do it. Is there any way i can overcome this problem? Your help is highly appreciated !! Here is my GetView method public View getView(int position, View convertView, ViewGroup parent) { View row = super.getView(position, convertView, parent); Cursor cursbbn = getCursor(); if (row == null) { LayoutInflater inflater = (LayoutInflater) localContext.getSystemService(Context.LAYOUT_INFLATER_SERVICE); row = inflater.inflate(R.layout.listtype, null); } String Title = cursbbn.getString(2); String SandID=cursbbn.getString(1); String Readyin = cursbbn.getString(4); String Faovoites=cursbbn.getString(8); TextView titler=(TextView)row.findViewById(R.id.listmaintitle); TextView readyinr=(TextView)row.findViewById(R.id.listreadyin); int colorPos = position % colors.length; row.setBackgroundColor(colors[colorPos]); titler.setText(Title); readyinr.setText(Readyin); ImageView picture = (ImageView) row.findViewById(R.id.imageView1); Bitmap bitImg1 = BitmapFactory.decodeResource(localContext.getResources(), R.drawable.rec0001); Bitmap bitImg2 = BitmapFactory.decodeResource(localContext.getResources(), R.drawable.rec0002); Bitmap bitImg3 = BitmapFactory.decodeResource(localContext.getResources(), R.drawable.rec0003); Bitmap bitImg4 = BitmapFactory.decodeResource(localContext.getResources(), R.drawable.rec0004); Bitmap bitImg5 = BitmapFactory.decodeResource(localContext.getResources(), R.drawable.rec0005); Bitmap bitImg6 = BitmapFactory.decodeResource(localContext.getResources(), R.drawable.rec0006); Bitmap bitImg7 = BitmapFactory.decodeResource(localContext.getResources(), R.drawable.rec0007); Bitmap bitImg8 = BitmapFactory.decodeResource(localContext.getResources(), R.drawable.rec0008); Bitmap bitImg9 = BitmapFactory.decodeResource(localContext.getResources(), R.drawable.rec0009); Bitmap bitImg10 = BitmapFactory.decodeResource(localContext.getResources(), R.drawable.rec0010); if(SandID.contentEquals("0001")) picture.setImageBitmap(getRoundedCornerImage(bitImg1)); if(SandID.contentEquals("0002")) picture.setImageBitmap(getRoundedCornerImage(bitImg2)); if(SandID.contentEquals("0003")) picture.setImageBitmap(getRoundedCornerImage(bitImg3)); if(SandID.contentEquals("0004")) picture.setImageBitmap(getRoundedCornerImage(bitImg4)); if(SandID.contentEquals("0005")) picture.setImageBitmap(getRoundedCornerImage(bitImg5)); if(SandID.contentEquals("0006")) picture.setImageBitmap(getRoundedCornerImage(bitImg6)); if(SandID.contentEquals("0007")) picture.setImageBitmap(getRoundedCornerImage(bitImg7)); if(SandID.contentEquals("0008")) picture.setImageBitmap(getRoundedCornerImage(bitImg8)); if(SandID.contentEquals("0009")) picture.setImageBitmap(getRoundedCornerImage(bitImg9)); if(SandID.contentEquals("0010")) picture.setImageBitmap(getRoundedCornerImage(bitImg10)); return row; } And This is the error : 05-02 03:11:55.898: E/AndroidRuntime(376): FATAL EXCEPTION: main 05-02 03:11:55.898: E/AndroidRuntime(376): java.lang.OutOfMemoryError: bitmap size exceeds VM budget 05-02 03:11:55.898: E/AndroidRuntime(376): at android.graphics.BitmapFactory.nativeDecodeAsset(Native Method) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.graphics.BitmapFactory.decodeStream(BitmapFactory.java:460) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.graphics.BitmapFactory.decodeResourceStream(BitmapFactory.java:336) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.graphics.BitmapFactory.decodeResource(BitmapFactory.java:359) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.graphics.BitmapFactory.decodeResource(BitmapFactory.java:385) 05-02 03:11:55.898: E/AndroidRuntime(376): at master.chef.mediamaster.AlternateRowCursorAdapter.getView(AlternateRowCursorAdapter.java:83) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.widget.AbsListView.obtainView(AbsListView.java:1409) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.widget.ListView.makeAndAddView(ListView.java:1745) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.widget.ListView.fillUp(ListView.java:700) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.widget.ListView.fillGap(ListView.java:646) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.widget.AbsListView.trackMotionScroll(AbsListView.java:3399) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.widget.AbsListView.onTouchEvent(AbsListView.java:2233) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.widget.ListView.onTouchEvent(ListView.java:3446) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.view.View.dispatchTouchEvent(View.java:3885) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:903) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:942) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:942) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:942) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:942) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:942) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:942) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:942) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:942) 05-02 03:11:55.898: E/AndroidRuntime(376): at com.android.internal.policy.impl.PhoneWindow$DecorView.superDispatchTouchEvent(PhoneWindow.java:1691) 05-02 03:11:55.898: E/AndroidRuntime(376): at com.android.internal.policy.impl.PhoneWindow.superDispatchTouchEvent(PhoneWindow.java:1125) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.app.Activity.dispatchTouchEvent(Activity.java:2096) 05-02 03:11:55.898: E/AndroidRuntime(376): at com.android.internal.policy.impl.PhoneWindow$DecorView.dispatchTouchEvent(PhoneWindow.java:1675) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.view.ViewRoot.deliverPointerEvent(ViewRoot.java:2194) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.view.ViewRoot.handleMessage(ViewRoot.java:1878) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.os.Handler.dispatchMessage(Handler.java:99) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.os.Looper.loop(Looper.java:123) 05-02 03:11:55.898: E/AndroidRuntime(376): at android.app.ActivityThread.main(ActivityThread.java:3683) 05-02 03:11:55.898: E/AndroidRuntime(376): at java.lang.reflect.Method.invokeNative(Native Method) 05-02 03:11:55.898: E/AndroidRuntime(376): at java.lang.reflect.Method.invoke(Method.java:507) 05-02 03:11:55.898: E/AndroidRuntime(376): at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:839) 05-02 03:11:55.898: E/AndroidRuntime(376): at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:597) 05-02 03:11:55.898: E/AndroidRuntime(376): at dalvik.system.NativeStart.main(Native Method)

Read the article

Optimizing python code performance when importing zipped csv to a mongo collection

- by mark

I need to import a zipped csv into a mongo collection, but there is a catch - every record contains a timestamp in Pacific Time, which must be converted to the local time corresponding to the (longitude,latitude) pair found in the same record. The code looks like so: def read_csv_zip(path, timezones): with ZipFile(path) as z, z.open(z.namelist()[0]) as input: csv_rows = csv.reader(input) header = csv_rows.next() check,converters = get_aux_stuff(header) for csv_row in csv_rows: if check(csv_row): row = { converter[0]:converter[1](value) for converter, value in zip(converters, csv_row) if allow_field(converter) } ts = row['ts'] lng, lat = row['loc'] found_tz_entry = timezones.find_one(SON({'loc': {'$within': {'$box': [[lng-tz_lookup_radius, lat-tz_lookup_radius],[lng+tz_lookup_radius, lat+tz_lookup_radius]]}}})) if found_tz_entry: tz_name = found_tz_entry['tz'] local_ts = ts.astimezone(timezone(tz_name)).replace(tzinfo=None) row['tz'] = tz_name else: local_ts = (ts.astimezone(utc) + timedelta(hours = int(lng/15))).replace(tzinfo = None) row['local_ts'] = local_ts yield row def insert_documents(collection, source, batch_size): while True: items = list(itertools.islice(source, batch_size)) if len(items) == 0: break; try: collection.insert(items) except: for item in items: try: collection.insert(item) except Exception as exc: print("Failed to insert record {0} - {1}".format(item['_id'], exc)) def main(zip_path): with Connection() as connection: data = connection.mydb.data timezones = connection.timezones.data insert_documents(data, read_csv_zip(zip_path, timezones), 1000) The code proceeds as follows: Every record read from the csv is checked and converted to a dictionary, where some fields may be skipped, some titles be renamed (from those appearing in the csv header), some values may be converted (to datetime, to integers, to floats. etc ...) For each record read from the csv, a lookup is made into the timezones collection to map the record location to the respective time zone. If the mapping is successful - that timezone is used to convert the record timestamp (pacific time) to the respective local timestamp. If no mapping is found - a rough approximation is calculated. The timezones collection is appropriately indexed, of course - calling explain() confirms it. The process is slow. Naturally, having to query the timezones collection for every record kills the performance. I am looking for advises on how to improve it. Thanks. EDIT The timezones collection contains 8176040 records, each containing four values: > db.data.findOne() { "_id" : 3038814, "loc" : [ 1.48333, 42.5 ], "tz" : "Europe/Andorra" } EDIT2 OK, I have compiled a release build of http://toblerity.github.com/rtree/ and configured the rtree package. Then I have created an rtree dat/idx pair of files corresponding to my timezones collection. So, instead of calling collection.find_one I call index.intersection. Surprisingly, not only there is no improvement, but it works even more slowly now! May be rtree could be fine tuned to load the entire dat/idx pair into RAM (704M), but I do not know how to do it. Until then, it is not an alternative. In general, I think the solution should involve parallelization of the task. EDIT3 Profile output when using collection.find_one: >>> p.sort_stats('cumulative').print_stats(10) Tue Apr 10 14:28:39 2012 ImportDataIntoMongo.profile 64549590 function calls (64549180 primitive calls) in 1231.257 seconds Ordered by: cumulative time List reduced from 730 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.012 0.012 1231.257 1231.257 ImportDataIntoMongo.py:1(<module>) 1 0.001 0.001 1230.959 1230.959 ImportDataIntoMongo.py:187(main) 1 853.558 853.558 853.558 853.558 {raw_input} 1 0.598 0.598 370.510 370.510 ImportDataIntoMongo.py:165(insert_documents) 343407 9.965 0.000 359.034 0.001 ImportDataIntoMongo.py:137(read_csv_zip) 343408 2.927 0.000 287.035 0.001 c:\python27\lib\site-packages\pymongo\collection.py:489(find_one) 343408 1.842 0.000 274.803 0.001 c:\python27\lib\site-packages\pymongo\cursor.py:699(next) 343408 2.542 0.000 271.212 0.001 c:\python27\lib\site-packages\pymongo\cursor.py:644(_refresh) 343408 4.512 0.000 253.673 0.001 c:\python27\lib\site-packages\pymongo\cursor.py:605(__send_message) 343408 0.971 0.000 242.078 0.001 c:\python27\lib\site-packages\pymongo\connection.py:871(_send_message_with_response) Profile output when using index.intersection: >>> p.sort_stats('cumulative').print_stats(10) Wed Apr 11 16:21:31 2012 ImportDataIntoMongo.profile 41542960 function calls (41542536 primitive calls) in 2889.164 seconds Ordered by: cumulative time List reduced from 778 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.028 0.028 2889.164 2889.164 ImportDataIntoMongo.py:1(<module>) 1 0.017 0.017 2888.679 2888.679 ImportDataIntoMongo.py:202(main) 1 2365.526 2365.526 2365.526 2365.526 {raw_input} 1 0.766 0.766 502.817 502.817 ImportDataIntoMongo.py:180(insert_documents) 343407 9.147 0.000 491.433 0.001 ImportDataIntoMongo.py:152(read_csv_zip) 343406 0.571 0.000 391.394 0.001 c:\python27\lib\site-packages\rtree-0.7.0-py2.7.egg\rtree\index.py:384(intersection) 343406 379.957 0.001 390.824 0.001 c:\python27\lib\site-packages\rtree-0.7.0-py2.7.egg\rtree\index.py:435(_intersection_obj) 686513 22.616 0.000 38.705 0.000 c:\python27\lib\site-packages\rtree-0.7.0-py2.7.egg\rtree\index.py:451(_get_objects) 343406 6.134 0.000 33.326 0.000 ImportDataIntoMongo.py:162(<dictcomp>) 346 0.396 0.001 30.665 0.089 c:\python27\lib\site-packages\pymongo\collection.py:240(insert) EDIT4 I have parallelized the code, but the results are still not very encouraging. I am convinced it could be done better. See my own answer to this question for details.

Search Results

Search found 13300 results on 532 pages for 'exalytics performance tuning'.

Page 132/532 | < Previous Page | 128 129 130 131 132 133 134 135 136 137 138 139 | Next Page >

- by harrison

- by Karussell

- by Brandon

- by Jeff Meatball Yang

- by glutz78

- by cyclotis04

- by munna

- by truncheon

- by Esteve Camps

- by ripper234

- by prosseek

- by Jeremy Rudd

- by AlejandroR

- by Roddy

- by Lucky

- by cdated

- by user576267

- by Shyju

- by griseldas

- by php html

- by Euclid

- by Rachel

- by pepper

- by Mohamed Hemdan

- by mark

< Previous Page | 128 129 130 131 132 133 134 135 136 137 138 139 | Next Page >