Search Results

Search found 13403 results on 537 pages for 'epm performance tuning'.

Page 481/537 | < Previous Page | 477 478 479 480 481 482 483 484 485 486 487 488  | Next Page >

  • New record may be written twice in clusterd index structure

    - by Cupidvogel
    As per the article at Microsoft, under the Test 1: INSERT Performance section, it is written that For the table with the clustered index, only a single write operation is required since the leaf nodes of the clustered index are data pages (as explained in the section Clustered Indexes and Heaps), whereas for the table with the nonclustered index, two write operations are required—one for the entry into the index B-tree and another for the insert of the data itself. I don't think that is necessarily true. Clustered Indexes are implemented through B+ tree structures, right? If you look at at this article, which gives a simple example of inserting into a B+ tree, we can see that when 8 is initially inserted, it is written only once, but then when 5 comes in, it is written to the root node as well (thus written twice, albeit not initially at the time of insertion). Also when 8 comes in next, it is written twice, once at the root and then at the leaf. So won't it be correct to say, that the number of rewrites in case of a clustered index is much less compared to a NIC structure (where it must occur every time), instead of saying that rewrite doesn't occur in CI at all?

    Read the article

  • Testing approach for multi-threaded software

    - by Shane MacLaughlin
    I have a piece of mature geospatial software that has recently had areas rewritten to take better advantage of the multiple processors available in modern PCs. Specifically, display, GUI, spatial searching, and main processing have all been hived off to seperate threads. The software has a pretty sizeable GUI automation suite for functional regression, and another smaller one for performance regression. While all automated tests are passing, I'm not convinced that they provide nearly enough coverage in terms of finding bugs relating race conditions, deadlocks, and other nasties associated with multi-threading. What techniques would you use to see if such bugs exist? What techniques would you advocate for rooting them out, assuming there are some in there to root out? What I'm doing so far is running the GUI functional automation on the app running under a debugger, such that I can break out of deadlocks and catch crashes, and plan to make a bounds checker build and repeat the tests against that version. I've also carried out a static analysis of the source via PC-Lint with the hope of locating potential dead locks, but not had any worthwhile results. The application is C++, MFC, mulitple document/view, with a number of threads per doc. The locking mechanism I'm using is based on an object that includes a pointer to a CMutex, which is locked in the ctor and freed in the dtor. I use local variables of this object to lock various bits of code as required, and my mutex has a time out that fires my a warning if the timeout is reached. I avoid locking where possible, using resource copies where possible instead. What other tests would you carry out?

    Read the article

  • Guid Primary /Foreign Key dilemma SQL Server

    - by Xience
    Hi guys, I am faced with the dilemma of changing my primary keys from int identities to Guid. I'll put my problem straight up. It's a typical Retail management app, with POS and back office functionality. Has about 100 tables. The database synchronizes with other databases and receives/ sends new data. Most tables don't have frequent inserts, updates or select statements executing on them. However, some do have frequent inserts and selects on them, eg. products and orders tables. Some tables have upto 4 foreign keys in them. If i changed my primary keys from 'int' to 'Guid', would there be a performance issue when inserting or querying data from tables that have many foreign keys. I know people have said that indexes will be fragmented and 16 bytes is an issue. Space wouldn't be an issue in my case and apparently index fragmentation can also be taken care of using 'NEWSEQUENTIALID()' function. Can someone tell me, from there experience, if Guid will be problematic in tables with many foreign keys. I'll be much appreciative of your thoughts on it...

    Read the article

  • How good is the memory mapped Circular Buffer on Wikipedia?

    - by abroun
    I'm trying to implement a circular buffer in C, and have come across this example on Wikipedia. It looks as if it would provide a really nice interface for anyone reading from the buffer, as reads which wrap around from the end to the beginning of the buffer are handled automatically. So all reads are contiguous. However, I'm a bit unsure about using it straight away as I don't really have much experience with memory mapping or virtual memory and I'm not sure that I fully understand what it's doing. What I think I understand is that it's mapping a shared memory file the size of the buffer into memory twice. Then, whenever data is written into the buffer it appears in memory in 2 places at once. This allows all reads to be contiguous. What would be really great is if someone with more experience of POSIX memory mapping could have a quick look at the code and tell me if the underlying mechanism used is really that efficient. Am I right in thinking for example that the file in /dev/shm used for the shared memory always stays in RAM or could it get written to the hard drive (performance hit) at some point? Are there any gotchas I should be aware of? As it stands, I'm probably going to use a simpler method for my current project, but it'd be good to understand this to have it in my toolbox for the future. Thanks in advance for your time.

    Read the article

  • OptimisticLockException in inner transaction ruins outer transaction

    - by Pace
    I have the following code (OLE = OptimisticLockException)... public void outer() { try { middle() } catch (OLE) { updateEntities(); outer(); } } @Transactional public void middle() { try { inner() } catch (OLE) { updateEntities(); middle(); } @Transactional public void inner() { //Do DB operation } inner() is called by other non-transactional methods which is why both middle() and inner() are transactional. As you can see, I deal with OLEs by updating the entities and retrying the operation. The problem I'm having is that when I designed things this way I was assuming that the only time one could get an OLE was when a transaction closed. This is apparently not the case as the call to inner() is throwing an OLE even when the stack is outer()->middle()->inner(). Now, middle() is properly handling the OLE and the retry succeeds but when it comes time to close the transaction it has been marked rollbackOnly by Spring. When the middle() method call finally returns the closing aspect throws an exception because it can't commit a transaction marked rollbackOnly. I'm uncertain what to do here. I can't clear the rollbackOnly state. I don't want to force create a transaction on every call to inner because that kills my performance. Am I missing something or can anyone see a way I can structure this differently? EDIT: To clarify what I'm asking, let me explain my main question. Is it possible to catch and handle OLE if you are inside of an @Transactional method? FYI: The transaction manager is a JpaTransactionManager and the JPA provider is Hibernate.

    Read the article

  • Flex profiling - what is [enterFrameEvent] doing?

    - by Herms
    I've been tasked with finding (and potentially fixing) some serious performance problems with a Flex application that was delivered to us. The application will consistently take up 50 to 100% of the CPU at times when it is simply idling and shouldn't be doing anything. My first step was to run the profiler that comes with FlexBuilder. I expected to find some method that was taking up most of the time, showing me where the bottleneck was. However, I got something unexpected. The top 4 methods were: [enterFrameEvent] - 84% cumulative, 32% self time [reap] - 20% cumulative and self time [tincan] - 8% cumulative and self time global.isNaN - 4% cumulative and self time All other methods had less than 1% for both cumulative and self time. From what I've found online, the [bracketed methods] are what the profiler lists when it doesn't have an actual Flex method to show. I saw someone claim that [tincan] is the processing of RTMP requests, and I assume [reap] is the garbage collector. Does anyone know what [enterFrameEvent] is actually doing? I assume it's essentially the "main" function for the event loop, so the high cumulative time is expected. But why is the self time so high? What's actually going on? I didn't expect the player internals to be taking up so much time, especially since nothing is actually happening in the app (and there are no UI updates going on). Is there any good way to find dig into what's happening? I know something is going on that shouldn't be (it looks like there must be some kind of busy wait or other runaway loop), but the profiler isn't giving me any results that I was expecting. My next step is going to be to start adding debug trace statements in various places to try and track down what's actually happening, but I feel like there has to be a better way.

    Read the article

  • What noncluster index would be better to create on SQL Server?

    - by Junior Mayhé
    Here I am studying nonclustered indexes on SQL Server Management Studio. I've created a table with more than 1 million records. This table has a primary key. SELECT CustomerName FROM Customers Which leads the execution plan to show me: I/O cost = 3.45646 Operator cost = 4.57715 For the first attempt to improve performance, I've created a nonclustered index for this table: CREATE NONCLUSTERED INDEX [IX_CustomerID_CustomerName] ON [dbo].[Customers] ( [CustomerId] ASC, [CustomerName] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO With this first try, I've executed the select statement and the execution plan shows me: I/O cost = 2.79942 Operator cost = 3.92001 Now the second try, I've deleted this nonclustered index in order to create a new one. CREATE NONCLUSTERED INDEX [IX_CategoryName] ON [dbo].[Categories] ( [CategoryId] ASC ) INCLUDE ( [CategoryName]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO With this second try, I've executed the select statement and the execution plan shows me the same result: I/O cost = 2.79942 Operator cost = 3.92001 Am I doing something wrong or this is expected? Shall I use the first nonclustered index with two fields, or the second nonclustered with one field (CategoryID) including the second field (CategoryName)?

    Read the article

  • Populating and Using Dynamic Classes in C#/.NET 4.0

    - by Bob
    In our application we're considering using dynamically generated classes to hold a lot of our data. The reason for doing this is that we have customers with tables that have different structures. So you could have a customer table called "DOG" (just making this up) that contains the columns "DOGID", "DOGNAME", "DOGTYPE", etc. Customer #2 could have the same table "DOG" with the columns "DOGID", "DOG_FIRST_NAME", "DOG_LAST_NAME", "DOG_BREED", and so on. We can't create classes for these at compile time as the customer can change the table schema at any time. At the moment I have code that can generate a "DOG" class at run-time using reflection. What I'm trying to figure out is how to populate this class from a DataTable (or some other .NET mechanism) without extreme performance penalties. We have one table that contains ~20 columns and ~50k rows. Doing a foreach over all of the rows and columns to create the collection take about 1 minute, which is a little too long. Am I trying to come up with a solution that's too complex or am I on the right track? Has anyone else experienced a problem like this? Create dynamic classes was the solution that a developer at Microsoft proposed. If we can just populate this collection and use it efficiently I think it could work.

    Read the article

  • Seeking suggestions on redesigning the interface

    - by ratkok
    As a part of maintaining large piece of legacy code, we need to change part of the design mainly to make it more testable (unit testing). One of the issues we need to resolve is the existing interface between components. The interface between two components is a class that contains static methods only. Simplified example: class ABInterface { static methodA(); static methodB(); ... static methodZ(); }; The interface is used by component A so that different methods can use ABInterface::methodA() in order to prepare some input data and then invoke appropriate functions within component B. Now we are trying to redesign this interface for various reasons: Extending our unit test coverage - we need to resolve this dependency between the components and stubs/mocks are to be introduced The interface between these components diverged from the original design (ie. a lots of newer functions, used for the inter-component i/f are created outside this interface class). The code is old, changed a lot over the time and needs to be refactored. The change should not be disruptive for the rest of the system. We try to limit leaving many test-required artifacts in the production code. Performance is very important and should be no (or very minimal) degradation after the redesign. Code is OO in C++. I am looking for some ideas what approach to take. Any suggestions on how to do this efficiently?

    Read the article

  • Index an array expression directly in PostgreSQL

    - by wich
    I'm trying to insert data into a table from a template table. I need to rewrite one of the columns for which I wanted to use a directly indexed array expression, but I can't seem to find how to do this, if it is even possible. The scenario: create table template ( id integer, index integer, foo integer); insert into template values (0, 1, 23), (0, 2, 18), (0, 3, 16), (0, 4, 7), (1, 1, 17), (1, 2, 26), (1, 3, 11), (1, 4, 3); create table data ( data_id integer, foo integer); Now what I'd like to do is the following: insert into data select (array[3,7,5,2])[index], foo from template where id = 1; But this doesn't work, the (array[3,7,5,2])[index] syntax isn't valid. I tried a few variants, but was unable to get anything working and wasn't able to find the correct syntax in the docs, nor even whether this is at all possible or not. As a current workaround I've devised the following, but it is less than ideal, from an elegance perspective at least, but it may also be a performance hit, I haven't looked into that yet. insert into data select arr[index], foo from template, (select array[3,7,5,2] as arr) as q where id = 1; If anyone could suggest a (better) alternative to accomplish this I'd like to hear that as well.

    Read the article

  • Detaching all entities of T to get fresh data

    - by Goran
    Lets take an example where there are two type of entites loaded: Product and Category, Product.CategoryId - Category.Id. We have available CRUD operations on products (not Categories). If on another screen Categories are updated (or from another user in the network), we would like to be able to reload the Categories, while preserving the context we currently use, since we could be in the middle of editing data, and we do not want changes to be lost (and we cannot depend on saving, since we have incomplete data). Since there is no easy way to tell EF to get fresh data (added, removed and modified), we thought of twp possible ways: 1) Getting products attached to context, and categories detached from context. This would mean that we loose the ability to access Product.Category.Name, which we do sometimes require, so we would need to manually resolve it (example when printing data). 2) detaching / attaching all Categories from current context. Context.ChangeTracker.Entries().Where(x => x.Entity.GetType() == typeof(T)).ForEach(x => x.State = EntityState.Detached); And then reload the categories, which will get fresh data. Do you find any problem with this second approach? We understand that this will require all constraints to be put on foreign keys, and not navigation properties, since when detaching all Categories, Product.Category navigation properties would be reset to null also. Also, there could be a potential performance problem, which we did not test, since there could be couple of thousand products loaded, and all would need to resolve navigation property when reloading. Which of the two do you prefer, and is there a better way (EF6 + .NET 4.0)?

    Read the article

  • Could I do this blind relative to absolute path conversion (for perforce depot paths) better?

    - by wonderfulthunk
    I need to "blindly" (i.e. without access to the filesystem, in this case the source control server) convert some relative paths to absolute paths. So I'm playing with dotdots and indices. For those that are curious I have a log file produced by someone else's tool that sometimes outputs relative paths, and for performance reasons I don't want to access the source control server where the paths are located to check if they're valid and more easily convert them to their absolute path equivalents. I've gone through a number of (probably foolish) iterations trying to get it to work - mostly a few variations of iterating over the array of folders and trying delete_at(index) and delete_at(index-1) but my index kept incrementing while I was deleting elements of the array out from under myself, which didn't work for cases with multiple dotdots. Any tips on improving it in general or specifically the lack of non-consecutive dotdot support would be welcome. Currently this is working with my limited examples, but I think it could be improved. It can't handle non-consecutive '..' directories, and I am probably doing a lot of wasteful (and error-prone) things that I probably don't need to do because I'm a bit of a hack. I've found a lot of examples of converting other types of relative paths using other languages, but none of them seemed to fit my situation. These are my example paths that I need to convert, from: //depot/foo/../bar/single.c //depot/foo/docs/../../other/double.c //depot/foo/usr/bin/../../../else/more/triple.c to: //depot/bar/single.c //depot/other/double.c //depot/else/more/triple.c And my script: begin paths = File.open(ARGV[0]).readlines puts(paths) new_paths = Array.new paths.each { |path| folders = path.split('/') if ( folders.include?('..') ) num_dotdots = 0 first_dotdot = folders.index('..') last_dotdot = folders.rindex('..') folders.each { |item| if ( item == '..' ) num_dotdots += 1 end } if ( first_dotdot and ( num_dotdots > 0 ) ) # this might be redundant? folders.slice!(first_dotdot - num_dotdots..last_dotdot) # dependent on consecutive dotdots only end end folders.map! { |elem| if ( elem !~ /\n/ ) elem = elem + '/' else elem = elem end } new_paths << folders.to_s } puts(new_paths) end

    Read the article

  • Advanced search engine or server for relational database [closed]

    - by Pawel
    In my current project we are storing big volume of data in relational database. One of the recent key requirements is to enrich application by adding some advanced search capabilities. In the Project, performance is one of the important factors due to very large tables (10+ milions of records) with parent-children relations (for example: multi-level parent-child relationship, where I am looking for all parents with specific children). The search engine should also be able to check these references for hits. I have found some potential engines on stack overflow, however it looks like that all of them are dedicated rather for text search than relational db and hosted on linux os: lucene Solr Sphinx As I understand some of them use documents as a source of searching, but is it possible or efficient to create programmaticaly documents based on my relational data? As I am not familiar with all of their features/capabilities can anyone please make some recommendations or propose some different solution? To summarize my requirements: framework/engine to search relational database including decendants. support for Microsoft SQL Server can be used in .NET applications preferably hosted on Windows systems Does any of mentioned above are able to solve my problem? do you know any better solution?

    Read the article

  • customer.name joining transactions.name vs. customer.id [serial] joining transactions.id [integer]

    - by Frank Computer
    INFORMIX-SQL 7.32 Pawnshop Application: one-to-many relationship where each customer (master) can have many transactions (detail). customer( id serial, pk_name char(30), {PATERNAL-NAME MATERNAL-NAME, FIRST-NAME MIDDLE-NAME} [...] ); unique index on id; unique cluster index on name; transaction( fk_name char(30), ticket_number serial, [...] ); dups cluster index on fk_name; unique index on ticket_number; Several people have told me this is not the correct way to join master to detail. They said I should always join customer.id[serial] to transactions.id[integer]. When a customer pawns merchandise, clerk queries the master using wildcards on name. The query usually returns several customers, clerk scrolls until locating the right name, enters a 'D' to change to detail transactions table, all transactions are automatically queried, then clerk enters an 'A' to add a new transaction. The problem with using customer.id joining transaction.id is that although the customer table is maintained in sorted name order, clustering the transaction table by fk_id groups the transactions by fk_id, but they are not in the same order as the customer name, so when clerk is scrolling through customer names in the master, the system has to jump allover the place to locate the clustered transactions belonging to each customer. As each new customer is added, the next id is assigned to that customer, but new customers dont show up in alphabetical order. I experimented using id joins and confirmed the decrease in performance. How can I use id joins instead of name joins and still preserve the clustered transaction order by name if transactions has no name column?

    Read the article

  • MySQL Database Design with Internationalization

    - by Some name
    Hello, I'm going to start work on a medium sized application, and i'm planning it's db design. One thing that I'm not sure about is this. I will have many tables which will need internationalization, such as: "membership_options, gender_options, language_options etc" Each of these tables will share common i18n fields, like: "title, alternative_title, short_description, description" In your opinion which is the best way to do it? Have an i18n table with the same fields for each of the tables that will need them? or do something like: Membership table Gender table ---------------- -------------- id | created_at id | created_at 1 - 22.03.2001 1 - 14.08.2002 2 - 22.03.2001 2 - 14.08.2002 General translation table ------------------------- record_id | table_name | string_name | alternative_title| .... |id_language 1 - membership regular null 1 (english) 1 - membership normale null 2 (italian) 1 - gender man null 1(english) 1 -gender uomo null 2(italian) This would avoid me repeating something like: membership_translation table ----------------------------- membership_id | name | alternative_title | id_lang 1 regular null 1 1 normale null 2 gender_translation table ----------------------------- gender_id | name | alternative_title | id_lang 1 man null 1 1 uomo null 2 and so on, so i would probably reduce the number of db tables, but i'm not sure about performance.I'm not much of a DB designer, so please let me know.

    Read the article

  • Groovy as a substitute for Java when using BigDecimal?

    - by geejay
    I have just completed an evaluation of Java, Groovy and Scala. The factors I considered were: readability, precision The factors I would like to know: performance, ease of integration I needed a BigDecimal level of precision. Here are my results: Java void someOp() { BigDecimal del_theta_1 = toDec(6); BigDecimal del_theta_2 = toDec(2); BigDecimal del_theta_m = toDec(0); del_theta_m = abs(del_theta_1.subtract(del_theta_2)) .divide(log(del_theta_1.divide(del_theta_2))); } Groovy void someOp() { def del_theta_1 = 6.0 def del_theta_2 = 2.0 def del_theta_m = 0.0 del_theta_m = Math.abs(del_theta_1 - del_theta_2) / Math.log(del_theta_1 / del_theta_2); } Scala def other(){ var del_theta_1 = toDec(6); var del_theta_2 = toDec(2); var del_theta_m = toDec(0); del_theta_m = ( abs(del_theta_1 - del_theta_2) / log(del_theta_1 / del_theta_2) ) } Note that in Java and Scala I used static imports. Java: Pros: it is Java Cons: no operator overloading (lots o methods), barely readable/codeable Groovy: Pros: default BigDecimal means no visible typing, least surprising BigDecimal support for all operations (division included) Cons: another language to learn Scala: Pros: has operator overloading for BigDecimal Cons: some surprising behaviour with division (fixed with Decimal128), another language to learn

    Read the article

  • How to insert zeros between bits in a bitmap?

    - by anatolyg
    I have some performance-heavy code that performs bit manipulations. It can be reduced to the following well-defined problem: Given a 13-bit bitmap, construct a 26-bit bitmap that contains the original bits spaced at even positions. To illustrate: 0000000000000000000abcdefghijklm (input, 32 bits) 0000000a0b0c0d0e0f0g0h0i0j0k0l0m (output, 32 bits) I currently have it implemented in the following way in C: if (input & (1 << 12)) output |= 1 << 24; if (input & (1 << 11)) output |= 1 << 22; if (input & (1 << 10)) output |= 1 << 20; ... My compiler (MS Visual Studio) turned this into the following: test eax,1000h jne 0064F5EC or edx,1000000h ... (repeated 13 times with minor differences in constants) I wonder whether i can make it any faster. I would like to have my code written in C, but switching to assembly language is possible. Can i use some MMX/SSE instructions to process all bits at once? Maybe i can use multiplication? (multiply by 0x11111111 or some other magical constant) Would it be better to use condition-set instruction (SETcc) instead of conditional-jump instruction? If yes, how can i make the compiler produce such code for me? Any other idea how to make it faster? Any idea how to do the inverse bitmap transformation (i have to implement it too, bit it's less critical)?

    Read the article

  • Running a Model::find in for loop in cakephp v1.3

    - by Gaurav Sharma
    Hi all, How can I achieve the following result in cakephp: In my application a Topic is related to category, category is related to city and city is finally related to state in other words: topic belongs to category, category belongs to city , city belongs to state.. Now in the Topic controller's index action I want to find out all the topics and it's city and state. How can I do this. I can easily do this using a custom query ($this-Model-query() function ) but then I will be facing pagination difficulties. I tried doing like this function index() { $this->Topic->recursive = 0; $topics = $this->paginate(); for($i=0; $i<count($topics);$i++) { $topics[$i]['City'] = $this->Topic->Category->City->find('all', array('conditions' => array('City.id' => $topics[$i]['Category']['city_id']))); } $this->set(compact('messages')); } The method that I have adopted is not a good one (running query in a loop) Using the recursive property and setting it to highest value (2) will degrade performance and is not going to yield me state information. How shall I solve this ? Please help Thanks

    Read the article

  • Non standard interaction among two tables to avoid very large merge

    - by riko
    Suppose I have two tables A and B. Table A has a multi-level index (a, b) and one column (ts). b determines univocally ts. A = pd.DataFrame( [('a', 'x', 4), ('a', 'y', 6), ('a', 'z', 5), ('b', 'x', 4), ('b', 'z', 5), ('c', 'y', 6)], columns=['a', 'b', 'ts']).set_index(['a', 'b']) AA = A.reset_index() Table B is another one-column (ts) table with non-unique index (a). The ts's are sorted "inside" each group, i.e., B.ix[x] is sorted for each x. Moreover, there is always a value in B.ix[x] that is greater than or equal to the values in A. B = pd.DataFrame( dict(a=list('aaaaabbcccccc'), ts=[1, 2, 4, 5, 7, 7, 8, 1, 2, 4, 5, 8, 9])).set_index('a') The semantics in this is that B contains observations of occurrences of an event of type indicated by the index. I would like to find from B the timestamp of the first occurrence of each event type after the timestamp specified in A for each value of b. In other words, I would like to get a table with the same shape of A, that instead of ts contains the "minimum value occurring after ts" as specified by table B. So, my goal would be: C: ('a', 'x') 4 ('a', 'y') 7 ('a', 'z') 5 ('b', 'x') 7 ('b', 'z') 7 ('c', 'y') 8 I have some working code, but is terribly slow. C = AA.apply(lambda row: ( row[0], row[1], B.ix[row[0]].irow(np.searchsorted(B.ts[row[0]], row[2]))), axis=1).set_index(['a', 'b']) Profiling shows the culprit is obviously B.ix[row[0]].irow(np.searchsorted(B.ts[row[0]], row[2]))). However, standard solutions using merge/join would take too much RAM in the long run. Consider that now I have 1000 a's, assume constant the average number of b's per a (probably 100-200), and consider that the number of observations per a is probably in the order of 300. In production I will have 1000 more a's. 1,000,000 x 200 x 300 = 60,000,000,000 rows may be a bit too much to keep in RAM, especially considering that the data I need is perfectly described by a C like the one I discussed above. How would I improve the performance?

    Read the article

  • Common "truisms" needing correction the most

    - by Charles Bretana
    In addition to "I never met a man I didn't like", Will Rogers had another great little ditty I've always remembered. It went: "It's not what you don't know that'll hurt you, it's what you do know that ain't so." We all know or subscribe to many IT "truisms" that mostly have a strong basis in fact, in something in our professional careers, something we learned from others, lessons learned the hard way by ourselves, or by others who came before us. Unfortuntely, as these truisms spread throughout the community, the details—why they came about and the caveats that affect when they apply—tend to not spread along with them. We all have a tendency to look for, and latch on to, small "rules" or principles that we can use to avoid doing a complete exhaustive analysis for every decision. But even though they are correct much of the time, when we sometimes misapply them, we pay a penalty that could be avoided by understooding the details behind them. For example, when user-defined functions were first introduced in SQL Server it became "common knowledge" within a year or so that they had extremely bad performance (because it required a re-compilation for each use) and should be avoided. This "trusim" still increases many database developers' aversion to using UDFs, even though Microsoft's introduction of InLine UDFs, which do not suffer from this issue at all, mitigates this issue substantially. In recent years I have run into numerous DBAs who still believe you should "never" use UDFs, because of this. What other common not-so-"trusims" do you know, which many developers believe, that are not quite as universally true as is commonly understood, and which the developer community would benefit from being better educated about? Please include why it was "true" to start off with, and under what circumstances it's not true. Limit responses to issues that are technical, where the "common" application of a "rule or principle" is in fact correct most of the time, or was correct back when it was first elucidated, but—in the edge cases, or because of not understanding the principle thoroughly, because technology has changed since it first spread, or applying the rule today without understanding the details behind the rule—can easily backfire or cause the opposite of the intended effect.

    Read the article

  • I'm having an issue to use GLshort for representing Vertex, and Normal.

    - by Xylopia
    As my project gets close to optimization stage, I notice that reducing Vertex Metadata could vastly improve the performance of 3D rendering. Eventually, I've dearly searched around and have found following advices from stackoverflow. Using GL_SHORT instead of GL_FLOAT in an OpenGL ES vertex array How do you represent a normal or texture coordinate using GLshorts? Advice on speeding up OpenGL ES 1.1 on the iPhone Simple experiments show that switching from "FLOAT" to "SHORT" for vertex and normal isn't tough, but what troubles me is when you're to scale back verticies to their original size (with glScalef), normals are multiplied by the reciprocal of the scale. Natural remedy for this is to multiply the normals w/ scale before you submit to GPU. Then, my short normals almost become 0, because the scale factor is usually smaller than 0. Duh! How do you use "short" for both vertex and normal at the same time? I've been trying this and that for about a full day, but I could only go for "float vertex w/ byte normal" or "short vertex w/ float normal" so far. Your help would be truly appreciated.

    Read the article

  • Architecture for database analytics

    - by David Cournapeau
    Hi, We have an architecture where we provide each customer Business Intelligence-like services for their website (internet merchant). Now, I need to analyze those data internally (for algorithmic improvement, performance tracking, etc...) and those are potentially quite heavy: we have up to millions of rows / customer / day, and I may want to know how many queries we had in the last month, weekly compared, etc... that is the order of billions entries if not more. The way it is currently done is quite standard: daily scripts which scan the databases, and generate big CSV files. I don't like this solutions for several reasons: as typical with those kinds of scripts, they fall into the write-once and never-touched-again category tracking things in "real-time" is necessary (we have separate toolset to query the last few hours ATM). this is slow and non-"agile" Although I have some experience in dealing with huge datasets for scientific usage, I am a complete beginner as far as traditional RDBM go. It seems that using column-oriented database for analytics could be a solution (the analytics don't need most of the data we have in the app database), but I would like to know what other options are available for this kind of issues.

    Read the article

  • Why doesn't java.lang.Number implement Comparable?

    - by Julien Chastang
    Does anyone know why java.lang.Number does not implement Comparable? This means that you cannot sort Numbers with Collections.sort which seems to me a little strange. Post discussion update: Thanks for all the helpful responses. I ended up doing some more research about this topic. The simplest explanation for why java.lang.Number does not implement Comparable is rooted in mutability concerns. For a bit of review, java.lang.Number is the abstract super-type of AtomicInteger, AtomicLong, BigDecimal, BigInteger, Byte, Double, Float, Integer, Long and Short. On that list, AtomicInteger and AtomicLong to do not implement Comparable. Digging around, I discovered that it is not a good practice to implement Comparable on mutable types because the objects can change during or after comparison rendering the result of the comparison useless. Both AtomicLong and AtomicInteger are mutable. The API designers had the forethought to not have Number implement Comparable because it would have constrained implementation of future subtypes. Indeed, AtomicLong and AtomicInteger were added in Java 1.5 long after java.lang.Number was initially implemented. Apart from mutability, there are probably other considerations here too. A compareTo implementation in Number would have to promote all numeric values to BigDecimal because it is capable of accommodating all the Number sub-types. The implication of that promotion in terms of mathematics and performance is a bit unclear to me, but my intuition finds that solution kludgy.

    Read the article

  • What database strategy to choose for a large web application

    - by Snoopy
    I have to rewrite a large database application, running on 32 servers. The hardware is up to date, each machine has two quad core Xeon and 32 GByte RAM. The database is multi-tenant, each customer has his own file, around 5 to 10 GByte each. I run around 50 databases on this hardware. The app is open to the web, so I have no control on the load. There are no really complex queries, so SQL is not required if there is a better solution. The databases get updated via FTP every day at midnight. The database is read-only. C# is my favourite language and I want to use ASP.NET MVC. I thought about the following options: Use two big SQL servers running SQL Server 2012 to serve the 32 servers with data. On the 32 servers running IIS hosting providing REST services. Denormalize the database and use Redis on each webserver. Use booksleeve as a Redis client. Use a combination of SQL Server and Redis Use SQL Server 2012 together with Hadoop Use Hadoop without SQL Server What is the best way for a read-only database, to get the best performance without loosing maintainability? Does Map-Reduce make sense at all in such a scenario? The reason for the rewrite is, the old app written in C++ with ISAM technology is too slow, the interfaces are old fashioned and not nice to use from an website, especially when using ajax. The app uses a relational datamodel with many tables, but it is possible to write one accerlerator table where all queries can be performed on, and all other information from the other tables are possible by a simple key lookup.

    Read the article

  • Google Adwords API response parse

    - by Yun Ling
    I am trying to figure out how to parse the Adword API query response without exceptions and one issue that i came across is that sometimes, the data itself contains comma besides the comma between each column. Say i do a query on Adroup, campaign and impression by using <reportDefinition xmlns="https://adwords.google.com/api/adwords/cm/v201209"> <selector> <fields>CampaignName</fields> <fields>AdgroupName</fields> <fields>Impressions</fields> <predicates> <field>Status</field> <operator>IN</operator> <values>ENABLED</values> <values>PAUSED</values> </predicates> </selector> <reportName>Custom Adgroup Performance Report</reportName> <reportType>ADGROUP_PERFORMANCE_REPORT</reportType> <dateRangeType>LAST_7_DAYS</dateRangeType> <downloadFormat>CSV</downloadFormat> </reportDefinition> Since my campaign has comma within the string like below: "Adroup,Campaign,Impressions, Premiun Beer, Beer, Chicago, 1000" where the adgroup is "premium beer" and campaign is "Beer,Chicago". that will cause an issue if we parse this information by using comma. Does anyone know how to solve this problem?

    Read the article

< Previous Page | 477 478 479 480 481 482 483 484 485 486 487 488  | Next Page >