Search Results

Search found 80052 results on 3203 pages for 'data load performance'.

Page 21/3203 | < Previous Page | 17 18 19 20 21 22 23 24 25 26 27 28  | Next Page >

  • Best data-structure to use for two ended sorted list

    - by fmark
    I need a collection data-structure that can do the following: Be sorted Allow me to quickly pop values off the front and back of the list Remain sorted after I insert a new value Allow a user-specified comparison function, as I will be storing tuples and want to sort on a particular value Thread-safety is not required Optionally allow efficient haskey() lookups (I'm happy to maintain a separate hash-table for this though) My thoughts at this stage are that I need a priority queue and a hash table, although I don't know if I can quickly pop values off both ends of a priority queue. I'm interested in performance for a moderate number of items (I would estimate less than 200,000). Another possibility is simply maintaining an OrderedDictionary and doing an insertion sort it every-time I add more data to it. Furthermore, are there any particular implementations in Python. I would really like to avoid writing this code myself.

    Read the article

  • Queue-like data structure with fast search and insertion

    - by Max
    I need a datastructure with the following properties: It contains integer numbers, no duplicates. After it reaches the maximal size the first element is removed. So if the capacity is 3, then this is how it would look when putting in it sequential numbers: {}, {1}, {1, 2}, {1, 2, 3}, {2, 3, 4}, {3, 4, 5} etc. Only two operations are needed: inserting a number into this container (INSERT) and checking if the number is already in the container (EXISTS). The number of EXISTS operations is expected to be approximately 2 * number of INSERT operations. I need these operations to be as fast as possible. What would be the fastest data structure or combination of data structures for this scenario?

    Read the article

  • Managing Data Dependecies of Java Classes that Load Data from the Classpath at Runtime

    - by Martin Potthast
    What is the simplest way to manage dependencies of Java classes to data files present in the classpath? More specifically: How should data dependencies be annotated? Perhaps using Java annotations (e.g., @Data)? Or rather some build entries in a build script or a properties file? Is there build tool that integrates and evaluates such information (Ant, Scons, ...)? Do you have examples? Consider the following scenario: A few lines of Ant create a Jar from my sources that includes everything found on the classpath. Then jarjar is used to remove all .class files that are not necessary to execute, say, class Foo. The problem is that all the data files that class Bar depends upon are still there in the Jar. The ideal deployment script, however, would recognize that the data files on which only class Bar depends can be removed while data files on which class Foo depends must be retained. Any hints?

    Read the article

  • Best way to implement user-powered data validation

    - by vegetables
    I run a product recommendation engine and I'm hitting a few snags. I'm looking to see if anyone has any recommendations on what I should do to minimize these issues. Here's how the site works: Users come to the site and are presented with product recommendations based on some criteria. If a user knows of a product that is not in our system, they can add it by providing the product name and manufacturer. We take that information, and: Hit one API to gather all the product meta-data (and to validate the product spelling, etc). If the product is not in this first API, we do not allow it in our system. Use the information from step 1 to hit another API for pricing information (gathered from many places online). For the sake of discussion, assume that I am searching both APIs in the most efficient/successful manner possible. For the most part, this works very well. I'd say ~80% of our data is perfectly accurate, but there are a few issues: Sometimes the pricing API (Step 2) doesn't have any information for the product. The way the pricing API is built, it will always return something (theoretically, the closest possible match), and there's no guarantee that the product name is spelled exactly the same way in both APIs, so there's no automated way of knowing if it's the right product. When the pricing API finds the right product, occasionally it has outdated, or even invalid pricing data (e.g. if it screen-scraped the wrong price from a website). Since the site was fairly small at first, I was able to manually verify every product that was added to the website. However, the site has grown to the point where this is taking several hours per day, and is just not efficient use of my time. So, my question is: Aside from hiring someone (or getting an intern) to validate all the data manually, what would be the best system of letting my userbase self-manage the data. Specifically, how can I allow users to edit the data while minimizing the risk of someone ambushing my website, or accidentally setting the data incorrectly.

    Read the article

  • SQL SERVER – Plan Cache and Data Cache in Memory

    - by pinaldave
    I get following question almost all the time when I go for consultations or training. I often end up providing the scripts to my clients and attendees. Instead of writing new blog post, today in this single blog post, I am going to cover both the script and going to link to original blog posts where I have mentioned about this blog post. Plan Cache in Memory USE AdventureWorks GO SELECT [text], cp.size_in_bytes, plan_handle FROM sys.dm_exec_cached_plans AS cp CROSS APPLY sys.dm_exec_sql_text(plan_handle) WHERE cp.cacheobjtype = N'Compiled Plan' ORDER BY cp.size_in_bytes DESC GO Further explanation of this script is over here: SQL SERVER – Plan Cache – Retrieve and Remove – A Simple Script Data Cache in Memory USE AdventureWorks GO SELECT COUNT(*) AS cached_pages_count, name AS BaseTableName, IndexName, IndexTypeDesc FROM sys.dm_os_buffer_descriptors AS bd INNER JOIN ( SELECT s_obj.name, s_obj.index_id, s_obj.allocation_unit_id, s_obj.OBJECT_ID, i.name IndexName, i.type_desc IndexTypeDesc FROM ( SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id ,allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.TYPE = 1 OR au.TYPE = 3) UNION ALL SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id, allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.partition_id AND au.TYPE = 2 ) AS s_obj LEFT JOIN sys.indexes i ON i.index_id = s_obj.index_id AND i.OBJECT_ID = s_obj.OBJECT_ID ) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id WHERE database_id = DB_ID() GROUP BY name, index_id, IndexName, IndexTypeDesc ORDER BY cached_pages_count DESC; GO Further explanation of this script is over here: SQL SERVER – Get Query Plan Along with Query Text and Execution Count Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL Tagged: SQL Memory

    Read the article

  • data handling with javascript

    - by Vincent Warmerdam
    Python has a very neat package called pandas which allows for quick data transformation; tables, aggregation, that sort of thing. A lot of these types of functionality can also be found in the python itertools module. The plyR package in R is also very similar. Usually one woud use this functionality to produce a table which is later visualized with a plot. I am personally very fond of d3, and I would like to allow the user to first indicate what type of data aggregation he wants on the dataset before it is visualized. The visualisation in question involves making a heatmap where the user gets to select the size of the bins of the heatmap beforehand (I want d3 to project this through leaflet). I want to visually select the ideal size of the bins for the heatmap. The way I work now is that I take the dataset, aggregate it with python and then manually load it in d3. This is a process that takes a lot of human effort and I was wondering if the data aggregation can be done through the javascript of the browser. I couldn't find a package for javascript specifically built for data, suggesting (to me) that this is a bad idea and that one should not use javascript for the data handling. Is there a good module/package for javascript to handle data aggregation? Is it a good/bad idea to do the data aggregation in javascript (performance wise)?

    Read the article

  • Customizing the NUnit GUI for data-driven testing

    - by rwong
    My test project consists of a set of input data files which is fed into a piece of legacy third-party software. Since the input data files for this software are difficult to construct (not something that can be done intentionally), I am not going to add new input data files. Each input data file will be subject to a set of "test functions". Some of the test functions can be invoked independently. Other test functions represent the stages of a sequential operation - if an earlier stage fails, the subsequent stages do not need to be executed. I have experimented with the NUnit parametrized test case (TestCaseAttribute and TestCaseSourceAttribute), passing in the list of data files as test cases. I am generally satisfied with the the ability to select the input data for testing. However, I would like to see if it is possible to customize its GUI's tree structure, so that the "test functions" become the children of the "input data". For example: File #1 CheckFileTypeTest GetFileTopLevelStructureTest CompleteProcessTest StageOneTest StageTwoTest StageThreeTest File #2 CheckFileTypeTest GetFileTopLevelStructureTest CompleteProcessTest StageOneTest StageTwoTest StageThreeTest This will be useful for identifying the stage that failed during the processing of a particular input file. Is there any tips and tricks that will enable the new tree layout? Do I need to customize NUnit to get this layout?

    Read the article

  • Test Data in a Distributed System

    - by Davin Tryon
    A question that has been vexing me lately has been about how to effectively test (end-to-end) features in a distributed system. Particuarly, how to effectively manage (through time) test data for feature testing. The system in question is a typical SOA setup. The composition is done in JavaScript when call to several REST APIs. Each service is built as an independent block. Each service has some kind of persistent storage (SQL Server in most cases). The main issue at the moment is how to approach test data when testing end-to-end features. Functional end-to-end testing occurs through the UI, and it is therefore necessary for test data to be set up before the test run (this could be manual or automated testing). As is typical in a distributed system, identifiers from one service are used as a link in another service. So, some level of synchronization needs to be present in the data to effectively test. What is the best way to manage and set up this data after a successful deployment to a test environment? For example, is it better to manage this test data inside each service? Or package it together with the testing suite? Does that testing suite exist as a separate project? I'm interested in design guidance about how to store and manage this test data as the application features evolve.

    Read the article

  • Data Management Business Continuity Planning

    Business Continuity Governance In order to ensure data continuity for an organization, they need to ensure they know how to handle a data or network emergency because all systems have the potential to fail. Data Continuity Checklist: Disaster Recovery Plan/Policy Backups Redundancy Trained Staff Business Continuity Policies In order to protect data in case of any emergency a company needs to put in place a Disaster recovery plan and policies that can be executed by IT staff to ensure the continuity of the existing data and/or limit the amount of data that is not contiguous.  A disaster recovery plan is a comprehensive statement of consistent actions to be taken before, during and after a disaster, according to Geoffrey H. Wold. He also states that the primary objective of disaster recovery planning is to protect the organization in the event that all or parts of its operations and/or computer services are rendered unusable. Furthermore, companies can mandate through policies that IT must maintain redundant hardware in case of any hardware failures and redundant network connectivity incase the primary internet service provider goes down.  Additionally, they can require that all staff be trained in regards to the Disaster recovery policy to ensure that all parties evolved are knowledgeable to execute the recovery plan. Business Continuity Procedures Business continuity procedure vary from organization to origination, however there are standard procedures that most originations should follow. Standard Business Continuity Procedures Backup and Test Backups to ensure that they work Hire knowledgeable and trainable staff  Offer training on new and existing systems Regularly monitor, test, maintain, and upgrade existing system hardware and applications Maintain redundancy regarding all data, and critical business functionality

    Read the article

  • Perfomance of 8 bit operations on 64 bit architechture

    - by wobbily_col
    I am usually a Python / Database programmer, and I am considering using C for a problem. I have a set of sequences, 8 characters long with 4 possible characters. My problem involves combining sets of these sequences and filtering which sets match a criteria. The combinations of 5 run into billions of rows and takes around an hour to run. So I can represent each sequence as 2 bytes. If I am working on a 64 bit architechture will I gain any advantage by keeping these data structures as 2 bytes when I generate the combinations, or will I be as well storing them as 8 bytes / double ? (64 bit = 8 x 8) If I am on a 64 bit architecture, all registers will be 64 bit, so in terms of operations that shouldn´t be any faster (please correct me if I am wrong). Will I gain anything from the smaller storage requirements - can I fit more combinations in memory, or will they all take up 64 bits anyway? And finally, am I likley to gain anything coding in C. I have a first version, which stores the sequence as a small int in a MySQL database. It then self joins the tabe to itself a number of times in order to generate all the possible combinations. The performance is acceptable, depending on how many combinations are generated. I assume the database must involve some overhead.

    Read the article

  • How to choose how to store data?

    - by Eldros
    Give a man a fish and you feed him for a day. Teach a man to fish and you feed him for a lifetime. - Chinese Proverb I could ask what kind of data storage I should use for my actual project, but I want to learn to fish, so I don't need to ask for a fish each time I begin a new project. So, until I used two methods to store data on my non-game project: XML files, and relational databases. I know that there is also other kind of database, of the NoSQL kind. However I wouldn't know if there is more choice available to me, or how to choose in the first place, aside arbitrary picking one. So the question is the following: How should I choose the kind of data storage for a game project? And I would be interested on the following criterion when choosing: The size of the project. The platform targeted by the game. The complexity of the data structure. Added Portability of data amongst many project. Added How often should the data be accessed Added Multiple type of data for a same application Any other point you think is of interest when deciding what to use. EDIT I know about Would it be better to use XML/JSON/Text or a database to store game content?, but thought it didn't address exactly my point. Now if I am wrong, I would gladely be shown the error in my ways.

    Read the article

  • Ubuntu Server 12.04 CPU Load

    - by zertux
    I have a Server (2x Hexa-Core Xeon E5649 2.53GHz w/HT with 32GB RAM and 20000 GB Bandwidth) running Ubuntu Server 12.04 LTS. The server runs LAMP and serves one website only, the estimated number of users is to be ~ 15,000 at the same time. At the moment i have around 2000 users online each of them runs 50 MySQL queries (small values mostly select and insert) from the beginning until the end of the session. Server CPU Load is high at this number of connections while the RAM usage is almost 1GB out of 32GB its worth mentioning that the server was running very fast with no problems at all but am concerned about the load average. http://s12.postimage.org/z7hi6mz3h/photo.png top - 03:02:43 up 9 min, 2 users, load average: 50.83, 30.14, 12.83 Tasks: 432 total, 1 running, 430 sleeping, 0 stopped, 1 zombie Cpu(s): 0.1%us, 0.2%sy, 0.0%ni, 66.5%id, 33.1%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 32939992k total, 3111604k used, 29828388k free, 84108k buffers Swap: 2048280k total, 0k used, 2048280k free, 1621640k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2860 root 20 0 25820 2288 1420 S 3 0.0 0:11.18 htop 1182 root 20 0 0 0 0 D 2 0.0 0:01.46 kjournald 1935 mysql 20 0 12.3g 161m 7924 S 1 0.5 102:31.45 mysqld 11 root 20 0 0 0 0 S 0 0.0 0:00.38 kworker/0:1 1822 www-data 20 0 247m 25m 4188 D 0 0.1 0:01.81 apache2 2920 www-data 20 0 0 0 0 Z 0 0.0 0:01.20 apache2 <defunct> 2942 www-data 20 0 247m 23m 3056 D 0 0.1 0:00.20 apache2 3516 www-data 20 0 247m 23m 3028 D 0 0.1 0:00.06 apache2 3521 www-data 20 0 247m 23m 3020 D 0 0.1 0:00.09 apache2 3664 www-data 20 0 247m 23m 3132 D 0 0.1 0:00.09 apache2 3674 www-data 20 0 247m 23m 3252 D 0 0.1 0:00.06 apache2 3713 www-data 20 0 247m 23m 3040 D 0 0.1 0:00.09 apache2 1 root 20 0 24328 2284 1344 S 0 0.0 0:03.09 init 2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0 0.0 0:00.01 ksoftirqd/0 6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0 7 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/0 8 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1 9 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/1:0 root@server:~/codes# vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 19 0 0 29684012 86112 1689844 0 0 19 590 254 231 48 0 47 5 23 0 0 29704812 86128 1697672 0 0 4 320 11100 8121 77 1 22 0 33 0 0 29671044 86156 1705308 0 0 0 5440 13190 9140 95 1 4 0 33 3 0 29670088 86160 1706288 0 0 0 32932 12275 7297 99 0 1 0 35 0 0 29693456 86188 1710724 0 0 4 676 12701 7867 98 1 1 0 ^C I have not changed any of the default configurations that comes with Ubuntu. Is this load normal for such powerful server ? is there any optimization i can make to Apache/MySQL to minimize the load ? What do you recommend ?

    Read the article

  • Plesk Postfix Mail Server 9.5.4 very heavy load, 1000s of processes

    - by Eugene van der Merwe
    Our Plesk Linux Ubuntu 64-bit mail server has extremely high load and we don't know how to isolate it. The load was okay will two weeks ago but in the last two weeks it's seriously deteriorated. The mail server has been running for years and we have had sporadic performance issues. Normally we reduce the load by turning off all SPAM checks until the problem is sorted (which sometimes resolves itself). Currently we have turned of real time block lists, SPF checking and we have attempted to turn off SpamAssassin. No matter what we do the SpamAssassin check box stays ticked in the GUI. Out of desperation we have done /etc/init.d/psa-spamassassin stop. For years we haven't been able to do SpamAssassin because it kills the server. We would like to use it but performance is more important for now. We cannot turn off Greylisting. The moment we turn off Greylisting our help desk is inandated with calls. Out of desperation we investigated truncating the Greylisting database which is now 2.5 GB big but we abandoned this after noticing turning of Greylisting doesn't improve the performance at all. We have no anti-virus. It's just more load and Dr. Web never really worked that well for us. But we'll try that if it will make a difference. We have implemented Postfix Anvil. This seems to have made the situation worse so we disabled it. We’re not sure if this is the case. Our current mail server is configured to forward all SMTP to a relay server. We did so to reduce the load. This helped a lot because outgoing queues are generally empty. We are running in an Expand configuration. The mail server has about 12 000 accounts of which maybe half are active. We have read through this document: http://www.postfix.org/STRESS_README.html but there are too many settings and we don’t know which ones to choose. Please assist urgently. We need advice on how to fix this problem before all our clients abandon is. The only clue we have is that there are 100s of these processes: 30 13205 1 0 13:18 ? 00:00:00 /usr/lib/plesk-9.0/postfix-queue 127.0.0.1 10027 before-queue 30 13207 1 0 11:38 ? 00:00:00 /usr/lib/plesk-9.0/postfix-queue 127.0.0.1 10027 before-queue 30 13208 1 0 13:18 ? 00:00:00 /usr/lib/plesk-9.0/postfix-queue 127.0.0.1 10026 before-remote 30 13209 1 0 11:38 ? 00:00:00 /usr/lib/plesk-9.0/postfix-queue 127.0.0.1 10026 before-remote 30 13213 1 0 13:18 ? 00:00:00 /usr/lib/plesk-9.0/postfix-queue 127.0.0.1 10027 before-queue

    Read the article

  • Data Masking Pack 12.1.0.3 Certified with E-Business Suite 12.1.3

    - by Elke Phelps (Oracle Development)
    I'm pleased to announce the certification of the E-Business Suite 12.1.3 Data Masking Template for the Data Masking Pack with Enterprise Manager Cloud Control 12.1.0.3. You can use the Oracle Data Masking Pack with Oracle Enterprise Manager Grid Control 12c to scramble sensitive data in cloned E-Business Suite environments.     You may scramble data in E-Business Suite cloned environments with EM12.1.0.3 using the following template: E-Business Suite 12.1.3 Data Masking Template for Data Masking Pack with EM12c (Patch 18462641) What does data masking do in E-Business Suite environments? Application data masking does the following: De-identify the data:  Scramble identifiers of individuals, also known as personally identifiable information or PII.  Examples include information such as name, account, address, location, and driver's license number. Mask sensitive data:  Mask data that, if associated with personally identifiable information (PII), would cause privacy concerns.  Examples include compensation, health and employment information.   Maintain data validity:  Provide a fully functional application.  How can EBS customers use data masking? The Oracle E-Business Suite Template for Data Masking Pack can be used in situations where confidential or regulated data needs to be shared with other non-production users who need access to some of the original data, but not necessarily every table.  Examples of non-production users include internal application developers or external business partners such as offshore testing companies, suppliers or customers.  Due to data dependencies, scrambling E-Business Suite data is not a trivial task.  The data needs to be scrubbed in such a way that allows the application to continue to function. The template works with the Oracle Data Masking Pack and Oracle Enterprise Manager to obscure sensitive E-Business Suite information that is copied from production to non-production environments.  The Oracle E-Business Suite Template for Data Masking Pack is applied to a non-production environment with the Enterprise Manager Grid Control Data Masking Pack.  When applied, the Oracle E-Business Suite Template for Data Masking Pack will create an irreversibly scrambled version of your production database for development and testing. Is there a charge for this? Yes. You must purchase licenses for the Oracle Data Masking Pack to use the Oracle E-Business Suite 12.1.3 template. The Oracle E-Business Suite 12.1.3 Template for the Data Masking Pack is included with the Oracle Data Masking Pack license.  You can contact your Oracle account manager for more details about licensing. References Additional details and requirements are provided in the following My Oracle Support Note: Using Oracle E-Business Suite Release 12.1.3 Template for the Data Masking Pack with Oracle Enterprise Manager 12.1 Data Masking Tool (Note 1481916.1) Masking Sensitive Data in the Oracle Database Real Application Testing User's Guide 11g Release 2 (11.2) Related Articles Scrambling Sensitive Data in E-Business Suite E-Business Suite 12.1.3 Data Masking Certified with Enterprise Manager 12c

    Read the article

  • Revisiting ANTS Performance Profiler 7.4

    - by James Michael Hare
    Last year, I did a small review on the ANTS Performance Profiler 6.3, now that it’s a year later and a major version number higher, I thought I’d revisit the review and revise my last post. This post will take the same examples as the original post and update them to show what’s new in version 7.4 of the profiler. Background A performance profiler’s main job is to keep track of how much time is typically spent in each unit of code. This helps when we have a program that is not running at the performance we expect, and we want to know where the program is experiencing issues. There are many profilers out there of varying capabilities. Red Gate’s typically seem to be the very easy to “jump in” and get started with very little training required. So let’s dig into the Performance Profiler. I’ve constructed a very crude program with some obvious inefficiencies. It’s a simple program that generates random order numbers (or really could be any unique identifier), adds it to a list, sorts the list, then finds the max and min number in the list. Ignore the fact it’s very contrived and obviously inefficient, we just want to use it as an example to show off the tool: 1: // our test program 2: public static class Program 3: { 4: // the number of iterations to perform 5: private static int _iterations = 1000000; 6: 7: // The main method that controls it all 8: public static void Main() 9: { 10: var list = new List<string>(); 11: 12: for (int i = 0; i < _iterations; i++) 13: { 14: var x = GetNextId(); 15: 16: AddToList(list, x); 17: 18: var highLow = GetHighLow(list); 19: 20: if ((i % 1000) == 0) 21: { 22: Console.WriteLine("{0} - High: {1}, Low: {2}", i, highLow.Item1, highLow.Item2); 23: Console.Out.Flush(); 24: } 25: } 26: } 27: 28: // gets the next order id to process (random for us) 29: public static string GetNextId() 30: { 31: var random = new Random(); 32: var num = random.Next(1000000, 9999999); 33: return num.ToString(); 34: } 35: 36: // add it to our list - very inefficiently! 37: public static void AddToList(List<string> list, string item) 38: { 39: list.Add(item); 40: list.Sort(); 41: } 42: 43: // get high and low of order id range - very inefficiently! 44: public static Tuple<int,int> GetHighLow(List<string> list) 45: { 46: return Tuple.Create(list.Max(s => Convert.ToInt32(s)), list.Min(s => Convert.ToInt32(s))); 47: } 48: } So let’s run it through the profiler and see what happens! Visual Studio Integration First, let’s look at how the ANTS profilers integrate with Visual Studio’s menu system. Once you install the ANTS profilers, you will get an ANTS menu item with several options: Notice that you can either Profile Performance or Launch ANTS Performance Profiler. These sound similar but achieve two slightly different actions: Profile Performance: this immediately launches the profiler with all defaults selected to profile the active project in Visual Studio. Launch ANTS Performance Profiler: this launches the profiler much the same way as starting it from the Start Menu. The profiler will pre-populate the application and path information, but allow you to change the settings before beginning the profile run. So really, the main difference is that Profile Performance immediately begins profiling with the default selections, where Launch ANTS Performance Profiler allows you to change the defaults and attach to an already-running application. Let’s Fire it Up! So when you fire up ANTS either via Start Menu or Launch ANTS Performance Profiler menu in Visual Studio, you are presented with a very simple dialog to get you started: Notice you can choose from many different options for application type. You can profile executables, services, web applications, or just attach to a running process. In fact, in version 7.4 we see two new options added: ASP.NET Web Application (IIS Express) SharePoint web application (IIS) So this gives us an additional way to profile ASP.NET applications and the ability to profile SharePoint applications as well. You can also choose your level of detail in the Profiling Mode drop down. If you choose Line-Level and method-level timings detail, you will get a lot more detail on the method durations, but this will also slow down profiling somewhat. If you really need the profiler to be as unintrusive as possible, you can change it to Sample method-level timings. This is performing very light profiling, where basically the profiler collects timings of a method by examining the call-stack at given intervals. Which method you choose depends a lot on how much detail you need to find the issue and how sensitive your program issues are to timing. So for our example, let’s just go with the line and method timing detail. So, we check that all the options are correct (if you launch from VS2010, the executable and path are filled in already), and fire it up by clicking the [Start Profiling] button. Profiling the Application Once you start profiling the application, you will see a real-time graph of CPU usage that will indicate how much your application is using the CPU(s) on your system. During this time, you can select segments of the graph and bookmark them, giving them mnemonic names. This can be useful if you want to compare performance in one part of the run to another part of the run. Notice that once you select a block, it will give you the call tree breakdown for that selection only, and the relative performance of those calls. Once you feel you have collected enough information, you can click [Stop Profiling] to stop the application run and information collection and begin a more thorough analysis. Analyzing Method Timings So now that we’ve halted the run, we can look around the GUI and see what we can see. By default, the times are shown in terms of percentage of time of the total run of the application, though you can change it in the View menu item to milliseconds, ticks, or seconds as well. This won’t affect the percentages of methods, it only affects what units the times are shown. Notice also that the major hotspot seems to be in a method without source, ANTS Profiler will filter these out by default, but you can right-click on the line and remove the filter to see more detail. This proves especially handy when a bottleneck is due to a method in the BCL. So now that we’ve removed the filter, we see a bit more detail: In addition, ANTS Performance Profiler gives you the ability to decompile the methods without source so that you can dive even deeper, though typically this isn’t necessary for our purposes. When looking at timings, there are generally two types of timings for each method call: Time: This is the time spent ONLY in this method, not including calls this method makes to other methods. Time With Children: This is the total of time spent in both this method AND including calls this method makes to other methods. In other words, the Time tells you how much work is being done exclusively in this method, and the Time With Children tells you how much work is being done inclusively in this method and everything it calls. You can also choose to display the methods in a tree or in a grid. The tree view is the default and it shows the method calls arranged in terms of the tree representing all method calls and the parent method that called them, etc. This is useful for when you find a hot-spot method, you can see who is calling it to determine if the problem is the method itself, or if it is being called too many times. The grid method represents each method only once with its totals and is useful for quickly seeing what method is the trouble spot. In addition, you can choose to display Methods with source which are generally the methods you wrote (as opposed to native or BCL code), or Any Method which shows not only your methods, but also native calls, JIT overhead, synchronization waits, etc. So these are just two ways of viewing the same data, and you’re free to choose the organization that best suits what information you are after. Analyzing Method Source If we look at the timings above, we see that our AddToList() method (and in particular, it’s call to the List<T>.Sort() method in the BCL) is the hot-spot in this analysis. If ANTS sees a method that is consuming the most time, it will flag it as a hot-spot to help call out potential areas of concern. This doesn’t mean the other statistics aren’t meaningful, but that the hot-spot is most likely going to be your biggest bang-for-the-buck to concentrate on. So let’s select the AddToList() method, and see what it shows in the source window below: Notice the source breakout in the bottom pane when you select a method (from either tree or grid view). This shows you the timings in this method per line of code. This gives you a major indicator of where the trouble-spot in this method is. So in this case, we see that performing a Sort() on the List<T> after every Add() is killing our performance! Of course, this was a very contrived, duh moment, but you’d be surprised how many performance issues become duh moments. Note that this one line is taking up 86% of the execution time of this application! If we eliminate this bottleneck, we should see drastic improvement in the performance. So to fix this, if we still wanted to maintain the List<T> we’d have many options, including: delay Sort() until after all Add() methods, using a SortedSet, SortedList, or SortedDictionary depending on which is most appropriate, or forgoing the sorting all together and using a Dictionary. Rinse, Repeat! So let’s just change all instances of List<string> to SortedSet<string> and run this again through the profiler: Now we see the AddToList() method is no longer our hot-spot, but now the Max() and Min() calls are! This is good because we’ve eliminated one hot-spot and now we can try to correct this one as well. As before, we can then optimize this part of the code (possibly by taking advantage of the fact the list is now sorted and returning the first and last elements). We can then rinse and repeat this process until we have eliminated as many bottlenecks as possible. Calls by Web Request Another feature that was added recently is the ability to view .NET methods grouped by the HTTP requests that caused them to run. This can be helpful in determining which pages, web services, etc. are causing hot spots in your web applications. Summary If you like the other ANTS tools, you’ll like the ANTS Performance Profiler as well. It is extremely easy to use with very little product knowledge required to get up and running. There are profilers built into the higher product lines of Visual Studio, of course, which are also powerful and easy to use. But for quickly jumping in and finding hot spots rapidly, Red Gate’s Performance Profiler 7.4 is an excellent choice. Technorati Tags: Influencers,ANTS,Performance Profiler,Profiler

    Read the article

  • SSI includes not working on Debian with Apache

    - by Mike
    I'm trying to get SSI to work on Debian running Apache, however the .shtml files are not being parsed. From a PHP file with phpinfo() I can see that the following show up in the loaded modules section: mod_mime_xattr mod_mime mod_mime_magic In /etc/apache2/mods-enabled/mime.conf I have (among other things): AddType text/html .shtml AddOutputFilter INCLUDES .shtml In /etc/apache2/sites-enabled/domain.com.conf (for the virtual host in question) I have: <Directory /home/username/public_html> Options +Includes allow from all AllowOverride All </Directory> and for good measure, I added the following as well: <Directory /> Options +Includes </directory> In the user's .htaccess file, I tried adding: Options +Includes AddType text/html shtml AddHandler server-parsed shtml Nothing seems to work. How can I even debug this? Edit: Here is the output of ls /etc/apache2/mods-enabled/ in case this helps actions.conf dav_svn.load proxy_balancer.load actions.load deflate.conf proxy.conf alias.conf deflate.load proxy_connect.load alias.load dir.conf proxy_http.load auth_basic.load dir.load proxy.load auth_digest.load env.load python.load authn_file.load fcgid.conf reqtimeout.conf authz_default.load fcgid.load reqtimeout.load authz_groupfile.load mime.conf rewrite.load authz_host.load mime.load ruby.load authz_user.load mime_magic.conf setenvif.conf autoindex.conf mime_magic.load setenvif.load autoindex.load mime-xattr.load ssl.conf cgi.load negotiation.conf ssl.load dav_fs.conf negotiation.load status.conf dav_fs.load php5.conf status.load dav.load php5.load suexec.load dav_svn.conf proxy_balancer.conf

    Read the article

  • High load average, low CPU and IO (Centos 5.7)

    - by Ben
    A Drupal 7 site with CiviCRM, after running smoothly for a year on a 1&1 VPS suddenly became unresponsive. Now pages eventually load, but can take more than a minute. Looking at resource use in Virtuozzo, the load average carries a warning, and has remained above 1. While I understand this isn't particularly high, this is a change from when the site was working. Here is a typical snapshot of top: top - 03:10:32 up 3:21, 1 user, load average: 1.16, 1.22, 1.30 Tasks: 43 total, 1 running, 42 sleeping, 0 stopped, 0 zombie Cpu(s): 0.1%us, 0.1%sy, 0.1%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 2097152k total, 1015112k used, 1082040k free, 0k buffers Swap: 0k total, 0k used, 0k free, 0k cached CPU idle level never seems to go much below 70%. wa is virtually always at 0. There appears to be lots of free memory. And here is some vmstat output, again showign wa at 0, plenty of free memory, and an idle CPU: procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 0 1100872 0 0 0 0 2783 23672 0 538 1 0 99 0 0 1 0 0 1100872 0 0 0 0 0 16 0 101754 0 0 100 0 0 0 0 0 1100872 0 0 0 0 0 17 0 103133 0 0 100 0 0 0 0 0 1100872 0 0 0 0 0 1 0 102080 0 0 100 0 0 1 0 0 1100872 0 0 0 0 0 6 0 99881 0 0 100 0 0 0 0 0 1100872 0 0 0 0 0 1 0 105187 0 0 100 0 0 I've spoken to 1&1 but they don't have any ideas as to what could be causing the high load average. Instead they suggested an upgrade :) I've looked for processes that might be causing this, examined MySQL showprocesslist, and restarted the container with no result. Does anyone have more troubleshooting suggestions or insights?

    Read the article

  • Virtualbox HTTP load testing, host CPU overload issues

    - by aschuler
    I'm doing HTTP load testing benchmarks (using Apache Benchmark and Siege) on a small Java EE 1.7.0 / Tomcat 7.0.26 application running on a Debian Squeeze 6.0.4 x64 virtualized with Virtualbox 4.1.8. The computer host is Ubuntu 11.10 x64. I've modified those parameters in the Tomcat server.xml : <Connector port="8080" protocol="HTTP/1.1" connectionTimeout="200000" redirectPort="8443" acceptCount="2000" maxThreads="150" minSpareThreads="50" /> The application executed on the server takes around 300ms. This app is running well until a certain amount of concurrent connections like those one : ab -n 500 -c 150 http://xx.xx.xx.xx:8080/myapp/ ab -n 1000 -c 50 http://xx.xx.xx.xx:8080/myapp/ siege -b -c 100 -r 20 http://xx.xx.xx.xx:8080/myapp/ A lot of socket connection timed out happens and this completly overload the host processor (but the CPU load inside the VM is normal). Doing an htop on the host, i can see that the Virtualbox processus is running under 300% CPU and never come down even after the load test is finished. (I've allocated 4 processors to the VM, if I allocate only one processor, CPU load goes under 100%). Restarting Tomcat don't do anything, i'm forced to restart the whole VM. I've tryed to launch those ab/siege commands locally on the VM and everything goes well. I first thought it was related to a linux network limit as explained here: Running some benchmarks using ab, and tomcat starts to really slow down So I've modified those TCP parameters : echo 15 > /proc/sys/net/ipv4/tcp_fin_timeout echo 30 > /proc/sys/net/ipv4/tcp_keepalive_intvl echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse It seems to be better, but it continues to overload the host CPU and output socket connections time out at a certain amount of concurrent connections. I'm wondering if this is not related to how Virtualbox handles external concurrent connections.

    Read the article

  • How to measure disk performance?

    - by Jakub Šturc
    I am going to "fix" a friend's computer this weekend. By the symptoms he describes it looks like he has a disk performance problem with his 5400 rpm disk. I want to be sure that disk is the problem so I want to "scientificaly" measure the performance. Which tools do you recommend me for this job? Is there any standard set of numbers I can compare the result of measurement with?

    Read the article

  • Question about network topology and routing performance

    - by algorithms
    Hello I am currently working on a uni project about routing protocols and network performance, one of the criteria i was going to test under was to see what effect lan topology has, ie workstations arranged in mesh, star, ring etc, but i am having doubts as to whether that would have any affect on the routing performance thus would be useless to do, rather i'm thinking it would be better to test under the topology of the routers themselves, ie routers arranged in either star, mesh ring etc. I would appreciate some feedback on this as I am rather confused. Thank You

    Read the article

  • is it worth to use load balancer on web server/website

    - by user427969
    I have a website and a while ago, the web server of the company hosting my website was down for about a day. I consulted the company for a solution on how i can stop this from happening in future and they suggested to have a second machine and which will be connected to my current website/web server by a "load balancer" (at an additional huge cost!!!). The second machine will be replicate of the first one and so if i goes down, the other will always be running. ---- Explanation ----- My hosting company suggested that it will be a good idea to have a second machine running at the same time and both the machines will be connected by a load balancer which reduces the rist of a downtime. The second machine will be a mirror of the first and any changes to first must be replicated in the second. I don't mind spending money if it really saves my website from going down. I want to know is it worth having this "load balancer" for my purpose? My website is a 24/7 service. I cannot afford an outage of 24 hours/1 hour. I don't mind using this "load balancer" as far as it is really worth. I am not sure if its just a marketing trick of my hosting company or really a "best" solution Thanks for help. Regards

    Read the article

  • VMWare Server - Writing files to virtual hard drive performance

    - by Ardman
    We have just moved our infrastructure from physical servers to virtual machines. Everything is running great and we are happy with the result of the move. We have identified one problem, and that is reading/writing performance. We have an application that compiles files and writes to disk. This is considerably slower on the new virtual machines compared to the physical machines. Is there a performance bottleneck when writing to a virtual hard drive compared to a physical hard drive?

    Read the article

  • Improving performance of fuzzy string matching against a dictionary [closed]

    - by Nathan Harmston
    Hi, So I'm currently working for with using SecondString for fuzzy string matching, where I have a large dictionary to compare to (with each entry in the dictionary has an associated non-unique identifier). I am currently using a hashMap to store this dictionary. When I want to do fuzzy string matching, I first check to see if the string is in the hashMap and then I iterate through all of the other potential keys, calculating the string similarity and storing the k,v pair/s with the highest similarity. Depending on which dictionary I am using this can take a long time ( 12330 - 1800035 entries ). Is there any way to speed this up or make it faster? I am currently writing a memoization function/table as a way of speeding this up, but can anyone else think of a better way to improve the speed of this? Maybe a different structure or something else I'm missing. Many thanks in advance, Nathan

    Read the article

< Previous Page | 17 18 19 20 21 22 23 24 25 26 27 28  | Next Page >