Search Results

Search found 415 results on 17 pages for 'bottleneck'.

Page 12/17 | < Previous Page | 8 9 10 11 12 13 14 15 16 17 | Next Page >

My client's solution of a Windows SBS 2011 VM on an Ubuntu host and VirtualBox is pinning the host CPU

- by Scott Stamp

Here's my situation, I've got a client hosting two servers (one VM), with the host providing VMware Zimbra, the other Windows Small Business Server 2011. Unfortunately, the person before me had configured this setup as follows. Host: Ubuntu Desktop Edition 10.04 (I know, again, not my choice) running VMware Zimbra 8GB of RAM On-board RAID1 of two 320GB Seagate Barracuda drives for the OS Software RAID5 of four 500GB WD Caviar Black drives on MDADM for bulk storage (sorry, I don't know the model #) A relatively competent quad-core Intel Core i7 CPU from the Nehalem architecture (not suspicious of this as the bottleneck) Guest: Windows Small Business Server 2011 4GB of RAM Host-equivalent CPU allocation VDI file for OS hosted on the on-board RAID, VDI file for storage hosted on the on-board RAID For some reason when running, the VM locks up when sitting nearly idle, and the VirtualBox process reports values of 240%+ in top (how is that even possible?!). Anyone have any ideas or suggestions? I'm totally stumped on this one. Happy to provide whatever logs you'd like to take a look at. Ideally I'd drop VirtualBox and provision this with VMware Workstation, but the client has objected to the (very nominal) costs involved. If hardware needs to be purchased to help, it will be, but we're considering upgrades a last-resort at this time. Thanks in advance! *fingers crossed*

Read the article
SQL Server replication and load balance

- by Ahmed Galal

I'm running a web service that serves a mobile app on IIS 8 and SQL Server 2014, my service has a massive load and i'm trying to improve performance, most of the load is happening on SQL. i don't think i have a bottleneck, my processor and ram is up to the max and i think my code is not that bad, am already using memcached and other stuff to avoid hitting SQL too much. i know i can always upgrade the server hardware but i already have a spare server that i would like to use, so i was thinking to split the SQL load on the 2 servers. What i was thinking of is to setup replication on the other server and do some load balancing, but am not sure how to do the load balance. I know i can adjust my code to hit the other server for some queries but i was hoping to find a solution that avoid changing my code. So my question is, What are the ways of doing load balancing between 2 SQL servers ? I would appreciate suggestions or best practices or some directions. Thanks.

Read the article
Scaling databases with cheap SSD hard drives

- by Dennis Kashkin

Hey guys! I hope that many of you are working with high traffic database-driven websites, and chances are that your main scalability issues are in the database. I noticed a couple of things lately: Most large databases require a team of DBAs in order to scale. They constantly struggle with limitations of hard drives and end up with very expensive solutions (SANs or large RAIDs, frequent maintenance windows for defragging and repartitioning, etc.) The actual annual cost of maintaining such databases is in $100K-$1M range which is too steep for me :) Finally, we got several companies like Intel, Samsung, FusionIO, etc. that just started selling extremely fast yet affordable SSD hard drives based on SLC Flash technology. These drives are 100 times faster in random read/writes than the best spinning hard drives on the market (up to 50,000 random writes per second). Their seek time is pretty much zero, so the cost of random I/O is the same as sequential I/O, which is awesome for databases. These SSD drives cost around $10-$20 per gigabyte, and they are relatively small (64GB). So, there seems to be an opportunity to avoid the HUGE costs of scaling databases the traditional way by simply building a big enough RAID 5 array of SSD drives (which would cost only a few thousand dollars). Then we don't care if the database file is fragmented, and we can afford 100 times more disk writes per second without having to spread the database across 100 spindles. . Is anybody else interested in this? I've been testing a few SSD drives and can share my results. If anybody on this site has already solved their I/O bottleneck with SSDs, I would love to hear your war stories! PS. I know that there are plenty of expensive solutions out there that help with scalability, for example the time proven RAM-based SANs. I want to be clear that even $50K is too expensive for my project. I have to find a solution that costs no more than $10K and does not take much time to implement.

Read the article
Find slow network nodes between two data centers

- by 2called-chaos

I've got a problem with syncing big amount of data between two data centers. Both machines have got a gigabit connection and are not fully occupied but the fastest that I am able to get is something between 6 and 10 Mbit = not acceptable! Yesterday I made some traceroute which indicates huge load on a LEVEL3 router but the problem exists for weeks now and the high response time is gone (20ms instead of 300ms). How can I trace this to find the actual slow node? Thought about a traceroute with bigger packages but will this work? In addition this problem might not be related to one of our servers as there are much higher transmission rates to other servers or clients. Actually office = server is faster than server <= server! Any idea is appreciated ;) Update We actually use rsync over ssh to copy the files. As encryption tends to have more bottlenecks I tried a HTTP request but unfortunately it is just as slow. We have a SLA with one of the data centers. They said they already tried to change the routing because they say this is related to a cheap network where the traffic gets routed through. It is true that it will route through a "cheapnet" but only the other way around. Our direction goes through LEVEL3 and the other way goes through lambdanet (which they said is not a good network). If I got it right (I'm a network intermediate) they simulated a longer path to force routing through LEVEL3 and they announce LEVEL3 in the AS path. I basically want to know if they're right or they're just trying to abdicate their responsibility. The thing is that the problem exists in both directions (while different routes), so I think it is in the responsibility of our hoster. And honestly, I don't believe that there is a DC2DC connection which only can handle 600kb/s - 1,5 MB/s for weeks! The question is how to detect WHERE this bottleneck is

Read the article
Performance tweaks and upgrades for VMWare Server 2

- by sjohnston

Our software department has a server running VMWare Server 2. We typically have 8-10 VMs running as test environments (Win XP and Server 08) for various versions of our software, and one VM that is used as a build server (Win XP). The host is running Server 2003 R2. It has 32GB RAM, 8 core Xeon 3.16GHz CPU, one disk for host OS and two raid disks for VMs. The majority of the time, this setup behaves very well and there are no complaints. Other times, the VMs can be very laggy. This is sometimes, but not always, correlated to heavy load on the build server. I'm a software developer, not an IT pro, but it seems to me that this machine should be beefy enough to handle this many VMs. Is this occasional performance hit likely just because we're hitting the limits of the hardware, or should I be looking for another culprit? From what I've read, I'm guessing if there's a bottleneck, it's probably disk I/O with all these VMs running off two disks (especially the build server). Would spreading the VMs over more disks, and/or switching to SSDs give us a significant performance boost? Other things I've read may increase performance: single virtual processor per VM removing/disabling unused virtual hardware preallocated disk space not using snapshots setting a reserved memory limit on the host and disabling VM memory swapping Can anyone confirm or deny if any of these improve performance? What other good tweaks have I missed?

Read the article
Firefox takes a really long time to load some sites on Ubuntu

- by Dave

Hello guys, I have an issue here. Some sites - just a few - takes a really long time to load on Firefox. One example is A List Apart (http://www.alistapart.com/) which takes more than 30 minutes (yes, minutes, not seconds). On Opera, ou even through a telnet session, the problematic sites run without problem, fast as expected. I am using Linux 8.04, running Firefox 3.6.3 downloaded from mozilla site, with a 10M ADSL connection. I tried many tweaks I found googling, like disable IPv6, and change http pipelining settings on FF's about:config. None worked. I also used Firebug to find what phase during negotiation is the bottleneck. Findings are in the screenshot. Well guys, any idea what is the issue? And how to solve it? I repeat, this only happens with firefox (3.6.3 and prior versions), for a few sites only (even sites with much more requests, images, javascripts, stylesheets work fine), and http pipelines and IPv6 tweaks on about:config didn't work. Thanks

Read the article
Squid reverse proxy array - siblings not communicating with each other

- by V. Romanov

I want to set up 2 squid servers to act as reverse proxy and cache for a webserver on our intranet. The load balancing will be done with DNS round robin or just different mappings for different clients. The thing is, I want both servers to try and contact each other to see if they have the object required in cache before contacting the webserver for it (the network that servers the webserver is the bottleneck and I'm trying to eliminate it) Both squids are configured the same, here are the relevant config lines : acl dvr1_cache_it_best_tv_com dstdomain dvr1.cache.it.best-tv.com acl squid1_it_best_tv_com dstdomain squid1.it.best-tv.com acl squid2_it_best_tv_com dstdomain squid2.it.best-tv.com http_access allow dvr1_cache_it_best_tv_com http_access allow squid1_it_best_tv_com http_access allow squid2_it_best_tv_com http_access allow all http_port 8081 accel defaultsite=dvr1.cache.it.best-tv.com cache_peer dvr1.origin.it.best-tv.com parent 80 0 no-query originserver name=Proxy_dvr1_origin_it_best_tv_com cache_peer squid1.it.best-tv.com sibling 8081 3130 weight=10 name=Proxy_Squid1_it_best_tv_com cache_peer squid2.it.best-tv.com sibling 8081 3130 weight=10 name=Proxy_Squid2_it_best_tv_com cache_peer_access Proxy_dvr1_origin_it_best_tv_com allow dvr1_cache_it_best_tv_com cache_peer_access Proxy_squid1_it_best_tv_com allow squid1_it_best_tv_com cache_peer_access Proxy_squid1_it_best_tv_com allow squid2_it_best_tv_com cache_peer_access Proxy_squid1_it_best_tv_com allow dvr1_cache_it_best_tv_com cache_peer_access Proxy_squid2_it_best_tv_com allow squid1_it_best_tv_com cache_peer_access Proxy_squid2_it_best_tv_com allow squid2_it_best_tv_com cache_peer_access Proxy_squid2_it_best_tv_com allow dvr1_cache_it_best_tv_com just to make it clear - dvr1.cache is the alias for the proxy servers. dvr1.origin is the web server. Both servers work, both serve content and cache it and work fine. However, when I clear the cache on one server and then access it, it gets the content from the parent (DVR1_ORIGIN) instead of going to the sibling squid. What did I configure wrong? Or perhaps I don't understand the architecture correctly? I read the squid manuals but as far as I see i did it all by the book and yet it doesn't work right. Any help will be appreciated!

Read the article
Bad performance with Linux software RAID5 and LUKS encryption

- by Philipp Wendler

I have set up a Linux software RAID5 on three hard drives and want to encrypt it with cryptsetup/LUKS. My tests showed that the encryption leads to a massive performance decrease that I cannot explain. The RAID5 is able to write 187 MB/s [1] without encryption. With encryption on top of it, write speed is down to about 40 MB/s. The RAID has a chunk size of 512K and a write intent bitmap. I used -c aes-xts-plain -s 512 --align-payload=2048 as the parameters for cryptsetup luksFormat, so the payload should be aligned to 2048 blocks of 512 bytes (i.e., 1MB). cryptsetup luksDump shows a payload offset of 4096. So I think the alignment is correct and fits to the RAID chunk size. The CPU is not the bottleneck, as it has hardware support for AES (aesni_intel). If I write on another drive (an SSD with LVM) that is also encrypted, I do have a write speed of 150 MB/s. top shows that the CPU usage is indeed very low, only the RAID5 xor takes 14%. I also tried putting a filesystem (ext4) directly on the unencrypted RAID so see if the layering is problem. The filesystem decreases the performance a little bit as expected, but by far not that much (write speed varying, but 100 MB/s). Summary: Disks + RAID5: good Disks + RAID5 + ext4: good Disks + RAID5 + encryption: bad SSD + encryption + LVM + ext4: good The read performance is not affected by the encryption, it is 207 MB/s without and 205 MB/s with encryption (also showing that CPU power is not the problem). What can I do to improve the write performance of the encrypted RAID? [1] All speed measurements were done with several runs of dd if=/dev/zero of=DEV bs=100M count=100 (i.e., writing 10G in blocks of 100M). Edit: If this helps: I'm using Ubuntu 11.04 64bit with Linux 2.6.38. Edit2: The performance stays approximately the same if I pass a block size of 4KB, 1MB or 10MB to dd.

Read the article
task blocked for more than

- by Manuel Sopena Ballesteros

I have a webserver with the configuration below: VMWare ESXi environemt CPanel installed CentOS release 6.5 (Final) 4 CPUs 2G RAM 2x VM disks 100G each LVM system My issue is I am getting kernel panic quite frequently. These is a list of some processes blocked I could see from the console: mysqld queueprocd httpd suphp vmtoolsd loop0 auditd this is my sar logs Linux 2.6.32-431.3.1.el6.x86_64 (test01) 08/22/2014 _x86_64_ (4 CPU) 12:00:01 AM CPU %user %nice %system %iowait %steal %idle 12:10:01 AM all 26.86 0.01 0.98 0.57 0.00 71.57 12:20:01 AM all 1.78 0.02 1.03 0.08 0.00 97.09 12:30:01 AM all 26.34 0.02 0.85 0.05 0.00 72.74 12:40:01 AM all 27.12 0.01 1.11 1.22 0.00 70.54 12:50:01 AM all 1.59 0.02 0.94 0.13 0.00 97.32 01:00:01 AM all 26.10 0.01 0.77 0.04 0.00 73.07 01:10:01 AM all 27.51 0.01 1.16 0.14 0.00 71.18 01:20:01 AM all 1.80 0.07 1.06 0.08 0.00 96.99 01:30:01 AM all 26.19 0.01 0.78 0.05 0.00 72.96 01:40:01 AM all 26.62 0.02 0.87 0.05 0.00 72.45 01:50:02 AM all 1.35 0.01 0.87 0.02 0.00 97.75 02:00:01 AM all 26.11 0.02 0.69 0.02 0.00 73.17 02:10:01 AM all 26.73 0.02 0.89 0.14 0.00 72.21 02:20:01 AM all 1.45 0.01 0.92 0.04 0.00 97.58 02:30:01 AM all 26.59 0.01 1.06 0.03 0.00 72.31 02:40:01 AM all 26.27 0.01 0.72 0.05 0.00 72.95 02:50:01 AM all 0.86 0.01 0.50 0.09 0.00 98.53 03:00:01 AM all 25.61 0.02 0.39 0.03 0.00 73.96 03:10:01 AM all 26.30 0.08 0.66 0.14 0.00 72.82 03:20:01 AM all 0.81 0.01 0.51 0.04 0.00 98.63 03:30:02 AM all 26.15 0.02 0.53 0.07 0.00 73.24 03:40:01 AM all 26.06 0.01 0.47 0.04 0.00 73.42 03:50:01 AM all 0.96 0.02 0.51 0.03 0.00 98.48 Average: all 17.69 0.02 0.79 0.14 0.00 81.36 06:58:14 AM LINUX RESTART 07:00:01 AM CPU %user %nice %system %iowait %steal %idle 07:10:01 AM all 1.04 0.02 0.57 0.95 0.00 97.42 07:20:02 AM all 0.66 0.01 0.39 0.06 0.00 98.87 07:30:01 AM all 25.71 0.01 0.45 0.16 0.00 73.67 07:40:01 AM all 25.88 0.01 0.35 0.08 0.00 73.68 As you can see the server became unresponsive at 03.50 AM and I had to reset the VM at 06.58 AM to fix it. dmesg does not show any relevant information. I don't see any bottleneck in sar, any idea what can I check next? thank you very much

Read the article
which is best smart automatic file replication solution for cloud storage based systems.

- by TORr0t

I am looking for a solution for a project i am working on. We are developing a websystem where people can upload their files and other people can download it. (similar to rapidshare.com model) Problem is, some files can be demanded much more than other files. The scenerio is like: I have uploaded my birthday video and shared it with all of my friend, I have uploaded it to myproject.com and it was stored in one of the cluster which has 100mbit connection. Problem is, once all of my friends want to download the file, they cant download it since the bottleneck here is 100mbit which is 15MB per second, but i got 1000 friends and they can only download 15KB per second. I am not taking into account that the hdd is serving same files. My network infrastrucre is as follows: 1 gbit server(client) and connected to 4 Nodes of storage servers that have 100mbit connection. 1gbit server can handle the 1000 users traffic if one of storage node can stream more than 15MB per second to my 1gbit (client) server and visitor will stream directly from client server instead of storage nodes. I can do it by replicating the file into 2 nodes. But i dont want to replicate all files uploadded to my network since it is costing much more. So i need a cloud based system, which will push the files into replicated nodes automatically when demanded to those files are high, and when the demand is low, they will delete from other nodes and it will stay in only 1 node. I have looked to gluster and asked in their irc channel that, gluster cant do such a thing. It is only able to replicate all the files or none of the files. But i need it the cluster software to do it automatically. Any solutions ? (instead of recommending me amazon s3) S

Read the article
What may the reason of slowness be (see details in message body)?

- by Ivan

I've got a really weird situation I'm beating to solve. A performance problem which looks really like an empty waiting sequence set in code (while it probably isn't so). I've got a pretty powerful dedicated server (10 GB RAM, eight Xeon cores, etc) running Ubuntu 10.04 with all the functionality services (except OpenVPN server used to provide secure access to clients) deployed in separate VirtualBox (vboxheadless) machines (one for the company e-mail server, one for web server and one for accounting/crm server (Firebird + proprietary app server working with Delphi-made clients)). CPU load (as "top" says) is almost always near zero. Host system RAM is close to 100% usage but not overloaded (as very little swapping gets used, and freed (by stopping one of VMs) memory doesn't get reused any quickly). Approximately 50% of guests RAM is used. iostat usually shows near zero %util. Network bandwidth seems to be underused. But the accounting/crm client (a Win32 Delphi application run on WinXP machines) software works hell-slow with this server (and works much better using an inside-LAN Windows server). I just can't imagine what can make it be slow if there are so plenty of CPU, RAM, HDD and bandwidth resources available on clients and on the server even in their hardest moments. Saying bandwidth is underused I not only know that clients and the server are connected to the Internet with a bigger channels than really used (which leaves the a chance they may have a bottleneck of a sort on the route between them), I've tested bandwidth between clients and the server by copying files among them.

Read the article
What hardware would I need (approx) to run ESXi server?

- by mr.b

Hi, I am considering to purchase off-the-shelf commodity hardware in order to build server that will host virtual machines using ESXi server. Intended purpose for this server is NOT mission critical tasks. It will have to run perhaps 20-50 Windows XP/Vista/7 virtual machines (in total, but closer to 20 figure). Each guest would have to have 1-2 GB of ram, and probably two-three times more disk space than guest OS needs with clean install and all updates applied (that would be around 6-8 GB for XP, and i believe closer to 10-15 for win7). Those guests will act as a test ground for a new product that is network management software, thus guests will idle most of their time once initially loaded, but if I give them some task to complete, they should be able to perform reasonably well. Now, from what I have learned... CPU is usually not much of an issue (6 cores would do it), memory should not be lacking, but doesn't have to be sum of all guests, because of overcommitment... That leads me to IO, which is, as it seems, the bottleneck. Since I have very little experience with ESXi (and ESX, too) server, I'd like to ask: How much memory could I save by overcommitment, and how does it affect performance? Is 6-core cpu enough to run above described system? Would it be possible to run entire server off two (or even one) SSD drives (to host system virtual disks, with few additional HDDs (2-3) in RAID 0 to be used as secondary storage? I read somewhere that ESXi allows having something like "master image", essentially virtual machine that is "deployed" many times, so that disk space can be saved by having only differences stored by specific guests, instead of copying around whole virtual disks. Is this true, and how can this help me? Are there any other things I need to take into consideration when building this off-the-shelf solution? I should probably mention here that I'm fully aware of issues like SPOF regarding power supply, raid 0, etc, but since it's only a testing ground and not a production system, it's not so important for me. Thanks, B.

Read the article
What does the 'Burst Rate' stat mean in HDTune?

- by UpTheCreek

I recently upgraded my laptop's v slow hard drive to a seagate momentus 7200. Everything is working fine, but I'm a bit confused by these benchmark results: The burst rate is significantly less than the Maximim transfer rate, and not much higher than the normal minimum (if you ignore the spikes). What's going on here? On the HDtune website it defines Burst Rate as: ...the highest speed (in megabytes per second) at which data can be transferred from the drive interface (IDE or SCSI for example) to the operating system. Which begs some questions... e.g. if this is the highest, then how did the bechmarking tool record the 103MB/sec maximum? And if this really is the true maximum, then where is the bottleneck? The laptops SATA interface is on an Intel 82801GBM southbridge controller. When I check in hardware manager, I see that it's driver is iaStor.sys from 2005. Maybe that's the issue? I'll look for a newever version, but any insights would be appreciated. Thanks

Read the article
Low 'Burst Rate' from SATA drive in HDTune?

- by UpTheCreek

I recently upgraded my laptop's v slow hard drive to a seagate momentus 7200. Everything is working fine, but I'm a bit confused by these benchmark results: The burst rate is significantly less than the Maximim transfer rate, and not much higher than the normal minimum (if you ignore the spikes). What's going on here? On the HDtune website it defines Burst Rate as: ...the highest speed (in megabytes per second) at which data can be transferred from the drive interface (IDE or SCSI for example) to the operating system. Which begs some questions... e.g. if this is the highest, then how did the bechmarking tool record the 103MB/sec maximum? And if this really is the true maximum, then where is the bottleneck? The laptops SATA interface is on an Intel 82801GBM southbridge controller. When I check in hardware manager, I see that it's driver is iaStor.sys from 2005. Maybe that's the issue? I'll look for a newever version, but any insights would be appreciated. Thanks UPDATE: Acorting to this page on the HDTune website... An important parameter of the test is the Burst Rate. This value should always be higher than the maximum transfer rate. A lower value is usually an indication of a configuration problem. So what might be the configuration problem?

Read the article
Performance of external USB disk with ESXi5

- by PeterMmm

I have a new HP DL120 G7 server with ESXi5. One VM is a Win2003 instalation and I have an external USB2.0 drive attached by USB Controller and USB Device. I copy a 4GB file from external USB to server disk. In the VM that takes up to 10 minutes. On a native Win2003 that takes aprox. 3 minutes. I have no explaination for that diference: In any case the bottleneck is the USB connection, much slower than the disks (SAS, RAID1). If the USB connection on the VM would be USB1.1 and not USB2.0 it would take much more time. (The disk performance between server partitions on the VM is correct. - see update) Could be that my native box is extremely fast and the VM is the normal case. ??? Update I try with passtrough and a first run copy the same data in aprox. 7 minutes. Still 2 times slower than the native connection. I also did another messure and the copy between partitions on the same VM takes 3 minutes.

Read the article
Seeking faster access/transfer times for accounting application

- by Markaway

Our accounting software, Sage 50, has been getting slower to open on workstations and reading the company file. The company file only contains 2 years worth of transactions, and we just cleared out 2011 so the file size has gotten a lot smaller. There are 10 users, 6 of which are on it all day, 4 are on and off throughout the day. Our network is entirely GbE and the switches are set to prioritize traffic on that port number. Watching network traffic, we barely use 40% of the network capability on the workstation, so I don't think that is our bottleneck. Our server contains two older Raptors Sata 2(3GB/s) 150GB in RAID 1. We were considering switching to SSD's, but a lot of what I read says to stay away from MLC's, especially for production environment and definitely avoid putting them in a RAID config. So would upgrading to newer Raptors with SATA 3(6GB/s) offer noticable benefits? What other options are out there that aren't so expensive? Trying to keep it to 200-300 per drive. We need at least 150GB, but going to 250-300GB would be better as it gives us more room to grow. We have about 30% space remaining on what we have now.

Read the article
Handling inheritance with overriding efficiently

- by Fyodor Soikin

I have the following two data structures. First, a list of properties applied to object triples: Object1 Object2 Object3 Property Value O1 O2 O3 P1 "abc" O1 O2 O3 P2 "xyz" O1 O3 O4 P1 "123" O2 O4 O5 P1 "098" Second, an inheritance tree: O1 O2 O4 O3 O5 Or viewed as a relation: Object Parent O2 O1 O4 O2 O3 O1 O5 O3 O1 null The semantics of this being that O2 inherits properties from O1; O4 - from O2 and O1; O3 - from O1; and O5 - from O3 and O1, in that order of precedence. NOTE 1: I have an efficient way to select all children or all parents of a given object. This is currently implemented with left and right indexes, but hierarchyid could also work. This does not seem important right now. NOTE 2: I have tiggers in place that make sure that the "Object" column always contains all possible objects, even when they do not really have to be there (i.e. have no parent or children defined). This makes it possible to use inner joins rather than severely less effiecient outer joins. The objective is: Given a pair of (Property, Value), return all object triples that have that property with that value either defined explicitly or inherited from a parent. NOTE 1: An object triple (X,Y,Z) is considered a "parent" of triple (A,B,C) when it is true that either X = A or X is a parent of A, and the same is true for (Y,B) and (Z,C). NOTE 2: A property defined on a closer parent "overrides" the same property defined on a more distant parent. NOTE 3: When (A,B,C) has two parents - (X1,Y1,Z1) and (X2,Y2,Z2), then (X1,Y1,Z1) is considered a "closer" parent when: (a) X2 is a parent of X1, or (b) X2 = X1 and Y2 is a parent of Y1, or (c) X2 = X1 and Y2 = Y1 and Z2 is a parent of Z1 In other words, the "closeness" in ancestry for triples is defined based on the first components of the triples first, then on the second components, then on the third components. This rule establishes an unambigous partial order for triples in terms of ancestry. For example, given the pair of (P1, "abc"), the result set of triples will be: O1, O2, O3 -- Defined explicitly O1, O2, O5 -- Because O5 inherits from O3 O1, O4, O3 -- Because O4 inherits from O2 O1, O4, O5 -- Because O4 inherits from O2 and O5 inherits from O3 O2, O2, O3 -- Because O2 inherits from O1 O2, O2, O5 -- Because O2 inherits from O1 and O5 inherits from O3 O2, O4, O3 -- Because O2 inherits from O1 and O4 inherits from O2 O3, O2, O3 -- Because O3 inherits from O1 O3, O2, O5 -- Because O3 inherits from O1 and O5 inherits from O3 O3, O4, O3 -- Because O3 inherits from O1 and O4 inherits from O2 O3, O4, O5 -- Because O3 inherits from O1 and O4 inherits from O2 and O5 inherits from O3 O4, O2, O3 -- Because O4 inherits from O1 O4, O2, O5 -- Because O4 inherits from O1 and O5 inherits from O3 O4, O4, O3 -- Because O4 inherits from O1 and O4 inherits from O2 O5, O2, O3 -- Because O5 inherits from O1 O5, O2, O5 -- Because O5 inherits from O1 and O5 inherits from O3 O5, O4, O3 -- Because O5 inherits from O1 and O4 inherits from O2 O5, O4, O5 -- Because O5 inherits from O1 and O4 inherits from O2 and O5 inherits from O3 Note that the triple (O2, O4, O5) is absent from this list. This is because property P1 is defined explicitly for the triple (O2, O4, O5) and this prevents that triple from inheriting that property from (O1, O2, O3). Also note that the triple (O4, O4, O5) is also absent. This is because that triple inherits its value of P1="098" from (O2, O4, O5), because it is a closer parent than (O1, O2, O3). The straightforward way to do it is the following. First, for every triple that a property is defined on, select all possible child triples: select Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value from TriplesAndProperties tp -- Select corresponding objects of the triple inner join Objects as Objects1 on Objects1.Id = tp.O1 inner join Objects as Objects2 on Objects2.Id = tp.O2 inner join Objects as Objects3 on Objects3.Id = tp.O3 -- Then add all possible children of all those objects inner join Objects as Children1 on Objects1.Id [isparentof] Children1.Id inner join Objects as Children2 on Objects2.Id [isparentof] Children2.Id inner join Objects as Children3 on Objects3.Id [isparentof] Children3.Id But this is not the whole story: if some triple inherits the same property from several parents, this query will yield conflicting results. Therefore, second step is to select just one of those conflicting results: select * from ( select Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value, row_number() over( partition by Children1.Id, Children2.Id, Children3.Id, tp.Property order by Objects1.[depthInTheTree] descending, Objects2.[depthInTheTree] descending, Objects3.[depthInTheTree] descending ) as InheritancePriority from ... (see above) ) where InheritancePriority = 1 The window function row_number() over( ... ) does the following: for every unique combination of objects triple and property, it sorts all values by the ancestral distance from the triple to the parents that the value is inherited from, and then I only select the very first of the resulting list of values. A similar effect can be achieved with a GROUP BY and ORDER BY statements, but I just find the window function semantically cleaner (the execution plans they yield are identical). The point is, I need to select the closest of contributing ancestors, and for that I need to group and then sort within the group. And finally, now I can simply filter the result set by Property and Value. This scheme works. Very reliably and predictably. It has proven to be very powerful for the business task it implements. The only trouble is, it is awfuly slow. One might point out the join of seven tables might be slowing things down, but that is actually not the bottleneck. According to the actual execution plan I'm getting from the SQL Management Studio (as well as SQL Profiler), the bottleneck is the sorting. The problem is, in order to satisfy my window function, the server has to sort by Children1.Id, Children2.Id, Children3.Id, tp.Property, Parents1.[depthInTheTree] descending, Parents2.[depthInTheTree] descending, Parents3.[depthInTheTree] descending, and there can be no indexes it can use, because the values come from a cross join of several tables. EDIT: Per Michael Buen's suggestion (thank you, Michael), I have posted the whole puzzle to sqlfiddle here. One can see in the execution plan that the Sort operation accounts for 32% of the whole query, and that is going to grow with the number of total rows, because all the other operations use indexes. Usually in such cases I would use an indexed view, but not in this case, because indexed views cannot contain self-joins, of which there are six. The only way that I can think of so far is to create six copies of the Objects table and then use them for the joins, thus enabling an indexed view. Did the time come that I shall be reduced to that kind of hacks? The despair sets in.

Read the article
How to best handle exception to repeating calendar events

- by blcArmadillo

I'm working on a project that will require me to implement a calendar. I'm trying to come up with a system that is very flexible: can handle repeating events, exceptions to repeats, etc. I've looked at the schema for applications like iCal, Lotus Notes, and Mozilla to get an idea of how to go about implementing such a system. Currently I'm having trouble deciding what is the best way to handle exceptions to repeating events. I've used databases quite a bit but don't have a ton of experience with really optimizing everything so I'm not sure which method of the two I'm considering would be optimal in terms of overall performance and ability to query/search: Breaking the repeating event. So taking the changing the ending date on the current row for the repeating event, inserting a new row with the exception, and adding another row continuing the old sequence. Simply adding an exception. So adding a new row with some field that indicates it as an override. So here is why I can't decide. Method one will result in a lot more rows since each edit requires 2 extra rows as apposed to only one row by the second method. On the other hand I think the query to find an event would be much simper, and thus possibly faster(?) using the first method. The second method seems like it will require more calculating on the application server since once you get the data you'll have to remove the intersection of the two rows. I know databases are often the bottleneck for websites and while I'm sure a lot of you are thinking either is fine because your project will probably never get large enough for the difference in efficiency to really matter, I'd still like to implement the best solution. So what method would you guys pick, or would you do something completely different? Also, as a side note I'll be using MySQL and PHP. If there is another technology that you think would be better suited for this, especially in the database area, please mention it. Thanks for the advice.

Read the article
iPhone openGLES performance tuning

- by genesys

Hey there! I'm trying now for quite a while to optimize the framerate of my game without really making progress. I'm running on the newest iPhone SDK and have a iPhone 3G 3.1.2 device. I invoke arround 150 drawcalls, rendering about 1900 Triangles in total (all objects are textured using two texturelayers and multitexturing. most textures come from the same textureAtlasTexture stored in pvrtc 2bpp compressed texture). This renders on my phone at arround 30 fps, which appears to me to be way too low for only 1900 triangles. I tried many things to optimize the performance, including batching together the objects, transforming the vertices on the CPU and rendering them in a single drawcall. this yelds 8 drawcalls (as oposed to 150 drawcalls), but performance is about the same (fps drop to arround 26fps) I'm using 32byte vertices stored in an interleaved array (12bytes position, 12bytes normals, 8bytes uv). I'm rendering triangleLists and the vertices are ordered in TriStrip order. I did some profiling but I don't really know how to interprete it. instruments-sampling using Instruments and Sampling yelds this result: http://neo.cycovery.com/instruments_sampling.gif telling me that a lot of time is spent in "mach_msg_trap". I googled for it and it seems this function is called in order to wait for some other things. But wait for what?? instruments-openGL instruments with the openGL module yelds this result: http://neo.cycovery.com/intstruments_openglES_debug.gif but here i have really no idea what those numbers are telling me shark profiling: profiling with shark didn't tell me much either: http://neo.cycovery.com/shark_profile_release.gif the largest number is 10%, spent by DrawTriangles - and the whole rest is spent in very small percentage functions Can anyone tell me what else I could do in order to figure out the bottleneck and could help me to interprete those profiling information? Thanks a lot!

Read the article
Concurrent cartesian product algorithm in Clojure

- by jqno

Is there a good algorithm to calculate the cartesian product of three seqs concurrently in Clojure? I'm working on a small hobby project in Clojure, mainly as a means to learn the language, and its concurrency features. In my project, I need to calculate the cartesian product of three seqs (and do something with the results). I found the cartesian-product function in clojure.contrib.combinatorics, which works pretty well. However, the calculation of the cartesian product turns out to be the bottleneck of the program. Therefore, I'd like to perform the calculation concurrently. Now, for the map function, there's a convenient pmap alternative that magically makes the thing concurrent. Which is cool :). Unfortunately, such a thing doesn't exist for cartesian-product. I've looked at the source code, but I can't find an easy way to make it concurrent myself. Also, I've tried to implement an algorithm myself using map, but I guess my algorithmic skills aren't what they used to be. I managed to come up with something ugly for two seqs, but three was definitely a bridge too far. So, does anyone know of an algorithm that's already concurrent, or one that I can parallelize myself?

Read the article
Will fixed-point arithmetic be worth my trouble?

- by Thomas

I'm working on a fluid dynamics Navier-Stokes solver that should run in real time. Hence, performance is important. Right now, I'm looking at a number of tight loops that each account for a significant fraction of the execution time: there is no single bottleneck. Most of these loops do some floating-point arithmetic, but there's a lot of branching in between. The floating-point operations are mostly limited to additions, subtractions, multiplications, divisions and comparisons. All this is done using 32-bit floats. My target platform is x86 with at least SSE1 instructions. (I've verified in the assembler output that the compiler indeed generates SSE instructions.) Most of the floating-point values that I'm working with have a reasonably small upper bound, and precision for near-zero values isn't very important. So the thought occurred to me: maybe switching to fixed-point arithmetic could speed things up? I know the only way to be really sure is to measure it, that might take days, so I'd like to know the odds of success beforehand. Fixed-point was all the rage back in the days of Doom, but I'm not sure where it stands anno 2010. Considering how much silicon is nowadays pumped into floating-point performance, is there a chance that fixed-point arithmetic will still give me a significant speed boost? Does anyone have any real-world experience that may apply to my situation?

Read the article
Writing to a log4net FileAppender with multiple threads performance problems

- by Wayne

TickZoom is a very high performance app which uses it's own parallelization library and multiple O/S threads for smooth utilization of multi-core computers. The app hits a bottleneck where users need to write information to a LogAppender from separate O/S threads. The FileAppender uses the MinimalLock feature so that each thread can lock and write to the file and then release it for the next thread to write. If MinimalLock gets disabled, log4net reports errors about the file being already locked by another process (thread). A better way for log4net to do this would be to have a single thread that takes care of writing to the FileAppender and any other threads simply add their messages to a queue. In that way, MinimalLock could be disabled to greatly improve performance of logging. Additionally, the application does a lot of CPU intensive work so it will also improve performance to use a separate thread for writing to the file so the CPU never waits on the I/O to complete. So the question is, does log4net already offer this feature? If so, how do you do enable threaded writing to a file? Is there another, more advanced appender, perhaps? If not, then since log4net is already wrapped in the platform, that makes it possible to implement a separate thread and queue for this purpose in the TickZoom code. Sincerely, Wayne

Read the article
Data in two databases, eager spool resulting in query

- by Valkyrie

I have two databases in SQL2k5: one that holds a large amount of static data (SQL Database 1) (never updated but frequently inserted into) and one that holds relational data (SQL Database 2) related to the static data. They're separated mainly because of corporate guidelines and business requirements: assume for the following problem that combining them is not practical. There are places in SQLDB2 that PKs in SQLDB1 are referenced; triggers control the referential integrity, since cross-database relationships are troublesome in SQL Server. BUT, because of the large amount of data in SQLDB1, I'm getting eager spools on queries that join from the Id in SQLDB2 that references the data in SQLDB1. (With me so far? Maybe an example will help:) SELECT t.Id, t.Name, t2.Company FROM SQLDB1.table t INNER JOIN SQLDB2.table t2 ON t.Id = t2.FKId This query results in a eager spool that's 84% of the load of the query; the table in SQLDB1 has 35M rows, so it's completely choking this query. I can't create a view on the table in SQLDB1 and use that as my FK/index; it doesn't want me to create a constraint based on a view. Anyone have any idea how I can fix this huge bottleneck? (Short of putting the static data in the first db: believe me, I've argued that one until I'm blue in the face to no avail.) Thanks! valkyrie Edit: also can't create an indexed view because you can't put schemabinding on a view that references a table outside the database where the view resides. Dang it.

Read the article
Online voice chat: Why client-server model vs. peer-to-peer model?

- by sstallings

I am adding online voice chat to a Silverlight app. I've been reviewing current apps, services and SDKs found thru online searches and forums. I'm finding that the majority of these implement a client-server (C/S) model and I'm trying to understand why that model versus a peer-to-peer (PTP) model. To me PTP would be preferable because going direct between peers would be more efficient (fewer IP hops and no processing along the way by a server computer) and no need for a server and its costs and dependencies. I found some products offer the ability to switch from PTP to C/S if the PTP proves insufficient. As I thought more about it, I could see that C/S could be better if there are more than two peers involved in a conversation, then the server (supposedly with more bandwidth) could do a better job of relaying each peers outgoing traffic to the multiple other peers. In C/S many-to-many voice chatting, each peer's upstream broadband (which is where the bottleneck inherently is) would only have to carry each item of voice traffic once, then the server would use its superior bandwidth to relay the message to the multiple other peers. But, in a situation with one-on-one voice chatting it seems that PTP would be best. A server would not reduce each of the two peer's bandwidth requirements and would only add unnecessary overhead, dependency and cost. In one-on-one voice chatting: Am I mistaken on anything above? Would peer-to-peer be best? Would a server provide anything of value that could not be provided by a client-only program? Is there anything else that I should be taking into consideration? And lastly, can you recommend any Silverlight PTP or C/S voice chat products? Thanks in advance for any info.

Read the article
Problems with WCF endpoints hosted from Windows Service

- by Dilip

I have a managed Windows Service that hosts a couple of WCF endpoints. The service is set to start automatically when the PC is restarted. On reboot I find that this line of code: ServiceHost wcfHost1 = new ServiceHost(typeof(WCFHost1)); in the OnStart() method of the service takes somewhere between 15 - 20 seconds to execute. Actually I have two such statements but the second one executes in a flash. It is the first one that takes so long. Does anyone know what could be causing the bottleneck? Because of this, sometimes the call exceeds 30 seconds and as a result the SCM thinks my service timed out while trying to initialize itself. Now, I know its easy for me to just spin off a thread to do this and return from OnStart() right away but I'd like to know what could cause this delay. This happens only when the service starts up on PC reboot. If the PC is up and running, the service starts & stops in less than a second.

Read the article

Search Results

Search found 415 results on 17 pages for 'bottleneck'.

Page 12/17 | < Previous Page | 8 9 10 11 12 13 14 15 16 17 | Next Page >

- by Scott Stamp

- by Ahmed Galal

- by Dennis Kashkin

- by 2called-chaos

- by sjohnston

- by Dave

- by V. Romanov

- by Philipp Wendler

- by Manuel Sopena Ballesteros

- by TORr0t

- by Ivan

- by mr.b

- by UpTheCreek

- by UpTheCreek

- by PeterMmm

- by Markaway

- by Fyodor Soikin

- by blcArmadillo

- by genesys

- by jqno

- by Thomas

- by Wayne

- by Valkyrie

- by sstallings

- by Dilip

< Previous Page | 8 9 10 11 12 13 14 15 16 17 | Next Page >