Search Results

Search found 882 results on 36 pages for 'concurrency coordination'.

Page 30/36 | < Previous Page | 26 27 28 29 30 31 32 33 34 35 36  | Next Page >

  • Best Practices - Dynamic Reconfiguration

    - by jsavit
    This post is one of a series of "best practices" notes for Oracle VM Server for SPARC (formerly named Logical Domains) Overview of dynamic Reconfiguration Oracle VM Server for SPARC supports Dynamic Reconfiguration (DR), making it possible to add or remove resources to or from a domain (virtual machine) while it is running. This is extremely useful because resources can be shifted to or from virtual machines in response to load conditions without having to reboot or interrupt running applications. For example, if an application requires more CPU capacity, you can add CPUs to improve performance, and remove them when they are no longer needed. You can use even use Dynamic Resource Management (DRM) policies that automatically add and remove CPUs to domains based on load. How it works (in broad general terms) Dynamic Reconfiguration is done in coordination with Solaris, which recognises a hypervisor request to change its virtual machine configuration and responds appropriately. In essence, Solaris receives a message saying "you now have 16 more CPUs numbered 16 to 31" or "8GB more RAM starting at address X" or "here's a new network or disk device - have fun with it". These actions take very little time. Solaris then can start using the new resource. In the case of added CPUs, that means dispatching processes and potentially binding interrupts to the new CPUs. For memory, Solaris adds the new memory pages to its "free" list and starts using them. Comparable actions occur with network and disk devices: they are recognised by Solaris and then used. Removing is the reverse process: after receiving the DR message to free specific CPUs, Solaris unbinds interrupts assigned to the CPUs and stops dispatching process threads. That takes very little time. primary # ldm list NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-cv- SP 16 4G 1.0% 6d 22h 29m ldom1 active -n---- 5000 16 8G 0.9% 6h 59m primary # ldm set-core 5 ldom1 primary # ldm list NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-cv- SP 16 4G 0.2% 6d 22h 29m ldom1 active -n---- 5000 40 8G 0.1% 6h 59m primary # ldm set-core 2 ldom1 primary # ldm list NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-cv- SP 16 4G 1.0% 6d 22h 29m ldom1 active -n---- 5000 16 8G 0.9% 6h 59m Memory pages are vacated by copying their contents to other memory locations and wiping them clean. Solaris may have to swap memory contents to disk if the remaining RAM isn't enough to hold all the contents. For this reason, deallocating memory can take longer on a loaded system. Even on a lightly loaded system it took several 7 or 8 seconds to switch the domain below between 8GB and 24GB of RAM. primary # ldm set-mem 24g ldom1 primary # ldm list NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-cv- SP 16 4G 0.1% 6d 22h 36m ldom1 active -n---- 5000 16 24G 0.2% 7h 6m primary # ldm set-mem 8g ldom1 primary # ldm list NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-cv- SP 16 4G 0.7% 6d 22h 37m ldom1 active -n---- 5000 16 8G 0.3% 7h 7m What if the device is in use? (this is the anecdote that inspired this blog post) If CPU or memory is being removed, releasing it pretty straightforward, using the method described above. The resources are released, and Solaris continues with less capacity. It's not as simple with a network or I/O device: you don't want to yank a device out from underneath an application that might be using it. In the following example, I've added a virtual network device to ldom1 and want to take it away, even though it's been plumbed. primary # ldm rm-vnet vnet19 ldom1 Guest LDom returned the following reason for failing the operation: Resource Information ---------------------------------------------------------- ----------------------- /devices/virtual-devices@100/channel-devices@200/network@1 Network interface net1 VIO operation failed because device is being used in LDom ldom1 Failed to remove VNET instance That's what I call a helpful error message - telling me exactly what was wrong. In this case the problem is easily solved. I know this NIC is seen in the guest as net1 so: ldom1 # ifconfig net1 down unplumb Now I can dispose of it, and even the virtual switch I had created for it: primary # ldm rm-vnet vnet19 ldom1 primary # ldm rm-vsw primary-vsw9 If I had to take away the device disruptively, I could have used ldm rm-vnet -f but that could disrupt whoever was using it. It's better if that can be avoided. Summary Oracle VM Server for SPARC provides dynamic reconfiguration, which lets you modify a guest domain's CPU, memory and I/O configuration on the fly without reboot. You can add and remove resources as needed, and even automate this for CPUs by setting up resource policies. Taking things away can be more complicated than giving, especially for devices like disks and networks that may contain application and system state or be involved in a transaction. LDoms and Solaris cooperative work together to coordinate resource allocation and de-allocation in a safe and effective way. For best practices, use dynamic reconfiguration to make the best use of your system's resources.

    Read the article

  • Is there a pedagogical game engine?

    - by K.G.
    I'm looking for a book, website, or other resource that gives modern 3D game engines the same treatment as Operating Systems: Design and Implementation gave operating systems. I have read Jason Gregory's Game Engine Architecture, which I enjoyed. However, by intent the author treated components of the architecture as atomic units, whereas what I'm interested in is the plumbing between those units that makes a coherent whole out of ideally loosely coupled parts. In books such as these, one usually reads that "that's academic," but that's the point! I have also read Julian Gold's Object-oriented Game Development, which likewise was good, but I feel is beginning to show its age. Since even mobile platforms these days are multicore and have fast video memory, those kinds of things (concurrency, display item buffering) would ideally be covered. There are other resources, such as the Doom 3 source code, which is highly instructive for its being a shipped product. The problem with those is as follows: float Q_rsqrt( float number ) { long i; float x2, y; const float threehalfs = 1.5F; x2 = number * 0.5F; y = number; i = * ( long * ) &y; // evil floating point bit level hacking i = 0x5f3759df - ( i >> 1 ); // what the f***? y = * ( float * ) &i; y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration // y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed return y; } To wit, while brilliant, this kind of source requires more enlightenment than I can usually muster upon first read. In summary, here's my white whale: For an adult reader with experience in programming. I wish I could save all the trees killed by every. Single. Game Programming book ever devoting the first two chapters to "Now just what is a variable anyway?" In C or C++, very preferably C++. Languages that are more concise are fantastic for teaching, except for when what you want to learn is how to cope with a verbose language. There is also the benefit of the guardrails that C++ doesn't provide, such as garbage collection. Platform agnostic. I'm sincerely afraid that this book is out there and it's Visual C++/DirectX oriented. I'm a Linux guy, and I'd do what it takes, but I would very much like to be able to use OpenGL. Thanks for everything! Before anyone gets on my case about it, Fast inverse square root was from Quake III Arena, not Doom 3!

    Read the article

  • How granular should a command be in a CQ[R]S model?

    - by Aaronaught
    I'm considering a project to migrate part of our WCF-based SOA over to a service bus model (probably nServiceBus) and using some basic pub-sub to achieve Command-Query Separation. I'm not new to SOA, or even to service bus models, but I confess that until recently my concept of "separation" was limited to run-of-the-mill database mirroring and replication. Still, I'm attracted to the idea because it seems to provide all the benefits of an eventually-consistent system while sidestepping many of the obvious drawbacks (most notably the lack of proper transactional support). I've read a lot on the subject from Udi Dahan who is basically the guru on ESB architectures (at least in the Microsoft world), but one thing he says really puzzles me: As we get larger entities with more fields on them, we also get more actors working with those same entities, and the higher the likelihood that something will touch some attribute of them at any given time, increasing the number of concurrency conflicts. [...] A core element of CQRS is rethinking the design of the user interface to enable us to capture our users’ intent such that making a customer preferred is a different unit of work for the user than indicating that the customer has moved or that they’ve gotten married. Using an Excel-like UI for data changes doesn’t capture intent, as we saw above. -- Udi Dahan, Clarified CQRS From the perspective described in the quotation, it's hard to argue with that logic. But it seems to go against the grain with respect to SOAs. An SOA (and really services in general) are supposed to deal with coarse-grained messages so as to minimize network chatter - among many other benefits. I realize that network chatter is less of an issue when you've got highly-distributed systems with good message queuing and none of the baggage of RPC, but it doesn't seem wise to dismiss the issue entirely. Udi almost seems to be saying that every attribute change (i.e. field update) ought to be its own command, which is hard to imagine in the context of one user potentially updating hundreds or thousands of combined entities and attributes as it often is with a traditional web service. One batch update in SQL Server may take a fraction of a second given a good highly-parameterized query, table-valued parameter or bulk insert to a staging table; processing all of these updates one at a time is slow, slow, slow, and OLTP database hardware is the most expensive of all to scale up/out. Is there some way to reconcile these competing concerns? Am I thinking about it the wrong way? Does this problem have a well-known solution in the CQS/ESB world? If not, then how does one decide what the "right level" of granularity in a Command should be? Is there some "standard" one can use as a starting point - sort of like 3NF in databases - and only deviate when careful profiling suggests a potentially significant performance benefit? Or is this possibly one of those things that, despite several strong opinions being expressed by various experts, is really just a matter of opinion?

    Read the article

  • Data Source Security Part 5

    - by Steve Felts
    If you read through the first four parts of this series on data source security, you should be an expert on this focus area.  There is one more small topic to cover related to WebLogic Resource permissions.  After that comes the test, I mean example, to see with a real set of configuration parameters what the results are with some concrete values. WebLogic Resource Permissions All of the discussion so far has been about database credentials that are (eventually) used on the database side.  WLS has resource credentials to control what WLS users are allowed to access JDBC resources.  These can be defined on the Policies tab on the Security tab associated with the data source.  There are four permissions: “reserve” (get a new connection), “admin”, “shrink”, and reset (plus the all-inclusive “ALL”); we will focus on “reserve” here because we are talking about getting connections.  By default, JDBC resource permissions are completely open – anyone can do anything.  As soon as you add one policy for a permission, then all other users are restricted.  For example, if I add a policy so that “weblogic” can reserve a connection, then all other users will fail to reserve connections unless they are also explicitly added.  The validation is done for WLS user credentials only, not database user credentials.  Configuration of resources in general is described at “Create policies for resource instances” http://docs.oracle.com/cd/E24329_01/apirefs.1211/e24401/taskhelp/security/CreatePoliciesForResourceInstances.html.  This feature can be very useful to restrict what code and users can get to your database. There are the three use cases: API Use database credentials User for permission checking getConnection() True or false Current WLS user getConnection(user,password) False User/password from API getConnection(user,password) True Current WLS user If a simple getConnection() is used or database credentials are enabled, the current user that is authenticated to the WLS system is checked. If database credentials are not enabled, then the user and password on the API are used. Example The following is an actual example of the interactions between identity-based-connection-pooling-enabled, oracle-proxy-session, and use-database-credentials. On the database side, the following objects are configured.- Database users scott; jdbcqa; jdbcqa3- Permission for proxy: alter user jdbcqa3 grant connect through jdbcqa;- Permission for proxy: alter user jdbcqa grant connect through jdbcqa; The following WebLogic Data Source objects are configured.- Users weblogic, wluser- Credential mapping “weblogic” to “scott”- Credential mapping "wluser" to "jdbcqa3"- Data source descriptor configured with user “jdbcqa”- All tests are run with Set Client ID set to true (more about that below).- All tests are run with oracle-proxy-session set to false (more about that below). The test program:- Runs in servlet- Authenticates to WLS as user “weblogic” Use DB Credentials Identity based getConnection(scott,***) getConnection(weblogic,***) getConnection(jdbcqa3,***) getConnection()  true  true Identity scottClient weblogicProxy null weblogic fails - not a db user User jdbcqa3Client weblogicProxy null Default user jdbcqaClient weblogicProxy null  false  true scott fails - not a WLS user User scottClient scottProxy null jdbcqa3 fails - not a WLS user User scottClient scottProxy null  true  false Proxy for scott fails weblogic fails - not a db user User jdbcqa3Client weblogicProxy jdbcqa Default user jdbcqaClient weblogicProxy null  false  false scott fails - not a WLS user Default user jdbcqaClient scottProxy null jdbcqa3 fails - not a WLS user Default user jdbcqaClient scottProxy null If Set Client ID is set to false, all cases would have Client set to null. If this was not an Oracle thin driver, the one case with the non-null Proxy in the above table would throw an exception because proxy session is only supported, implicitly or explicitly, with the Oracle thin driver. When oracle-proxy-session is set to true, the only cases that will pass (with a proxy of "jdbcqa") are the following.1. Setting use-database-credentials to true and doing getConnection(jdbcqa3,…) or getConnection().2. Setting use-database-credentials to false and doing getConnection(wluser, …) or getConnection(). Summary There are many options to choose from for data source security.  Considerations include the number and volatility of WLS and Database users, the granularity of data access, the depth of the security identity (property on the connection or a real user), performance, coordination of various components in the software stack, and driver capabilities.  Now that you have the big picture (remember that table in part 1), you can make a more informed choice.

    Read the article

  • "Hello World" in C++ AMP

    - by Daniel Moth
    Some say that the equivalent of "hello world" code in the data parallel world is matrix multiplication :) Below is the before C++ AMP and after C++ AMP code. For more on what it all means, watch the recording of my C++ AMP introduction (the example below is part of the session). void MatrixMultiply(vector<float>& vC, const vector<float>& vA, const vector<float>& vB, int M, int N, int W ) { for (int y = 0; y < M; y++) { for (int x = 0; x < N; x++) { float sum = 0; for(int i = 0; i < W; i++) { sum += vA[y * W + i] * vB[i * N + x]; } vC[y * N + x] = sum; } } } Change the function to use C++ AMP and hence offload the computation to the GPU, and now the calling code (which I am not showing) needs no changes and the overall operation gives you really nice speed up for large datasets…  #include <amp.h> using namespace concurrency; void MatrixMultiply(vector<float>& vC, const vector<float>& vA, const vector<float>& vB, int M, int N, int W ) { array_view<const float,2> a(M, W, vA); array_view<const float,2> b(W, N, vB); array_view<writeonly<float>,2> c(M, N, vC); parallel_for_each( c.grid, [=](index<2> idx) mutable restrict(direct3d) { float sum = 0; for(int i = 0; i < a.x; i++) { sum += a(idx.y, i) * b(i, idx.x); } c[idx] = sum; } ); } Again, you can understand the elements above, by using my C++ AMP presentation slides and recording… Stay tuned for more… Comments about this post welcome at the original blog.

    Read the article

  • Java @Contented annotation to help reduce false sharing

    - by Dave
    See this posting by Aleksey Shipilev for details -- @Contended is something we've wanted for a long time. The JVM provides automatic layout and placement of fields. Usually it'll (a) sort fields by descending size to improve footprint, and (b) pack reference fields so the garbage collector can process a contiguous run of reference fields when tracing. @Contended gives the program a way to provide more explicit guidance with respect to concurrency and false sharing. Using this facility we can sequester hot frequently written shared fields away from other mostly read-only or cold fields. The simple rule is that read-sharing is cheap, and write-sharing is very expensive. We can also pack fields together that tend to be written together by the same thread at about the same time. More generally, we're trying to influence relative field placement to minimize coherency misses. Fields that are accessed closely together in time should be placed proximally in space to promote cache locality. That is, temporal locality should condition spatial locality. Fields accessed together in time should be nearby in space. That having been said, we have to be careful to avoid false sharing and excessive invalidation from coherence traffic. As such, we try to cluster or otherwise sequester fields that tend to written at approximately the same time by the same thread onto the same cache line. Note that there's a tension at play: if we try too hard to minimize single-threaded capacity misses then we can end up with excessive coherency misses running in a parallel environment. Theres no single optimal layout for both single-thread and multithreaded environments. And the ideal layout problem itself is NP-hard. Ideally, a JVM would employ hardware monitoring facilities to detect sharing behavior and change the layout on the fly. That's a bit difficult as we don't yet have the right plumbing to provide efficient and expedient information to the JVM. Hint: we need to disintermediate the OS and hypervisor. Another challenge is that raw field offsets are used in the unsafe facility, so we'd need to address that issue, possibly with an extra level of indirection. Finally, I'd like to be able to pack final fields together as well, as those are known to be read-only.

    Read the article

  • Criteria for a programming language to be considered "mature"

    - by Giorgio
    I was recently reading an answer to this question, and I was struck by the statement "The language is mature". So I was wondering what we actually mean when we say that "A programming language is mature"? Normally, a programming language is initially developed out of a need, e.g. Try out / implement a new programming paradigm or a new combination of features that cannot be found in existing languages. Try to solve a problem or overcome a limitation of an existing language. Create a language for teaching programming. Create a language that solves a particular class of problems (e.g. concurrency). Create a language and an API for a special application field, e.g. the web (in this case the language might reuse a well-known paradigm, but the whole API must be new). Create a language to push your competitor out of the market (in this case the creator might want the new language to be very similar to an existing one, in order to attract developers to the new programming language and platform). Regardless of what the original motivation and scenario in which a language has been created, eventually some languages are considered mature. In my intuition, this means that the language has achieved (at least one of) its goals, e.g. "We can now use language X as a reliable tool for writing web applications." This is however a bit vague, so I wanted to ask what you consider the most important criteria (if any) that are applied when saying that a language is mature. IMPORTANT NOTE This question is (on purpose) language-agnostic because I am only interested in general criteria. Please write only language-agnostic answers and comments! I am not asking whether any specific "language X is mature" or "which programming languages can be considered mature", or whether "language X is more mature than language Y": please avoid posting any opinions or reference about any specific languages because these are out of the scope of this question. EDIT To make the question more precise, by criteria I mean such things as "tool support", "adoption by the industry", "stability", "rich API", "large user community", "successful application record", "standardization", "clean and uniform semantics", and so on.

    Read the article

  • Programming tourism

    - by Andrew_B
    I'm going on vacation to Paris, France for 10 days. Actually, it's my girlfriend's wish to go there but I'm not very interested in visiting, sightseeing, etc. Recently, I came up with an idea of trying to do something like programming tourism. :) I'd like to do something related to programming in a startup-like company. I do not want a salary or any kind of compensation. I want to overview process, social aspects, environment and "what it feels like" to development software in another country. I'm from Russia. I've been a software developer since 2003. I prefer C#4 but I'm ready to use anything Turing-complete. I have some MS certifications and am familiar with all .NETs since 1.1. Currently I'm finishing PhD in CS. I'm interested in multidimensional indexing and I can turn any piece of data and code to OLAP system. :) But it'd take too much time. What can I do? I have no more than one week. I want a totally complete project in a short amount of time. Implement some features in well-tested project Do a code review Debug memory, performance and concurrency issues Do unit testing So, about the questions: Is it legal? I'm ready to sign NDA if it's necessary. I'll have tourist visa. Is it possible? I'm really sure that bureaucratic companies with lots of HRs and PMs will not allow such experiments. But small companies can afford it. I'm ready to guarantee support on my code after leaving home :) P.S. I still havn't started learning French :) I hope it will not take too much time :) P.P.S. Yes, it's girlfriend-approved. What's in it for me? It's fun. It's fun to see new systems and people who created them. It's fun to complete meaningful things. Quickly. What's in it for them? Feature, debug, review or test. If my short-term colleagues will like this style of working I can invite them to make same trip into my company :) I think in Russia it's even more exciting :)

    Read the article

  • Occasional disk I/O errors in SQLite

    - by Alix Axel
    I have a very simple website running PHP and SQLite 3.7.9 (with PDO). After establishing the SQLite connection I immediately execute the following queries: PRAGMA busy_timeout=0; PRAGMA cache_size=8192; PRAGMA foreign_keys=ON; PRAGMA journal_size_limit=67110000; PRAGMA legacy_file_format=OFF; PRAGMA page_size=4096; PRAGMA recursive_triggers=ON; PRAGMA secure_delete=ON; PRAGMA synchronous=NORMAL; PRAGMA temp_store=MEMORY; PRAGMA journal_mode=WAL; PRAGMA wal_autocheckpoint=4096; This website only has one writer and a few occasional readers, so I don't expect any concurrency problems (and I'm even using WAL). Every couple of days, I've seen this error being reported by PHP: Fatal error: Uncaught exception 'PDOException' with message 'SQLSTATE[HY000]: General error: 10 disk I/O error' in ... Stack trace: #0 ...: PDO-exec('PRAGMA cache_si...') There are several things that make this error very weird to me: it's not a transient problem - no matter how many times I refresh the page, it won't go away the database file is not corrupted - the sqlite3 executable can open the database without problems If the following pragmas are commented out, PHP stops throwing the disk I/O exception: PRAGMA cache_size=8192; PRAGMA synchronous=NORMAL; PRAGMA journal_mode=WAL; Then, after successfully reconnecting to the database, I'm able to reintroduce these pragmas and the code with run smoothly for days - until eventually, the same error will occur without any apparent reason. I wasn't able to reproduce this error so far, so I'm clueless about the origin of it. I'm really curious what may be causing this problem... Any ideas? Environment: Ubuntu Server 12.04 LTS PHP 5.4.15 SQLite 3.7.9 Database size: ? 10MiB Transaction (write) size: ? 1KiB EDIT: Might these symptoms have something to do with busy_timeout?

    Read the article

  • Could not continue scan with NOLOCK due to data movement during installation

    - by dbdev1
    I am running Windows Server 2008 Standard Edition R2 x64 and I installed SQL Server 2008 Developer Edition. All of the preliminary checks run fine (Apart from a warning about Windows Firewall and opening ports which is unrelated to this and shouldn't be an issue - I can open those ports). Half way through the actual installation, I get a popup with this error: Could not continue scan with NOLOCK due to data movement. The installation still runs to completion when I press ok. However, at the end, it states that the following services "failed": database engine services sql server replication full-text search reporting services How do I know if this actually means that anything from my installation (which is on a clean Windows Server setup - nothing else on there, no previous SQL Servers, no upgrades, etc) is missing? I know from my programming experience that locks are for concurrency control and the Microsoft help on this issue points to changing my query's lock/transactions in a certain way to fix the issue. But I am not touching any queries? Also, now that I have installed the app, when I login, I keep getting this message: TITLE: Connect to Server ------------------------------ Cannot connect to MSSQLSERVER. ------------------------------ ADDITIONAL INFORMATION: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server) (Microsoft SQL Server, Error: 67) For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft+SQL+Server&EvtSrc=MSSQLServer&EvtID=67&LinkId=20476 ------------------------------ BUTTONS: OK ------------------------------ I went into the Configuration Manager and enabled named pipes and restarted the service (this is something I have done before as this message is common and not serious). I have disabled Windows Firewall temporarily. I have checked the instance name against the error logs. Please advise on both of these errors. I think these two errors are related. Thanks

    Read the article

  • Apachebench on node.js server returning "apr_poll: The timeout specified has expired (70007)" after ~30 requests

    - by Scott
    I just started working with node.js and doing some experimental load testing with ab is returning an error at around 30 requests or so. I've found other pages showing a lot better concurrency numbers than I am such as: http://zgadzaj.com/benchmarking-nodejs-basic-performance-tests-against-apache-php Are there some critical server configuration settings that need done to achieve those numbers? I've watched memory on top and I still see a decent amount of free memory while running ab, watched mongostat as well and not seeing anything that looks suspicious. The command I'm running, and the error is: ab -k -n 100 -c 10 postrockandbeyond.com/ This is ApacheBench, Version 2.0.41-dev <$Revision: 1.121.2.12 $> apache-2.0 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright (c) 2006 The Apache Software Foundation, http://www.apache.org/ Benchmarking postrockandbeyond.com (be patient)...apr_poll: The timeout specified has expired (70007) Total of 32 requests completed Does anyone have any suggestions on things I should look in to that may be causing this? I'm running it on osx lion, but have also run the same command on the server with the same results. EDIT: I eventually solved this issue. I was using a TTAPI, which was connecting to turntable.fm through websockets. On the homepage, I was connecting on every request. So what was happening was that after a certain number of connections, everything would fall apart. If you're running into the same issue, check out whether you are hitting external services each request.

    Read the article

  • PostgreSQL lots of writes

    - by strife911
    Hi, I am using postgreSQL for a scientific application (unsupervised clustering). The python program is multi-threaded so that each thread manages its own postmaster process (one per core). Hence, their is a lot of concurrency. Each thread-process loop infinitely though two SQL queries. The first is for reading, the second is for writing. The read operation considers 500 time the amount of rows the write operation considers. Here is the output of dstat: ----total-cpu-usage---- ------memory-usage----- -dsk/total- --paging-- --io/total- usr sys idl wai hiq siq| used buff cach free| read writ| in out | read writ 4 0 32 64 0 0|3599M 63M 57G 1893M|1524k 16M| 0 0 | 98 2046 1 0 35 64 0 0|3599M 63M 57G 1892M|1204k 17M| 0 0 | 68 2062 2 0 32 66 0 0|3599M 63M 57G 1890M|1132k 17M| 0 0 | 62 2033 2 1 32 65 0 0|3599M 63M 57G 1904M|1236k 18M| 0 0 | 80 1994 2 0 31 67 0 0|3599M 63M 57G 1903M|1312k 16M| 0 0 | 70 1900 2 0 37 60 0 0|3599M 63M 57G 1899M|1116k 15M| 0 0 | 71 1594 2 1 37 60 0 0|3599M 63M 57G 1898M| 448k 17M| 0 0 | 39 2001 2 0 25 72 0 0|3599M 63M 57G 1896M|1192k 17M| 0 0 | 78 1946 1 0 40 58 0 0|3599M 63M 57G 1895M| 432k 15M| 0 0 | 38 1937 I am pretty sure I could write more often than that for I have seen it write up to 110-140M on dstat. How can I optimize this process?

    Read the article

  • Could not continue scan with NOLOCK due to data movement during installation

    - by dbdev1
    Hi, I am running Windows Server 2008 Standard Edition R2 x64 and I installed SQL Server 2008 Developer Edition. All of the preliminary checks run fine (Apart from a warning about Windows Firewall and opening ports which is unrelated to this and shouldn't be an issue - I can open those ports). Half way through the actual installation, I get a popup with this error: Could not continue scan with NOLOCK due to data movement. The installation still runs to completion when I press ok. However, at the end, it states that the following services "failed": database engine services sql server replication full-text search reporting services How do I know if this actually means that anything from my installation (which is on a clean Windows Server setup - nothing else on there, no previous SQL Servers, no upgrades, etc) is missing? I know from my programming experience that locks are for concurrency control and the Microsoft help on this issue points to changing my query's lock/transactions in a certain way to fix the issue. But I am not touching any queries? Also, now that I have installed the app, when I login, I keep getting this message: TITLE: Connect to Server Cannot connect to MSSQLSERVER. ADDITIONAL INFORMATION: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server) (Microsoft SQL Server, Error: 67) For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft+SQL+Server&EvtSrc=MSSQLServer&EvtID=67&LinkId=20476 BUTTONS: OK I went into the Configuration Manager and enabled named pipes and restarted the service (this is something I have done before as this message is common and not serious). I have disabled Windows Firewall temporarily. I have checked the instance name against the error logs. Please advise on both of these errors. I think these two errors are related. Thanks

    Read the article

  • How to collect the performance data of a server during an unreachable/down period using Nagios?

    - by gsc-frank
    Some time services and host stop responding due to a poor server performance. I mean, if for some reason (could be lot of concurrency services access, a expensive backup execution on the server or whatever that consume tons of server resources) a server performance is very degraded, that could lead that the server isn't capable to establish any "normal network communication" (without trigger whatever standards timeouts defined for such communication). Knowing host's performance data (cpu, memory, ...) in case of available during that period (host is not down and despite of its performance degradation still allow plugins collect performance data) could be very useful for sysadmin to try to determine what cause the problem, or at least, if the host performance was good and don't interfered at all in the host/service down. This problem could be solved using remote active (NRPE) or remote passive (NSCA) if such remote solutions could store (buffered) perf data to be send to central Nagios server when host performance or network outage allow it. I read the doc of both solutions and can't find any reference to such buffer mechanism neither what happened in case that NSCA can't reach Nagios server. Any idea of how solve this lack of info? so useful for forensic analysis. EDIT: My questions isn about which tools I can use to debug perf problems or gather perf data to analysis, but is about how collect (using Nagios) host perf data even during a network outage for its posterior analysis (kind of forensic analysis). The idea is integrate such data to Nagios graphers like pnp4nagios and NagiosGrapther. I know that I could install tools like Cacti in each of my host, and have a kind of performance data collection redundancy, but I really want avoid that and try to solve all perf analysis requirements with one tools: Nagios

    Read the article

  • how do i write an init script for django-supervisor

    - by amateur
    pardon me as this is my first time attempting to write a init script for centos 5. I am using django + supervisor to manage my celery workers, scheduler. Now, this is my naive simple attempt /etc/init.d/supervisor #!/bin/sh # # /etc/rc.d/init.d/supervisord # # Supervisor is a client/server system that # allows its users to monitor and control a # number of processes on UNIX-like operating # systems. # # chkconfig: - 64 36 # description: Supervisor Server # processname: supervisord # Source init functions /home/foo/virtualenv/property_env/bin/python /home/foo/bar/manage.py supervisor --daemonize inside my supervisor.conf: [program:celerybeat] command=/home/property/virtualenv/property_env/bin/python manage.py celerybeat --loglevel=INFO --logfile=/home/property/property_buyer/logfiles/celerybeat.log [program:celeryd] command=/home/foo/virtualenv/property_env/bin/python manage.py celeryd --loglevel=DEBUG --logfile=/home/foo/bar/logfiles/celeryd.log --concurrency=1 -E [program:celerycam] command=/home/foo/virtualenv/property_env/bin/python manage.py celerycam I couldn't get it to work. 2013-08-06 00:21:03,108 INFO exited: celerybeat (exit status 2; not expected) 2013-08-06 00:21:06,114 INFO spawned: 'celeryd' with pid 11772 2013-08-06 00:21:06,116 INFO spawned: 'celerycam' with pid 11773 2013-08-06 00:21:06,119 INFO spawned: 'celerybeat' with pid 11774 2013-08-06 00:21:06,146 INFO exited: celerycam (exit status 2; not expected) 2013-08-06 00:21:06,147 INFO gave up: celerycam entered FATAL state, too many start retries too quickly 2013-08-06 00:21:06,147 INFO exited: celeryd (exit status 2; not expected) 2013-08-06 00:21:06,152 INFO gave up: celeryd entered FATAL state, too many start retries too quickly 2013-08-06 00:21:06,152 INFO exited: celerybeat (exit status 2; not expected) 2013-08-06 00:21:07,153 INFO gave up: celerybeat entered FATAL state, too many start retries too quickly I believe it is the init script, but please help me understand what is wrong.

    Read the article

  • Samba PDC share slow with LDAP backend

    - by hmart
    The scenario I have a SUSE SLES 11.1 SP1 machine as Samba master PDC with LDAP backend. In one share there are Database files for a Client-Server application. I log XP and Windows 7 machines to the local domain (example.local), the login is a little slow but works. In the client computers have an executable which opens, reads and writes the database files from the server share. The Problem When running Samba with LDAP password backend the client application runs VERY SLOW with a maximum transfer rate of 2500 MBit per second. If disable LDAP the client app speed increases 20x, with transfer rate of 50Mbit/sec and running smoothly. I'm doing test with just two users and two machines, so concurrency, or LDAP size shouldn't be the problem here. The suspect LDAP, Smb.conf [global] section configuration. The Question What can I do? I've googled a lot, but still have no answer. Slow smb.conf WITH LDAP [global] workgroup = zmartsoft.local passdb backend = ldapsam:ldap://127.0.0.1 printing = cups printcap name = cups printcap cache time = 750 cups options = raw map to guest = Bad User logon path = \\%L\profiles\.msprofile logon home = \\%L\%U\.9xprofile logon drive = P: usershare allow guests = Yes add machine script = /usr/sbin/useradd -c Machine -d /var/lib/nobody -s /bin/false %m$ domain logons = Yes domain master = Yes local master = Yes netbios name = server os level = 65 preferred master = Yes security = user wins support = Yes idmap backend = ldap:ldap://127.0.0.1 ldap admin dn = cn=Administrator,dc=zmartsoft,dc=local ldap group suffix = ou=Groups ldap idmap suffix = ou=Idmap ldap machine suffix = ou=Machines ldap passwd sync = Yes ldap ssl = Off ldap suffix = dc=zmartsoft,dc=local ldap user suffix = ou=Users

    Read the article

  • Performance: Nginx SSL slowness or just SSL slowness in general?

    - by Mauvis Ledford
    I have an Amazon Web Services setup with an Apache instance behind Nginx with Nginx handling SSL and serving everything but the .php pages. In my ApacheBench tests I'm seeing this for my most expensive API call (which cache via Memcached): 100 concurrent calls to API call (http): 115ms (median) 260ms (max) 100 concurrent calls to API call (https): 6.1s (median) 11.9s (max) I've done a bit of research, disabled the most expensive SSL ciphers and enabled SSL caching (I know it doesn't help in this particular test.) Can you tell me why my SSL is taking so long? I've set up a massive EC2 server with 8CPUs and even applying consistent load to it only brings it up to 50% total CPU. I have 8 Nginx workers set and a bunch of Apache. Currently this whole setup is on one EC2 box but I plan to split it up and load balance it. There have been a few questions on this topic but none of those answers (disable expensive ciphers, cache ssl, seem to do anything.) Sample results below: $ ab -k -n 100 -c 100 https://URL This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking URL.com (be patient).....done Server Software: nginx/1.0.15 Server Hostname: URL.com Server Port: 443 SSL/TLS Protocol: TLSv1/SSLv3,AES256-SHA,2048,256 Document Path: /PATH Document Length: 73142 bytes Concurrency Level: 100 Time taken for tests: 12.204 seconds Complete requests: 100 Failed requests: 0 Write errors: 0 Keep-Alive requests: 0 Total transferred: 7351097 bytes HTML transferred: 7314200 bytes Requests per second: 8.19 [#/sec] (mean) Time per request: 12203.589 [ms] (mean) Time per request: 122.036 [ms] (mean, across all concurrent requests) Transfer rate: 588.25 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 65 168 64.1 162 268 Processing: 385 6096 3438.6 6199 11928 Waiting: 379 6091 3438.5 6194 11923 Total: 449 6264 3476.4 6323 12196 Percentage of the requests served within a certain time (ms) 50% 6323 66% 8244 75% 9321 80% 9919 90% 11119 95% 11720 98% 12076 99% 12196 100% 12196 (longest request)

    Read the article

  • Calling Excel from PHP 5 through COM fails on Windows 7 when Apache started through Task Planner

    - by Stefan Pantke
    I currently write an application, which controls Excel through COM: The app creates a COM-based Excel instance, opens some XLS files and reads their contents. Scenario I On Windows 7, I start Apache and mySQL using xmapp-control with system administrator rights. All works as expected. The PHP-based controller script interacts with Excel as expected. Scenario II A problem appears, if I start Apache and mySQL as 'background jobs'. Here is how: I created two jobs using Windows 7 Task Planner. One runs apache_start.bat, the other runs mysql_start.bat. Both tasks run as SYSTEM with elevated privileges when Windows 7 boots. Apache and mySQL work as expected. Specifically, Apache serves HTTP request from clients and PHP is able to talk to mySQL. When I call the PHP controller, which calls and interacts with Excel using COM, I do receive an error. The error seems to come from Excel [not COM itself] and reads like this: Excel can't read the XLS-file Excel failed to save the file due to an ill-name worksheet Interestingly, during the first run of the PHP-based controller script, it takes a few seconds to render the error message. Each subsequent run immediately renders the error message. Windows system logs didn't show a single problem report entry. Note, that the PHP program and the Apache instance didn't change - except the way Apache was started. At least the PHP controller script is perfectly able to read the file-system, since it provides the pathes to the XLS-file through scandir() of a certain directory. Concurrency issues can't be the cause of the problem. A single instance of the specific PHP controller interacts with Excel. Question Could someone provide details, why this happens?

    Read the article

  • Celery daemon as a Ubuntu service does not consume tasks while running from terminal does

    - by Guy
    On Ubuntu 11.10, I have to issue python tasks from django using celery. I'm currently testing on the same machine but eventually the celery worker should run on a remote machine. django uses the following settings: BROKER_HOST = "127.0.0.1" BROKER_PORT = 5672 BROKER_VHOST = "/my_vhost" BROKER_USER = "celery" BROKER_PASSWORD = "celery" I can also see my task queued in http://localhost:55672/#/queues the celery daemon uses the following configuration (celeryconfig.py): BROKER_HOST = "127.0.0.1" BROKER_PORT = 5672 BROKER_USER = "celery" BROKER_PASSWORD = "celery" BROKER_VHOST = "/my_vhost" CELERY_RESULT_BACKEND = "amqp" import os import sys sys.path.append(os.getcwd()) CELERY_IMPORTS = ("tasks", ) running celeryd -l info works well and now I want to run it as a service. I've followed the instructions from http://ask.github.com/celery/cookbook/daemonizing.html and now I'm trying to run it using: sudo /etc/init.d/celeryd start But the message is not being consumed, no error in the celery log either. /etc/default/celeryd CELERYD_NODES="w1" CELERYD_CHDIR="/path/to/django/project" CELERYD_OPTS="--time-limit=300 --concurrency=1" CELERY_CONFIG_MODULE="celeryconfig" # %n will be replaced with the nodename. CELERYD_LOG_FILE="/var/log/celery/%n.log" CELERYD_PID_FILE="/var/run/celery/%n.pid" # Workers should run as an unprivileged user. CELERYD_USER="celery" CELERYD_GROUP="celery" I've also created user celery in Ubuntu not sure if its necessary. Any help will be appreciated, Thanks, Guy

    Read the article

  • Server Recovery from Denial of Service

    - by JMC
    I'm looking at a server that might be misconfigured to handle Denial of Service. The database was knocked offline after the attack, and was unable to restart itself after it failed to restart when the attack subsided. Details of the Attack: The Attacker either intentionally or unintentionally sent 1000's of search queries using the applications search query url within a couple of seconds. It looks like the server was overwhelmed and it caused the database to log this message: Server Specs: 1.5GB of dedicated memory Are there any obvious mis-configurations here that I'm missing? **mysql.log** 121118 20:28:54 mysqld_safe Number of processes running now: 0 121118 20:28:54 mysqld_safe mysqld restarted 121118 20:28:55 [Warning] option 'slow_query_log': boolean value '/var/log/mysqld.slow.log' wasn't recognized. Set to OFF. 121118 20:28:55 [Note] Plugin 'FEDERATED' is disabled. 121118 20:28:55 InnoDB: The InnoDB memory heap is disabled 121118 20:28:55 InnoDB: Mutexes and rw_locks use GCC atomic builtins 121118 20:28:55 InnoDB: Compressed tables use zlib 1.2.3 121118 20:28:55 InnoDB: Using Linux native AIO 121118 20:28:55 InnoDB: Initializing buffer pool, size = 512.0M InnoDB: mmap(549453824 bytes) failed; errno 12 121118 20:28:55 InnoDB: Completed initialization of buffer pool 121118 20:28:55 InnoDB: Fatal error: cannot allocate memory for the buffer pool 121118 20:28:55 [ERROR] Plugin 'InnoDB' init function returned error. 121118 20:28:55 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed. 121118 20:28:55 [ERROR] Unknown/unsupported storage engine: InnoDB 121118 20:28:55 [ERROR] Aborting **ulimit -a** core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 13089 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 1024 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited **httpd.conf** StartServers 10 MinSpareServers 8 MaxSpareServers 12 ServerLimit 256 MaxClients 256 MaxRequestsPerChild 4000 **my.cnf** innodb_buffer_pool_size=512M # Increase Innodb Thread Concurrency = 2 * [numberofCPUs] + 2 innodb_thread_concurrency=4 # Set Table Cache table_cache=512 # Set Query Cache_Size query_cache_size=64M query_cache_limit=2M # A sort buffer is used for optimizing sorting sort_buffer_size=8M # Log slow queries slow_query_log=/var/log/mysqld.slow.log long_query_time=2 #performance_tweak join_buffer_size=2M **php.ini** memory_limit = 128M post_max_size = 8M

    Read the article

  • Apache https is slsow

    - by raucous12
    Hey, I've set apache up to use SSL with a self signed certificate. With http (KeepAlive off), I can get over 5000 requests per second. However, with https, I can only get 13 requests per second. I know there is supposed to be a bit of an overhead, but this seems abnormal. Can anyone suggest how I might go about debugging this. Here is the ab log for https: Server Software: Apache/2.2.3 Server Hostname: 127.0.0.1 Server Port: 443 SSL/TLS Protocol: TLSv1/SSLv3,DHE-RSA-AES256-SHA,4096,256 Document Path: /hello.html Document Length: 29 bytes Concurrency Level: 5 Time taken for tests: 30.49425 seconds Complete requests: 411 Failed requests: 0 Write errors: 0 Total transferred: 119601 bytes HTML transferred: 11919 bytes Requests per second: 13.68 [#/sec] (mean) Time per request: 365.565 [ms] (mean) Time per request: 73.113 [ms] (mean, across all concurrent requests) Transfer rate: 3.86 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 190 347 74.3 333 716 Processing: 0 14 24.0 1 166 Waiting: 0 11 21.6 0 165 Total: 191 361 80.8 345 716 Percentage of the requests served within a certain time (ms) 50% 345 66% 377 75% 408 80% 421 90% 468 95% 521 98% 578 99% 596 100% 716 (longest request)

    Read the article

  • Bonnie does not provide speed for Sequential Input / Block

    - by Lqp1
    I'm using ProxmoxVE and I would like to run some benchmarks regarding performances of this product. One of these benchmarks is bonnie++ ; it runs very well in a VM (qemu-kvm) but when I run it in a conainer (openVZ), it does not provide me reading speed (only writing). I don't understand why... Does anyone know what's happenning ? VMs ans Containers are Debian 7.4. Here's the output of bonnie in the container: root@ct2:/# bonnie++ -u root Using uid:0, gid:0. Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order...done. Stat files in random order...done. Delete files in random order...done. Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP ct2 1G 843 99 59116 8 60351 4 4966 99 +++++ +++ 2745 8 Latency 9558us 3582ms 527ms 1672us 936us 5248us Version 1.96 ------Sequential Create------ --------Random Create-------- ct2 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ Latency 19567us 358us 368us 107us 59us 25us 1.96,1.96,ct2,1,1401810323,1G,,843,99,59116,8,60351,4,4966,99,+++++,+++,2745,8,16,,,,,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,9558us,3582ms,527ms,1672us,936us,5248us,19567us,358us,368us,107us,59us,25us The filesystem for / is of type "simfs", which is a pseudo filesystem for openVZ. Maybe it's related to this issue but I can't find anyone with the same issue with bonnie and openVZ... Thanks for your help. Regards, Thomas.

    Read the article

  • Apache https is slow

    - by raucous12
    Hey, I've set apache up to use SSL with a self signed certificate. With https (KeepAlive on), I can get over 3000 requests per second. However, with https (KeepAlive off), I can only get 13 requests per second. I know there is supposed to be a bit of an overhead, but this seems abnormal. Can anyone suggest how I might go about debugging this. Here is the ab log for https: Server Software: Apache/2.2.3 Server Hostname: 127.0.0.1 Server Port: 443 SSL/TLS Protocol: TLSv1/SSLv3,DHE-RSA-AES256-SHA,4096,256 Document Path: /hello.html Document Length: 29 bytes Concurrency Level: 5 Time taken for tests: 30.49425 seconds Complete requests: 411 Failed requests: 0 Write errors: 0 Total transferred: 119601 bytes HTML transferred: 11919 bytes Requests per second: 13.68 [#/sec] (mean) Time per request: 365.565 [ms] (mean) Time per request: 73.113 [ms] (mean, across all concurrent requests) Transfer rate: 3.86 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 190 347 74.3 333 716 Processing: 0 14 24.0 1 166 Waiting: 0 11 21.6 0 165 Total: 191 361 80.8 345 716 Percentage of the requests served within a certain time (ms) 50% 345 66% 377 75% 408 80% 421 90% 468 95% 521 98% 578 99% 596 100% 716 (longest request)

    Read the article

  • amazon ec2-medium apache requests per second terrible

    - by TheDayIsDone
    EDITED -- test running from localhost now to rule out network... i have a c1.medium using EBS. when i do an apache benchmark and i'm just printing a "hello" for the test from localhost - no database hits, it's very slow. i can repeat this test many times with the same results. any thoughts? thanks in advance. ab -n 1000 -c 100 http://localhost/home/test/ Benchmarking localhost (be patient) Completed 100 requests Completed 200 requests Completed 300 requests Completed 400 requests Completed 500 requests Completed 600 requests Completed 700 requests Completed 800 requests Completed 900 requests Completed 1000 requests Finished 1000 requests Server Software: Apache/2.2.23 Server Hostname: localhost Server Port: 80 Document Path: /home/test/ Document Length: 5 bytes Concurrency Level: 100 Time taken for tests: 25.300 seconds Complete requests: 1000 Failed requests: 0 Write errors: 0 Total transferred: 816000 bytes HTML transferred: 5000 bytes Requests per second: 39.53 [#/sec] (mean) Time per request: 2530.037 [ms] (mean) Time per request: 25.300 [ms] (mean, across all concurrent requests) Transfer rate: 31.50 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 7 21.0 0 73 Processing: 81 2489 665.7 2500 4057 Waiting: 80 2443 654.0 2445 4057 Total: 85 2496 653.5 2500 4057 Percentage of the requests served within a certain time (ms) 50% 2500 66% 2651 75% 2842 80% 2932 90% 3301 95% 3506 98% 3762 99% 3838 100% 4057 (longest request)

    Read the article

  • Strange 3-second tcp connection latencies (Linux, HTTP)

    - by user25417
    Our webservers with static content are experiencing strange 3 second latencies occasionally. Typically, an ApacheBench run ( 10000 requests, concurrency 1 or 40, no difference, but keepalive off) looks like this: Connection Times (ms) min mean[+/-sd] median max Connect: 2 10 152.8 3 3015 Processing: 2 8 34.7 3 663 Waiting: 2 8 34.7 3 663 Total: 4 19 157.2 6 3222 Percentage of the requests served within a certain time (ms) 50% 6 66% 7 75% 7 80% 7 90% 9 95% 11 98% 223 99% 225 100% 3222 (longest request) I have tried many things: - Apache2 2.2.9 with worker or prefork MPM, no difference (with KeepAliveTimeout 10-15) - Nginx 0.6.32 - various tcp parameters (net.core.somaxconn=3000, net.ipv4.tcp_sack=0, net.ipv4.tcp_dsack=0) - putting the files/DocumentRoot on tmpfs - shorewall on or off (i.e. empty iptables or not) - AllowOverride None is on for /, so no .htaccess checks (verified with strace) - the problem persists whether the webservers are accessed directly or through a Foundry load balancer Kernel is 2.6.32 (Debian Lenny backports), but it occurred with 2.6.26 also. IPv6 is enabled, but not used. Does the issue look familiar to anyone? Help/suggestions are much appreciated. It sounds a bit like a SYN,ACK packet getting lost or ignored.

    Read the article

< Previous Page | 26 27 28 29 30 31 32 33 34 35 36  | Next Page >