Search Results

Search found 1914 results on 77 pages for 'mongrel cluster'.

Page 43/77 | < Previous Page | 39 40 41 42 43 44 45 46 47 48 49 50  | Next Page >

  • Further Performance Tuning on Medium SharePoint Farm?

    - by elorg
    I figured I would post this here, since it may be related more to the server configuration than the SharePoint configuration or a combination of both? I'm open for ideas to try, or even feedback on things that maybe have been configured incorrectly as far as performance is concerned. We have a medium MOSS 2007 install prepped and ready for receiving the WSS 2003 data to upgrade. The environment was originally architected by a previous coworker, and I have since added a few configuration modifications to assist with performance before we finally performed the install. When testing the new site collections & SharePoint install (no actual data yet), things seemed a bit slow. I had assumed that it was because I was accessing it remotely. Apparently the client is still experiencing this and it is unacceptably slow. 1 SQL Server running SQL Server 2008 2x SharePoint WFEs - hosting queries (no index) 1x SharePoint Index - hosting index (no queries) MOSS 2007 installed and patched up through December '09 on WFEs & Index All 4 servers are VMs, should have more than sufficient disk space & RAM (don't recall at the moment), and are running Windows Server 2008 - everything is 64-bit. The WFEs have Windows NLB configured, with a DNS name & IP for the NLB cluster. Single NIC on each server (virtual, since VMWare). The Index server is configured as a WFE (outside of the NLB cluster) so that it can index itself and replicate the indexes to the WFEs that will serve the queries. Everything is configured & working properly - it just takes a minute or two to load a page on the local LAN. The client is still using their old portal (we haven't started the migration/upgrade just yet) so there's virtually no data or users. We need to either further tune the configuration, or fix anything that may have been configured incorrectly which is causing this slowness? I've already reviewed & taken into account everything that I could find that was relevant before we even started the install. Does anyone have ideas or pointers? Perhaps there's something that I've missed?

    Read the article

  • How does NMap decide to print a progress line?

    - by Andrew Bolster
    Checking a larger subnet than I normally do; mapping out a cluster suite in a university for a traffic mapping project (permission attained), and I was wondering something. NMap usually prints its progress periodically, but I'm unclear to what that 'periodically' is, because the cirrent scan printed a line for basically every 100th of a percent up to 1% done, then one at 1.5%, and has said nothing since. I suspect that it changes at different 'levels' but does anyone have an actual answer?

    Read the article

  • How to scale out image hosting/serving?

    - by Continuation
    I asked this question on stackoverflow and it was suggested that I try it here: I'm building a website where users can upload photos and I'd also convert uploaded photos into thumbnails. Planning ahead, if the website gets popular, how do I scale it out so that the images (both original and thumbnails) will be stored in and served from multiple servers? Maybe a cluster? Is there any open source software that would help me in this? Thanks.

    Read the article

  • Proxy Access to my Squid Proxy

    - by Fake4d
    I have a squid proxy cluster to let my users surf in the internet and on intranet ressources. Now there is a special user, that wants to configure another squid in the net of the users. So this proxy wants to access the internet over a proxy-proxy configuration. It doesnt work at the moment. So here is the question: Whats the configuration line for my squid.conf to allow an IP to use my squid as an upstream proxy?

    Read the article

  • Invalid AuthenticityToken everywhere

    - by bwizzy
    I have a rails app that I just deployed which is generating Invalid AuthenticityToken errors anywhere a form is submitted. The app uses subdomains as account names and will also eventually allow for a custom domain to be entered. I have an entry in production.rb to allow for cross-domain session handling. The problem is that you can't login / or submit any form because everything raises an Invalid AuthenticityToken error. The issue looks similar but not the same as http://stackoverflow.com/questions/1201901/rails-invalid-authenticity-token-after-deploy plus I'm not using mongrel. I've tried clearing cookies in the browser, and restarting passenger but no luck. Anyone have any ideas? The server is running nginx + passenger 2.3.11, and Rails 2.3.5. #production.rb config.action_controller.session[:domain] = '.domain.com' #environment.rb config.action_controller.session = { :session_key => '_app_session', :secret => '.... nums and chars .....' }

    Read the article

  • Mercurial clone operation works, but I don't have write access

    - by normalocity
    Somewhere I did something silly. I was deploying my Rails app via cloning the Mercurial repo down onto my Ubuntu server. It worked the first time, and then...well, I made a small change on my dev machine, pushed the changes to the repo, and then deleted the copy on the Ubuntu server and re-cloned from the repo. The clone operation (the second, and third, and 'n' times) works without error, but I don't have write access to the files that were cloned. When I try to startup my mongrel - it can't create the /tmp folder, and because of no write access, fails to start the Rails app.

    Read the article

  • access_log item w/out IP. Starts with "::1 - - [<date>]"

    - by Meltemi
    Looking at our Apache log I see normal requests like: 174.133.xxx.xxx - - [20/May/2010:17:36:44 -0700] "GET /index.html HTTP/1.1" 200 2004 but every so often i get a cluster of these w/out an IP address. ::1 - - [20/May/2010:18:47:21 -0700] "OPTIONS * HTTP/1.0" 200 - ::1 - - [20/May/2010:18:47:22 -0700] "OPTIONS * HTTP/1.0" 200 - ::1 - - [20/May/2010:18:47:23 -0700] "OPTIONS * HTTP/1.0" 200 - what do they mean and curious what causes them?

    Read the article

  • Can I use ext3 as a shared fs if I add a lock manager ?

    - by edomaur
    I need a cluster filesystem for an iSCSI device. The problem is that the servers to which it is connected generate datafiles which must be read by every other servers. Except for the writing and deleting of such files, I do not need a full locking scheme like in OCFS2 or GFS2. So, can I use a distributed lock manager (DLM) on top of an ext3 filesystem or must I use only specialized filesystem ?

    Read the article

  • HP StorageWorks P4500 G2 Manager Management

    - by MDMarra
    According to the documentation, a management group should have an odd number of managers greater than 1. I have a four node SAN consisting of P4500 G2s. I plan on having two clusters with two nodes each in this management group, i.e.: -Managent_Group1 -Cluster1 -Node1 -Node2 -Cluster2 -Node3 -Node4 Are there any issues running standard managers on Node1, Node2, and Node3? After reading the documentation, I'm still unclear about whether or not cluster membership matters in quorum consistency, or if they don't matter at all.

    Read the article

  • openvz graphing user_beancounters

    - by Jona
    Hi there, We run a cluster of openvz servers and are looking for a way to automatically graph the content of user_beancounters for all ves. We currently have a fairly rudimentary cron which alerts us when limits are hit but we could like a graphing solution to show us history. Obviously we could roll our own using some fancy bash/php/perl and rrdtool but we're wondering if there are any existing solutions before we go down this path. We current run a cacti/snmp based graphing infrastructure.

    Read the article

  • Web hosting colo stack, what do i need

    - by james
    I'm looking to colocate a website, obviously running a few apache servers, mysql cluster, SAN servers and load balancer. What else do I need to protect the servers and what would be the best method for DNS as all the web side of things will be in the colo centre and email hosted with another company? Thanks in advance

    Read the article

  • Hot deploy on Heroku with no downtime

    - by zetarun
    A bad side of pushing to Heroku is that I must push the code (and the server restarts automatically) before running my db migrations. This can obviously cause some 500 errors on users navigating the website having the new code without the new tables/attributes: the solution proposed by Heroku is to use the maintenance mode, but I want a way with no downside letting my webapp running everytime! Is there a way? For example with Capistrano: I prepare the code to deploy in a new dir I run (backward) migrations and the old code continue to work perfectly I swith mongrel instance to the new dir and restart the server ...and I have no downtime!

    Read the article

  • Corosync :: Restarting some resources after Lan connectivity issue

    - by moebius_eye
    I am currently looking into corosync to build a two-node cluster. So, I've got it working fine, and it does what I want to do, which is: Lost connectivity between the two nodes gives the first node '10node' both Failover Wan IPs. (aka resources WanCluster100 and WanCluster101 ) '11node' does nothing. He "thinks" he still has his Failover Wan IP. (aka WanCluster101) But it doesn't do this: '11node' should restart the WanCluster101 resource when the connectivity with the other node is back. This is to prevent a condition where node10 simply dies (and thus does not get 11node's Failover Wan IP), resulting in a situation where none of the nodes have 10node's failover IP because 10node is down 11node has "given back" his failover Wan IP. Here's the current configuration I'm working on. node 10sch \ attributes standby="off" node 11sch \ attributes standby="off" primitive LanCluster100 ocf:heartbeat:IPaddr2 \ params ip="172.25.0.100" cidr_netmask="32" nic="eth3" \ op monitor interval="10s" \ meta is-managed="true" target-role="Started" primitive LanCluster101 ocf:heartbeat:IPaddr2 \ params ip="172.25.0.101" cidr_netmask="32" nic="eth3" \ op monitor interval="10s" \ meta is-managed="true" target-role="Started" primitive Ping100 ocf:pacemaker:ping \ params host_list="192.0.2.1" multiplier="500" dampen="15s" \ op monitor interval="5s" \ meta target-role="Started" primitive Ping101 ocf:pacemaker:ping \ params host_list="192.0.2.1" multiplier="500" dampen="15s" \ op monitor interval="5s" \ meta target-role="Started" primitive WanCluster100 ocf:heartbeat:IPaddr2 \ params ip="192.0.2.100" cidr_netmask="32" nic="eth2" \ op monitor interval="10s" \ meta target-role="Started" primitive WanCluster101 ocf:heartbeat:IPaddr2 \ params ip="192.0.2.101" cidr_netmask="32" nic="eth2" \ op monitor interval="10s" \ meta target-role="Started" primitive Website0 ocf:heartbeat:apache \ params configfile="/etc/apache2/apache2.conf" options="-DSSL" \ operations $id="Website-one" \ op start interval="0" timeout="40" \ op stop interval="0" timeout="60" \ op monitor interval="10" timeout="120" start-delay="0" statusurl="http://127.0.0.1/server-status/" \ meta target-role="Started" primitive Website1 ocf:heartbeat:apache \ params configfile="/etc/apache2/apache2.conf.1" options="-DSSL" \ operations $id="Website-two" \ op start interval="0" timeout="40" \ op stop interval="0" timeout="60" \ op monitor interval="10" timeout="120" start-delay="0" statusurl="http://127.0.0.1/server-status/" \ meta target-role="Started" group All100 WanCluster100 LanCluster100 group All101 WanCluster101 LanCluster101 location AlwaysPing100WithNode10 Ping100 \ rule $id="AlWaysPing100WithNode10-rule" inf: #uname eq 10sch location AlwaysPing101WithNode11 Ping101 \ rule $id="AlWaysPing101WithNode11-rule" inf: #uname eq 11sch location NeverLan100WithNode11 LanCluster100 \ rule $id="RAND1083308" -inf: #uname eq 11sch location NeverPing100WithNode11 Ping100 \ rule $id="NeverPing100WithNode11-rule" -inf: #uname eq 11sch location NeverPing101WithNode10 Ping101 \ rule $id="NeverPing101WithNode10-rule" -inf: #uname eq 10sch location Website0NeedsConnectivity Website0 \ rule $id="Website0NeedsConnectivity-rule" -inf: not_defined pingd or pingd lte 0 location Website1NeedsConnectivity Website1 \ rule $id="Website1NeedsConnectivity-rule" -inf: not_defined pingd or pingd lte 0 colocation Never -inf: LanCluster101 LanCluster100 colocation Never2 -inf: WanCluster100 LanCluster101 colocation NeverBothWebsitesTogether -inf: Website0 Website1 property $id="cib-bootstrap-options" \ dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ no-quorum-policy="ignore" \ stonith-enabled="false" \ last-lrm-refresh="1408954702" \ maintenance-mode="false" rsc_defaults $id="rsc-options" \ resource-stickiness="100" \ migration-threshold="3" I also have a less important question concerning this line: colocation NeverBothLans -inf: LanCluster101 LanCluster100 How do I tell it that this collocation only applies to '11node'.

    Read the article

  • Nginx and Passenger deploy issue

    - by merlin
    Currently I can only get the default nginx page to come up on my domain name. I am pretty sure the error is either in the /etc/hosts file or the enginx.config file. my /etc/hosts file is 127.0.0.1 localhost.localdomain localhost myip server.mydomain.com server and nginx.config is: server { listen 80; server_name server.mydomain.com; root /whatever/pulic; passenger_enabled on; rails_env production; I don't get any errors in the log. Incidentally I can run mongrel and on mydomain:3000 see the application there.

    Read the article

  • Operative systems on SD cards

    - by HisDudeness
    I was getting some wild ideas the last days, like putting some operative systems into SD cards rather than on my hard drive. I'll go further into details now and explain what lead me to consider this probably abominable decision. I am on a laptop (that means I have a native SD-card reader) which is currently running a cross-distro setup, with a bunch of Linux systems (placed in dedicated ext4 logical partitions into a huge extended one) regulated by an unique GRUB. Since today, my laptop haven't even seen any Windows system with binoculars. I was thinking about placing all the os part of my setup into a Secure Digital to save all my 500 Gb Hard Drive for documents, music, videos and so on, and being able to just remove the SD and boot my system into another computer too, as well as having the possibility of booting other systems into mine by just plugging in another SD, without having to keep it constantly placed in my PC. Also, in the remote case in the near future I just wanted to boot Windows 8 in it, I read it causes major boot incompatibility issues with other systems by needing a digital signature in order for them to start. By having it in a removable drive, I could just get rid of it when I'm needing him and switch its card with Linux one, and so not having any obstacles to their boot. Now, my questions are: I know unlikely traditional rotating disk drives, integrated circuits ones have a limited lifespan in terms of cluster rewriting. Is it an obstacle to that kind of usage? I mean, some Ultrabooks are using SSD now, is it the same issue, or there are some differences between Solid State Drives and Secure Digitals in that sense? Maybe having them to store system files which are in fixed positions (making the even-usage of cluster technology useless) constantly being re-read and updated and similar things just gets them soon unserviceable, do it? Second question: are all motherboards and BIOSes able to boot from SDs just like they are from USB pen drives (I mean, provided card reader is USB-connected, isn't it)? Or can't bootloaders like GRUB be installed on SDs working? If they can't, is it a solution installing GRUB to MBR and making boot option pointing to SD? Will it work? Are there any other problems to installing OSs on a Secure Digital?

    Read the article

  • Transient network dropout for Xen DomU's

    - by Stephen C
    We've got a CentOS server running a cluster of virtuals. Occasionally the cluster's internal network drops out for a minute or so ... and then comes back. The problem is somehow related to the actual network traffic, but it is not a simple load issue. (The system is generally lightly loaded, and the problem occurs irrespective of actual load.) The setup: CentOS 5.6 on Dom0, various CentOS on the DomU's Hardware - a Dell R710 with a BroadCom NextXpress 2 NIC (sigh) using the latest drivers for the NIC from BroadCom Xen configured to use network-bridge and vif-bridge Some iptable tweaks to route an unrelated port to one of the virtuals. The system has one externally visible IP address, and Dom0 runs an Apache httpd configured with a number of virtual hosts each of which reverse proxies to web servers running on the virtuals. (The virtuals have to be NAT'ed, primarily because we don't have enough allocated public IP addresses.) The symptoms: Works fine most of the time. When someone tries to UPLOAD a large file to one virtuals, the internal network drops out ... for all virtuals: The Dom0 httpd sees a network timeout talking to the backend server on the virtual and reports a 502. A previously established ssh connection from Dom0 to any of the DomU's freezes. Our monitoring shows ping failures for traffic between virtuals. The Xen consoles to the DomU's do not freeze. No log messages in any log files that I can see, on either Dom0 or the DomU's ... apart from the Dom0 httpd logs. After a minute or so, the problem clears by itself. This is 100% reproducible. What we've tried: Downloading, building and installing the latest BNX2 driver on Dom0 Turning off MSI on the NIC - adding "options bnx2 disable_msi=1" to /etc/modprobe.conf Turning off tcp segmentation offload - "ethtool -K eth0 tso off". Sacrificing a black rooster at midnight. I've exhausted all my options apart from switching to KVM ... or slaughtering more roosters. Any suggestions?

    Read the article

  • Uneven Cassandra load

    - by David Keen
    Should a three node Cassandra cluster with a replication factor of 3 have the same load value for all three nodes? We are using a random partitioner and NetworkTopologyStrategy. Nodetool ring shows equal values for "Owns" but unequal values for "Load". Load Owns Token 113427455640312821154458202477256070484 16.53 GB 33.33% 0 14.8 GB 33.33% 56713727820156410577229101238628035242 15.65 GB 33.33% 113427455640312821154458202477256070484 Running nodetool repair and cleanup on each node brought the load a little closer but it still seems quite unbalanced. Is this considered normal?

    Read the article

  • Ram cache on Windows Server 2008

    - by Jonas Lincoln
    Scenario: We have a file cluster on a UNC share. A couple of IIS web servers serve files from this UNC share. This is done through a IIS-module, and this module does not use the built-in IIS-caching feature. We'd like to cache the files from the UNC share in a ram disk. So far, we've found this product: http://www.superspeed.com/servers/supercache.php Are there other products that can help us cache the files from the UNC-share in ram?

    Read the article

  • cannot access localhost using ip

    - by Robert
    I have done a small web development project using eclipse. It runs well when I try running it on browser with url localhost:8080/myproject/home.html. But if I want to access it on another machine (laptop, mobile, etc. using the same wifi) it is not possible; it is not able to connect. After Googling for a while found out that I have to use the IP address instead of 'localhost'. So I tried 10.0.0.4:8080/myproject/home.html, but still does not work. In fact i am unable to open that url on the same machine (where localhost:8080/myproject/home.html works fine). I also added a new Inbound rule in control panel firewall settings, allowing access to all ports for protocol TCP. Still have problem in running application with the url 10.0.0.4:8080/myproject/home.html (both on same machine as well as laptop and mobile). FYI i am using Eclipse Indigo, Apache tomcat 6.0 and server.xml file contents is as below: <?xml version="1.0" encoding="UTF-8"?> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --><!-- Note: A "Server" is not itself a "Container", so you may not define subcomponents such as "Valves" at this level. Documentation at /docs/config/server.html --><Server port="8005" shutdown="SHUTDOWN"> <!--APR library loader. Documentation at /docs/apr.html --> <Listener SSLEngine="on" className="org.apache.catalina.core.AprLifecycleListener"/> <!--Initialize Jasper prior to webapps are loaded. Documentation at /docs/jasper-howto.html --> <Listener className="org.apache.catalina.core.JasperListener"/> <!-- Prevent memory leaks due to use of particular java/javax APIs--> <Listener className="org.apache.catalina.core.JreMemoryLeakPreventionListener"/> <!-- JMX Support for the Tomcat server. Documentation at /docs/non-existent.html --> <Listener className="org.apache.catalina.mbeans.ServerLifecycleListener"/> <Listener className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener"/> <!-- Global JNDI resources Documentation at /docs/jndi-resources-howto.html --> <GlobalNamingResources> <!-- Editable user database that can also be used by UserDatabaseRealm to authenticate users --> <Resource auth="Container" description="User database that can be updated and saved" factory="org.apache.catalina.users.MemoryUserDatabaseFactory" name="UserDatabase" pathname="conf/tomcat-users.xml" type="org.apache.catalina.UserDatabase"/> </GlobalNamingResources> <!-- A "Service" is a collection of one or more "Connectors" that share a single "Container" Note: A "Service" is not itself a "Container", so you may not define subcomponents such as "Valves" at this level. Documentation at /docs/config/service.html --> <Service name="Catalina"> <!--The connectors can use a shared executor, you can define one or more named thread pools--> <!-- <Executor name="tomcatThreadPool" namePrefix="catalina-exec-" maxThreads="150" minSpareThreads="4"/> --> <!-- A "Connector" represents an endpoint by which requests are received and responses are returned. Documentation at : Java HTTP Connector: /docs/config/http.html (blocking & non-blocking) Java AJP Connector: /docs/config/ajp.html APR (HTTP/AJP) Connector: /docs/apr.html Define a non-SSL HTTP/1.1 Connector on port 8080 --> <Connector port="8080" protocol="HTTP/1.1" address="10.0.0.4" connectionTimeout="20000" redirectPort="8443" /> <!-- A "Connector" using the shared thread pool--> <!-- <Connector executor="tomcatThreadPool" port="8080" protocol="HTTP/1.1" connectionTimeout="20000" redirectPort="8443" /> --> <!-- Define a SSL HTTP/1.1 Connector on port 8443 This connector uses the JSSE configuration, when using APR, the connector should be using the OpenSSL style configuration described in the APR documentation --> <!-- <Connector port="8443" protocol="HTTP/1.1" SSLEnabled="true" maxThreads="150" scheme="https" secure="true" clientAuth="false" sslProtocol="TLS" /> --> <!-- Define an AJP 1.3 Connector on port 8009 --> <Connector port="8009" protocol="AJP/1.3" redirectPort="8443"/> <!-- An Engine represents the entry point (within Catalina) that processes every request. The Engine implementation for Tomcat stand alone analyzes the HTTP headers included with the request, and passes them on to the appropriate Host (virtual host). Documentation at /docs/config/engine.html --> <!-- You should set jvmRoute to support load-balancing via AJP ie : <Engine name="Catalina" defaultHost="localhost" jvmRoute="jvm1"> --> <Engine defaultHost="localhost" name="Catalina"> <!--For clustering, please take a look at documentation at: /docs/cluster-howto.html (simple how to) /docs/config/cluster.html (reference documentation) --> <!-- <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"/> --> <!-- The request dumper valve dumps useful debugging information about the request and response data received and sent by Tomcat. Documentation at: /docs/config/valve.html --> <!-- <Valve className="org.apache.catalina.valves.RequestDumperValve"/> --> <!-- This Realm uses the UserDatabase configured in the global JNDI resources under the key "UserDatabase". Any edits that are performed against this UserDatabase are immediately available for use by the Realm. --> <Realm className="org.apache.catalina.realm.UserDatabaseRealm" resourceName="UserDatabase"/> <!-- Define the default virtual host Note: XML Schema validation will not work with Xerces 2.2. --> <Host appBase="webapps" autoDeploy="true" name="localhost" unpackWARs="true" xmlNamespaceAware="false" xmlValidation="false"> <!-- SingleSignOn valve, share authentication between web applications Documentation at: /docs/config/valve.html --> <!-- <Valve className="org.apache.catalina.authenticator.SingleSignOn" /> --> <!-- Access log processes all example. Documentation at: /docs/config/valve.html --> <!-- <Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs" prefix="localhost_access_log." suffix=".txt" pattern="common" resolveHosts="false"/> --> <Context docBase="myproject" path="/myproject" reloadable="true" source="org.eclipse.jst.jee.server:myproject"/></Host> </Engine> </Service> </Server>

    Read the article

  • What is the point/purpose of Ruby EventMachine, Python Twisted, or JavaScript Node.js?

    - by CCw
    I don't understand what problem these frameworks solve. Are they replacements for a HTTP server like Apache HTTPD, Tomcat, Mongrel, etc? Or are they more? Why might I use them... some real world examples? I've seen endless examples of chat rooms and broadcast services, but don't see how this is any different than, for instance, setting up a Java program to open sockets and dispatch a thread for each request. I think I understand the non-blocking I/O, but I don't understand how that is any different than a multi-threaded web server.

    Read the article

  • How to get the best LINPACK result and conquer the Top500?

    - by knweiss
    Given a large Linux HPC cluster with hundreds/thousands of nodes. What are your best practices to get the best possible LINPACK benchmark (HPL) result to submit for the Top500 supercomputer list? To give you an idea what kind of answers I would appreciate here are some sub-questions (with links): How to you tune the parameters (N, NB, P, Q, memory-alignment, etc) for the HPL.dat file (without spending too much time trying each possible permutation - esp with large problem sizes N)? Are there any Top500 submission rules to be aware of? What is allowed, what isn't? Which MPI product, which version? Does it make a difference? Any special host order in your MPI machine file? Do you use CPU pinning? How to you configure your interconnect? Which interconnect? Which BLAS package do you use for which CPU model? (Intel MKL, AMD ACML, GotoBLAS2, etc.) How do you prepare for the big run (on all nodes)? Start with small runs on a subset of nodes and then scale up? Is it really necessary to run LINPACK with a big run on all of the nodes (or is extrapolation allowed)? How do you optimize for the latest Intel/AMD CPUs? Hyperthreading? NUMA? Is it worth it to recompile the software stack or do you use precompiled binaries? Which settings? Which compiler optimizations, which compiler? (What about profile-based compilation?) How to get the best result given only a limited amount of time to do the benchmark run? (You can block a huge cluster forever) How do you prepare the individual nodes (stopping system daemons, freeing memory, etc)? How do you deal with hardware faults (ruining a huge run)? Are there any must-read documents or websites about this topic? E.g. I would love to hear about some background stories of some of the current Top500 systems and how they did their LINPACK benchmark. I deliberately don't want to mention concrete hardware details or discuss hardware recommendations because I don't want to limit the answers. However, feel free to mention hints e.g. for specific CPU models.

    Read the article

< Previous Page | 39 40 41 42 43 44 45 46 47 48 49 50  | Next Page >