Search Results

Search found 3641 results on 146 pages for 'threads'.

Page 39/146 | < Previous Page | 35 36 37 38 39 40 41 42 43 44 45 46  | Next Page >

  • Inverted schedctl usage in the JVM

    - by Dave
    The schedctl facility in Solaris allows a thread to request that the kernel defer involuntary preemption for a brief period. The mechanism is strictly advisory - the kernel can opt to ignore the request. Schedctl is typically used to bracket lock critical sections. That, in turn, can avoid convoying -- threads piling up on a critical section behind a preempted lock-holder -- and other lock-related performance pathologies. If you're interested see the man pages for schedctl_start() and schedctl_stop() and the schedctl.h include file. The implementation is very efficient. schedctl_start(), which asks that preemption be deferred, simply stores into a thread-specific structure -- the schedctl block -- that the kernel maps into user-space. Similarly, schedctl_stop() clears the flag set by schedctl_stop() and then checks a "preemption pending" flag in the block. Normally, this will be false, but if set schedctl_stop() will yield to politely grant the CPU to other threads. Note that you can't abuse this facility for long-term preemption avoidance as the deferral is brief. If your thread exceeds the grace period the kernel will preempt it and transiently degrade its effective scheduling priority. Further reading : US05937187 and various papers by Andy Tucker. We'll now switch topics to the implementation of the "synchronized" locking construct in the HotSpot JVM. If a lock is contended then on multiprocessor systems we'll spin briefly to try to avoid context switching. Context switching is wasted work and inflicts various cache and TLB penalties on the threads involved. If context switching were "free" then we'd never spin to avoid switching, but that's not the case. We use an adaptive spin-then-park strategy. One potentially undesirable outcome is that we can be preempted while spinning. When our spinning thread is finally rescheduled the lock may or may not be available. If not, we'll spin and then potentially park (block) again, thus suffering a 2nd context switch. Recall that the reason we spin is to avoid context switching. To avoid this scenario I've found it useful to enable schedctl to request deferral while spinning. But while spinning I've arranged for the code to periodically check or poll the "preemption pending" flag. If that's found set we simply abandon our spinning attempt and park immediately. This avoids the double context-switch scenario above. One annoyance is that the schedctl blocks for the threads in a given process are tightly packed on special pages mapped from kernel space into user-land. As such, writes to the schedctl blocks can cause false sharing on other adjacent blocks. Hopefully the kernel folks will make changes to avoid this by padding and aligning the blocks to ensure that one cache line underlies at most one schedctl block at any one time.

    Read the article

  • Thread travailleur avec Qt en utilisant les signaux et les slots, un article de Christophe Dumez traduit par Thibaut Cuvelier

    Qt fournit des classes de threads indépendantes de la plateforme, une manière thread-safe de poster des événements et des connexions entre signaux et slots entre les threads. La programmation multithreadée s'avantage des machines à plusieurs processeurs et est aussi utile pour effectuer les opérations chronophages sans geler l'interface utilisateur d'une application. Sans multithreading, tout est fait dans le thread principal.

    Read the article

  • What is the best way to diagrammatically represent a system threading architecture?

    - by thegreendroid
    I am yet to find the perfect way to diagrammatically represent the overall threading architecture for a system (using UML or otherwise). I am after a diagramming technique that would show all the threads in a given system and how they interact with each other. There are a few similar questions - Drawing Thread Interaction, UML Diagrams of Multithreaded Applications and Intuitive UML Approach to Depict Threads but they don't fully answer my question. What are some of the techniques that you've found useful to depict the overall threading architecture for a system?

    Read the article

  • Load Balance and Parallel Performance

    Load balancing an application workload among threads is critical to performance. However, achieving perfect load balance is non-trivial, and it depends on the parallelism within the application, workload, the number of threads, load balancing policy, and the threading implementation.

    Read the article

  • Forum widget for website

    - by Ivan Kuckir
    I have Discussion section at my website and I am getting 3 - 10 comments each week (it may increase in the future). I am using one Disqus "widget" for the whole discussion, but I would like to give it some better structure, with threads, categories etc. Do you know about some "forum widget" service, with functionality of phpBB (threads, categories, ...) and simplicity of Disqus (installing with IFRAME, login with Facebook/Golge, ...) ?

    Read the article

  • Is a 1:* write:read thread system safe?

    - by Di-0xide
    Theoretically, thread-safe code should fix race conditions. Race conditions, as I understand it, occur because two threads attempt to write to the same location at the same time. However, what about a threading model in which a single thread is designed to write to a location, and several slave/worker threads simply read from the location? Assuming the value/timing at which they read the data isn't relevant/doesn't hinder the worker thread's outcome, wouldn't this be considered 'thread safe', or am I missing something in my logic?

    Read the article

  • JBOSS 7.1 started hanging after 6 months of deployment

    - by PVR
    My application is been live from 6 months. The application is host on jboss 7.1 server. From last few days I am finding numerous problem of hanging of jboss server. Though I restart the jboss server again, it does not invoke. I need to restart the server machine itself. Can anyone please let me know what could be the cause of these problems and the workable resolutions or any suggestion ? Kindly dont degrade the question as I am facing a lot problems due to this hanging issue. Also for the information, the application is based on Java, GWT, Hibernate 3. Please find the standalone.xml file in case if it helps. <extensions> <extension module="org.jboss.as.clustering.infinispan"/> <extension module="org.jboss.as.configadmin"/> <extension module="org.jboss.as.connector"/> <extension module="org.jboss.as.deployment-scanner"/> <extension module="org.jboss.as.ee"/> <extension module="org.jboss.as.ejb3"/> <extension module="org.jboss.as.jaxrs"/> <extension module="org.jboss.as.jdr"/> <extension module="org.jboss.as.jmx"/> <extension module="org.jboss.as.jpa"/> <extension module="org.jboss.as.logging"/> <extension module="org.jboss.as.mail"/> <extension module="org.jboss.as.naming"/> <extension module="org.jboss.as.osgi"/> <extension module="org.jboss.as.pojo"/> <extension module="org.jboss.as.remoting"/> <extension module="org.jboss.as.sar"/> <extension module="org.jboss.as.security"/> <extension module="org.jboss.as.threads"/> <extension module="org.jboss.as.transactions"/> <extension module="org.jboss.as.web"/> <extension module="org.jboss.as.webservices"/> <extension module="org.jboss.as.weld"/> </extensions> <system-properties> <property name="org.apache.coyote.http11.Http11Protocol.COMPRESSION" value="on"/> <property name="org.apache.coyote.http11.Http11Protocol.COMPRESSION_MIME_TYPES" value="text/javascript,text/css,text/html,text/xml,text/json"/> </system-properties> <management> <security-realms> <security-realm name="ManagementRealm"> <authentication> <properties path="mgmt-users.properties" relative-to="jboss.server.config.dir"/> </authentication> </security-realm> <security-realm name="ApplicationRealm"> <authentication> <properties path="application-users.properties" relative-to="jboss.server.config.dir"/> </authentication> </security-realm> </security-realms> <management-interfaces> <native-interface security-realm="ManagementRealm"> <socket-binding native="management-native"/> </native-interface> <http-interface security-realm="ManagementRealm"> <socket-binding http="management-http"/> </http-interface> </management-interfaces> </management> <profile> <subsystem xmlns="urn:jboss:domain:logging:1.1"> <console-handler name="CONSOLE"> <level name="INFO"/> <formatter> <pattern-formatter pattern="%d{HH:mm:ss,SSS} %-5p [%c] (%t) %s%E%n"/> </formatter> </console-handler> <periodic-rotating-file-handler name="FILE"> <formatter> <pattern-formatter pattern="%d{HH:mm:ss,SSS} %-5p [%c] (%t) %s%E%n"/> </formatter> <file relative-to="jboss.server.log.dir" path="server.log"/> <suffix value=".yyyy-MM-dd"/> <append value="true"/> </periodic-rotating-file-handler> <logger category="com.arjuna"> <level name="WARN"/> </logger> <logger category="org.apache.tomcat.util.modeler"> <level name="WARN"/> </logger> <logger category="sun.rmi"> <level name="WARN"/> </logger> <logger category="jacorb"> <level name="WARN"/> </logger> <logger category="jacorb.config"> <level name="ERROR"/> </logger> <root-logger> <level name="INFO"/> <handlers> <handler name="CONSOLE"/> <handler name="FILE"/> </handlers> </root-logger> </subsystem> <subsystem xmlns="urn:jboss:domain:configadmin:1.0"/> <subsystem xmlns="urn:jboss:domain:datasources:1.0"> <datasources> <datasource jndi-name="java:jboss/datasources/ExampleDS" pool-name="ExampleDS" enabled="true" use-java-context="true"> <connection-url>jdbc:h2:mem:test;DB_CLOSE_DELAY=-1</connection-url> <driver>h2</driver> <security> <user-name>sa</user-name> <password>sa</password> </security> </datasource> <drivers> <driver name="h2" module="com.h2database.h2"> <xa-datasource-class>org.h2.jdbcx.JdbcDataSource</xa-datasource-class> </driver> </drivers> </datasources> </subsystem> <subsystem xmlns="urn:jboss:domain:deployment-scanner:1.1"> <deployment-scanner path="deployments" relative-to="jboss.server.base.dir" scan-interval="5000"/> </subsystem> <subsystem xmlns="urn:jboss:domain:ee:1.0"/> <subsystem xmlns="urn:jboss:domain:ejb3:1.2"> <session-bean> <stateless> <bean-instance-pool-ref pool-name="slsb-strict-max-pool"/> </stateless> <stateful default-access-timeout="5000" cache-ref="simple"/> <singleton default-access-timeout="5000"/> </session-bean> <pools> <bean-instance-pools> <strict-max-pool name="slsb-strict-max-pool" max-pool-size="20" instance-acquisition-timeout="5" instance-acquisition-timeout-unit="MINUTES"/> <strict-max-pool name="mdb-strict-max-pool" max-pool-size="20" instance-acquisition-timeout="5" instance-acquisition-timeout-unit="MINUTES"/> </bean-instance-pools> </pools> <caches> <cache name="simple" aliases="NoPassivationCache"/> <cache name="passivating" passivation-store-ref="file" aliases="SimpleStatefulCache"/> </caches> <passivation-stores> <file-passivation-store name="file"/> </passivation-stores> <async thread-pool-name="default"/> <timer-service thread-pool-name="default"> <data-store path="timer-service-data" relative-to="jboss.server.data.dir"/> </timer-service> <remote connector-ref="remoting-connector" thread-pool-name="default"/> <thread-pools> <thread-pool name="default"> <max-threads count="10"/> <keepalive-time time="100" unit="milliseconds"/> </thread-pool> </thread-pools> </subsystem> <subsystem xmlns="urn:jboss:domain:infinispan:1.2" default-cache-container="hibernate"> <cache-container name="hibernate" default-cache="local-query"> <local-cache name="entity"> <transaction mode="NON_XA"/> <eviction strategy="LRU" max-entries="10000"/> <expiration max-idle="100000"/> </local-cache> <local-cache name="local-query"> <transaction mode="NONE"/> <eviction strategy="LRU" max-entries="10000"/> <expiration max-idle="100000"/> </local-cache> <local-cache name="timestamps"> <transaction mode="NONE"/> <eviction strategy="NONE"/> </local-cache> </cache-container> </subsystem> <subsystem xmlns="urn:jboss:domain:jaxrs:1.0"/> <subsystem xmlns="urn:jboss:domain:jca:1.1"> <archive-validation enabled="true" fail-on-error="true" fail-on-warn="false"/> <bean-validation enabled="true"/> <default-workmanager> <short-running-threads> <core-threads count="50"/> <queue-length count="50"/> <max-threads count="50"/> <keepalive-time time="10" unit="seconds"/> </short-running-threads> <long-running-threads> <core-threads count="50"/> <queue-length count="50"/> <max-threads count="50"/> <keepalive-time time="100" unit="seconds"/> </long-running-threads> </default-workmanager> <cached-connection-manager/> </subsystem> <subsystem xmlns="urn:jboss:domain:jdr:1.0"/> <subsystem xmlns="urn:jboss:domain:jmx:1.1"> <show-model value="true"/> <remoting-connector/> </subsystem> <subsystem xmlns="urn:jboss:domain:jpa:1.0"> <jpa default-datasource=""/> </subsystem> <subsystem xmlns="urn:jboss:domain:mail:1.0"> <mail-session jndi-name="java:jboss/mail/Default"> <smtp-server outbound-socket-binding-ref="mail-smtp"/> </mail-session> </subsystem> <subsystem xmlns="urn:jboss:domain:naming:1.1"/> <subsystem xmlns="urn:jboss:domain:osgi:1.2" activation="lazy"> <properties> <property name="org.osgi.framework.startlevel.beginning"> 1 </property> </properties> <capabilities> <capability name="javax.servlet.api:v25"/> <capability name="javax.transaction.api"/> <capability name="org.apache.felix.log" startlevel="1"/> <capability name="org.jboss.osgi.logging" startlevel="1"/> <capability name="org.apache.felix.configadmin" startlevel="1"/> <capability name="org.jboss.as.osgi.configadmin" startlevel="1"/> </capabilities> </subsystem> <subsystem xmlns="urn:jboss:domain:pojo:1.0"/> <subsystem xmlns="urn:jboss:domain:remoting:1.1"> <connector name="remoting-connector" socket-binding="remoting" security-realm="ApplicationRealm"/> </subsystem> <subsystem xmlns="urn:jboss:domain:resource-adapters:1.0"/> <subsystem xmlns="urn:jboss:domain:sar:1.0"/> <subsystem xmlns="urn:jboss:domain:security:1.1"> <security-domains> <security-domain name="other" cache-type="default"> <authentication> <login-module code="Remoting" flag="optional"> <module-option name="password-stacking" value="useFirstPass"/> </login-module> <login-module code="RealmUsersRoles" flag="required"> <module-option name="usersProperties" value="${jboss.server.config.dir}/application-users.properties"/> <module-option name="rolesProperties" value="${jboss.server.config.dir}/application-roles.properties"/> <module-option name="realm" value="ApplicationRealm"/> <module-option name="password-stacking" value="useFirstPass"/> </login-module> </authentication> </security-domain> <security-domain name="jboss-web-policy" cache-type="default"> <authorization> <policy-module code="Delegating" flag="required"/> </authorization> </security-domain> <security-domain name="jboss-ejb-policy" cache-type="default"> <authorization> <policy-module code="Delegating" flag="required"/> </authorization> </security-domain> </security-domains> </subsystem> <subsystem xmlns="urn:jboss:domain:threads:1.1"/> <subsystem xmlns="urn:jboss:domain:transactions:1.1"> <core-environment> <process-id> <uuid/> </process-id> </core-environment> <recovery-environment socket-binding="txn-recovery-environment" status-socket-binding="txn-status-manager"/> <coordinator-environment default-timeout="300"/> </subsystem> <subsystem xmlns="urn:jboss:domain:web:1.1" default-virtual-server="default-host" native="false"> <connector name="http" protocol="HTTP/1.1" scheme="http" socket-binding="http"/> <virtual-server name="default-host" enable-welcome-root="false"> <alias name="localhost"/> <alias name="nextenders.com"/> </virtual-server> </subsystem> <subsystem xmlns="urn:jboss:domain:webservices:1.1"> <modify-wsdl-address>true</modify-wsdl-address> <wsdl-host>${jboss.bind.address:127.0.0.1}</wsdl-host> <endpoint-config name="Standard-Endpoint-Config"/> <endpoint-config name="Recording-Endpoint-Config"> <pre-handler-chain name="recording-handlers" protocol-bindings="##SOAP11_HTTP ##SOAP11_HTTP_MTOM ##SOAP12_HTTP ##SOAP12_HTTP_MTOM"> <handler name="RecordingHandler" class="org.jboss.ws.common.invocation.RecordingServerHandler"/> </pre-handler-chain> </endpoint-config> </subsystem> <subsystem xmlns="urn:jboss:domain:weld:1.0"/> </profile> <interfaces> <interface name="management"> <inet-address value="${jboss.bind.address.management:127.0.0.1}"/> </interface> <interface name="public"> <inet-address value="${jboss.bind.address:127.0.0.1}"/> </interface> <interface name="unsecure"> <inet-address value="${jboss.bind.address.unsecure:127.0.0.1}"/> </interface> </interfaces> <socket-binding-group name="standard-sockets" default-interface="public" port-offset="${jboss.socket.binding.port-offset:0}"> <socket-binding name="management-native" interface="management" port="${jboss.management.native.port:9999}"/> <socket-binding name="management-http" interface="management" port="${jboss.management.http.port:9990}"/> <socket-binding name="management-https" interface="management" port="${jboss.management.https.port:9443}"/> <socket-binding name="ajp" port="8009"/> <socket-binding name="http" port="80"/> <socket-binding name="https" port="443"/> <socket-binding name="osgi-http" interface="management" port="8090"/> <socket-binding name="remoting" port="4447"/> <socket-binding name="txn-recovery-environment" port="4712"/> <socket-binding name="txn-status-manager" port="4713"/> <outbound-socket-binding name="mail-smtp"> <remote-destination host="localhost" port="25"/> </outbound-socket-binding> </socket-binding-group>

    Read the article

  • Issue accessing remote Infinispan mbeans

    - by user1960172
    I am able to access the Mbeans by local Jconsole but not able to access the MBEANS from a remote Host. My COnfiguration: <?xml version='1.0' encoding='UTF-8'?> <server xmlns="urn:jboss:domain:1.4"> <extensions> <extension module="org.infinispan.server.endpoint"/> <extension module="org.jboss.as.clustering.infinispan"/> <extension module="org.jboss.as.clustering.jgroups"/> <extension module="org.jboss.as.connector"/> <extension module="org.jboss.as.jdr"/> <extension module="org.jboss.as.jmx"/> <extension module="org.jboss.as.logging"/> <extension module="org.jboss.as.modcluster"/> <extension module="org.jboss.as.naming"/> <extension module="org.jboss.as.remoting"/> <extension module="org.jboss.as.security"/> <extension module="org.jboss.as.threads"/> <extension module="org.jboss.as.transactions"/> <extension module="org.jboss.as.web"/> </extensions> <management> <security-realms> <security-realm name="ManagementRealm"> <authentication> <local default-user="$local"/> <properties path="mgmt-users.properties" relative-to="jboss.server.config.dir"/> </authentication> </security-realm> <security-realm name="ApplicationRealm"> <authentication> <local default-user="$local" allowed-users="*"/> <properties path="application-users.properties" relative-to="jboss.server.config.dir"/> </authentication> </security-realm> </security-realms> <management-interfaces> <native-interface security-realm="ManagementRealm"> <socket-binding native="management-native"/> </native-interface> <http-interface security-realm="ManagementRealm"> <socket-binding http="management-http"/> </http-interface> </management-interfaces> </management> <profile> <subsystem xmlns="urn:jboss:domain:logging:1.2"> <console-handler name="CONSOLE"> <level name="INFO"/> <formatter> <pattern-formatter pattern="%K{level}%d{HH:mm:ss,SSS} %-5p [%c] (%t) %s%E%n"/> </formatter> </console-handler> <periodic-rotating-file-handler name="FILE" autoflush="true"> <formatter> <pattern-formatter pattern="%d{HH:mm:ss,SSS} %-5p [%c] (%t) %s%E%n"/> </formatter> <file relative-to="jboss.server.log.dir" path="server.log"/> <suffix value=".yyyy-MM-dd"/> <append value="true"/> </periodic-rotating-file-handler> <logger category="com.arjuna"> <level name="WARN"/> </logger> <logger category="org.apache.tomcat.util.modeler"> <level name="WARN"/> </logger> <logger category="org.jboss.as.config"> <level name="DEBUG"/> </logger> <logger category="sun.rmi"> <level name="WARN"/> </logger> <logger category="jacorb"> <level name="WARN"/> </logger> <logger category="jacorb.config"> <level name="ERROR"/> </logger> <root-logger> <level name="INFO"/> <handlers> <handler name="CONSOLE"/> <handler name="FILE"/> </handlers> </root-logger> </subsystem> <subsystem xmlns="urn:infinispan:server:endpoint:6.0"> <hotrod-connector socket-binding="hotrod" cache-container="clustered"> <topology-state-transfer lazy-retrieval="false" lock-timeout="1000" replication-timeout="5000"/> </hotrod-connector> <memcached-connector socket-binding="memcached" cache-container="clustered"/> <!--<rest-connector virtual-server="default-host" cache-container="clustered" security-domain="other" auth-method="BASIC"/> --> <rest-connector virtual-server="default-host" cache-container="clustered" /> <websocket-connector socket-binding="websocket" cache-container="clustered"/> </subsystem> <subsystem xmlns="urn:jboss:domain:datasources:1.1"> <datasources/> </subsystem> <subsystem xmlns="urn:infinispan:server:core:5.3" default-cache-container="clustered"> <cache-container name="clustered" default-cache="default"> <transport executor="infinispan-transport" lock-timeout="60000"/> <distributed-cache name="default" mode="SYNC" segments="20" owners="2" remote-timeout="30000" start="EAGER"> <locking isolation="READ_COMMITTED" acquire-timeout="30000" concurrency-level="1000" striping="false"/> <transaction mode="NONE"/> </distributed-cache> <distributed-cache name="memcachedCache" mode="SYNC" segments="20" owners="2" remote-timeout="30000" start="EAGER"> <locking isolation="READ_COMMITTED" acquire-timeout="30000" concurrency-level="1000" striping="false"/> <transaction mode="NONE"/> </distributed-cache> <distributed-cache name="namedCache" mode="SYNC" start="EAGER"/> </cache-container> <cache-container name="security"/> </subsystem> <subsystem xmlns="urn:jboss:domain:jca:1.1"> <archive-validation enabled="true" fail-on-error="true" fail-on-warn="false"/> <bean-validation enabled="true"/> <default-workmanager> <short-running-threads> <core-threads count="50"/> <queue-length count="50"/> <max-threads count="50"/> <keepalive-time time="10" unit="seconds"/> </short-running-threads> <long-running-threads> <core-threads count="50"/> <queue-length count="50"/> <max-threads count="50"/> <keepalive-time time="10" unit="seconds"/> </long-running-threads> </default-workmanager> <cached-connection-manager/> </subsystem> <subsystem xmlns="urn:jboss:domain:jdr:1.0"/> <subsystem xmlns="urn:jboss:domain:jgroups:1.2" default-stack="${jboss.default.jgroups.stack:udp}"> <stack name="udp"> <transport type="UDP" socket-binding="jgroups-udp"/> <protocol type="PING"/> <protocol type="MERGE2"/> <protocol type="FD_SOCK" socket-binding="jgroups-udp-fd"/> <protocol type="FD_ALL"/> <protocol type="pbcast.NAKACK"/> <protocol type="UNICAST2"/> <protocol type="pbcast.STABLE"/> <protocol type="pbcast.GMS"/> <protocol type="UFC"/> <protocol type="MFC"/> <protocol type="FRAG2"/> <protocol type="RSVP"/> </stack> <stack name="tcp"> <transport type="TCP" socket-binding="jgroups-tcp"/> <!--<protocol type="MPING" socket-binding="jgroups-mping"/>--> <protocol type="TCPPING"> <property name="initial_hosts">10.32.50.53[7600],10.32.50.64[7600]</property> </protocol> <protocol type="MERGE2"/> <protocol type="FD_SOCK" socket-binding="jgroups-tcp-fd"/> <protocol type="FD"/> <protocol type="VERIFY_SUSPECT"/> <protocol type="pbcast.NAKACK"> <property name="use_mcast_xmit">false</property> </protocol> <protocol type="UNICAST2"/> <protocol type="pbcast.STABLE"/> <protocol type="pbcast.GMS"/> <protocol type="UFC"/> <protocol type="MFC"/> <protocol type="FRAG2"/> <protocol type="RSVP"/> </stack> </subsystem> <subsystem xmlns="urn:jboss:domain:jmx:1.1"> <show-model value="true"/> <remoting-connector use-management-endpoint="false"/> </subsystem> <subsystem xmlns="urn:jboss:domain:modcluster:1.1"> <mod-cluster-config advertise-socket="modcluster" connector="ajp" excluded-contexts="console"> <dynamic-load-provider> <load-metric type="busyness"/> </dynamic-load-provider> </mod-cluster-config> </subsystem> <subsystem xmlns="urn:jboss:domain:naming:1.2"/> <subsystem xmlns="urn:jboss:domain:remoting:1.1"> <connector name="remoting-connector" socket-binding="remoting" security-realm="ApplicationRealm"/> </subsystem> <subsystem xmlns="urn:jboss:domain:security:1.2"> <security-domains> <security-domain name="other" cache-type="infinispan"> <authentication> <login-module code="Remoting" flag="optional"> <module-option name="password-stacking" value="useFirstPass"/> </login-module> <login-module code="RealmUsersRoles" flag="required"> <module-option name="usersProperties" value="${jboss.server.config.dir}/application-users.properties"/> <module-option name="rolesProperties" value="${jboss.server.config.dir}/application-roles.properties"/> <module-option name="realm" value="ApplicationRealm"/> <module-option name="password-stacking" value="useFirstPass"/> </login-module> </authentication> </security-domain> <security-domain name="jboss-web-policy" cache-type="infinispan"> <authorization> <policy-module code="Delegating" flag="required"/> </authorization> </security-domain> </security-domains> </subsystem> <subsystem xmlns="urn:jboss:domain:threads:1.1"> <thread-factory name="infinispan-factory" group-name="infinispan" priority="5"/> <unbounded-queue-thread-pool name="infinispan-transport"> <max-threads count="25"/> <keepalive-time time="0" unit="milliseconds"/> <thread-factory name="infinispan-factory"/> </unbounded-queue-thread-pool> </subsystem> <subsystem xmlns="urn:jboss:domain:transactions:1.2"> <core-environment> <process-id> <uuid/> </process-id> </core-environment> <recovery-environment socket-binding="txn-recovery-environment" status-socket-binding="txn-status-manager"/> <coordinator-environment default-timeout="300"/> </subsystem> <subsystem xmlns="urn:jboss:domain:web:1.1" default-virtual-server="default-host" native="false"> <connector name="http" protocol="HTTP/1.1" scheme="http" socket-binding="http"/> <connector name="ajp" protocol="AJP/1.3" scheme="http" socket-binding="ajp"/> <virtual-server name="default-host" enable-welcome-root="false"> <alias name="localhost"/> <alias name="example.com"/> </virtual-server> </subsystem> </profile> <interfaces> <interface name="management"> <inet-address value="${jboss.bind.address.management:10.32.222.111}"/> </interface> <interface name="public"> <inet-address value="${jboss.bind.address:10.32.222.111}"/> </interface> </interfaces> <socket-binding-group name="standard-sockets" default-interface="public" port-offset="${jboss.socket.binding.port-offset:0}"> <socket-binding name="management-native" interface="management" port="${jboss.management.native.port:9999}"/> <socket-binding name="management-http" interface="management" port="${jboss.management.http.port:9990}"/> <socket-binding name="management-https" interface="management" port="${jboss.management.https.port:9443}"/> <socket-binding name="ajp" port="8089"/> <socket-binding name="hotrod" port="11222"/> <socket-binding name="http" port="8080"/> <socket-binding name="https" port="8443"/> <socket-binding name="jgroups-mping" port="0" multicast-address="${jboss.default.multicast.address:234.99.54.14}" multicast-port="45700"/> <socket-binding name="jgroups-tcp" port="7600"/> <socket-binding name="jgroups-tcp-fd" port="57600"/> <socket-binding name="jgroups-udp" port="55200" multicast-address="${jboss.default.multicast.address:234.99.54.14}" multicast-port="45688"/> <socket-binding name="jgroups-udp-fd" port="54200"/> <socket-binding name="memcached" port="11211"/> <socket-binding name="modcluster" port="0" multicast-address="224.0.1.115" multicast-port="23364"/> <socket-binding name="remoting" port="4447"/> <socket-binding name="txn-recovery-environment" port="4712"/> <socket-binding name="txn-status-manager" port="4713"/> <socket-binding name="websocket" port="8181"/> </socket-binding-group> </server> Remote Process: service:jmx:remoting-jmx://10.32.222.111:4447 I added user to both management and application realm admin=2a0923285184943425d1f53ddd58ec7a test=2b1be81e1da41d4ea647bd82fc8c2bc9 But when i try to connect its says's: Connection failed: Retry When i use Remote process as:10.32.222.111:4447 on the sever it prompts a warning : 16:29:48,084 ERROR [org.jboss.remoting.remote.connection] (Remoting "djd7w4r1" read-1) JBREM000200: Remote connection failed: java.io.IOException: Received an invali d message length of -2140864253 Also disabled Remote authentication: -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=12345 Still not able to connect. Any help will be highly appreciated . Thanks

    Read the article

  • cpu load measure with hyperthreading on linux

    - by dronus
    How can I get the true usage of a multicore hyperthreading enabled cpu? For example lets consider a 2 core CPU, expressing 4 virtual cores. A single threaded workload would now show up as 100% in top, as one core of the virtual cores is completely used. The CPU and top work as expected, like there would be 4 real cores. With two threads however, the things get arkward: If all works well, they are balanced to the two real cores, so we got 200% usage: Two times 100% and two idle virtual cores, and are using all of the available CPU power. Seems ok to me. However, if the two threads would run on a single real core, they would show up as using two times 100%, that makes 200% virtual core usage. But on the real side, that would be one core sharing its power on the two threads, which are then using only one half of the total CPU power. So the usage numbers shown by top can not be used to measure the total CPU workload. I also wonder how hyperthreading balances two virtual on a real core. If two threads take a different amount of cycles, would the virtual cores 'adapt' so that both show a 100% load even if the real load differ?

    Read the article

  • placing shell script under systemd control

    - by Calvin Cheng
    Assuming I have a shell script like this:- #!/bin/sh # cherrypy_server.sh PROCESSES=10 THREADS=1 # threads per process BASE_PORT=3035 # the first port used # you need to make the PIDFILE dir and insure it has the right permissions PIDFILE="/var/run/cherrypy/myproject.pid" WORKDIR=`dirname "$0"` cd "$WORKDIR" cp_start_proc() { N=$1 P=$(( $BASE_PORT + $N - 1 )) ./manage.py runcpserver daemonize=1 port=$P pidfile="$PIDFILE-$N" threads=$THREADS request_queue_size=0 verbose=0 } cp_start() { for N in `seq 1 $PROCESSES`; do cp_start_proc $N done } cp_stop_proc() { N=$1 #[ -f "$PIDFILE-$N" ] && kill `cat "$PIDFILE-$N"` [ -f "$PIDFILE-$N" ] && ./manage.py runcpserver pidfile="$PIDFILE-$N" stop rm -f "$PIDFILE-$N" } cp_stop() { for N in `seq 1 $PROCESSES`; do cp_stop_proc $N done } cp_restart_proc() { N=$1 cp_stop_proc $N #sleep 1 cp_start_proc $N } cp_restart() { for N in `seq 1 $PROCESSES`; do cp_restart_proc $N done } case "$1" in "start") cp_start ;; "stop") cp_stop ;; "restart") cp_restart ;; *) "$@" ;; esac From the bash script, we can essentially do 3 things: start the cherrypy server by calling ./cherrypy_server.sh start stop the cherrypy server by calling ./cherrypy_server.sh stop restart the cherrypy server by calling ./cherrypy_server.sh restart How would I place this shell script under systemd's control as a cherrypy.service file (with the obvious goal of having systemd start up the cherrypy server when a machine has been rebooted)? Reference systemd service file example here - https://wiki.archlinux.org/index.php/Systemd#Using_service_file

    Read the article

  • Remote Socket Read In Multi-Threaded Application Returns Zero Bytes or EINTR (104)

    - by user39891
    Hi. Am a c-coder for a while now - neither a newbie nor an expert. Now, I have a certain daemoned application in C on a PPC Linux. I use PHP's socket_connect as a client to connect to this service locally. The server uses epoll for multiplexing connections via a Unix socket. A user submitted string is parsed for certain characters/words using strstr() and if found, spawns 4 joinable threads to different websites simultaneously. I use socket, connect, write and read, to interact with the said webservers via TCP on their port 80 in each thread. All connections and writes seems successful. Reads to the webserver sockets fail however, with either (A) all 3 threads seem to hang, and only one thread returns -1 and errno is set to 104. The responding thread takes like 10 minutes - an eternity long:-(. *I read somewhere that the 104 (is EINTR?), which in the network context suggests that ...'the connection was reset by peer'; or (B) 0 bytes from 3 threads, and only 1 of the 4 threads actually returns some data. Isn't the socket read/write thread-safe? I use thread-safe (and reentrant) libc functions such as strtok_r, gethostbyname_r, etc. *I doubt that the said webhosts are actually resetting the connection, because when I run a single-threaded standalone (everything else equal) all things works perfectly right, but of course in series not parallel. There's a second problem too (oops), I can't write back to the client who connect to my epoll-ed Unix socket. My daemon application will hang and hog CPU 100% for ever. Yet nothing is written to the clients end. Am sure the client (a very typical PHP socket application) hasn't closed the connection whenever this is happening - no error(s) detected either. Any ideas? I cannot figure-out whatever is wrong even with Valgrind, GDB or much logging. Kindly help where you can.

    Read the article

  • sysbench memory test on ec2 small instance

    - by caribio
    I'm seeing a problem with sysbench memory test (the default version that's compiled in). This is on Ubuntu Maverick, sysbench installed via apt-get install sysbench. Running the same thing on Ubuntu @ Rackspace worked just as expected. While the CPU and I/O tests worked fine on EC2 servers, the memory test just runs without doing anything (notice the 0M in the test results). The instance used was the publicly available 'stock' Ubuntu image with no changes to it: ./ec2-run-instances ami-ccf405a5 --instance-type m1.small --region us-east-1 --key mykey Supplying more arguments (such as: --memory-block-size=1K --memory-total-size=102400M) didn't help. What am I doing wrong? Thanks. sysbench --num-threads=4 --test=memory run sysbench 0.4.12: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 4 Doing memory operations speed test Memory block size: 1K Memory transfer size: 0M Memory operations type: write Memory scope type: global Threads started! Done. Operations performed: 0 ( 0.00 ops/sec) 0.00 MB transferred (0.00 MB/sec) Test execution summary: total time: 0.0003s total number of events: 0 total time taken by event execution: 0.0000 per-request statistics: min: 18446744073709.55ms avg: 0.00ms max: 0.00ms Threads fairness: events (avg/stddev): 0.0000/0.00 execution time (avg/stddev): 0.0000/0.00

    Read the article

  • Use IIS Application Initialization for keeping ASP.NET Apps alive

    - by Rick Strahl
    I've been working quite a bit with Windows Services in the recent months, and well, it turns out that Windows Services are quite a bear to debug, deploy, update and maintain. The process of getting services set up,  debugged and updated is a major chore that has to be extensively documented and or automated specifically. On most projects when a service is built, people end up scrambling for the right 'process' to use for administration. Web app deployment and maintenance on the other hand are common and well understood today, as we are constantly dealing with Web apps. There's plenty of infrastructure and tooling built into Web Tools like Visual Studio to facilitate the process. By comparison Windows Services or anything self-hosted for that matter seems convoluted.In fact, in a recent blog post I mentioned that on a recent project I'd been using self-hosting for SignalR inside of a Windows service, because the application is in fact a 'service' that also needs to send out lots of messages via SignalR. But the reality is that it could just as well be an IIS application with a service component that runs in the background. Either way you look at it, it's either a Windows Service with a built in Web Server, or an IIS application running a Service application, neither of which follows the standard Service or Web App template.Personally I much prefer Web applications. Running inside of IIS I get all the benefits of the IIS platform including service lifetime management (crash and restart), controlled shutdowns, the whole security infrastructure including easy certificate support, hot-swapping of code and the the ability to publish directly to IIS from within Visual Studio with ease.Because of these benefits we set out to move from the self hosted service into an ASP.NET Web app instead.The Missing Link for ASP.NET as a Service: Auto-LoadingI've had moments in the past where I wanted to run a 'service like' application in ASP.NET because when you think about it, it's so much easier to control a Web application remotely. Services are locked into start/stop operations, but if you host inside of a Web app you can write your own ticket and control it from anywhere. In fact nearly 10 years ago I built a background scheduling application that ran inside of ASP.NET and it worked great and it's still running doing its job today.The tricky part for running an app as a service inside of IIS then and now, is how to get IIS and ASP.NET launched so your 'service' stays alive even after an Application Pool reset. 7 years ago I faked it by using a web monitor (my own West Wind Web Monitor app) I was running anyway to monitor my various web sites for uptime, and having the monitor ping my 'service' every 20 seconds to effectively keep ASP.NET alive or fire it back up after a reload. I used a simple scheduler class that also includes some logic for 'self-reloading'. Hacky for sure, but it worked reliably.Luckily today it's much easier and more integrated to get IIS to launch ASP.NET as soon as an Application Pool is started by using the Application Initialization Module. The Application Initialization Module basically allows you to turn on Preloading on the Application Pool and the Site/IIS App, which essentially fires a request through the IIS pipeline as soon as the Application Pool has been launched. This means that effectively your ASP.NET app becomes active immediately, Application_Start is fired making sure your app stays up and running at all times. All the other features like Application Pool recycling and auto-shutdown after idle time still work, but IIS will then always immediately re-launch the application.Getting started with Application InitializationAs of IIS 8 Application Initialization is part of the IIS feature set. For IIS 7 and 7.5 there's a separate download available via Web Platform Installer. Using IIS 8 Application Initialization is an optional install component in Windows or the Windows Server Role Manager: This is an optional component so make sure you explicitly select it.IIS Configuration for Application InitializationInitialization needs to be applied on the Application Pool as well as the IIS Application level. As of IIS 8 these settings can be made through the IIS Administration console.Start with the Application Pool:Here you need to set both the Start Automatically which is always set, and the StartMode which should be set to AlwaysRunning. Both have to be set - the Start Automatically flag is set true by default and controls the starting of the application pool itself while Always Running flag is required in order to launch the application. Without the latter flag set the site settings have no effect.Now on the Site/Application level you can specify whether the site should pre load: Set the Preload Enabled flag to true.At this point ASP.NET apps should auto-load. This is all that's needed to pre-load the site if all you want is to get your site launched automatically.If you want a little more control over the load process you can add a few more settings to your web.config file that allow you to show a static page while the App is starting up. This can be useful if startup is really slow, so rather than displaying blank screen while the user is fiddling their thumbs you can display a static HTML page instead: <system.webServer> <applicationInitialization remapManagedRequestsTo="Startup.htm" skipManagedModules="true"> <add initializationPage="ping.ashx" /> </applicationInitialization> </system.webServer>This allows you to specify a page to execute in a dry run. IIS basically fakes request and pushes it directly into the IIS pipeline without hitting the network. You specify a page and IIS will fake a request to that page in this case ping.ashx which just returns a simple OK string - ie. a fast pipeline request. This request is run immediately after Application Pool restart, and while this request is running and your app is warming up, IIS can display an alternate static page - Startup.htm above. So instead of showing users an empty loading page when clicking a link on your site you can optionally show some sort of static status page that says, "we'll be right back".  I'm not sure if that's such a brilliant idea since this can be pretty disruptive in some cases. Personally I think I prefer letting people wait, but at least get the response they were supposed to get back rather than a random page. But it's there if you need it.Note that the web.config stuff is optional. If you don't provide it IIS hits the default site link (/) and even if there's no matching request at the end of that request it'll still fire the request through the IIS pipeline. Ideally though you want to make sure that an ASP.NET endpoint is hit either with your default page, or by specify the initializationPage to ensure ASP.NET actually gets hit since it's possible for IIS fire unmanaged requests only for static pages (depending how your pipeline is configured).What about AppDomain Restarts?In addition to full Worker Process recycles at the IIS level, ASP.NET also has to deal with AppDomain shutdowns which can occur for a variety of reasons:Files are updated in the BIN folderWeb Deploy to your siteweb.config is changedHard application crashThese operations don't cause the worker process to restart, but they do cause ASP.NET to unload the current AppDomain and start up a new one. Because the features above only apply to Application Pool restarts, AppDomain restarts could also cause your 'ASP.NET service' to stop processing in the background.In order to keep the app running on AppDomain recycles, you can resort to a simple ping in the Application_End event:protected void Application_End() { var client = new WebClient(); var url = App.AdminConfiguration.MonitorHostUrl + "ping.aspx"; client.DownloadString(url); Trace.WriteLine("Application Shut Down Ping: " + url); }which fires any ASP.NET url to the current site at the very end of the pipeline shutdown which in turn ensures that the site immediately starts back up.Manual Configuration in ApplicationHost.configThe above UI corresponds to the following ApplicationHost.config settings. If you're using IIS 7, there's no UI for these flags so you'll have to manually edit them.When you install the Application Initialization component into IIS it should auto-configure the module into ApplicationHost.config. Unfortunately for me, with Mr. Murphy in his best form for me, the module registration did not occur and I had to manually add it.<globalModules> <add name="ApplicationInitializationModule" image="%windir%\System32\inetsrv\warmup.dll" /> </globalModules>Most likely you won't need ever need to add this, but if things are not working it's worth to check if the module is actually registered.Next you need to configure the ApplicationPool and the Web site. The following are the two relevant entries in ApplicationHost.config.<system.applicationHost> <applicationPools> <add name="West Wind West Wind Web Connection" autoStart="true" startMode="AlwaysRunning" managedRuntimeVersion="v4.0" managedPipelineMode="Integrated"> <processModel identityType="LocalSystem" setProfileEnvironment="true" /> </add> </applicationPools> <sites> <site name="Default Web Site" id="1"> <application path="/MPress.Workflow.WebQueueMessageManager" applicationPool="West Wind West Wind Web Connection" preloadEnabled="true"> <virtualDirectory path="/" physicalPath="C:\Clients\…" /> </application> </site> </sites> </system.applicationHost>On the Application Pool make sure to set the autoStart and startMode flags to true and AlwaysRunning respectively. On the site make sure to set the preloadEnabled flag to true.And that's all you should need. You can still set the web.config settings described above as well.ASP.NET as a Service?In the particular application I'm working on currently, we have a queue manager that runs as standalone service that polls a database queue and picks out jobs and processes them on several threads. The service can spin up any number of threads and keep these threads alive in the background while IIS is running doing its own thing. These threads are newly created threads, so they sit completely outside of the IIS thread pool. In order for this service to work all it needs is a long running reference that keeps it alive for the life time of the application.In this particular app there are two components that run in the background on their own threads: A scheduler that runs various scheduled tasks and handles things like picking up emails to send out outside of IIS's scope and the QueueManager. Here's what this looks like in global.asax:public class Global : System.Web.HttpApplication { private static ApplicationScheduler scheduler; private static ServiceLauncher launcher; protected void Application_Start(object sender, EventArgs e) { // Pings the service and ensures it stays alive scheduler = new ApplicationScheduler() { CheckFrequency = 600000 }; scheduler.Start(); launcher = new ServiceLauncher(); launcher.Start(); // register so shutdown is controlled HostingEnvironment.RegisterObject(launcher); }}By keeping these objects around as static instances that are set only once on startup, they survive the lifetime of the application. The code in these classes is essentially unchanged from the Windows Service code except that I could remove the various overrides required for the Windows Service interface (OnStart,OnStop,OnResume etc.). Otherwise the behavior and operation is very similar.In this application ASP.NET serves two purposes: It acts as the host for SignalR and provides the administration interface which allows remote management of the 'service'. I can start and stop the service remotely by shutting down the ApplicationScheduler very easily. I can also very easily feed stats from the queue out directly via a couple of Web requests or (as we do now) through the SignalR service.Registering a Background Object with ASP.NETNotice also the use of the HostingEnvironment.RegisterObject(). This function registers an object with ASP.NET to let it know that it's a background task that should be notified if the AppDomain shuts down. RegisterObject() requires an interface with a Stop() method that's fired and allows your code to respond to a shutdown request. Here's what the IRegisteredObject::Stop() method looks like on the launcher:public void Stop(bool immediate = false) { LogManager.Current.LogInfo("QueueManager Controller Stopped."); Controller.StopProcessing(); Controller.Dispose(); Thread.Sleep(1500); // give background threads some time HostingEnvironment.UnregisterObject(this); }Implementing IRegisterObject should help with reliability on AppDomain shutdowns. Thanks to Justin Van Patten for pointing this out to me on Twitter.RegisterObject() is not required but I would highly recommend implementing it on whatever object controls your background processing to all clean shutdowns when the AppDomain shuts down.Testing it outI'm still in the testing phase with this particular service to see if there are any side effects. But so far it doesn't look like it. With about 50 lines of code I was able to replace the Windows service startup to Web start up - everything else just worked as is. An honorable mention goes to SignalR 2.0's oWin hosting, because with the new oWin based hosting no code changes at all were required, merely a couple of configuration file settings and an assembly directive needed, to point at the SignalR startup class. Sweet!It also seems like SignalR is noticeably faster running inside of IIS compared to self-host. Startup feels faster because of the preload.Starting and Stopping the 'Service'Because the application is running as a Web Server, it's easy to have a Web interface for starting and stopping the services running inside of the service. For our queue manager the SignalR service and front monitoring app has a play and stop button for toggling the queue.If you want more administrative control and have it work more like a Windows Service you can also stop the application pool explicitly from the command line which would be equivalent to stopping and restarting a service.To start and stop from the command line you can use the IIS appCmd tool. To stop:> %windir%\system32\inetsrv\appcmd stop apppool /apppool.name:"Weblog"and to start> %windir%\system32\inetsrv\appcmd start apppool /apppool.name:"Weblog"Note that when you explicitly force the AppPool to stop running either in the UI (on the ApplicationPools page use Start/Stop) or via command line tools, the application pool will not auto-restart immediately. You have to manually start it back up.What's not to like?There are certainly a lot of benefits to running a background service in IIS, but… ASP.NET applications do have more overhead in terms of memory footprint and startup time is a little slower, but generally for server applications this is not a big deal. If the application is stable the service should fire up and stay running indefinitely. A lot of times this kind of service interface can simply be attached to an existing Web application, or if scalability requires be offloaded to its own Web server.Easier to work withBut the ultimate benefit here is that it's much easier to work with a Web app as opposed to a service. While developing I can simply turn off the auto-launch features and launch the service on demand through IIS simply by hitting a page on the site. If I want to shut down an IISRESET -stop will shut down the service easily enough. I can then attach a debugger anywhere I want and this works like any other ASP.NET application. Yes you end up on a background thread for debugging but Visual Studio handles that just fine and if you stay on a single thread this is no different than debugging any other code.SummaryUsing ASP.NET to run background service operations is probably not a super common scenario, but it probably should be something that is considered carefully when building services. Many applications have service like features and with the auto-start functionality of the Application Initialization module, it's easy to build this functionality into ASP.NET. Especially when combined with the notification features of SignalR it becomes very, very easy to create rich services that can also communicate their status easily to the outside world.Whether it's existing applications that need some background processing for scheduling related tasks, or whether you just create a separate site altogether just to host your service it's easy to do and you can leverage the same tool chain you're already using for other Web projects. If you have lots of service projects it's worth considering… give it some thought…© Rick Strahl, West Wind Technologies, 2005-2013Posted in ASP.NET  SignalR  IIS   Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

    Read the article

  • .NET Code Evolution

    - by Alois Kraus
    Originally posted on: http://geekswithblogs.net/akraus1/archive/2013/07/24/153504.aspxAt my day job I do look at a lot of code written by other people. Most of the code is quite good and some is even a masterpiece. And there is also code which makes you think WTF… oh it was written by me. Hm not so bad after all. There are many excuses reasons for bad code. Most often it is time pressure followed by not enough ambition (who cares) or insufficient training. Normally I do care about code quality quite a lot which makes me a (perceived) slow worker who does write many tests and refines the code quite a lot because of the design deficiencies. Most of the deficiencies I do find by putting my design under stress while checking for invariants. It does also help a lot to step into the code with a debugger (sometimes also Windbg). I do this much more often when my tests are red. That way I do get a much better understanding what my code really does and not what I think it should be doing. This time I do want to show you how code can evolve over the years with different .NET Framework versions. Once there was  time where .NET 1.1 was new and many C++ programmers did switch over to get rid of not initialized pointers and memory leaks. There were also nice new data structures available such as the Hashtable which is fast lookup table with O(1) time complexity. All was good and much code was written since then. At 2005 a new version of the .NET Framework did arrive which did bring many new things like generics and new data structures. The “old” fashioned way of Hashtable were coming to an end and everyone used the new Dictionary<xx,xx> type instead which was type safe and faster because the object to type conversion (aka boxing) was no longer necessary. I think 95% of all Hashtables and dictionaries use string as key. Often it is convenient to ignore casing to make it easy to look up values which the user did enter. An often followed route is to convert the string to upper case before putting it into the Hashtable. Hashtable Table = new Hashtable(); void Add(string key, string value) { Table.Add(key.ToUpper(), value); } This is valid and working code but it has problems. First we can pass to the Hashtable a custom IEqualityComparer to do the string matching case insensitive. Second we can switch over to the now also old Dictionary type to become a little faster and we can keep the the original keys (not upper cased) in the dictionary. Dictionary<string, string> DictTable = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase); void AddDict(string key, string value) { DictTable.Add(key, value); } Many people do not user the other ctors of Dictionary because they do shy away from the overhead of writing their own comparer. They do not know that .NET has for strings already predefined comparers at hand which you can directly use. Today in the many core area we do use threads all over the place. Sometimes things break in subtle ways but most of the time it is sufficient to place a lock around the offender. Threading has become so mainstream that it may sound weird that in the year 2000 some guy got a huge incentive for the idea to reduce the time to process calibration data from 12 hours to 6 hours by using two threads on a dual core machine. Threading does make it easy to become faster at the expense of correctness. Correct and scalable multithreading can be arbitrarily hard to achieve depending on the problem you are trying to solve. Lets suppose we want to process millions of items with two threads and count the processed items processed by all threads. A typical beginners code might look like this: int Counter; void IJustLearnedToUseThreads() { var t1 = new Thread(ThreadWorkMethod); t1.Start(); var t2 = new Thread(ThreadWorkMethod); t2.Start(); t1.Join(); t2.Join(); if (Counter != 2 * Increments) throw new Exception("Hmm " + Counter + " != " + 2 * Increments); } const int Increments = 10 * 1000 * 1000; void ThreadWorkMethod() { for (int i = 0; i < Increments; i++) { Counter++; } } It does throw an exception with the message e.g. “Hmm 10.222.287 != 20.000.000” and does never finish. The code does fail because the assumption that Counter++ is an atomic operation is wrong. The ++ operator is just a shortcut for Counter = Counter + 1 This does involve reading the counter from a memory location into the CPU, incrementing value on the CPU and writing the new value back to the memory location. When we do look at the generated assembly code we will see only inc dword ptr [ecx+10h] which is only one instruction. Yes it is one instruction but it is not atomic. All modern CPUs have several layers of caches (L1,L2,L3) which try to hide the fact how slow actual main memory accesses are. Since cache is just another word for redundant copy it can happen that one CPU does read a value from main memory into the cache, modifies it and write it back to the main memory. The problem is that at least the L1 cache is not shared between CPUs so it can happen that one CPU does make changes to values which did change in meantime in the main memory. From the exception you can see we did increment the value 20 million times but half of the changes were lost because we did overwrite the already changed value from the other thread. This is a very common case and people do learn to protect their  data with proper locking.   void Intermediate() { var time = Stopwatch.StartNew(); Action acc = ThreadWorkMethod_Intermediate; var ar1 = acc.BeginInvoke(null, null); var ar2 = acc.BeginInvoke(null, null); ar1.AsyncWaitHandle.WaitOne(); ar2.AsyncWaitHandle.WaitOne(); if (Counter != 2 * Increments) throw new Exception(String.Format("Hmm {0:N0} != {1:N0}", Counter, 2 * Increments)); Console.WriteLine("Intermediate did take: {0:F1}s", time.Elapsed.TotalSeconds); } void ThreadWorkMethod_Intermediate() { for (int i = 0; i < Increments; i++) { lock (this) { Counter++; } } } This is better and does use the .NET Threadpool to get rid of manual thread management. It does give the expected result but it can result in deadlocks because you do lock on this. This is in general a bad idea since it can lead to deadlocks when other threads use your class instance as lock object. It is therefore recommended to create a private object as lock object to ensure that nobody else can lock your lock object. When you read more about threading you will read about lock free algorithms. They are nice and can improve performance quite a lot but you need to pay close attention to the CLR memory model. It does make quite weak guarantees in general but it can still work because your CPU architecture does give you more invariants than the CLR memory model. For a simple counter there is an easy lock free alternative present with the Interlocked class in .NET. As a general rule you should not try to write lock free algos since most likely you will fail to get it right on all CPU architectures. void Experienced() { var time = Stopwatch.StartNew(); Task t1 = Task.Factory.StartNew(ThreadWorkMethod_Experienced); Task t2 = Task.Factory.StartNew(ThreadWorkMethod_Experienced); t1.Wait(); t2.Wait(); if (Counter != 2 * Increments) throw new Exception(String.Format("Hmm {0:N0} != {1:N0}", Counter, 2 * Increments)); Console.WriteLine("Experienced did take: {0:F1}s", time.Elapsed.TotalSeconds); } void ThreadWorkMethod_Experienced() { for (int i = 0; i < Increments; i++) { Interlocked.Increment(ref Counter); } } Since time does move forward we do not use threads explicitly anymore but the much nicer Task abstraction which was introduced with .NET 4 at 2010. It is educational to look at the generated assembly code. The Interlocked.Increment method must be called which does wondrous things right? Lets see: lock inc dword ptr [eax] The first thing to note that there is no method call at all. Why? Because the JIT compiler does know very well about CPU intrinsic functions. Atomic operations which do lock the memory bus to prevent other processors to read stale values are such things. Second: This is the same increment call prefixed with a lock instruction. The only reason for the existence of the Interlocked class is that the JIT compiler can compile it to the matching CPU intrinsic functions which can not only increment by one but can also do an add, exchange and a combined compare and exchange operation. But be warned that the correct usage of its methods can be tricky. If you try to be clever and look a the generated IL code and try to reason about its efficiency you will fail. Only the generated machine code counts. Is this the best code we can write? Perhaps. It is nice and clean. But can we make it any faster? Lets see how good we are doing currently. Level Time in s IJustLearnedToUseThreads Flawed Code Intermediate 1,5 (lock) Experienced 0,3 (Interlocked.Increment) Master 0,1 (1,0 for int[2]) That lock free thing is really a nice thing. But if you read more about CPU cache, cache coherency, false sharing you can do even better. int[] Counters = new int[12]; // Cache line size is 64 bytes on my machine with an 8 way associative cache try for yourself e.g. 64 on more modern CPUs void Master() { var time = Stopwatch.StartNew(); Task t1 = Task.Factory.StartNew(ThreadWorkMethod_Master, 0); Task t2 = Task.Factory.StartNew(ThreadWorkMethod_Master, Counters.Length - 1); t1.Wait(); t2.Wait(); Counter = Counters[0] + Counters[Counters.Length - 1]; if (Counter != 2 * Increments) throw new Exception(String.Format("Hmm {0:N0} != {1:N0}", Counter, 2 * Increments)); Console.WriteLine("Master did take: {0:F1}s", time.Elapsed.TotalSeconds); } void ThreadWorkMethod_Master(object number) { int index = (int) number; for (int i = 0; i < Increments; i++) { Counters[index]++; } } The key insight here is to use for each core its own value. But if you simply use simply an integer array of two items, one for each core and add the items at the end you will be much slower than the lock free version (factor 3). Each CPU core has its own cache line size which is something in the range of 16-256 bytes. When you do access a value from one location the CPU does not only fetch one value from main memory but a complete cache line (e.g. 16 bytes). This means that you do not pay for the next 15 bytes when you access them. This can lead to dramatic performance improvements and non obvious code which is faster although it does have many more memory reads than another algorithm. So what have we done here? We have started with correct code but it was lacking knowledge how to use the .NET Base Class Libraries optimally. Then we did try to get fancy and used threads for the first time and failed. Our next try was better but it still had non obvious issues (lock object exposed to the outside). Knowledge has increased further and we have found a lock free version of our counter which is a nice and clean way which is a perfectly valid solution. The last example is only here to show you how you can get most out of threading by paying close attention to your used data structures and CPU cache coherency. Although we are working in a virtual execution environment in a high level language with automatic memory management it does pay off to know the details down to the assembly level. Only if you continue to learn and to dig deeper you can come up with solutions no one else was even considering. I have studied particle physics which does help at the digging deeper part. Have you ever tried to solve Quantum Chromodynamics equations? Compared to that the rest must be easy ;-). Although I am no longer working in the Science field I take pride in discovering non obvious things. This can be a very hard to find bug or a new way to restructure data to make something 10 times faster. Now I need to get some sleep ….

    Read the article

  • SQL SERVER – CXPACKET – Parallelism – Usual Solution – Wait Type – Day 6 of 28

    - by pinaldave
    CXPACKET has to be most popular one of all wait stats. I have commonly seen this wait stat as one of the top 5 wait stats in most of the systems with more than one CPU. Books On-Line: Occurs when trying to synchronize the query processor exchange iterator. You may consider lowering the degree of parallelism if contention on this wait type becomes a problem. CXPACKET Explanation: When a parallel operation is created for SQL Query, there are multiple threads for a single query. Each query deals with a different set of the data (or rows). Due to some reasons, one or more of the threads lag behind, creating the CXPACKET Wait Stat. There is an organizer/coordinator thread (thread 0), which takes waits for all the threads to complete and gathers result together to present on the client’s side. The organizer thread has to wait for the all the threads to finish before it can move ahead. The Wait by this organizer thread for slow threads to complete is called CXPACKET wait. Note that not all the CXPACKET wait types are bad. You might experience a case when it totally makes sense. There might also be cases when this is unavoidable. If you remove this particular wait type for any query, then that query may run slower because the parallel operations are disabled for the query. Reducing CXPACKET wait: We cannot discuss about reducing the CXPACKET wait without talking about the server workload type. OLTP: On Pure OLTP system, where the transactions are smaller and queries are not long but very quick usually, set the “Maximum Degree of Parallelism” to 1 (one). This way it makes sure that the query never goes for parallelism and does not incur more engine overhead. EXEC sys.sp_configure N'cost threshold for parallelism', N'1' GO RECONFIGURE WITH OVERRIDE GO Data-warehousing / Reporting server: As queries will be running for long time, it is advised to set the “Maximum Degree of Parallelism” to 0 (zero). This way most of the queries will utilize the parallel processor, and long running queries get a boost in their performance due to multiple processors. EXEC sys.sp_configure N'cost threshold for parallelism', N'0' GO RECONFIGURE WITH OVERRIDE GO Mixed System (OLTP & OLAP): Here is the challenge. The right balance has to be found. I have taken a very simple approach. I set the “Maximum Degree of Parallelism” to 2, which means the query still uses parallelism but only on 2 CPUs. However, I keep the “Cost Threshold for Parallelism” very high. This way, not all the queries will qualify for parallelism but only the query with higher cost will go for parallelism. I have found this to work best for a system that has OLTP queries and also where the reporting server is set up. Here, I am setting ‘Cost Threshold for Parallelism’ to 25 values (which is just for illustration); you can choose any value, and you can find it out by experimenting with the system only. In the following script, I am setting the ‘Max Degree of Parallelism’ to 2, which indicates that the query that will have a higher cost (here, more than 25) will qualify for parallel query to run on 2 CPUs. This implies that regardless of the number of CPUs, the query will select any two CPUs to execute itself. EXEC sys.sp_configure N'cost threshold for parallelism', N'25' GO EXEC sys.sp_configure N'max degree of parallelism', N'2' GO RECONFIGURE WITH OVERRIDE GO Read all the post in the Wait Types and Queue series. Additionally a must read comment of Jonathan Kehayias. Note: The information presented here is from my experience and I no way claim it to be accurate. I suggest you all to read the online book for further clarification. All the discussion of Wait Stats over here is generic and it varies from system to system. It is recommended that you test this on the development server before implementing on the production server. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: DMV, Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

    Read the article

  • Query on simple C++ threadpool implementation

    - by ticketman
    Stackoverflow has been a tremendous help to me and I'd to give something back to the community. I have been implementing a simple threadpool using the tinythread C++ portable thread library, using what I have learnt from Stackoverflow. I am new to thread programming, so not that comfortable with mutexes, etc. I have a question best asked after presenting the code (which runs quite well under Linux): // ThreadPool.h class ThreadPool { public: ThreadPool(); ~ThreadPool(); // Creates a pool of threads and gets them ready to be used void CreateThreads(int numOfThreads); // Assigns a job to a thread in the pool, but doesn't start the job // Each SubmitJob call will use up one thread of the pool. // This operation can only be undone by calling StartJobs and // then waiting for the jobs to complete. On completion, // new jobs may be submitted. void SubmitJob( void (*workFunc)(void *), void *workData ); // Begins execution of all the jobs in the pool. void StartJobs(); // Waits until all jobs have completed. // The wait will block the caller. // On completion, new jobs may be submitted. void WaitForJobsToComplete(); private: enum typeOfWorkEnum { e_work, e_quit }; class ThreadData { public: bool ready; // thread has been created and is ready for work bool haveWorkToDo; typeOfWorkEnum typeOfWork; // Pointer to the work function each thread has to call. void (*workFunc)(void *); // Pointer to work data void *workData; ThreadData() : ready(false), haveWorkToDo(false) { }; }; struct ThreadArgStruct { ThreadPool *threadPoolInstance; int threadId; }; // Data for each thread ThreadData *m_ThreadData; ThreadPool(ThreadPool const&); // copy ctor hidden ThreadPool& operator=(ThreadPool const&); // assign op. hidden // Static function that provides the function pointer that a thread can call // By including the ThreadPool instance in the void * parameter, // we can use it to access other data and methods in the ThreadPool instance. static void ThreadFuncWrapper(void *arg) { ThreadArgStruct *threadArg = static_cast<ThreadArgStruct *>(arg); threadArg->threadPoolInstance->ThreadFunc(threadArg->threadId); } // The function each thread calls void ThreadFunc( int threadId ); // Called by the thread pool destructor void DestroyThreadPool(); // Total number of threads available // (fixed on creation of thread pool) int m_numOfThreads; int m_NumOfThreadsDoingWork; int m_NumOfThreadsGivenJobs; // List of threads std::vector<tthread::thread *> m_ThreadList; // Condition variable to signal each thread has been created and executing tthread::mutex m_ThreadReady_mutex; tthread::condition_variable m_ThreadReady_condvar; // Condition variable to signal each thread to start work tthread::mutex m_WorkToDo_mutex; tthread::condition_variable m_WorkToDo_condvar; // Condition variable to signal the main thread that // all threads in the pool have completed their work tthread::mutex m_WorkCompleted_mutex; tthread::condition_variable m_WorkCompleted_condvar; }; cpp file: // // ThreadPool.cpp // #include "ThreadPool.h" // This is the thread function for each thread. // All threads remain in this function until // they are asked to quit, which only happens // when terminating the thread pool. void ThreadPool::ThreadFunc( int threadId ) { ThreadData *myThreadData = &m_ThreadData[threadId]; std::cout << "Hello world: Thread " << threadId << std::endl; // Signal that this thread is ready m_ThreadReady_mutex.lock(); myThreadData->ready = true; m_ThreadReady_condvar.notify_one(); // notify the main thread m_ThreadReady_mutex.unlock(); while(true) { //tthread::lock_guard<tthread::mutex> guard(m); m_WorkToDo_mutex.lock(); while(!myThreadData->haveWorkToDo) // check for work to do m_WorkToDo_condvar.wait(m_WorkToDo_mutex); // if no work, wait here myThreadData->haveWorkToDo = false; // need to do this before unlocking the mutex m_WorkToDo_mutex.unlock(); // Do the work switch(myThreadData->typeOfWork) { case e_work: std::cout << "Thread " << threadId << ": Woken with work to do\n"; // Do work myThreadData->workFunc(myThreadData->workData); std::cout << "#Thread " << threadId << ": Work is completed\n"; break; case e_quit: std::cout << "Thread " << threadId << ": Asked to quit\n"; return; // ends the thread } // Now to signal the main thread that my work is completed m_WorkCompleted_mutex.lock(); m_NumOfThreadsDoingWork--; // Unsure if this 'if' would make the program more efficient // if(NumOfThreadsDoingWork == 0) m_WorkCompleted_condvar.notify_one(); // notify the main thread m_WorkCompleted_mutex.unlock(); } } ThreadPool::ThreadPool() { m_numOfThreads = 0; m_NumOfThreadsDoingWork = 0; m_NumOfThreadsGivenJobs = 0; } ThreadPool::~ThreadPool() { if(m_numOfThreads) { DestroyThreadPool(); delete [] m_ThreadData; } } void ThreadPool::CreateThreads(int numOfThreads) { // Check a thread pool has already been created if(m_numOfThreads > 0) return; m_NumOfThreadsGivenJobs = 0; m_NumOfThreadsDoingWork = 0; m_numOfThreads = numOfThreads; m_ThreadData = new ThreadData[m_numOfThreads]; ThreadArgStruct threadArg; for(int i=0; i<m_numOfThreads; ++i) { threadArg.threadId = i; threadArg.threadPoolInstance = this; // Creates the thread and save in a list so we can destroy it later m_ThreadList.push_back( new tthread::thread( ThreadFuncWrapper, (void *)&threadArg ) ); // It takes a little time for a thread to get established. // Best wait until it gets established before creating the next thread. m_ThreadReady_mutex.lock(); while(!m_ThreadData[i].ready) // Check if thread is ready m_ThreadReady_condvar.wait(m_ThreadReady_mutex); // If not, wait here m_ThreadReady_mutex.unlock(); } } // Adds a job to the batch, but doesn't start the job void ThreadPool::SubmitJob(void (*workFunc)(void *), void *workData) { // Check that the thread pool has been created if(!m_numOfThreads) return; if(m_NumOfThreadsGivenJobs >= m_numOfThreads) return; m_ThreadData[m_NumOfThreadsGivenJobs].workFunc = workFunc; m_ThreadData[m_NumOfThreadsGivenJobs].workData = workData; std::cout << "Submitted job " << m_NumOfThreadsGivenJobs << std::endl; m_NumOfThreadsGivenJobs++; } void ThreadPool::StartJobs() { // Check that the thread pool has been created // and some jobs have been assigned if(!m_numOfThreads || !m_NumOfThreadsGivenJobs) return; // Set 'haveworkToDo' flag for all threads m_WorkToDo_mutex.lock(); for(int i=0; i<m_NumOfThreadsGivenJobs; ++i) m_ThreadData[i].haveWorkToDo = true; m_NumOfThreadsDoingWork = m_NumOfThreadsGivenJobs; // Reset this counter so we can resubmit jobs later m_NumOfThreadsGivenJobs = 0; // Notify all threads they have work to do m_WorkToDo_condvar.notify_all(); m_WorkToDo_mutex.unlock(); } void ThreadPool::WaitForJobsToComplete() { // Check that a thread pool has been created if(!m_numOfThreads) return; m_WorkCompleted_mutex.lock(); while(m_NumOfThreadsDoingWork > 0) // Check if all threads have completed their work m_WorkCompleted_condvar.wait(m_WorkCompleted_mutex); // If not, wait here m_WorkCompleted_mutex.unlock(); } void ThreadPool::DestroyThreadPool() { std::cout << "Ask threads to quit\n"; m_WorkToDo_mutex.lock(); for(int i=0; i<m_numOfThreads; ++i) { m_ThreadData[i].haveWorkToDo = true; m_ThreadData[i].typeOfWork = e_quit; } m_WorkToDo_condvar.notify_all(); m_WorkToDo_mutex.unlock(); // As each thread terminates, catch them here for(int i=0; i<m_numOfThreads; ++i) { tthread::thread *t = m_ThreadList[i]; // Wait for thread to complete t->join(); } m_numOfThreads = 0; } Example of usage: (this calculates pi-squared/6) struct CalculationDataStruct { int inputVal; double outputVal; }; void LongCalculation( void *theSums ) { CalculationDataStruct *sums = (CalculationDataStruct *)theSums; int terms = sums->inputVal; double sum; for(int i=1; i<terms; i++) sum += 1.0/( double(i)*double(i) ); sums->outputVal = sum; } int main(int argc, char** argv) { int numThreads = 10; // Create pool ThreadPool threadPool; threadPool.CreateThreads(numThreads); // Create thread workspace CalculationDataStruct sums[numThreads]; // Set up jobs for(int i=0; i<numThreads; i++) { sums[i].inputVal = 3000*(i+1); threadPool.SubmitJob(LongCalculation, &sums[i]); } // Run the jobs threadPool.StartJobs(); threadPool.WaitForJobsToComplete(); // Print results for(int i=0; i<numThreads; i++) std::cout << "Sum of " << sums[i].inputVal << " terms is " << sums[i].outputVal << std::endl; return 0; } Question: In the ThreadPool::ThreadFunc method, would better performance be obtained if the following if statement if(NumOfThreadsDoingWork == 0) was included? Also, I'd be grateful of criticisms and ways to improve the code. At the same time, I hope the code is of use to others.

    Read the article

  • SPARC T4-4 Delivers World Record First Result on PeopleSoft Combined Benchmark

    - by Brian
    Oracle's SPARC T4-4 servers running Oracle's PeopleSoft HCM 9.1 combined online and batch benchmark achieved World Record 18,000 concurrent users while executing a PeopleSoft Payroll batch job of 500,000 employees in 43.32 minutes and maintaining online users response time at < 2 seconds. This world record is the first to run online and batch workloads concurrently. This result was obtained with a SPARC T4-4 server running Oracle Database 11g Release 2, a SPARC T4-4 server running PeopleSoft HCM 9.1 application server and a SPARC T4-2 server running Oracle WebLogic Server in the web tier. The SPARC T4-4 server running the application tier used Oracle Solaris Zones which provide a flexible, scalable and manageable virtualization environment. The average CPU utilization on the SPARC T4-2 server in the web tier was 17%, on the SPARC T4-4 server in the application tier it was 59%, and on the SPARC T4-4 server in the database tier was 35% (online and batch) leaving significant headroom for additional processing across the three tiers. The SPARC T4-4 server used for the database tier hosted Oracle Database 11g Release 2 using Oracle Automatic Storage Management (ASM) for database files management with I/O performance equivalent to raw devices. This is the first three tier mixed workload (online and batch) PeopleSoft benchmark also processing PeopleSoft payroll batch workload. Performance Landscape PeopleSoft HR Self-Service and Payroll Benchmark Systems Users Ave Response Search (sec) Ave Response Save (sec) Batch Time (min) Streams SPARC T4-2 (web) SPARC T4-4 (app) SPARC T4-2 (db) 18,000 0.944 0.503 43.32 64 Configuration Summary Application Configuration: 1 x SPARC T4-4 server with 4 x SPARC T4 processors, 3.0 GHz 512 GB memory 5 x 300 GB SAS internal disks 1 x 100 GB and 2 x 300 GB internal SSDs 2 x 10 Gbe HBA Oracle Solaris 11 11/11 PeopleTools 8.52 PeopleSoft HCM 9.1 Oracle Tuxedo, Version 10.3.0.0, 64-bit, Patch Level 031 Java Platform, Standard Edition Development Kit 6 Update 32 Database Configuration: 1 x SPARC T4-4 server with 4 x SPARC T4 processors, 3.0 GHz 256 GB memory 3 x 300 GB SAS internal disks Oracle Solaris 11 11/11 Oracle Database 11g Release 2 Web Tier Configuration: 1 x SPARC T4-2 server with 2 x SPARC T4 processors, 2.85 GHz 256 GB memory 2 x 300 GB SAS internal disks 1 x 100 GB internal SSD Oracle Solaris 11 11/11 PeopleTools 8.52 Oracle WebLogic Server 10.3.4 Java Platform, Standard Edition Development Kit 6 Update 32 Storage Configuration: 1 x Sun Server X2-4 as a COMSTAR head for data 4 x Intel Xeon X7550, 2.0 GHz 128 GB memory 1 x Sun Storage F5100 Flash Array (80 flash modules) 1 x Sun Storage F5100 Flash Array (40 flash modules) 1 x Sun Fire X4275 as a COMSTAR head for redo logs 12 x 2 TB SAS disks with Niwot Raid controller Benchmark Description This benchmark combines PeopleSoft HCM 9.1 HR Self Service online and PeopleSoft Payroll batch workloads to run on a unified database deployed on Oracle Database 11g Release 2. The PeopleSoft HRSS benchmark kit is a Oracle standard benchmark kit run by all platform vendors to measure the performance. It's an OLTP benchmark where DB SQLs are moderately complex. The results are certified by Oracle and a white paper is published. PeopleSoft HR SS defines a business transaction as a series of HTML pages that guide a user through a particular scenario. Users are defined as corporate Employees, Managers and HR administrators. The benchmark consist of 14 scenarios which emulate users performing typical HCM transactions such as viewing paycheck, promoting and hiring employees, updating employee profile and other typical HCM application transactions. All these transactions are well-defined in the PeopleSoft HR Self-Service 9.1 benchmark kit. This benchmark metric is the weighted average response search/save time for all the transactions. The PeopleSoft 9.1 Payroll (North America) benchmark demonstrates system performance for a range of processing volumes in a specific configuration. This workload represents large batch runs typical of a ERP environment during a mass update. The benchmark measures five application business process run times for a database representing large organization. They are Paysheet Creation, Payroll Calculation, Payroll Confirmation, Print Advice forms, and Create Direct Deposit File. The benchmark metric is the cumulative elapsed time taken to complete the Paysheet Creation, Payroll Calculation and Payroll Confirmation business application processes. The benchmark metrics are taken for each respective benchmark while running simultaneously on the same database back-end. Specifically, the payroll batch processes are started when the online workload reaches steady state (the maximum number of online users) and overlap with online transactions for the duration of the steady state. Key Points and Best Practices Two Oracle PeopleSoft Domain sets with 200 application servers each on a SPARC T4-4 server were hosted in 2 separate Oracle Solaris Zones to demonstrate consolidation of multiple application servers, ease of administration and performance tuning. Each Oracle Solaris Zone was bound to a separate processor set, each containing 15 cores (total 120 threads). The default set (1 core from first and third processor socket, total 16 threads) was used for network and disk interrupt handling. This was done to improve performance by reducing memory access latency by using the physical memory closest to the processors and offload I/O interrupt handling to default set threads, freeing up cpu resources for Application Servers threads and balancing application workload across 240 threads. See Also Oracle PeopleSoft Benchmark White Papers oracle.com SPARC T4-2 Server oracle.com OTN SPARC T4-4 Server oracle.com OTN PeopleSoft Enterprise Human Capital Management oracle.com OTN PeopleSoft Enterprise Human Capital Management (Payroll) oracle.com OTN Oracle Solaris oracle.com OTN Oracle Database 11g Release 2 Enterprise Edition oracle.com OTN Disclosure Statement Oracle's PeopleSoft HR and Payroll combined benchmark, www.oracle.com/us/solutions/benchmark/apps-benchmark/peoplesoft-167486.html, results 09/30/2012.

    Read the article

  • Suggestions for duplicate file finder algorithm (using C)

    - by Andrei Ciobanu
    Hello, I wanted to write a program that test if two files are duplicates (have exactly the same content). First I test if the files have the same sizes, and if they have i start to compare their contents. My first idea, was to "split" the files into fixed size blocks, then start a thread for every block, fseek to startup character of every block and continue the comparisons in parallel. When a comparison from a thread fails, the other working threads are canceled, and the program exits out of the thread spawning loop. The code looks like this: dupf.h #ifndef __NM__DUPF__H__ #define __NM__DUPF__H__ #define NUM_THREADS 15 #define BLOCK_SIZE 8192 /* Thread argument structure */ struct thread_arg_s { const char *name_f1; /* First file name */ const char *name_f2; /* Second file name */ int cursor; /* Where to seek in the file */ }; typedef struct thread_arg_s thread_arg; /** * 'arg' is of type thread_arg. * Checks if the specified file blocks are * duplicates. */ void *check_block_dup(void *arg); /** * Checks if two files are duplicates */ int check_dup(const char *name_f1, const char *name_f2); /** * Returns a valid pointer to a file. * If the file (given by the path/name 'fname') cannot be opened * in 'mode', the program is interrupted an error message is shown. **/ FILE *safe_fopen(const char *name, const char *mode); #endif dupf.c #include <errno.h> #include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> #include <unistd.h> #include "dupf.h" FILE *safe_fopen(const char *fname, const char *mode) { FILE *f = NULL; f = fopen(fname, mode); if (f == NULL) { char emsg[255]; sprintf(emsg, "FOPEN() %s\t", fname); perror(emsg); exit(-1); } return (f); } void *check_block_dup(void *arg) { const char *name_f1 = NULL, *name_f2 = NULL; /* File names */ FILE *f1 = NULL, *f2 = NULL; /* Streams */ int cursor = 0; /* Reading cursor */ char buff_f1[BLOCK_SIZE], buff_f2[BLOCK_SIZE]; /* Character buffers */ int rchars_1, rchars_2; /* Readed characters */ /* Initializing variables from 'arg' */ name_f1 = ((thread_arg*)arg)->name_f1; name_f2 = ((thread_arg*)arg)->name_f2; cursor = ((thread_arg*)arg)->cursor; /* Opening files */ f1 = safe_fopen(name_f1, "r"); f2 = safe_fopen(name_f2, "r"); /* Setup cursor in files */ fseek(f1, cursor, SEEK_SET); fseek(f2, cursor, SEEK_SET); /* Initialize buffers */ rchars_1 = fread(buff_f1, 1, BLOCK_SIZE, f1); rchars_2 = fread(buff_f2, 1, BLOCK_SIZE, f2); if (rchars_1 != rchars_2) { /* fread failed to read the same portion. * program cannot continue */ perror("ERROR WHEN READING BLOCK"); exit(-1); } while (rchars_1-->0) { if (buff_f1[rchars_1] != buff_f2[rchars_1]) { /* Different characters */ fclose(f1); fclose(f2); pthread_exit("notdup"); } } /* Close streams */ fclose(f1); fclose(f2); pthread_exit("dup"); } int check_dup(const char *name_f1, const char *name_f2) { int num_blocks = 0; /* Number of 'blocks' to check */ int num_tsp = 0; /* Number of threads spawns */ int tsp_iter = 0; /* Iterator for threads spawns */ pthread_t *tsp_threads = NULL; thread_arg *tsp_threads_args = NULL; int tsp_threads_iter = 0; int thread_c_res = 0; /* Thread creation result */ int thread_j_res = 0; /* Thread join res */ int loop_res = 0; /* Function result */ int cursor; struct stat buf_f1; struct stat buf_f2; if (name_f1 == NULL || name_f2 == NULL) { /* Invalid input parameters */ perror("INVALID FNAMES\t"); return (-1); } if (stat(name_f1, &buf_f1) != 0 || stat(name_f2, &buf_f2) != 0) { /* Stat fails */ char emsg[255]; sprintf(emsg, "STAT() ERROR: %s %s\t", name_f1, name_f2); perror(emsg); return (-1); } if (buf_f1.st_size != buf_f2.st_size) { /* File have different sizes */ return (1); } /* Files have the same size, function exec. is continued */ num_blocks = (buf_f1.st_size / BLOCK_SIZE) + 1; num_tsp = (num_blocks / NUM_THREADS) + 1; cursor = 0; for (tsp_iter = 0; tsp_iter < num_tsp; tsp_iter++) { loop_res = 0; /* Create threads array for this spawn */ tsp_threads = malloc(NUM_THREADS * sizeof(*tsp_threads)); if (tsp_threads == NULL) { perror("TSP_THREADS ALLOC FAILURE\t"); return (-1); } /* Create arguments for every thread in the current spawn */ tsp_threads_args = malloc(NUM_THREADS * sizeof(*tsp_threads_args)); if (tsp_threads_args == NULL) { perror("TSP THREADS ARGS ALLOCA FAILURE\t"); return (-1); } /* Initialize arguments and create threads */ for (tsp_threads_iter = 0; tsp_threads_iter < NUM_THREADS; tsp_threads_iter++) { if (cursor >= buf_f1.st_size) { break; } tsp_threads_args[tsp_threads_iter].name_f1 = name_f1; tsp_threads_args[tsp_threads_iter].name_f2 = name_f2; tsp_threads_args[tsp_threads_iter].cursor = cursor; thread_c_res = pthread_create( &tsp_threads[tsp_threads_iter], NULL, check_block_dup, (void*)&tsp_threads_args[tsp_threads_iter]); if (thread_c_res != 0) { perror("THREAD CREATION FAILURE"); return (-1); } cursor+=BLOCK_SIZE; } /* Join last threads and get their status */ while (tsp_threads_iter-->0) { void *thread_res = NULL; thread_j_res = pthread_join(tsp_threads[tsp_threads_iter], &thread_res); if (thread_j_res != 0) { perror("THREAD JOIN FAILURE"); return (-1); } if (strcmp((char*)thread_res, "notdup")==0) { loop_res++; /* Closing other threads and exiting by condition * from loop. */ while (tsp_threads_iter-->0) { pthread_cancel(tsp_threads[tsp_threads_iter]); } } } free(tsp_threads); free(tsp_threads_args); if (loop_res > 0) { break; } } return (loop_res > 0) ? 1 : 0; } The function works fine (at least for what I've tested). Still, some guys from #C (freenode) suggested that the solution is overly complicated, and it may perform poorly because of parallel reading on hddisk. What I want to know: Is the threaded approach flawed by default ? Is fseek() so slow ? Is there a way to somehow map the files to memory and then compare them ?

    Read the article

  • linux thread synchronization

    - by johnnycrash
    I am new to linux and linux threads. I have spent some time googling to try to understand the differences between all the functions available for thread synchronization. I still have some questions. I have found all of these different types of synchronizations, each with a number of functions for locking, unlocking, testing the lock, etc. gcc atomic operations futexes mutexes spinlocks seqlocks rculocks conditions semaphores My current (but probably flawed) understanding is this: semaphores are process wide, involve the filesystem (virtually I assume), and are probably the slowest. Futexes might be the base locking mechanism used by mutexes, spinlocks, seqlocks, and rculocks. Futexes might be faster than the locking mechanisms that are based on them. Spinlocks dont block and thus avoid context swtiches. However they avoid the context switch at the expense of consuming all the cycles on a CPU until the lock is released (spinning). They should only should be used on multi processor systems for obvious reasons. Never sleep in a spinlock. The seq lock just tells you when you finished your work if a writer changed the data the work was based on. You have to go back and repeat the work in this case. Atomic operations are the fastest synch call, and probably are used in all the above locking mechanisms. You do not want to use atomic operations on all the fields in your shared data. You want to use a lock (mutex, futex, spin, seq, rcu) or a single atomic opertation on a lock flag when you are accessing multiple data fields. My questions go like this: Am I right so far with my assumptions? Does anyone know the cpu cycle cost of the various options? I am adding parallelism to the app so we can get better wall time response at the expense of running fewer app instances per box. Performances is the utmost consideration. I don't want to consume cpu with context switching, spinning, or lots of extra cpu cycles to read and write shared memory. I am absolutely concerned with number of cpu cycles consumed. Which (if any) of the locks prevent interruption of a thread by the scheduler or interrupt...or am I just an idiot and all synchonization mechanisms do this. What kinds of interruption are prevented? Can I block all threads or threads just on the locking thread's CPU? This question stems from my fear of interrupting a thread holding a lock for a very commonly used function. I expect that the scheduler might schedule any number of other workers who will likely run into this function and then block because it was locked. A lot of context switching would be wasted until the thread with the lock gets rescheduled and finishes. I can re-write this function to minimize lock time, but still it is so commonly called I would like to use a lock that prevents interruption...across all processors. I am writing user code...so I get software interrupts, not hardware ones...right? I should stay away from any functions (spin/seq locks) that have the word "irq" in them. Which locks are for writing kernel or driver code and which are meant for user mode? Does anyone think using an atomic operation to have multiple threads move through a linked list is nuts? I am thinking to atomicly change the current item pointer to the next item in the list. If the attempt works, then the thread can safely use the data the current item pointed to before it was moved. Other threads would now be moved along the list. futexes? Any reason to use them instead of mutexes? Is there a better way than using a condition to sleep a thread when there is no work? When using gcc atomic ops, specifically the test_and_set, can I get a performance increase by doing a non atomic test first and then using test_and_set to confirm? *I know this will be case specific, so here is the case. There is a large collection of work items, say thousands. Each work item has a flag that is initialized to 0. When a thread has exclusive access to the work item, the flag will be one. There will be lots of worker threads. Any time a thread is looking for work, they can non atomicly test for 1. If they read a 1, we know for certain that the work is unavailable. If they read a zero, they need to perform the atomic test_and_set to confirm. So if the atomic test_and_set is 500 cpu cycles because it is disabling pipelining, causes cpu's to communicate and L2 caches to flush/fill .... and a simple test is 1 cycle .... then as long as I had a better ratio of 500 to 1 when it came to stumbling upon already completed work items....this would be a win.* I hope to use mutexes or spinlocks to sparilngly protect sections of code that I want only one thread on the SYSTEM (not jsut the CPU) to access at a time. I hope to sparingly use gcc atomic ops to select work and minimize use of mutexes and spinlocks. For instance: a flag in a work item can be checked to see if a thread has worked it (0=no, 1=yes or in progress). A simple test_and_set tells the thread if it has work or needs to move on. I hope to use conditions to wake up threads when there is work. Thanks!

    Read the article

  • ASP.NET request queue priority

    - by dan
    I'm on IIS 7 and .NET 4.0. My understanding is that IIS takes requests and passes them off to ASP.NET worker threads. If all the threads are in use, the request goes into a queue and is processed once a thread becomes available. If the queue goes over a certain size, all new requests get a 503 until there is room in the queue again. Is there a way to prioritize the order in which queued requests are served? For example, I have consumer traffic and infrastructure traffic coming to the same server. If there are no available threads, I'd like for the consumer requests to be served first, even if they have arrived after infrastructure requests. Basically I want to replace the request queue with a priority queue. Is this possible with IIS?

    Read the article

  • How to use supervisord to run a PHP script as a daemon?

    - by Alasdair
    I need to have 8 threads of the same PHP script running in the background on a server continuously (as a daemon), and each script need to be automatically restarted if it exits for any reason. I've been advised to use supervisord to do this, but I don't at all understand their documentation, which seems very complicated to me. I also want each of the 8 threads of the script to be initially started at 2 minute intervals (2 minutes in between each launch) but then after this all 8 threads of the same PHP script should continue running on the server forever (and restarting if any exit for any reason). Could someone please explain how to do this with supervisord, or any other easy way of doing it? I'm on CentOS 6. Thank you!

    Read the article

  • How many VPS' can I create on my server?

    - by user197692
    I need to create as many VPS's on my dedicated server (KVM or OpenVZ) in order to sell them but I really don't know the answer. RAM calculation is simple, it's more about CPU resources, how many VPS's can hold. I'am talking about Intel i7-2600 (4 cores, 8 Threads). I need to deploy as many VPS's. It's all about the nr threads? i.e. 8 threads = maximum 8 x 1vCPU or maximum 4 x 2vCPU? I'am planning to use 1Gb and 2Gb memory on each VPS, the server has 16Gb (but I can raise RAM if need it. So, can I create 8 KVM VPS's with 4 vCPU and 2Gb ram each ? How about 20 VPS's with 1Gb ram and 4vCPU each? How is this decision affected by the hypervisor (KVM, OpenVZ, VMware)?

    Read the article

  • Restrict whole system on certain cores except a few process?

    - by icando
    Hi I am running some latency sensitive program on a Linux machine (more specifically, CentOS 6), and I don't want the threads of the process being preempted. So in my plan, the first step is to set cpu affinity of the threads so that threads are running on separate cores, so they don't preempt each other. Then the second step is to make sure other processes in the system not running on these cores. So my question is: is it possible to restrict the whole system running on certain cores, except this process? This should apply to any newly created processes in the future.

    Read the article

  • How do I make time?

    - by SystemNetworks
    I wanted to output a text for a certain amount of time. One way is to use threads. Are there any other ways? I can't use threads for slick2d. This is my code when I use threads for slick: package javagame; import org.newdawn.slick.GameContainer; import org.newdawn.slick.Graphics; import org.newdawn.slick.Image; import java.util.Random; import org.newdawn.slick.Input; import org.newdawn.slick.*; import org.newdawn.slick.state.*; import org.lwjgl.input.Mouse; public class thread1 implements Runnable { String showUp; int timeLeft; public thread1(String s) { s = showUp; } public void run(Graphics g) { try { g.drawString("%s is sleeping %d", 500, 500); Thread.sleep(timeLeft); g.drawString("%s is awake", 600,600); } catch(Exception e) { } } @Override public void run() { // TODO Auto-generated method stub run(); } } It auto generates a new run() And also when I call it to my main class it has stack overflow!

    Read the article

  • Matrix Multiplication with C++ AMP

    - by Daniel Moth
    As part of our API tour of C++ AMP, we looked recently at parallel_for_each. I ended that post by saying we would revisit parallel_for_each after introducing array and array_view. Now is the time, so this is part 2 of parallel_for_each, and also a post that brings together everything we've seen until now. The code for serial and accelerated Consider a naïve (or brute force) serial implementation of matrix multiplication  0: void MatrixMultiplySerial(std::vector<float>& vC, const std::vector<float>& vA, const std::vector<float>& vB, int M, int N, int W) 1: { 2: for (int row = 0; row < M; row++) 3: { 4: for (int col = 0; col < N; col++) 5: { 6: float sum = 0.0f; 7: for(int i = 0; i < W; i++) 8: sum += vA[row * W + i] * vB[i * N + col]; 9: vC[row * N + col] = sum; 10: } 11: } 12: } We notice that each loop iteration is independent from each other and so can be parallelized. If in addition we have really large amounts of data, then this is a good candidate to offload to an accelerator. First, I'll just show you an example of what that code may look like with C++ AMP, and then we'll analyze it. It is assumed that you included at the top of your file #include <amp.h> 13: void MatrixMultiplySimple(std::vector<float>& vC, const std::vector<float>& vA, const std::vector<float>& vB, int M, int N, int W) 14: { 15: concurrency::array_view<const float,2> a(M, W, vA); 16: concurrency::array_view<const float,2> b(W, N, vB); 17: concurrency::array_view<concurrency::writeonly<float>,2> c(M, N, vC); 18: concurrency::parallel_for_each(c.grid, 19: [=](concurrency::index<2> idx) restrict(direct3d) { 20: int row = idx[0]; int col = idx[1]; 21: float sum = 0.0f; 22: for(int i = 0; i < W; i++) 23: sum += a(row, i) * b(i, col); 24: c[idx] = sum; 25: }); 26: } First a visual comparison, just for fun: The beginning and end is the same, i.e. lines 0,1,12 are identical to lines 13,14,26. The double nested loop (lines 2,3,4,5 and 10,11) has been transformed into a parallel_for_each call (18,19,20 and 25). The core algorithm (lines 6,7,8,9) is essentially the same (lines 21,22,23,24). We have extra lines in the C++ AMP version (15,16,17). Now let's dig in deeper. Using array_view and extent When we decided to convert this function to run on an accelerator, we knew we couldn't use the std::vector objects in the restrict(direct3d) function. So we had a choice of copying the data to the the concurrency::array<T,N> object, or wrapping the vector container (and hence its data) with a concurrency::array_view<T,N> object from amp.h – here we used the latter (lines 15,16,17). Now we can access the same data through the array_view objects (a and b) instead of the vector objects (vA and vB), and the added benefit is that we can capture the array_view objects in the lambda (lines 19-25) that we pass to the parallel_for_each call (line 18) and the data will get copied on demand for us to the accelerator. Note that line 15 (and ditto for 16 and 17) could have been written as two lines instead of one: extent<2> e(M, W); array_view<const float, 2> a(e, vA); In other words, we could have explicitly created the extent object instead of letting the array_view create it for us under the covers through the constructor overload we chose. The benefit of the extent object in this instance is that we can express that the data is indeed two dimensional, i.e a matrix. When we were using a vector object we could not do that, and instead we had to track via additional unrelated variables the dimensions of the matrix (i.e. with the integers M and W) – aren't you loving C++ AMP already? Note that the const before the float when creating a and b, will result in the underling data only being copied to the accelerator and not be copied back – a nice optimization. A similar thing is happening on line 17 when creating array_view c, where we have indicated that we do not need to copy the data to the accelerator, only copy it back. The kernel dispatch On line 18 we make the call to the C++ AMP entry point (parallel_for_each) to invoke our parallel loop or, as some may say, dispatch our kernel. The first argument we need to pass describes how many threads we want for this computation. For this algorithm we decided that we want exactly the same number of threads as the number of elements in the output matrix, i.e. in array_view c which will eventually update the vector vC. So each thread will compute exactly one result. Since the elements in c are organized in a 2-dimensional manner we can organize our threads in a two-dimensional manner too. We don't have to think too much about how to create the first argument (a grid) since the array_view object helpfully exposes that as a property. Note that instead of c.grid we could have written grid<2>(c.extent) or grid<2>(extent<2>(M, N)) – the result is the same in that we have specified M*N threads to execute our lambda. The second argument is a restrict(direct3d) lambda that accepts an index object. Since we elected to use a two-dimensional extent as the first argument of parallel_for_each, the index will also be two-dimensional and as covered in the previous posts it represents the thread ID, which in our case maps perfectly to the index of each element in the resulting array_view. The kernel itself The lambda body (lines 20-24), or as some may say, the kernel, is the code that will actually execute on the accelerator. It will be called by M*N threads and we can use those threads to index into the two input array_views (a,b) and write results into the output array_view ( c ). The four lines (21-24) are essentially identical to the four lines of the serial algorithm (6-9). The only difference is how we index into a,b,c versus how we index into vA,vB,vC. The code we wrote with C++ AMP is much nicer in its indexing, because the dimensionality is a first class concept, so you don't have to do funny arithmetic calculating the index of where the next row starts, which you have to do when working with vectors directly (since they store all the data in a flat manner). I skipped over describing line 20. Note that we didn't really need to read the two components of the index into temporary local variables. This mostly reflects my personal choice, in some algorithms to break down the index into local variables with names that make sense for the algorithm, i.e. in this case row and col. In other cases it may i,j,k or x,y,z, or M,N or whatever. Also note that we could have written line 24 as: c(idx[0], idx[1])=sum  or  c(row, col)=sum instead of the simpler c[idx]=sum Targeting a specific accelerator Imagine that we had more than one hardware accelerator on a system and we wanted to pick a specific one to execute this parallel loop on. So there would be some code like this anywhere before line 18: vector<accelerator> accs = MyFunctionThatChoosesSuitableAccelerators(); accelerator acc = accs[0]; …and then we would modify line 18 so we would be calling another overload of parallel_for_each that accepts an accelerator_view as the first argument, so it would become: concurrency::parallel_for_each(acc.default_view, c.grid, ...and the rest of your code remains the same… how simple is that? Comments about this post by Daniel Moth welcome at the original blog.

    Read the article

< Previous Page | 35 36 37 38 39 40 41 42 43 44 45 46  | Next Page >