Search Results

Search found 5945 results on 238 pages for 'green threads'.

Page 57/238 | < Previous Page | 53 54 55 56 57 58 59 60 61 62 63 64 | Next Page >

Thread mutex behaviour

- by Alberteddu

Hi there, I'm learning C. I'm writing an application with multiple threads; I know that when a variable is shared between two or more threads, it is better to lock/unlock using a mutex to avoid deadlock and inconsistency of variables. This is very clear when I want to change or view one variable. int i = 0; /** Global */ static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; /** Thread 1. */ pthread_mutex_lock(&mutex); i++; pthread_mutex_unlock(&mutex); /** Thread 2. */ pthread_mutex_lock(&mutex); i++; pthread_mutex_unlock(&mutex); This is correct, I think. The variable i, at the end of the executions, contains the integer 2. Anyway, there are some situations in which I don't know exactly where to put the two function calls. For example, suppose you have a function obtain(), which returns a global variable. I need to call that function from within the two threads. I have also two other threads that call the function set(), defined with a few arguments; this function will set the same global variable. The two functions are necessary when you need to do something before getting/setting the var. /** (0) */ /** Thread 1, or 2, or 3... */ if(obtain() == something) { if(obtain() == somethingElse) { // Do this, sometimes obtain() and sometimes set(random number) (1) } else { // Do that, just obtain(). (2) } } else { // Do this and do that (3) // If # of thread * 3 > 10, then set(3*10) For example. (4) } /** (5) */ Where I have to lock, and where I have to unlock? The situation can be, I think, even more complex. I will appreciate an exhaustive answer. Thank you in advance. —Alberto

Read the article
How do I configure encodings (UTF-8) for code executed by Quartz scheduled Jobs in Spring framework

- by Martin

I wonder how to configure Quartz scheduled job threads to reflect proper encoding. Code which otherwise executes fine within Springframework injection loaded webapps (java) will get encoding issues when run in threads scheduled by quartz. Is there anyone who can help me out? All source is compiled using maven2 with source and file encodings configured as UTF-8. In the quartz threads any string will have encoding errors if outside ISO 8859-1 characters: Example config <bean name="jobDetail" class="org.springframework.scheduling.quartz.JobDetailBean"> <property name="jobClass" value="example.ExampleJob" /> </bean> <bean id="jobTrigger" class="org.springframework.scheduling.quartz.SimpleTriggerBean"> <property name="jobDetail" ref="jobDetail" /> <property name="startDelay" value="1000" /> <property name="repeatCount" value="0" /> <property name="repeatInterval" value="1" /> </bean> <bean class="org.springframework.scheduling.quartz.SchedulerFactoryBean"> <property name="triggers"> <list> <ref bean="jobTrigger"/> </list> </property> </bean> Example implementation public class ExampleJob extends QuartzJobBean { private Log log = LogFactory.getLog(ExampleJob.class); protected void executeInternal(JobExecutionContext ctx) throws JobExecutionException { log.info("ÅÄÖ"); log.info(Charset.defaultCharset()); } } Example output 2010-05-20 17:04:38,285 1342 INFO [QuartzScheduler_Worker-9] ExampleJob - vÖvÑvñ 2010-05-20 17:04:38,286 1343 INFO [QuartzScheduler_Worker-9] ExampleJob - UTF-8 The same lines of code executed within spring injected beans referenced by servlets in the web-container will output proper encoding. What is it that make Quartz threads encoding dependent?

Read the article
Design for fastest page download

- by mexxican

I have a file with millions of URLs/IPs and have to write a program to download the pages really fast. The connection rate should be at least 6000/s and file download speed at least 2000 with avg. 15kb file size. The network bandwidth is 1 Gbps. My approach so far has been: Creating 600 socket threads with each having 60 sockets and using WSAEventSelect to wait for data to read. As soon as a file download is complete, add that memory address(of the downloaded file) to a pipeline( a simple vector ) and fire another request. When the total download is more than 50Mb among all socket threads, write all the files downloaded to the disk and free the memory. So far, this approach has been not very successful with the rate at which I could hit not shooting beyond 2900 connections/s and downloaded data rate even less. Can somebody suggest an alternative approach which could give me better stats. Also I am working windows server 2008 machine with 8 Gig of memory. Also, do we need to hack the kernel so as we could use more threads and memory. Currently I can create a max. of 1500 threads and memory usage not going beyond 2 gigs [ which technically should be much more as this is a 64-bit machine ]. And IOCP is out of question as I have no experience in that so far and have to fix this application today. Thanks Guys!

Read the article
perl multithreading issue for autoincrement

- by user3446683

I'm writing a multi threaded perl script and storing the output in a csv file. I'm trying to insert a field called sl.no. in the csv file for each row entered but as I'm using threads, the sl. no. overlaps in most. Below is an idea of my code snippet. for ( my $count = 1 ; $count <= 10 ; $count++ ) { my $t = threads->new( \&sub1, $count ); push( @threads, $t ); } foreach (@threads) { my $num = $_->join; } sub sub1 { my $num = shift; my $start = '...'; #distributing data based on an internal logic my $end = '...'; #distributing data based on an internal logic my $next; for ( my $x = $start ; $x <= $end ; $x++ ) { my $count = $x + 1; #part of code from which I get @data which has name and age my $j = 0; if ( $x != 0 ) { $count = $next; } foreach (@data) { #j is required here for some extra code flock( OUTPUT, LOCK_EX ); print OUTPUT $count . "," . $name . "," . $age . "\n"; flock( OUTPUT, LOCK_UN ); $j++; $count++; } $next = $count; } return $num; } I need the count to be incremented which is the serial number for the rows that would be inserted in the csv file. Any help would be appreciated.

Read the article
How to read from multiple queues in real-world?

- by Leon Cullens

Here's a theoretical question: When I'm building an application using message queueing, I'm going to need multiple queues support different data types for different purposes. Let's assume I have 20 queues (e.g. one to create new users, one to process new orders, one to edit user settings, etc.). I'm going to deploy this to Windows Azure using the 'minimum' of 1 web role and 1 worker role. How does one read from all those 20 queues in a proper way? This is what I had in mind, but I have little or no real-world practical experience with this: Create a class that spawns 20 threads in the worker role 'main' class. Let each of these threads execute a method to poll a different queue, and let all those threads sleep between each poll (of course with a back-off mechanism that increases the sleep time). This leads to have 20 threads (or 21?), and 20 queues that are being actively polled, resulting in a lot of wasted messages (each time you poll an empty queue it's being billed as a message). How do you solve this problem?

Read the article
The cross-thread usage of "HttpContext.Current" property and related things

- by smwikipedia

I read from < Essential ASP.NET with Examples in C# the following statement: Another useful property to know about is the static Current property of the HttpContext class. This property always points to the current instance of the HttpContext class for the request being serviced. This can be convenient if you are writing helper classes that will be used from pages or other pipeline classes and may need to access the context for whatever reason. By using the static Current property to retrieve the context, you can avoid passing a reference to it to helper classes. For example, the class shown in Listing 4-1 uses the Current property of the context to access the QueryString and print something to the current response buffer. Note that for this static property to be correctly initialized, the caller must be executing on the original request thread, so if you have spawned additional threads to perform work during a request, you must take care to provide access to the context class yourself. I am wondering about the root cause of the bold part, and one thing leads to another, here is my thoughts: We know that a process can have multiple threads. Each of these threads have their own stacks, respectively. These threads also have access to a shared memory area, the heap. The stack then, as I understand it, is kind of where all the context for that thread is stored. For a thread to access something in the heap it must use a pointer, and the pointer is stored on its stack. So when we make some cross-thread calls, we must make sure that all the necessary context info is passed from the caller thread's stack to the callee thread's stack. But I am not quite sure if I made any mistake. Any comments will be deeply appreciated. Thanks. ADD Here the stack is limited to user stack.

Read the article
How should a multi-threaded C application handle a failed malloc()?

- by user294463

A part of an application I'm working on is a simple pthread-based server that communicates over a TCP/IP socket. I am writing it in C because it's going to be running in a memory constrained environment. My question is: what should the program do if one of the threads encounters a malloc() that returns NULL? Possibilities I've come up with so far: No special handling. Let malloc() return NULL and let it be dereferenced so that the whole thing segfaults. Exit immediately on a failed malloc(), by calling abort() or exit(-1). Assume that the environment will clean everything up. Jump out of the main event loop and attempt to pthread_join() all the threads, then shut down. The first option is obviously the easiest, but seems very wrong. The second one also seems wrong since I don't know exactly what will happen. The third option seems tempting except for two issues: first, all of the threads need not be joined back to the main thread under normal circumstances and second, in order to complete the thread execution, most of the remaining threads will have to call malloc() again anyway. What shall I do?

Read the article
SPARC T4-4 Beats 8-CPU IBM POWER7 on TPC-H @3000GB Benchmark

- by Brian

Oracle's SPARC T4-4 server delivered a world record TPC-H @3000GB benchmark result for systems with four processors. This result beats eight processor results from IBM (POWER7) and HP (x86). The SPARC T4-4 server also delivered better performance per core than these eight processor systems from IBM and HP. Comparisons below are based upon system to system comparisons, highlighting Oracle's complete software and hardware solution. This database world record result used Oracle's Sun Storage 2540-M2 arrays (rotating disk) connected to a SPARC T4-4 server running Oracle Solaris 11 and Oracle Database 11g Release 2 demonstrating the power of Oracle's integrated hardware and software solution. The SPARC T4-4 server based configuration achieved a TPC-H scale factor 3000 world record for four processor systems of 205,792 QphH@3000GB with price/performance of $4.10/QphH@3000GB. The SPARC T4-4 server with four SPARC T4 processors (total of 32 cores) is 7% faster than the IBM Power 780 server with eight POWER7 processors (total of 32 cores) on the TPC-H @3000GB benchmark. The SPARC T4-4 server is 36% better in price performance compared to the IBM Power 780 server on the TPC-H @3000GB Benchmark. The SPARC T4-4 server is 29% faster than the IBM Power 780 for data loading. The SPARC T4-4 server is up to 3.4 times faster than the IBM Power 780 server for the Refresh Function. The SPARC T4-4 server with four SPARC T4 processors is 27% faster than the HP ProLiant DL980 G7 server with eight x86 processors on the TPC-H @3000GB benchmark. The SPARC T4-4 server is 52% faster than the HP ProLiant DL980 G7 server for data loading. The SPARC T4-4 server is up to 3.2 times faster than the HP ProLiant DL980 G7 for the Refresh Function. The SPARC T4-4 server achieved a peak IO rate from the Oracle database of 17 GB/sec. This rate was independent of the storage used, as demonstrated by the TPC-H @3000TB benchmark which used twelve Sun Storage 2540-M2 arrays (rotating disk) and the TPC-H @1000TB benchmark which used four Sun Storage F5100 Flash Array devices (flash storage). [*] The SPARC T4-4 server showed linear scaling from TPC-H @1000GB to TPC-H @3000GB. This demonstrates that the SPARC T4-4 server can handle the increasingly larger databases required of DSS systems. [*] The SPARC T4-4 server benchmark results demonstrate a complete solution of building Decision Support Systems including data loading, business questions and refreshing data. Each phase usually has a time constraint and the SPARC T4-4 server shows superior performance during each phase. [*] The TPC believes that comparisons of results published with different scale factors are misleading and discourages such comparisons. Performance Landscape The table lists the leading TPC-H @3000GB results for non-clustered systems. TPC-H @3000GB, Non-Clustered Systems System Processor P/C/T – Memory Composite(QphH) $/perf($/QphH) Power(QppH) Throughput(QthH) Database Available SPARC Enterprise M9000 3.0 GHz SPARC64 VII+ 64/256/256 – 1024 GB 386,478.3 $18.19 316,835.8 471,428.6 Oracle 11g R2 09/22/11 SPARC T4-4 3.0 GHz SPARC T4 4/32/256 – 1024 GB 205,792.0 $4.10 190,325.1 222,515.9 Oracle 11g R2 05/31/12 SPARC Enterprise M9000 2.88 GHz SPARC64 VII 32/128/256 – 512 GB 198,907.5 $15.27 182,350.7 216,967.7 Oracle 11g R2 12/09/10 IBM Power 780 4.1 GHz POWER7 8/32/128 – 1024 GB 192,001.1 $6.37 210,368.4 175,237.4 Sybase 15.4 11/30/11 HP ProLiant DL980 G7 2.27 GHz Intel Xeon X7560 8/64/128 – 512 GB 162,601.7 $2.68 185,297.7 142,685.6 SQL Server 2008 10/13/10 P/C/T = Processors, Cores, Threads QphH = the Composite Metric (bigger is better) $/QphH = the Price/Performance metric in USD (smaller is better) QppH = the Power Numerical Quantity QthH = the Throughput Numerical Quantity The following table lists data load times and refresh function times during the power run. TPC-H @3000GB, Non-Clustered Systems Database Load & Database Refresh System Processor Data Loading(h:m:s) T4Advan RF1(sec) T4Advan RF2(sec) T4Advan SPARC T4-4 3.0 GHz SPARC T4 04:08:29 1.0x 67.1 1.0x 39.5 1.0x IBM Power 780 4.1 GHz POWER7 05:51:50 1.5x 147.3 2.2x 133.2 3.4x HP ProLiant DL980 G7 2.27 GHz Intel Xeon X7560 08:35:17 2.1x 173.0 2.6x 126.3 3.2x Data Loading = database load time RF1 = power test first refresh transaction RF2 = power test second refresh transaction T4 Advan = the ratio of time to T4 time Complete benchmark results found at the TPC benchmark website http://www.tpc.org. Configuration Summary and Results Hardware Configuration: SPARC T4-4 server 4 x SPARC T4 3.0 GHz processors (total of 32 cores, 128 threads) 1024 GB memory 8 x internal SAS (8 x 300 GB) disk drives External Storage: 12 x Sun Storage 2540-M2 array storage, each with 12 x 15K RPM 300 GB drives, 2 controllers, 2 GB cache Software Configuration: Oracle Solaris 11 11/11 Oracle Database 11g Release 2 Enterprise Edition Audited Results: Database Size: 3000 GB (Scale Factor 3000) TPC-H Composite: 205,792.0 QphH@3000GB Price/performance: $4.10/QphH@3000GB Available: 05/31/2012 Total 3 year Cost: $843,656 TPC-H Power: 190,325.1 TPC-H Throughput: 222,515.9 Database Load Time: 4:08:29 Benchmark Description The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB, 10000GB, 30000GB and 100000GB) are not allowed by the TPC. TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system. The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multiple user modes. The benchmark requires reporting of price/performance, which is the ratio of the total HW/SW cost plus 3 years maintenance to the QphH. A secondary metric is the storage efficiency, which is the ratio of total configured disk space in GB to the scale factor. Key Points and Best Practices Twelve Sun Storage 2540-M2 arrays were used for the benchmark. Each Sun Storage 2540-M2 array contains 12 15K RPM drives and is connected to a single dual port 8Gb FC HBA using 2 ports. Each Sun Storage 2540-M2 array showed 1.5 GB/sec for sequential read operations and showed linear scaling, achieving 18 GB/sec with twelve Sun Storage 2540-M2 arrays. These were stand alone IO tests. The peak IO rate measured from the Oracle database was 17 GB/sec. Oracle Solaris 11 11/11 required very little system tuning. Some vendors try to make the point that storage ratios are of customer concern. However, storage ratio size has more to do with disk layout and the increasing capacities of disks – so this is not an important metric in which to compare systems. The SPARC T4-4 server and Oracle Solaris efficiently managed the system load of over one thousand Oracle Database parallel processes. Six Sun Storage 2540-M2 arrays were mirrored to another six Sun Storage 2540-M2 arrays on which all of the Oracle database files were placed. IO performance was high and balanced across all the arrays. The TPC-H Refresh Function (RF) simulates periodical refresh portion of Data Warehouse by adding new sales and deleting old sales data. Parallel DML (parallel insert and delete in this case) and database log performance are a key for this function and the SPARC T4-4 server outperformed both the IBM POWER7 server and HP ProLiant DL980 G7 server. (See the RF columns above.) See Also Transaction Processing Performance Council (TPC) Home Page Ideas International Benchmark Page SPARC T4-4 Server oracle.com OTN Oracle Solaris oracle.com OTN Oracle Database 11g Release 2 Enterprise Edition oracle.com OTN Sun Storage 2540-M2 Array oracle.com OTN Disclosure Statement TPC-H, QphH, $/QphH are trademarks of Transaction Processing Performance Council (TPC). For more information, see www.tpc.org. SPARC T4-4 205,792.0 QphH@3000GB, $4.10/QphH@3000GB, available 5/31/12, 4 processors, 32 cores, 256 threads; IBM Power 780 QphH@3000GB, 192,001.1 QphH@3000GB, $6.37/QphH@3000GB, available 11/30/11, 8 processors, 32 cores, 128 threads; HP ProLiant DL980 G7 162,601.7 QphH@3000GB, $2.68/QphH@3000GB available 10/13/10, 8 processors, 64 cores, 128 threads.

Read the article
Improved Performance on PeopleSoft Combined Benchmark using SPARC T4-4

- by Brian

Oracle's SPARC T4-4 server running Oracle's PeopleSoft HCM 9.1 combined online and batch benchmark achieved a world record 18,000 concurrent users experiencing subsecond response time while executing a PeopleSoft Payroll batch job of 500,000 employees in 32.4 minutes. This result was obtained with a SPARC T4-4 server running Oracle Database 11g Release 2, a SPARC T4-4 server running PeopleSoft HCM 9.1 application server and a SPARC T4-2 server running Oracle WebLogic Server in the web tier. The SPARC T4-4 server running the application tier used Oracle Solaris Zones which provide a flexible, scalable and manageable virtualization environment. The average CPU utilization on the SPARC T4-2 server in the web tier was 17%, on the SPARC T4-4 server in the application tier it was 59%, and on the SPARC T4-4 server in the database tier was 47% (online and batch) leaving significant headroom for additional processing across the three tiers. The SPARC T4-4 server used for the database tier hosted Oracle Database 11g Release 2 using Oracle Automatic Storage Management (ASM) for database files management with I/O performance equivalent to raw devices. Performance Landscape Results are presented for the PeopleSoft HRMS Self-Service and Payroll combined benchmark. The new result with 128 streams shows significant improvement in the payroll batch processing time with little impact on the self-service component response time. PeopleSoft HRMS Self-Service and Payroll Benchmark Systems Users Ave Response Search (sec) Ave Response Save (sec) Batch Time (min) Streams SPARC T4-2 (web) SPARC T4-4 (app) SPARC T4-4 (db) 18,000 0.988 0.539 32.4 128 SPARC T4-2 (web) SPARC T4-4 (app) SPARC T4-4 (db) 18,000 0.944 0.503 43.3 64 The following results are for the PeopleSoft HRMS Self-Service benchmark that was previous run. The results are not directly comparable with the combined results because they do not include the payroll component. PeopleSoft HRMS Self-Service 9.1 Benchmark Systems Users Ave Response Search (sec) Ave Response Save (sec) Batch Time (min) Streams SPARC T4-2 (web) SPARC T4-4 (app) 2x SPARC T4-2 (db) 18,000 1.048 0.742 N/A N/A The following results are for the PeopleSoft Payroll benchmark that was previous run. The results are not directly comparable with the combined results because they do not include the self-service component. PeopleSoft Payroll (N.A.) 9.1 - 500K Employees (7 Million SQL PayCalc, Unicode) Systems Users Ave Response Search (sec) Ave Response Save (sec) Batch Time (min) Streams SPARC T4-4 (db) N/A N/A N/A 30.84 96 Configuration Summary Application Configuration: 1 x SPARC T4-4 server with 4 x SPARC T4 processors, 3.0 GHz 512 GB memory Oracle Solaris 11 11/11 PeopleTools 8.52 PeopleSoft HCM 9.1 Oracle Tuxedo, Version 10.3.0.0, 64-bit, Patch Level 031 Java Platform, Standard Edition Development Kit 6 Update 32 Database Configuration: 1 x SPARC T4-4 server with 4 x SPARC T4 processors, 3.0 GHz 256 GB memory Oracle Solaris 11 11/11 Oracle Database 11g Release 2 PeopleTools 8.52 Oracle Tuxedo, Version 10.3.0.0, 64-bit, Patch Level 031 Micro Focus Server Express (COBOL v 5.1.00) Web Tier Configuration: 1 x SPARC T4-2 server with 2 x SPARC T4 processors, 2.85 GHz 256 GB memory Oracle Solaris 11 11/11 PeopleTools 8.52 Oracle WebLogic Server 10.3.4 Java Platform, Standard Edition Development Kit 6 Update 32 Storage Configuration: 1 x Sun Server X2-4 as a COMSTAR head for data 4 x Intel Xeon X7550, 2.0 GHz 128 GB memory 1 x Sun Storage F5100 Flash Array (80 flash modules) 1 x Sun Storage F5100 Flash Array (40 flash modules) 1 x Sun Fire X4275 as a COMSTAR head for redo logs 12 x 2 TB SAS disks with Niwot Raid controller Benchmark Description This benchmark combines PeopleSoft HCM 9.1 HR Self Service online and PeopleSoft Payroll batch workloads to run on a unified database deployed on Oracle Database 11g Release 2. The PeopleSoft HRSS benchmark kit is a Oracle standard benchmark kit run by all platform vendors to measure the performance. It's an OLTP benchmark where DB SQLs are moderately complex. The results are certified by Oracle and a white paper is published. PeopleSoft HR SS defines a business transaction as a series of HTML pages that guide a user through a particular scenario. Users are defined as corporate Employees, Managers and HR administrators. The benchmark consist of 14 scenarios which emulate users performing typical HCM transactions such as viewing paycheck, promoting and hiring employees, updating employee profile and other typical HCM application transactions. All these transactions are well-defined in the PeopleSoft HR Self-Service 9.1 benchmark kit. This benchmark metric is the weighted average response search/save time for all the transactions. The PeopleSoft 9.1 Payroll (North America) benchmark demonstrates system performance for a range of processing volumes in a specific configuration. This workload represents large batch runs typical of a ERP environment during a mass update. The benchmark measures five application business process run times for a database representing large organization. They are Paysheet Creation, Payroll Calculation, Payroll Confirmation, Print Advice forms, and Create Direct Deposit File. The benchmark metric is the cumulative elapsed time taken to complete the Paysheet Creation, Payroll Calculation and Payroll Confirmation business application processes. The benchmark metrics are taken for each respective benchmark while running simultaneously on the same database back-end. Specifically, the payroll batch processes are started when the online workload reaches steady state (the maximum number of online users) and overlap with online transactions for the duration of the steady state. Key Points and Best Practices Two PeopleSoft Domain sets with 200 application servers each on a SPARC T4-4 server were hosted in 2 separate Oracle Solaris Zones to demonstrate consolidation of multiple application servers, ease of administration and performance tuning. Each Oracle Solaris Zone was bound to a separate processor set, each containing 15 cores (total 120 threads). The default set (1 core from first and third processor socket, total 16 threads) was used for network and disk interrupt handling. This was done to improve performance by reducing memory access latency by using the physical memory closest to the processors and offload I/O interrupt handling to default set threads, freeing up cpu resources for Application Servers threads and balancing application workload across 240 threads. A total of 128 PeopleSoft streams server processes where used on the database node to complete payroll batch job of 500,000 employees in 32.4 minutes. See Also Oracle PeopleSoft Benchmark White Papers oracle.com SPARC T4-2 Server oracle.com OTN SPARC T4-4 Server oracle.com OTN PeopleSoft Enterprise Human Capital Managementoracle.com OTN PeopleSoft Enterprise Human Capital Management (Payroll) oracle.com OTN Oracle Solaris oracle.com OTN Oracle Database 11g Release 2 oracle.com OTN Disclosure Statement Copyright 2012, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 8 November 2012.

Read the article
NUMA-aware placement of communication variables

- by Dave

For classic NUMA-aware programming I'm typically most concerned about simple cold, capacity and compulsory misses and whether we can satisfy the miss by locally connected memory or whether we have to pull the line from its home node over the coherent interconnect -- we'd like to minimize channel contention and conserve interconnect bandwidth. That is, for this style of programming we're quite aware of where memory is homed relative to the threads that will be accessing it. Ideally, a page is collocated on the node with the thread that's expected to most frequently access the page, as simple misses on the page can be satisfied without resorting to transferring the line over the interconnect. The default "first touch" NUMA page placement policy tends to work reasonable well in this regard. When a virtual page is first accessed, the operating system will attempt to provision and map that virtual page to a physical page allocated from the node where the accessing thread is running. It's worth noting that the node-level memory interleaving granularity is usually a multiple of the page size, so we can say that a given page P resides on some node N. That is, the memory underlying a page resides on just one node. But when thinking about accesses to heavily-written communication variables we normally consider what caches the lines underlying such variables might be resident in, and in what states. We want to minimize coherence misses and cache probe activity and interconnect traffic in general. I don't usually give much thought to the location of the home NUMA node underlying such highly shared variables. On a SPARC T5440, for instance, which consists of 4 T2+ processors connected by a central coherence hub, the home node and placement of heavily accessed communication variables has very little impact on performance. The variables are frequently accessed so likely in M-state in some cache, and the location of the home node is of little consequence because a requester can use cache-to-cache transfers to get the line. Or at least that's what I thought. Recently, though, I was exploring a simple shared memory point-to-point communication model where a client writes a request into a request mailbox and then busy-waits on a response variable. It's a simple example of delegation based on message passing. The server polls the request mailbox, and having fetched a new request value, performs some operation and then writes a reply value into the response variable. As noted above, on a T5440 performance is insensitive to the placement of the communication variables -- the request and response mailbox words. But on a Sun/Oracle X4800 I noticed that was not the case and that NUMA placement of the communication variables was actually quite important. For background an X4800 system consists of 8 Intel X7560 Xeons . Each package (socket) has 8 cores with 2 contexts per core, so the system is 8x8x2. Each package is also a NUMA node and has locally attached memory. Every package has 3 point-to-point QPI links for cache coherence, and the system is configured with a twisted ladder "mobius" topology. The cache coherence fabric is glueless -- there's not central arbiter or coherence hub. The maximum distance between any two nodes is just 2 hops over the QPI links. For any given node, 3 other nodes are 1 hop distant and the remaining 4 nodes are 2 hops distant. Using a single request (client) thread and a single response (server) thread, a benchmark harness explored all permutations of NUMA placement for the two threads and the two communication variables, measuring the average round-trip-time and throughput rate between the client and server. In this benchmark the server simply acts as a simple transponder, writing the request value plus 1 back into the reply field, so there's no particular computation phase and we're only measuring communication overheads. In addition to varying the placement of communication variables over pairs of nodes, we also explored variations where both variables were placed on one page (and thus on one node) -- either on the same cache line or different cache lines -- while varying the node where the variables reside along with the placement of the threads. The key observation was that if the client and server threads were on different nodes, then the best placement of variables was to have the request variable (written by the client and read by the server) reside on the same node as the client thread, and to place the response variable (written by the server and read by the client) on the same node as the server. That is, if you have a variable that's to be written by one thread and read by another, it should be homed with the writer thread. For our simple client-server model that means using split request and response communication variables with unidirectional message flow on a given page. This can yield up to twice the throughput of less favorable placement strategies. Our X4800 uses the QPI 1.0 protocol with source-based snooping. Briefly, when node A needs to probe a cache line it fires off snoop requests to all the nodes in the system. Those recipients then forward their response not to the original requester, but to the home node H of the cache line. H waits for and collects the responses, adjudicates and resolves conflicts and ensures memory-model ordering, and then sends a definitive reply back to the original requester A. If some node B needed to transfer the line to A, it will do so by cache-to-cache transfer and let H know about the disposition of the cache line. A needs to wait for the authoritative response from H. So if a thread on node A wants to write a value to be read by a thread on node B, the latency is dependent on the distances between A, B, and H. We observe the best performance when the written-to variable is co-homed with the writer A. That is, we want H and A to be the same node, as the writer doesn't need the home to respond over the QPI link, as the writer and the home reside on the very same node. With architecturally informed placement of communication variables we eliminate at least one QPI hop from the critical path. Newer Intel processors use the QPI 1.1 coherence protocol with home-based snooping. As noted above, under source-snooping a requester broadcasts snoop requests to all nodes. Those nodes send their response to the home node of the location, which provides memory ordering, reconciles conflicts, etc., and then posts a definitive reply to the requester. In home-based snooping the snoop probe goes directly to the home node and are not broadcast. The home node can consult snoop filters -- if present -- and send out requests to retrieve the line if necessary. The 3rd party owner of the line, if any, can respond either to the home or the original requester (or even to both) according to the protocol policies. There are myriad variations that have been implemented, and unfortunately vendor terminology doesn't always agree between vendors or with the academic taxonomy papers. The key is that home-snooping enables the use of a snoop filter to reduce interconnect traffic. And while home-snooping might have a longer critical path (latency) than source-based snooping, it also may require fewer messages and less overall bandwidth. It'll be interesting to reprise these experiments on a platform with home-based snooping. While collecting data I also noticed that there are placement concerns even in the seemingly trivial case when both threads and both variables reside on a single node. Internally, the cores on each X7560 package are connected by an internal ring. (Actually there are multiple contra-rotating rings). And the last-level on-chip cache (LLC) is partitioned in banks or slices, which with each slice being associated with a core on the ring topology. A hardware hash function associates each physical address with a specific home bank. Thus we face distance and topology concerns even for intra-package communications, although the latencies are not nearly the magnitude we see inter-package. I've not seen such communication distance artifacts on the T2+, where the cache banks are connected to the cores via a high-speed crossbar instead of a ring -- communication latencies seem more regular.

Read the article
How does Windows 7 taskbar "color hot-tracking" feature calculate the colour to use?

- by theyetiman

This has intrigued me for quite some time. Does anyone know the algorithm Windows 7 Aero uses to determine the colour to use as the hot-tracking hover highlight on taskbar buttons for currently-running apps? It is definitely based on the icon of the app, but I can't see a specific pattern of where it's getting the colour value from. It doesn't seem to be any of the following: An average colour value from the entire icon, otherwise you would get brown all the time with multi-coloured icons like Chrome. The colour used the most in the image, otherwise you'd get yellow for the SQL Server Management Studio icon (6th from left). Also, the Chrome icon used red, green and yellow in equal measure. A colour located at certain pixel coordinates within the icon, because Chrome is red -indicating the top of the icon - and Notepad++ (2nd from right) is green - indicating the bottom of the icon. I asked this question on ux.stackoverflow.com and it got closed as off-topic, but someone answered with the following: As described by Raymond Chen in this MSDN blog article: Some people ask how it's done. It's really nothing special. The code just looks for the predominant color in the icon. (And, since visual designers are sticklers for this sort of thing, black, white, and shades of gray are not considered "colors" for the purpose of this calculation.) However I wasn't really satisfied with that answer because it doesn't explain how the "predominant" colour is calculated. Surely on the SQL Management Studio icon, the predominant colour, to my eyes at least, is yellow. Yet the highlight is green. I want to know, specifically, what the algorithm is.

Read the article
problem transferring Win 7 operating system hard drive to be used as external hard drive

- by itserich

Win 7 Asus MA479 8GB Ram hard drives are 500GB The operating system was a Caviar Green, and I tried to exchange it for a Caviar Blue. The Caviar Blue took the install correctly, but the Green, the prior operating system hard drive, will not allow itself to be used as an external hard drive. I use TrueCrypt and tried to format the Green and it freezes each time at the very end of the encyrption of the partition. I took the Blue out of the system and tried to encrypt it, and same problem. I think there must be something on the hard drive that shows it was a system drive and it causes a conflict. I have tried writing over the system hard drive to fully erase everything but that does not work, it still freezes. The drives will work in a different pc, i.e. other pc where it never was the system drive. The external hard drive is connected esata through Thermaltake. I have used TrueCrypt on various pc's for years including this Win7 with no problem. Thank you.

Read the article
Overlap problem with Android ListView selector

- by virsir

I am trying to style my ListView with two 9-patch background images (16px * 9px), one dark image for default state and another green image for selected and pressed state. It works except for just one problem that when I select or press one list item, it seems that the selected item overlap the next one a little bit as I can see some pixels of the green background image is on the top of next item. How to fix that?

Read the article
Business enum to DatContract Enum conversion in WCF

- by chugh97

I have an enum namespace Business { public enum Color { Red,Green,Blue } } namespace DataContract { [DataContract] public enum Color { [EnumMember] Red, [EnumMember] Green, [EnumMember] Blue } } I have the same enum as a datacontract in WCF with same values. I need to convert the Business enum to the DataContract enum using a translator. Hoe can I achieve this?

Read the article
Create array of objects based on another array?

- by xckpd7

I want to take an array like this: var food = [ { name: 'strawberry', type: 'fruit', color: 'red', id: 3483 }, { name: 'apple', type: 'fruit', color: 'red', id: 3418 }, { name: 'banana', type: 'fruit', color: 'yellow', id: 3458 }, { name: 'brocolli', type: 'vegetable', color: 'green', id: 1458 }, { name: 'steak', type: 'meat', color: 'brown', id: 2458 }, ] And I want to create something like this dynamically: var foodCategories = [ { name: 'fruit', items: [ { name: 'apple', type: 'fruit', color: 'red', id: 3418 }, { name: 'banana', type: 'fruit', color: 'yellow', id: 3458 } ] }, { name: 'vegetable', items: [ { name: 'brocolli', type: 'vegetable', color: 'green', id: 1458 }, ] }, { name: 'meat', items: [ { name: 'steak', type: 'meat', color: 'brown', id: 2458 } ] } ] What's the best way to go about doing this?

Read the article
How to convert between Enums where values share the same names ?

- by Ross Watson

Hi, if I want to convert between two Enum types, the values of which, I hope, have the same names, is there a neat way, or do I have to do it like this...? enum colours_a { red, blue, green } enum colours_b { yellow, red, blue, green } static void Main(string[] args) { colours_a a = colours_a.red; colours_b b; //b = a; b = (colours_b)Enum.Parse(typeof(colours_b), a.ToString()); } Thanks, Ross

Read the article
adding opacity's of the PhoneAccentBrush to a Silverlight toolbox pie chart

- by Doug

Hi there, i am working on a windows phone application and i am using the pie chart - i am relatively new to the silverlight control toolbox. i want to make my pie chart use different opacity's of the PhoneAccentBrush as its colour pallete. (ie if the accent colour is green then i use the green and then a 0.8,0.6,0.4,0.2 opacity of the colour as my pie charts pallete) i have tried a few things, but none of them worked - has anyone accomplished this and if so how? thanks in advance Doug

Read the article
PHP form values after POST in dropdown

- by FFish

I have a form with 'selected' values pulled from the database. Now I want the user to edit the values. When the data is send I want to show the new values. When I submit my form I always get the 'green' value? What am I doing wrong here? <?php // pulled from db $color = "blue"; // update if (isset($_POST['Submit'])) { echo "write to db: " . $_POST['name'] . " + " . $_POST['color']; } ?> <html> <form name="form1" method="post" action="<?php echo $_SERVER['PHP_SELF']; ?>"> <label for="name">Name:</label> <input type="text" name="name" size="30" value="<?php echo (isset($_POST['name'])) ? $_POST['name'] : ""; ?>"> <br /> <label for="color">Color:</label> <select name="color"> <option <?php echo (isset($_POST['color']) || $color == "red") ? 'selected="selected"' : ''; ?> value="red">red</option> <option <?php echo (isset($_POST['color']) || $color == "blue") ? 'selected="selected"' : ''; ?> value="blue">blue</option> <option <?php echo (isset($_POST['color']) || $color == "green") ? 'selected="selected"' : ''; ?> value="green">green</option> </select> <br /> <input type="submit" name="Submit" value="Update"> </form> </html>

Read the article
Split html text in a SEO friendly manner

- by al nik

I've some html text like <h1>GreenWhiteRed</h1> Is it SEO friendly to split this text in something like <h1><span class="green">Green</span><span class="white">White</span><span class="red">Red</span></h1> Is the text still ranking well and is it interpreted as a single word 'GreenWhiteRed'?

Read the article
Winforms Checkbox : CheckState property Indeterminate renders differently

- by joedotnot

In C# environment, setting a checkbox's CheckState property to Indeterminate displays a "green square" inside the checkbox. In VB environment, this displays as a "grayed out check" (which is less intuitive, even for "dummy" users). How do i make Indeterminate state look like a "green square" in VB.NET ? Btw, i am using VS2008, Winforms 2.0. (Btw2: I tried to create two tags CheckState Indeterminate, which is more appropriate to my question, but disallowed by StackOverflow due to points!)

Read the article
finding shortest valid path in a colored-edge graphs

- by user1067083

Given a directed graph G, with edges colored either green or purple, and a vertex S in G, I must find an algorithm that finds the shortest path from s to each vertex in G so the path includes at most two purple edges (and green as much as needed). I thought of BFS on G after removing all the purple edges, and for every vertex that the shortest path is still infinity, do something to try to find it, but I'm kinda stuck, and it takes alot of the running time as well... Any other suggestions? Thanks in advance

Read the article
UIColor app crashing

- by coure06

i have a global variable UIColor *textColor; I am update this variable by the code textColor = [UIColor colorWithRed:fr green:fg blue:fb alpha:1.0]; then assigning this color to Label like this myLabel.textColor = textColor; It only work once, when i again call with updated values and assign label new values app crashes... textColor = [UIColor colorWithRed:fr green:fg blue:fb alpha:1.0]; myLabel.textColor = textColor;

Read the article
Groovy sorting string asc

- by srinath

How to sort string names in array on ascending order . I tried sort method but it fails to sort on name basis . def words = ["orange", "blue", "apple", "violet", "green"] I need to achieve like this : ["apple", "blue", "green", "orange", "violet" ] thanks in advance.

Read the article
excel change 4 rows / 48 col to 48 rows / 4 col

- by GoodOlPete

Hi, I've selected 4 database records of 48 fields into excel as below: FirstName LastName Age Address1 ....................... Andy smith 23 53 high st billy ball 43 23 the avenue charles brown 76 rose cottage dave green 43 station rd I want to display them as firstname andy billy charles dave lastname smith ball brown green age 23 43 76 43 address1.............................. Can anyone suggest how to do this?

Read the article
GUI design for color blindness

- by Phillip Ngan

It is common to represent status of an item in a GUI using the colors: red, yellow, green, to mean error, warning, and OK (or something equivalent). However, 7-10% of men have difficulty distinguishing between red and green because of color blindness. So far I've looked at Color Scheme Designer which simulates how people with different color blindnesses would perceive a set of colors, but I'm interested in hearing how you have approached this problem and how successful it was.

Read the article

< Previous Page | 53 54 55 56 57 58 59 60 61 62 63 64 | Next Page >