Search Results

Search found 740 results on 30 pages for 'processors'.

Page 26/30 | < Previous Page | 22 23 24 25 26 27 28 29 30  | Next Page >

  • How can I speed up a 1800-line PHP include? It's slowing my pageload down to 10sec/view

    - by somerandomguy
    I designed my code to put all important functions in a single PHP file that's now 1800 lines long. I call it in other PHP files--AJAX processors, for example--with a simple "require_once("codeBank.php")". I'm discovering that it takes about 10 seconds to load up all those functions, even though I have nothing more than a few global arrays and a bunch of other functions involved. The main AJAX processor code, for example, is taking 8 seconds just to do a simple syntax verification (whose operational function is stored in codeBank.php). When I comment out the require_once, my AJAX response time speeds up from 10sec to 40ms, so it's pretty clear that PHP's trying to do something with those 1800 lines of functions. That's even with APC installed, which is surprising. What should I do to get my code speed back to the sub-100ms level? Am I failing to get the cache's benefit somehow? Do I need to cut that single function bank file into different pieces? Are there other subtle things that I could be doing to screw up my response time? Or barring all that, what are some tools to dig further into which PHP operations are hitting speed bumps?

    Read the article

  • MPI Odd/Even Compare-Split Deadlock

    - by erebel55
    I'm trying to write an MPI version of a program that runs an odd/even compare-split operation on n randomly generated elements. Process 0 should generated the elements and send nlocal of them to the other processes, (keeping the first nlocal for itself). From here, process 0 should print out it's results after running the CompareSplit algorithm. Then, receive the results from the other processes run of the algorithm. Finally, print out the results that it has just received. I have a large chunk of this already done, but I'm getting a deadlock that I can't seem to fix. I would greatly appreciate any hints that people could give me. Here is my code http://pastie.org/3742474 Right now I'm pretty sure that the deadlock is coming from the Send/Recv at lines 134 and 151. I've tried changing the Send to use "tag" instead of myrank for the tag parameter..but when I did that I just keep getting a "MPI_ERR_TAG: invalid tag" for some reason. Obviously I would also run the algorithm within the processors 0 but I took that part out for now, until I figure out what is going wrong. Any help is appreciated.

    Read the article

  • Emulating a computer running MS-DOS

    - by Richard
    Writing emulators has always fascinated me. Now I want to write an emulator for an IBM PC and run MS-DOS on it (I've got the floppy image files). I have good experience in C++ and C and basic knowledge of assembler and the architecture of a CPU. I also know that there are thousands of emulators out there doing exactly what I want to do, but I'd be doing this for pure joy only. How much work do I have to expect? (If my goal is to boot DOS and create a text file with it, all emulated) What CPU should I emulate ? Where can I find documentation on how the machine code is organized and which opcodes mean what, so I can unpack and execute them correctly with my emulator? Does MS-DOS still run on the newest generations of processors? Would it theoretically be able to natively run on a 64-bit AMD Phenom 2 processor w/ a modern mainboard, HDD, RAM, etc.? What else, besides emulating the CPU, could be an important factor (in terms of difficulty)? I would only aim for outputting / inputting text to the system via the host system's console, no sound or other more advanced IO etc. Have you written an emulator yet? What was your first one for? How hard was it? Do you have any special tips for me? Thanks in advance

    Read the article

  • What's the "correct way" to organize this project?

    - by user571747
    I'm working on a project that allows multiple users to submit large data files and perform operations on them. The "backend" which performs these operations is written in Perl while the "frontend" uses PHP to load HTML template files and determines which content to deliver. Data is stored in a database (MySQL, SQLite, Oracle) and while there is data which has not yet been acted upon, Perl adds it to a running queue which delivers data to other threads based on system load. In addition, there may be pre- and post-processing of the data before and after the main Perl script operates (the specifications are unclear) so I may want to allow these processors to be user-selectable plugins. I had been writing this project in a more procedural fashion but I am quickly realizing the benefit of separating concerns as to limit the scope one change has on the rest of the project. I'm quite unexperienced with design patterns and am curious what the best way to proceed is. I've heard MVC thrown around quite a bit but I am unsure of how to apply it. Specifically, what are some good options to structure this code (in terms of design patterns and folder hierarchy)? How can I achieve this with both PHP and Perl while minimizing duplicated code between languages? Should I keep my PHP files in the top level so I don't have ugly paths in the URL? Also, if I want to provide interchangeable databases, does each table need its own DAO implementation?

    Read the article

  • Parallel doseq for Clojure

    - by andrew cooke
    I haven't used multithreading in Clojure at all so am unsure where to start. I have a doseq whose body can run in parallel. What I'd like is for there always to be 3 threads running (leaving 1 core free) that evaluate the body in parallel until the range is exhausted. There's no shared state, nothing complicated - the equivalent of Python's multiprocessing would be just fine. So something like: (dopar 3 [i (range 100)] ; repeated 100 times in 3 parallel threads... ...) Where should I start looking? Is there a command for this? A standard package? A good reference? So far I have found pmap, and could use that (how do I restrict to 3 at a time? looks like it uses 32 at a time - no, source says 2 + number of processors), but it seems like this is a basic primitive that should already exist somewhere. clarification: I really would like to control the number of threads. I have processes that are long-running and use a fair amount of memory, so creating a large number and hoping things work out OK isn't a good approach (example which uses a significant chunk available mem). update: Starting to write a macro that does this, and I need a semaphore (or a mutex, or an atom i can wait on). Do semaphores exist in Clojure? Or should I use a ThreadPoolExecutor? It seems odd to have to pull so much in from Java - I thought parallel programming in Clojure was supposed to be easy... Maybe I am thinking about this completely the wrong way? Hmmm. Agents?

    Read the article

  • What is an appropriate way to separate lifecycle events in the logging system?

    - by Hanno Fietz
    I have an application with many different parts, it runs on OSGi, so there's the bundle lifecycles, there's a number of message processors and plugin components that all can die, can be started and stopped, have their setup changed etc. I want a way to get a good picture of the current system status, what components are up, which have problems, how long they have been running for etc. I think that logging, especially in combination with custom appenders (I'm using log4j), is a good part of the solution and does help ad-hoc analysis as well as live monitoring. Normally, I would classify lifecycle events as INFO level, but what I really want is to have them separate from what else is going on in INFO. I could create my own level, LIFECYCLE. The lifecycle events happen in various different areas and on various levels in the application hierarchy, also they happen in the same areas as other events that I want to separate them from. I could introduce some common lifecycle management and use that to distinguish the events from others. For instance, all components that have a lifecycle could implement a particular interface and I log by its name. Are there good examples of how this is done elsewhere? What are considerations?

    Read the article

  • How do software events work internally?

    - by Duddle
    Hello! I am a student of Computer Science and have learned many of the basic concepts of what is going on "under the hood" while a computer program is running. But recently I realized that I do not understand how software events work efficiently. In hardware, this is easy: instead of the processor "busy waiting" to see if something happened, the component sends an interrupt request. But how does this work in, for example, a mouse-over event? My guess is as follows: if the mouse sends a signal ("moved"), the operating system calculates its new position p, then checks what program is being drawn on the screen, tells that program position p, then the program itself checks what object is at p, checks if any event handlers are associated with said object and finally fires them. That sounds terribly inefficient to me, since a tiny mouse movement equates to a lot of cpu context switches (which I learned are relatively expensive). And then there are dozens of background applications that may want to do stuff of their own as well. Where is my intuition failing me? I realize that even "slow" 500MHz processors do 500 million operations per second, but still it seems too much work for such a simple event. Thanks in advance!

    Read the article

  • C Population Count of unsigned 64-bit integer with a maximum value of 15

    - by BitTwiddler1011
    I use a population count (hamming weight) function intensively in a windows c application and have to optimize it as much as possible in order to boost performance. More than half the cases where I use the function I only need to know the value to a maximum of 15. The software will run on a wide range of processors, both old and new. I already make use of the POPCNT instruction when Intel's SSE4.2 or AMD's SSE4a is present, but would like to optimize the software implementation (used as a fall back if no SSE4 is present) as much as possible. Currently I have the following software implementation of the function: inline int population_count64(unsigned __int64 w) { w -= (w 1) & 0x5555555555555555ULL; w = (w & 0x3333333333333333ULL) + ((w 2) & 0x3333333333333333ULL); w = (w + (w 4)) & 0x0f0f0f0f0f0f0f0fULL; return int(w * 0x0101010101010101ULL) 56; } So to summarize: (1) I would like to know if it is possible to optimize this for the case when I only want to know the value to a maximum of 15. (2) Is there a faster software implementation (for both Intel and AMD CPU's) than the function above?

    Read the article

  • Parallel programming, are we not learning from history again?

    - by mezmo
    I started programming because I was a hardware guy that got bored, I thought the problems being solved in the software side of things were much more interesting than those in hardware. At that time, most of the electrical buses I dealt with were serial, some moving data as fast as 1.5 megabit!! ;) Over the years these evolved into parallel buses in order to speed communication up, after all, transferring 8/16/32/64, whatever bits at a time incredibly speeds up the transfer. Well, our ability to create and detect state changes got faster and faster, to the point where we could push data so fast that interference between parallel traces or cable wires made cleaning the signal too expensive to continue, and we still got reasonable performance from serial interfaces, heck some graphics interfaces are even happening over USB for a while now. I think I'm seeing a like trend in software now, our processors were getting faster and faster, so we got good at building "serial" software. Now we've hit a speed bump in raw processor speed, so we're adding cores, or "traces" to the mix, and spending a lot of time and effort on learning how to properly use those. But I'm also seeing what I feel are advances in things like optical switching and even quantum computing that could take us far more quickly that I was expecting back to the point where "serial programming" again makes the most sense. What are your thoughts?

    Read the article

  • Calculating with a variable outside of its bounds in C

    - by aquanar
    If I make a calculation with a variable where an intermediate part of the calculation goes higher then the bounds of that variable type, is there any hazard that some platforms may not like? This is an example of what I'm asking: int a, b; a=30000; b=(a*32000)/32767; I have compiled this, and it does give the correct answer of 29297 (well, within truncating error, anyway). But the part that worries me is that 30,000*32,000 = 960,000,000, which is a 30-bit number, and thus cannot be stored in a 16-bit int. The end result is well within the bounds of an int, but I was expecting that whatever working part of memory would have the same size allocated as the largest source variables did, so an overflow error would occur. This is just a small example to show my problem, I am trying to avoid using floating points by making the fraction be a fraction of the max amount able to be stored in that variable (in this case, a signed integer, so 32767 on the positive side), because the embedded system I'm using I believe does not have an FPU. So how do most processors handle calculations out of the bounds of the source and destination variables?

    Read the article

  • Why doesn't my processor have built-in BigInt support?

    - by ol
    As far as I understood it, BigInts are usually implemented in most programming languages as strings containing numbers, where, eg.: when adding two of them, each digit is added one after another like we know it from school, e.g.: 246 816 * * ---- 1062 Where * marks that there was an overflow. I learned it this way at school and all BigInt adding functions I've implemented work similar to the example above. So we all know that our processors can only natively manage ints from 0 to 2^32 / 2^64. That means that most scripting languages in order to be high-level and offer arithmetics with big integers, have to implement/use BigInt libraries that work with integers as strings like above. But of course this means that they'll be far slower than the processor. So what I've asked myself is: Why doesn't my processor have a built-in BigInt function? It would work like any other BigInt library, only (a lot) faster and at a lower level: Processor fetches one digit from the cache/RAM, adds it, and writes the result back again. Seems like a fine idea to me, so why isn't there something like that?

    Read the article

  • ASP.NET MVC: How to show value in a label from selected Drop Down List item?

    - by Lillie
    Hi! I'm trying to show a value of selected Drop Down List item in a label. I managed to make this work with Web Forms but with MVC I'm totally lost. My Index looks like this: [...] <% using (Html.BeginForm()) { %> <table> <tr> <td>Processor</td> <td><%= Html.DropDownList("lstProcessor1", new SelectList((IEnumerable)ViewData["Processor1List"], "product_price", "product_description")) %></td> </tr> <tr> <td>Total Amount</td> <td>0,00 €</td> </tr> </table> <input type="submit" value="Submit" /> <% } %> [...] And my HomeController starts with: using System; using System.Collections.Generic; using System.Linq; using System.Web; using System.Web.Mvc; using System.Web.Mvc.Ajax; using MvcApplication1.Models; namespace MvcApplication1.Controllers { [HandleError] public class HomeController : Controller { // Connect database DB50DataContext _ctx = new DB50DataContext(); // GET: /Home/ public ActionResult Index() { // Search: Processors var products = from prod in _ctx.products where prod.product_searchcode == "processor1" select prod; ViewData["Processort1List"] = products; return View(); } I would like the product_price to show on the second line of the table, where it now says 0,00 €. It should also update the price automatically when the item from the Drop Down List is changed. I guess I should use JQuery but I have no idea how. Could someone please give me some tips how to do this?

    Read the article

  • Trouble understanding the semantics of volatile in Java

    - by HungryTux
    I've been reading up about the use of volatile variables in Java. I understand that they ensure instant visibility of their latest updates to all the threads running in the system on different cores/processors. However no atomicity of the operations that caused these updates is ensured. I see the following literature being used frequently A write to a volatile field happens-before every read of that same field . This is where I am a little confused. Here's a snippet of code which should help me better explain my query. volatile int x = 0; volatile int y = 0; Thread-0: | Thread-1: | if (x==1) { | if (y==1) { return false; | return false; } else { | } else { y=1; | x=1; return true; | return true; } | } Since x & y are both volatile, we have the following happens-before edges between the write of y in Thread-0 and read of y in Thread-1 between the write of x in Thread-1 and read of x in Thread-0 Does this imply that, at any point of time, only one of the threads can be in its 'else' block(since a write would happen before the read)? It may well be possible that Thread-0 starts, loads x, finds it value as 0 and right before it is about to write y in the else-block, there's a context switch to Thread-1 which loads y finds it value as 0 and thus enters the else-block too. Does volatile guard against such context switches (seems very unlikely)?

    Read the article

  • Oracle Announces Oracle Exadata X3 Database In-Memory Machine

    - by jgelhaus
    Fourth Generation Exadata X3 Systems are Ideal for High-End OLTP, Large Data Warehouses, and Database Clouds; Eighth-Rack Configuration Offers New Low-Cost Entry Point ORACLE OPENWORLD, SAN FRANCISCO – October 1, 2012 News Facts During his opening keynote address at Oracle OpenWorld, Oracle CEO, Larry Ellison announced the Oracle Exadata X3 Database In-Memory Machine - the latest generation of its Oracle Exadata Database Machines. The Oracle Exadata X3 Database In-Memory Machine is a key component of the Oracle Cloud. Oracle Exadata X3-2 Database In-Memory Machine and Oracle Exadata X3-8 Database In-Memory Machine can store up to hundreds of Terabytes of compressed user data in Flash and RAM memory, virtually eliminating the performance overhead of reads and writes to slow disk drives, making Exadata X3 systems the ideal database platforms for the varied and unpredictable workloads of cloud computing. In order to realize the highest performance at the lowest cost, the Oracle Exadata X3 Database In-Memory Machine implements a mass memory hierarchy that automatically moves all active data into Flash and RAM memory, while keeping less active data on low-cost disks. With a new Eighth-Rack configuration, the Oracle Exadata X3-2 Database In-Memory Machine delivers a cost-effective entry point for smaller workloads, testing, development and disaster recovery systems, and is a fully redundant system that can be used with mission critical applications. Next-Generation Technologies Deliver Dramatic Performance Improvements Oracle Exadata X3 Database In-Memory Machines use a combination of scale-out servers and storage, InfiniBand networking, smart storage, PCI Flash, smart memory caching, and Hybrid Columnar Compression to deliver extreme performance and availability for all Oracle Database Workloads. Oracle Exadata X3 Database In-Memory Machine systems leverage next-generation technologies to deliver significant performance enhancements, including: Four times the Flash memory capacity of the previous generation; with up to 40 percent faster response times and 100 GB/second data scan rates. Combined with Exadata’s unique Hybrid Columnar Compression capabilities, hundreds of Terabytes of user data can now be managed entirely within Flash; 20 times more capacity for database writes through updated Exadata Smart Flash Cache software. The new Exadata Smart Flash Cache software also runs on previous generation Exadata systems, increasing their capacity for writes tenfold; 33 percent more database CPU cores in the Oracle Exadata X3-2 Database In-Memory Machine, using the latest 8-core Intel® Xeon E5-2600 series of processors; Expanded 10Gb Ethernet connectivity to the data center in the Oracle Exadata X3-2 provides 40 10Gb network ports per rack for connecting users and moving data; Up to 30 percent reduction in power and cooling. Configured for Your Business, Available Today Oracle Exadata X3-2 Database In-Memory Machine systems are available in a Full-Rack, Half-Rack, Quarter-Rack, and the new low-cost Eighth-Rack configuration to satisfy the widest range of applications. Oracle Exadata X3-8 Database In-Memory Machine systems are available in a Full-Rack configuration, and both X3 systems enable multi-rack configurations for virtually unlimited scalability. Oracle Exadata X3-2 and X3-8 Database In-Memory Machines are fully compatible with prior Exadata generations and existing systems can also be upgraded with Oracle Exadata X3-2 servers. Oracle Exadata X3 Database In-Memory Machine systems can be used immediately with any application certified with Oracle Database 11g R2 and Oracle Real Application Clusters, including SAP, Oracle Fusion Applications, Oracle’s PeopleSoft, Oracle’s Siebel CRM, the Oracle E-Business Suite, and thousands of other applications. Supporting Quotes “Forward-looking enterprises are moving towards Cloud Computing architectures,” said Andrew Mendelsohn, senior vice president, Oracle Database Server Technologies. “Oracle Exadata’s unique ability to run any database application on a fully scale-out architecture using a combination of massive memory for extreme performance and low-cost disk for high capacity delivers the ideal solution for Cloud-based database deployments today.” Supporting Resources Oracle Press Release Oracle Exadata Database Machine Oracle Exadata X3-2 Database In-Memory Machine Oracle Exadata X3-8 Database In-Memory Machine Oracle Database 11g Follow Oracle Database via Blog, Facebook and Twitter Oracle OpenWorld 2012 Oracle OpenWorld 2012 Keynotes Like Oracle OpenWorld on Facebook Follow Oracle OpenWorld on Twitter Oracle OpenWorld Blog Oracle OpenWorld on LinkedIn Mark Hurd's keynote with Andy Mendelsohn and Juan Loaiza - - watch for the replay to be available soon at http://www.youtube.com/user/Oracle or http://www.oracle.com/openworld/live/on-demand/index.html

    Read the article

  • Oracle’s New Memory-Optimized x86 Servers: Getting the Most Out of Oracle Database In-Memory

    - by Josh Rosen, x86 Product Manager-Oracle
    With the launch of Oracle Database In-Memory, it is now possible to perform real-time analytics operations on your business data as it exists at that moment – in the DRAM of the server – and immediately return completely current and consistent data. The Oracle Database In-Memory option dramatically accelerates the performance of analytics queries by storing data in a highly optimized columnar in-memory format.  This is a truly exciting advance in database technology.As Larry Ellison mentioned in his recent webcast about Oracle Database In-Memory, queries run 100 times faster simply by throwing a switch.  But in order to get the most from the Oracle Database In-Memory option, the underlying server must also be memory-optimized. This week Oracle announced new 4-socket and 8-socket x86 servers, the Sun Server X4-4 and Sun Server X4-8, both of which have been designed specifically for Oracle Database In-Memory.  These new servers use the fastest Intel® Xeon® E7 v2 processors and each subsystem has been designed to be the best for Oracle Database, from the memory, I/O and flash technologies right down to the system firmware.Amongst these subsystems, one of the most important aspects we have optimized with the Sun Server X4-4 and Sun Server X4-8 are their memory subsystems.  The new In-Memory option makes it possible to select which parts of the database should be memory optimized.  You can choose to put a single column or table in memory or, if you can, put the whole database in memory.  The more, the better.  With 3 TB and 6 TB total memory capacity on the Sun Server X4-4 and Sun Server X4-8, respectively, you can memory-optimize more, if not your entire database.   Sun Server X4-8 CMOD with 24 DIMM slots per socket (up to 192 DIMM slots per server) But memory capacity is not the only important factor in selecting the best server platform for Oracle Database In-Memory.  As you put more of your database in memory, a critical performance metric known as memory bandwidth comes into play.  The total memory bandwidth for the server will dictate the rate in which data can be stored and retrieved from memory.  In order to achieve real-time analysis of your data using Oracle Database In-Memory, even under heavy load, the server must be able to handle extreme memory workloads.  With that in mind, the Sun Server X4-8 was designed with the maximum possible memory bandwidth, providing over a terabyte per second of total memory bandwidth.  Likewise, the Sun Server X4-4 also provides extreme memory bandwidth in an even more compact form factor with over half a terabyte per second, providing customers with scalability and choice depending on the size of the database.Beyond the memory subsystem, Oracle’s Sun Server X4-4 and Sun Server X4-8 systems provide other key technologies that enable Oracle Database to run at its best.  The Sun Server X4-4 allows for up 4.8 TB of internal, write-optimized PCIe flash while the Sun Server X4-8 allows for up to 6.4 TB of PCIe flash.  This enables dramatic acceleration of data inserts and updates to Oracle Database.  And with the new elastic computing capability of Oracle’s new x86 servers, server performance can be adapted to your specific Oracle Database workload to ensure that every last bit of processing power is utilized.Because Oracle designs and tests its x86 servers specifically for Oracle workloads, we provide the highest possible performance and reliability when running Oracle Database.  To learn more about Sun Server X4-4 and Sun Server X4-8, you can find more details including data sheets and white papers here. Josh Rosen is a Principal Product Manager for Oracle’s x86 servers, focusing on Oracle’s operating systems and software.  He previously spent more than a decade as a developer and architect of system management software. Josh has worked on system management for many of Oracle's hardware products ranging from the earliest blade systems to the latest Oracle x86 servers. 

    Read the article

  • Sun Fire X4800 M2 Posts World Record x86 SPECjEnterprise2010 Result

    - by Brian
    Oracle's Sun Fire X4800 M2 using the Intel Xeon E7-8870 processor and Sun Fire X4470 M2 using the Intel Xeon E7-4870 processor, produced a world record single application server SPECjEnterprise2010 benchmark result of 27,150.05 SPECjEnterprise2010 EjOPS. The Sun Fire X4800 M2 server ran the application tier and the Sun Fire X4470 M2 server was used for the database tier. The Sun Fire X4800 M2 server demonstrated 63% better performance compared to IBM P780 server result of 16,646.34 SPECjEnterprise2010 EjOPS. The Sun Fire X4800 M2 server demonstrated 4% better performance than the Cisco UCS B440 M2 result, both results used the same number of processors. This result used Oracle WebLogic Server 12c, Java HotSpot(TM) 64-Bit Server 1.7.0_02, and Oracle Database 11g. This result was produced using Oracle Linux. Performance Landscape Complete benchmark results are at the SPEC website, SPECjEnterprise2010 Results. The table below compares against the best results from IBM and Cisco. SPECjEnterprise2010 Performance Chart as of 3/12/2012 Submitter EjOPS* Application Server Database Server Oracle 27,150.05 1x Sun Fire X4800 M2 8x 2.4 GHz Intel Xeon E7-8870 Oracle WebLogic 12c 1x Sun Fire X4470 M2 4x 2.4 GHz Intel Xeon E7-4870 Oracle Database 11g (11.2.0.2) Cisco 26,118.67 2x UCS B440 M2 Blade Server 4x 2.4 GHz Intel Xeon E7-4870 Oracle WebLogic 11g (10.3.5) 1x UCS C460 M2 Blade Server 4x 2.4 GHz Intel Xeon E7-4870 Oracle Database 11g (11.2.0.2) IBM 16,646.34 1x IBM Power 780 8x 3.86 GHz POWER 7 WebSphere Application Server V7 1x IBM Power 750 Express 4x 3.55 GHz POWER 7 IBM DB2 9.7 Workgroup Server Edition FP3a * SPECjEnterprise2010 EjOPS, bigger is better. Configuration Summary Application Server: 1 x Sun Fire X4800 M2 8 x 2.4 GHz Intel Xeon processor E7-8870 256 GB memory 4 x 10 GbE NIC 2 x FC HBA Oracle Linux 5 Update 6 Oracle WebLogic Server 11g Release 1 (10.3.5) Java HotSpot(TM) 64-Bit Server VM on Linux, version 1.7.0_02 (Java SE 7 Update 2) Database Server: 1 x Sun Fire X4470 M2 4 x 2.4 GHz Intel Xeon E7-4870 512 GB memory 4 x 10 GbE NIC 2 x FC HBA 2 x Sun StorageTek 2540 M2 4 x Sun Fire X4270 M2 4 x Sun Storage F5100 Flash Array Oracle Linux 5 Update 6 Oracle Database 11g Enterprise Edition Release 11.2.0.2 Benchmark Description SPECjEnterprise2010 is the third generation of the SPEC organization's J2EE end-to-end industry standard benchmark application. The SPECjEnterprise2010 benchmark has been designed and developed to cover the Java EE 5 specification's significantly expanded and simplified programming model, highlighting the major features used by developers in the industry today. This provides a real world workload driving the Application Server's implementation of the Java EE specification to its maximum potential and allowing maximum stressing of the underlying hardware and software systems. The workload consists of an end to end web based order processing domain, an RMI and Web Services driven manufacturing domain and a supply chain model utilizing document based Web Services. The application is a collection of Java classes, Java Servlets, Java Server Pages, Enterprise Java Beans, Java Persistence Entities (pojo's) and Message Driven Beans. The SPECjEnterprise2010 benchmark heavily exercises all parts of the underlying infrastructure that make up the application environment, including hardware, JVM software, database software, JDBC drivers, and the system network. The primary metric of the SPECjEnterprise2010 benchmark is jEnterprise Operations Per Second ("SPECjEnterprise2010 EjOPS"). This metric is calculated by adding the metrics of the Dealership Management Application in the Dealer Domain and the Manufacturing Application in the Manufacturing Domain. There is no price/performance metric in this benchmark. Key Points and Best Practices Sixteen Oracle WebLogic server instances were started using numactl, binding 2 instances per chip. Eight Oracle database listener processes were started, binding 2 instances per chip using taskset. Additional tuning information is in the report at http://spec.org. See Also Oracle Press Release -- SPECjEnterprise2010 Results Page Sun Fire X4800 M2 Server oracle.com OTN Sun Fire X4270 M2 Server oracle.com OTN Sun Storage 2540-M2 Array oracle.com OTN Oracle Linux oracle.com OTN Oracle Database 11g Release 2 Enterprise Edition oracle.com OTN WebLogic Suite oracle.com OTN Disclosure Statement SPEC and the benchmark name SPECjEnterprise are registered trademarks of the Standard Performance Evaluation Corporation. Sun Fire X4800 M2, 27,150.05 SPECjEnterprise2010 EjOPS; IBM Power 780, 16,646.34 SPECjEnterprise2010 EjOPS; Cisco UCS B440 M2, 26,118.67 SPECjEnterprise2010 EjOPS. Results from www.spec.org as of 3/27/2012.

    Read the article

  • Algorithm to Find the Aggregate Mass of "Granola Bar"-Like Structures?

    - by Stuart Robbins
    I'm a planetary science researcher and one project I'm working on is N-body simulations of Saturn's rings. The goal of this particular study is to watch as particles clump together under their own self-gravity and measure the aggregate mass of the clumps versus the mean velocity of all particles in the cell. We're trying to figure out if this can explain some observations made by the Cassini spacecraft during the Saturnian summer solstice when large structures were seen casting shadows on the nearly edge-on rings. Below is a screenshot of what any given timestep looks like. (Each particle is 2 m in diameter and the simulation cell itself is around 700 m across.) The code I'm using already spits out the mean velocity at every timestep. What I need to do is figure out a way to determine the mass of particles in the clumps and NOT the stray particles between them. I know every particle's position, mass, size, etc., but I don't know easily that, say, particles 30,000-40,000 along with 102,000-105,000 make up one strand that to the human eye is obvious. So, the algorithm I need to write would need to be a code with as few user-entered parameters as possible (for replicability and objectivity) that would go through all the particle positions, figure out what particles belong to clumps, and then calculate the mass. It would be great if it could do it for "each" clump/strand as opposed to everything over the cell, but I don't think I actually need it to separate them out. The only thing I was thinking of was doing some sort of N2 distance calculation where I'd calculate the distance between every particle and if, say, the closest 100 particles were within a certain distance, then that particle would be considered part of a cluster. But that seems pretty sloppy and I was hoping that you CS folks and programmers might know of a more elegant solution? Edited with My Solution: What I did was to take a sort of nearest-neighbor / cluster approach and do the quick-n-dirty N2 implementation first. So, take every particle, calculate distance to all other particles, and the threshold for in a cluster or not was whether there were N particles within d distance (two parameters that have to be set a priori, unfortunately, but as was said by some responses/comments, I wasn't going to get away with not having some of those). I then sped it up by not sorting distances but simply doing an order N search and increment a counter for the particles within d, and that sped stuff up by a factor of 6. Then I added a "stupid programmer's tree" (because I know next to nothing about tree codes). I divide up the simulation cell into a set number of grids (best results when grid size ˜7 d) where the main grid lines up with the cell, one grid is offset by half in x and y, and the other two are offset by 1/4 in ±x and ±y. The code then divides particles into the grids, then each particle N only has to have distances calculated to the other particles in that cell. Theoretically, if this were a real tree, I should get order N*log(N) as opposed to N2 speeds. I got somewhere between the two, where for a 50,000-particle sub-set I got a 17x increase in speed, and for a 150,000-particle cell, I got a 38x increase in speed. 12 seconds for the first, 53 seconds for the second, 460 seconds for a 500,000-particle cell. Those are comparable speeds to how long the code takes to run the simulation 1 timestep forward, so that's reasonable at this point. Oh -- and it's fully threaded, so it'll take as many processors as I can throw at it.

    Read the article

  • Measuring ASP.NET and SharePoint output cache

    - by DigiMortal
    During ASP.NET output caching week in my local blog I wrote about how to measure ASP.NET output cache. As my posting was based on real work and real-life results then I thought that this posting is maybe interesting to you too. So here you can read what I did, how I did and what was the result. Introduction Caching is not effective without measuring it. As MVP Henn Sarv said in one of his sessions then you will get what you measure. And right he is. Lately I measured caching on local Microsoft community portal to make sure that our caching strategy is good enough in environment where this system lives. In this posting I will show you how to start measuring the cache of your web applications. Although the application measured is built on SharePoint Server publishing infrastructure, all those counters have same meaning as similar counters under pure ASP.NET applications. Measured counters I used Performance Monitor and the following performance counters (their names are similar on ASP.NET and SharePoint WCMS): Total number of objects added – how much objects were added to output cache. Total object discards – how much objects were deleted from output cache. Cache hit count – how many times requests were served by cache. Cache hit ratio – percent of requests served from cache. The first three counters are cumulative while last one is coefficient. You can use also other counters to measure the full effect of caching (memory, processor, disk I/O, network load etc before and after caching). Measuring process The measuring I describe here started from freshly restarted web server. I measured application during 12 hours that covered also time ranges when users are most active. The time range does not include late evening hours and night because there is nothing to measure during these hours. During measuring we performed no maintenance or administrative tasks on server. All tasks performed were related to usual daily content management and content monitoring. Also we had no advertisement campaigns or other promotions running at same time. The results You can see the results on following graphic.   Total number of objects added   Total object discards   Cache hit count   Cache hit ratio You can see that adds and discards are growing in same tempo. It is good because cache expires and not so popular items are not kept in memory. If there are more popular content then the these lines may have bigger distance between them. Cache hit count grows faster and this shows that more and more content is served from cache. In current case it shows that cache is filled optimally and we can do even better if we tune caches more. The site contains also pages that are discarded when some subsite changes (page was added/modified/deleted) and one modification may affect about four or five pages. This may also decrease cache hit count because during day the site gets about 5-10 new pages. Cache hit ratio is currently extremely good. The suggested minimum is about 85% but after some tuning and measuring I achieved 98.7% as a result. This is due to the fact that new pages are most often requested and after new pages are added the older ones are requested only sometimes. So they get discarded from cache and only some of these will return sometimes back to cache. Although this may also indicate the need for additional SEO work the result is very well in technical means. Conclusion Measuring ASP.NET output cache is not complex thing to do and you can start by measuring performance of cache as a start. Later you can move on and measure caching effect to other counters such as disk I/O, network, processors etc. What you have to achieve is optimal cache that is not full of items asked only couple of times per day (you can avoid this by not using too long cache durations). After some tuning you should be able to boost cache hit ratio up to at least 85%.

    Read the article

  • Unleash the Power of Cryptography on SPARC T4

    - by B.Koch
    by Rob Ludeman Oracle’s SPARC T4 systems are architected to deliver enhanced value for customer via the inclusion of many integrated features.  One of the best examples of this approach is demonstrated in the on-chip cryptographic support that delivers wire speed encryption capabilities without any impact to application performance.  The Evolution of SPARC Encryption SPARC T-Series systems have a long history of providing this capability, dating back to the release of the first T2000 systems that featured support for on-chip RSA encryption directly in the UltraSPARC T1 processor.  Successive generations have built on this approach by support for additional encryption ciphers that are tightly coupled with the Oracle Solaris 10 and Solaris 11 encryption framework.  While earlier versions of this technology were implemented using co-processors, the SPARC T4 was redesigned with new crypto instructions to eliminate some of the performance overhead associated with the former approach, resulting in much higher performance for encrypted workloads. The Superiority of the SPARC T4 Approach to Crypto As companies continue to engage in more and more e-commerce, the need to provide greater degrees of security for these transactions is more critical than ever before.  Traditional methods of securing data in transit by applications have a number of drawbacks that are addressed by the SPARC T4 cryptographic approach. 1. Performance degradation – cryptography is highly compute intensive and therefore, there is a significant cost when using other architectures without embedded crypto functionality.  This performance penalty impacts the entire system, slowing down performance of web servers (SSL), for example, and potentially bogging down the speed of other business applications.  The SPARC T4 processor enables customers to deliver high levels of security to internal and external customers while not incurring an impact to overall SLAs in their IT environment. 2. Added cost – one of the methods to avoid performance degradation is the addition of add-in cryptographic accelerator cards or external offload engines in other systems.  While these solutions provide a brute force mechanism to avoid the problem of slower system performance, it usually comes at an added cost.  Customers looking to encrypt datacenter traffic without the overhead and expenditure of extra hardware can rely on SPARC T4 systems to deliver the performance necessary without the need to purchase other hardware or add-on cards. 3. Higher complexity – the addition of cryptographic cards or leveraging load balancers to perform encryption tasks results in added complexity from a management standpoint.  With SPARC T4, encryption keys and the framework built into Solaris 10 and 11 means that administrators generally don’t need to spend extra cycles determining how to perform cryptographic functions.  In fact, many of the instructions are built-in and require no user intervention to be utilized.  For example, For OpenSSL on Solaris 11, SPARC T4 crypto is available directly with a new built-in OpenSSL 1.0 engine, called the "t4 engine."  For a deeper technical dive into the new instructions included in SPARC T4, consult Dan Anderson’s blog. Conclusion In summary, SPARC T4 systems offer customers much more value for applications than just increased performance. The integration of key virtualization technologies, embedded encryption, and a true Enterprise Operating System, Oracle Solaris, provides direct business benefits that supersedes the commodity approach to data center computing.   SPARC T4 removes the roadblocks to secure computing by offering integrated crypto accelerators that can save IT organizations in operating cost while delivering higher levels of performance and meeting objectives around compliance. For more on the SPARC T4 family of products, go to here.

    Read the article

  • Low level programming - what's in it for me?

    - by back2dos
    For years I have considered digging into what I consider "low level" languages. For me this means C and assembly. However I had no time for this yet, nor has it EVER been neccessary. Now because I don't see any neccessity arising, I feel like I should either just schedule some point in time when I will study the subject or drop the plan forever. My Position For the past 4 years I have focused on "web technologies", which may change, and I am an application developer, which is unlikely to change. In application development, I think usability is the most important thing. You write applications to be "consumed" by users. The more usable those applications are, the more value you have produced. In order to achieve good usability, I believe the following things are viable Good design: Well-thought-out features accessible through a well-thought-out user interface. Correctness: The best design isn't worth anything, if not implemented correctly. Flexibility: An application A should constantly evolve, so that its users need not switch to a different application B, that has new features, that A could implement. Applications addressing the same problem should not differ in features but in philosophy. Performance: Performance contributes to a good user experience. An application is ideally always responsive and performs its tasks reasonably fast (based on their frequency). The value of performance optimization beyond the point where it is noticeable by the user is questionable. I think low level programming is not going to help me with that, except for performance. But writing a whole app in a low level language for the sake of performance is premature optimization to me. My Question What could low level programming teach me, what other languages wouldn't teach me? Am I missing something, or is it just a skill, that is of very little use for application development? Please understand, that I am not questioning the value of C and assembly. It's just that in my everyday life, I am quite happy that all the intricacies of that world are abstracted away and managed for me (mostly by layers written in C/C++ and assembly themselves). I just don't see any concepts, that could be new to me, only details I would have to stuff my head with. So what's in it for me? My Conclusion Thanks to everyone for their answers. I must say, nobody really surprised me, but at least now I am quite sure I will drop this area of interest until any need for it arises. To my understanding, writing assembly these days for processors as they are in use in today's CPUs is not only unneccesarily complicated, but risks to result in poorer runtime performance than a C counterpart. Optimizing by hand is nearly impossible due to OOE, while you do not get all kinds of optimizations a compiler can do automatically. Also, the code is either portable, because it uses a small subset of available commands, or it is optimized, but then it probably works on one architecture only. Writing C is not nearly as neccessary anymore, as it was in the past. If I were to write an application in C, I would just as much use tested and established libraries and frameworks, that would spare me implementing string copy routines, sorting algorithms and other kind of stuff serving as exercise at university. My own code would execute faster at the cost of type safety. I am neither keen on reeinventing the wheel in the course of normal app development, nor trying to debug by looking at core dumps :D I am currently experimenting with languages and interpreters, so if there is anything I would like to publish, I suppose I'd port a working concept to C, although C++ might just as well do the trick. Again, thanks to everyone for your answers and your insight.

    Read the article

  • SPARC T4-2 Produces World Record Oracle Essbase Aggregate Storage Benchmark Result

    - by Brian
    Significance of Results Oracle's SPARC T4-2 server configured with a Sun Storage F5100 Flash Array and running Oracle Solaris 10 with Oracle Database 11g has achieved exceptional performance for the Oracle Essbase Aggregate Storage Option benchmark. The benchmark has upwards of 1 billion records, 15 dimensions and millions of members. Oracle Essbase is a multi-dimensional online analytical processing (OLAP) server and is well-suited to work well with SPARC T4 servers. The SPARC T4-2 server (2 cpus) running Oracle Essbase 11.1.2.2.100 outperformed the previous published results on Oracle's SPARC Enterprise M5000 server (4 cpus) with Oracle Essbase 11.1.1.3 on Oracle Solaris 10 by 80%, 32% and 2x performance improvement on Data Loading, Default Aggregation and Usage Based Aggregation, respectively. The SPARC T4-2 server with Sun Storage F5100 Flash Array and Oracle Essbase running on Oracle Solaris 10 achieves sub-second query response times for 20,000 users in a 15 dimension database. The SPARC T4-2 server configured with Oracle Essbase was able to aggregate and store values in the database for a 15 dimension cube in 398 minutes with 16 threads and in 484 minutes with 8 threads. The Sun Storage F5100 Flash Array provides more than a 20% improvement out-of-the-box compared to a mid-size fiber channel disk array for default aggregation and user-based aggregation. The Sun Storage F5100 Flash Array with Oracle Essbase provides the best combination for large Oracle Essbase databases leveraging Oracle Solaris ZFS and taking advantage of high bandwidth for faster load and aggregation. Oracle Fusion Middleware provides a family of complete, integrated, hot pluggable and best-of-breed products known for enabling enterprise customers to create and run agile and intelligent business applications. Oracle Essbase's performance demonstrates why so many customers rely on Oracle Fusion Middleware as their foundation for innovation. Performance Landscape System Data Size(millions of items) Database Load(minutes) Default Aggregation(minutes) Usage Based Aggregation(minutes) SPARC T4-2, 2 x SPARC T4 2.85 GHz 1000 149 398* 55 Sun M5000, 4 x SPARC64 VII 2.53 GHz 1000 269 526 115 Sun M5000, 4 x SPARC64 VII 2.4 GHz 400 120 448 18 * – 398 mins with CALCPARALLEL set to 16; 484 mins with CALCPARALLEL threads set to 8 Configuration Summary Hardware Configuration: 1 x SPARC T4-2 2 x 2.85 GHz SPARC T4 processors 128 GB memory 2 x 300 GB 10000 RPM SAS internal disks Storage Configuration: 1 x Sun Storage F5100 Flash Array 40 x 24 GB flash modules SAS HBA with 2 SAS channels Data Storage Scheme Striped - RAID 0 Oracle Solaris ZFS Software Configuration: Oracle Solaris 10 8/11 Installer V 11.1.2.2.100 Oracle Essbase Client v 11.1.2.2.100 Oracle Essbase v 11.1.2.2.100 Oracle Essbase Administration services 64-bit Oracle Database 11g Release 2 (11.2.0.3) HP's Mercury Interactive QuickTest Professional 9.5.0 Benchmark Description The objective of the Oracle Essbase Aggregate Storage Option benchmark is to showcase the ability of Oracle Essbase to scale in terms of user population and data volume for large enterprise deployments. Typical administrative and end-user operations for OLAP applications were simulated to produce benchmark results. The benchmark test results include: Database Load: Time elapsed to build a database including outline and data load. Default Aggregation: Time elapsed to build aggregation. User Based Aggregation: Time elapsed of the aggregate views proposed as a result of tracked retrieval queries. Summary of the data used for this benchmark: 40 flat files, each of size 1.2 GB, 49.4 GB in total 10 million rows per file, 1 billion rows total 28 columns of data per row Database outline has 15 dimensions (five of them are attribute dimensions) Customer dimension has 13.3 million members 3 rule files Key Points and Best Practices The Sun Storage F5100 Flash Array has been used to accelerate the application performance. Setting data load threads (DLTHREADSPREPARE) to 64 and Load Buffer to 6 improved dataloading by about 9%. Factors influencing aggregation materialization performance are "Aggregate Storage Cache" and "Number of Threads" (CALCPARALLEL) for parallel view materialization. The optimal values for this workload on the SPARC T4-2 server were: Aggregate Storage Cache: 32 GB CALCPARALLEL: 16   See Also Oracle Essbase Aggregate Storage Option Benchmark on Oracle's SPARC T4-2 Server oracle.com Oracle Essbase oracle.com OTN SPARC T4-2 Server oracle.com OTN Oracle Solaris oracle.com OTN Oracle Database 11g Release 2 Enterprise Edition oracle.com OTN Disclosure Statement Copyright 2012, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 28 August 2012.

    Read the article

  • Impressions of my ASUS eee slate EP121 - Dual core 4GB, 64GB SSD

    - by tonyrogerson
    This thing is lovely, very nice bluetooth keyboard that has nice feedback on the keypress, there is no mouse but you can use the stylus or get yourself a bluetooth mouse, me, I've opted for a Microsoft ARC mouse which is a delight to use, the USB doors are a pain to open for the first time if like me you don't have any finger nails. It came as a suprise that the slate shows four processors, Dual Core with multi-threading, I didn't really look at the processor I was more interested in the amount of memory and the SSD; you don't get the full 4GB even with the 64 bit version of Windows 7 installed (which I immediately upgraded to Ultimate through my MSDN subscription). The box is extremely responsive - extremely, it loads Winword in literally a second. I've got office 2010 and onenote 2010 on there now; one problem is that on applying all (43) windows updates since the upgrade the machine is still sat on step 3 of 3 on the start up configuring updates screen after about an hour, you can't turn this machine off without using a paper clip to reset it and as I have just found you need a paper clip :). Installing Windows 7 SP1 was effortless. One of the first things I did on it was to reduce the size of the font, by default its set at 125%, my eye sight is ok :) so I've set that back down. Amazon Kindle for the PC works really well, plenty of text on the screen when viewed portrait, the case it comes with also allows the slate to stand up in various positions - portrait, horizontal - seems stable enough. The wireless works well, seems to have a better signal than my other two laptop machines which is good news. The gadget passed the pose test at work :). I use offline files to keep a copy of all my work stuff locally, I'm not sure what it is, well, its probably my server but whenever I try and sync it runs for a couple of minutes then fails with network name no longer contactable, funnily enough its fine from my big laptop so I can only guess this may be a driver type issue on the EP121 itself - very odd and very annoying. I do a lot of presenting and need to plug into a VGA project because most sites that's all that is offered, the EP121 has a mini-hdmi output which is great except for this scenario, hdmi is digital, vga is analogue, you will struggle to find a cost effective solution, I found HDFury and also a device HP do, however, a better solution appears to be getting a USB graphics adapter for instance the one I've ordered is the ClimaxDigital USB 2.0 to DVI,VGA or HDMI Adaptor which gives everything I need - VGA and DVI output and great resolution as well - ok, so fingers crossed because I'm presenting next Wednesday in Edinburgh and not taking my 300kg lenovo w700 (I'm sure my back just sighed in relief) - it certainly works really well on my LED TV, the install was simple - it just works! One of the several reasons for buying this piece of kit was to use it on my LED TV to remote into my main machine to check stuff whilst sat in my living room, also to watch webcasts and lecture videos in comfort away from my office, because of the wireless speed and limitation I'm opting for a USB network adapter from Belkin - that will also allow me to take advantage of my home gigabit network, there are only 2 usb ports on the slate so I'm going to knock up a hub so connecting it in is straight forward and simple, I'm also going to purchase a second power supply so I don't have to faff about with that either.I now have the developer x64 edition of SQL Server 2008 R2, yes everything :) - about 16GB left to play with on the machine now but that will be fine, I'll put AdventureWorks on there so I can play and demo stuff which is all I'm after from this, my development machine is significantly more powerful and meets my storage needs too.Travel test this weekend and next week, I'm in Dundee for my final exam for the masters degree.

    Read the article

  • Organization &amp; Architecture UNISA Studies &ndash; Chap 6

    - by MarkPearl
    Learning Outcomes Discuss the physical characteristics of magnetic disks Describe how data is organized and accessed on a magnetic disk Discuss the parameters that play a role in the performance of magnetic disks Describe different optical memory devices Magnetic Disk The way data is stored on and retried from magnetic disks Data is recorded on and later retrieved form the disk via a conducting coil named the head (in many systems there are two heads) The writ mechanism exploits the fact that electricity flowing through a coil produces a magnetic field. Electric pulses are sent to the write head, and the resulting magnetic patterns are recorded on the surface below with different patterns for positive and negative currents The physical characteristics of a magnetic disk   Summarize from book   The factors that play a role in the performance of a disk Seek time – the time it takes to position the head at the track Rotational delay / latency – the time it takes for the beginning of the sector to reach the head Access time – the sum of the seek time and rotational delay Transfer time – the time it takes to transfer data RAID The rate of improvement in secondary storage performance has been considerably less than the rate for processors and main memory. Thus secondary storage has become a bit of a bottleneck. RAID works on the concept that if one disk can be pushed so far, additional gains in performance are to be had by using multiple parallel components. Points to note about RAID… RAID is a set of physical disk drives viewed by the operating system as a single logical drive Data is distributed across the physical drives of an array in a scheme known as striping Redundant disk capacity is used to store parity information, which guarantees data recoverability in case of a disk failure (not supported by RAID 0 or RAID 1) Interesting to note that the increase in the number of drives, increases the probability of failure. To compensate for this decreased reliability RAID makes use of stored parity information that enables the recovery of data lost due to a disk failure.   The RAID scheme consists of 7 levels…   Category Level Description Disks Required Data Availability Large I/O Data Transfer Capacity Small I/O Request Rate Striping 0 Non Redundant N Lower than single disk Very high Very high for both read and write Mirroring 1 Mirrored 2N Higher than RAID 2 – 5 but lower than RAID 6 Higher than single disk Up to twice that of a signle disk for read Parallel Access 2 Redundant via Hamming Code N + m Much higher than single disk Highest of all listed alternatives Approximately twice that of a single disk Parallel Access 3 Bit interleaved parity N + 1 Much higher than single disk Highest of all listed alternatives Approximately twice that of a single disk Independent Access 4 Block interleaved parity N + 1 Much higher than single disk Similar to RAID 0 for read, significantly lower than single disk for write Similar to RAID 0 for read, significantly lower than single disk for write Independent Access 5 Block interleaved parity N + 1 Much higher than single disk Similar to RAID 0 for read, lower than single disk for write Similar to RAID 0 for read, generally  lower than single disk for write Independent Access 6 Block interleaved parity N + 2 Highest of all listed alternatives Similar to RAID 0 for read; lower than RAID 5 for write Similar to RAID 0 for read, significantly lower than RAID 5  for write   Read page 215 – 221 for detailed explanation on RAID levels Optical Memory There are a variety of optical-disk systems available. Read through the table on page 222 – 223 Some of the devices include… CD CD-ROM CD-R CD-RW DVD DVD-R DVD-RW Blue-Ray DVD Magnetic Tape Most modern systems use serial recording – data is lade out as a sequence of bits along each track. The typical recording used in serial is referred to as serpentine recording. In this technique when data is being recorded, the first set of bits is recorded along the whole length of the tape. When the end of the tape is reached the heads are repostioned to record a new track, and the tape is again recorded on its whole length, this time in the opposite direction. That process continued back and forth until the tape is full. To increase speed, the read-write head is capable of reading and writing a number of adjacent tracks simultaneously. Data is still recorded serially along individual tracks, but blocks in sequence are stored on adjacent tracks as suggested. A tape drive is a sequential access device. Magnetic tape was the first kind of secondary memory. It is still widely used as the lowest-cost, slowest speed member of the memory hierarchy.

    Read the article

  • E-Business Suite : Role of CHUNK_SIZE in Oracle Payroll

    - by Giri Mandalika
    Different batch processes in Oracle Payroll flow have the ability to spawn multiple child processes (or threads) to complete the work in hand. The number of child processes to fork is controlled by the THREADS parameter in APPS.PAY_ACTION_PARAMETERS view. THREADS parameter The default value for THREADS parameter is 1, which is fine for a single-processor system but not optimal for the modern multi-core multi-processor systems. Setting the THREADS parameter to a value equal to or less than the total number of [virtual] processors available on the system may improve the performance of payroll processing. However on the down side, since multiple child processes operate against the same set of payroll tables in HR schema, database may experience undesired consequences such as buffer busy waits and index contention, which results in giving up some of the gains achieved by using multiple child processes/threads to process the work. Couple of other action parameters, CHUNK_SIZE and CHUNK_SHUFFLE, help alleviate the database contention. eg., Set a value for THREADS parameter as shown below. CONNECT APPS/APPS_PASSWORD UPDATE PAY_ACTION_PARAMETERS SET PARAMETER_VALUE = DESIRED_VALUE WHERE PARAMETER_NAME = 'THREADS'; COMMIT; (I am not aware of any maximum value for THREADS parameter) CHUNK_SIZE parameter The size of each commit unit for the batch process is controlled by the CHUNK_SIZE action parameter. In other words, chunking is the act of splitting the assignment actions into commit groups of desired size represented by the CHUNK_SIZE parameter. The default value is 20, and each thread processes one chunk at a time -- which means each child process inserts or processes 20 assignment actions at any time. When multiple threads are configured, each thread picks up a chunk to process, completes the assignment actions and then picks up another chunk. This is repeated until all the chunks are exhausted. It is possible to use different chunk sizes in different batch processes. During the initial phase of processing, CHUNK_SIZE number of assignment actions are inserted into relevant table(s). When multiple child processes are inserting data at the same time into the same set of tables, as explained earlier, database may experience contention. The default value of 20 is mostly optimal in such a case. Experiment with different values for the initial phase by +/-10 for CHUNK_SIZE parameter and observe the performance impact. A larger value may make sense during the main processing phase. Again experimentation is the key in finding the suitable value for your environment. Start with a large value such as 2000 for the chunk size, then increment or decrement the size by 500 at a time until an optimal value is found. eg., Set a value for CHUNK_SIZE parameter as shown below. CONNECT APPS/APPS_PASSWORD UPDATE PAY_ACTION_PARAMETERS SET PARAMETER_VALUE = DESIRED_VALUE WHERE PARAMETER_NAME = 'CHUNK_SIZE'; COMMIT; CHUNK_SIZE action parameter accepts a value that is as low as 1 or as high as 16000. CHUNK SHUFFLE parameter By default, chunks of assignment actions are processed sequentially by all threads - which may not be a good thing especially given that all child processes/threads performing similar actions against the same set of tables almost at the same time. By saying not a good thing, I mean to say that the default behavior leads to contention in the database (in data blocks, for example). It is possible to relieve some of that database contention by randomizing the processing order of chunks of assignment actions. This behavior is controlled by the CHUNK SHUFFLE action parameter. Chunk processing is not randomized unless explicitly configured. eg., Set chunk shuffling as shown below. CONNECT APPS/APPS_PASSWORD UPDATE PAY_ACTION_PARAMETERS SET PARAMETER_VALUE = 'Y' WHERE PARAMETER_NAME = 'CHUNK SHUFFLE'; COMMIT; Finally I recommend checking the following document out for additional details and additional pay action tunable parameters that may speed up the processing of Oracle Payroll.     My Oracle Support Doc ID: 226987.1 Oracle 11i & R12 Human Resources (HRMS) & Benefits (BEN) Tuning & System Health Checks Also experiment with different combinations of parameters and values until the right set of action parameters and values are found for your deployment.

    Read the article

  • About the K computer

    - by nospam(at)example.com (Joerg Moellenkamp)
    Okay ? after getting yet another mail because of the new #1 on the Top500 list, I want to add some comments from my side: Yes, the system is using SPARC processor. And that is great news for a SPARC fan like me. It is using the SPARC VIIIfx processor from Fujitsu clocked at 2 GHz. No, it isn't the only one. Most people are saying there are two in the Top500 list using SPARC (#77 JAXA and #1 K) but in fact there are three. The Tianhe-1 (#2 on the Top500 list) super computer contains 2048 Galaxy "FT-1000" 1 GHz 8-core processors. Don't know it? The FeiTeng-1000 ? this proc is a 8 core, 8 threads per core, 1 ghz processor made in China. And it's SPARC based. By the way ? this sounds really familiar to me ? perhaps the people just took the opensourced UltraSPARC-T2 design, because some of the parameters sound just to similar. However it looks like that Tianhe-1 is using the SPARCs as input nodes and not as compute notes. No, I don't see it as the next M-series processor. Simple reason: You can't create SMP systems out of them ? it simply hasn't the functionality to do so. Even when there are multiple CPUs on a single board, they are not connected like an SMP/NUMA machine to a shared memory machine ? they are connected with the cluster interconnect (in this case the Tofu interconnect) and work like a large cluster. Yes, it has a lot of oomph in Linpack ? however I assume a lot came from the extensions to the SPARCv9 standard. No, Linpack has no relevance for any commercial workload ? Linpack is such a special load, that even some HPC people are arguing that it isn't really a good benchmark for HPC. It's embarrassingly parallel, it can work with relatively small interconnects compared to the interconnects in SMP systems (however we get in spheres SMP interconnects where a few years ago). Amdahl isn't hitting that hard when running Linpack. Yes, it's a good move to use SPARC. At some time in the last 10 years, there was an interesting twist in perception: SPARC was considered as proprietary architecture and x86 was the open architecture. However it's vice versa ? try to create a x86 clone and you have a lot of intellectual property problems, create a SPARC clone and you have to spend 100 bucks or so to get the specification from the SPARC Foundation and develop your own SPARC processor. Fujitsu is doing this for a long time now. So they had their own processor, their own know-how. So why was SPARC a good choice? Well ? essentially Fujitsu can do what they want with their core as it is their core, for example adding the extensions to the SPARCv9 chipset ? getting Intel to create extensions to x86 to help you with your product is a little bit harder. So Fujitsu could do they needed to do with their processor in order to create such a supercomputer. No, the K is really using no FPGA or GPU as accelerators. The K is really using the CPU at doing this job. Yes, it has a significantly enhanced FPU capable to execute 8 instructions in parallel. No, it doesn't run Solaris. Yes, it uses Linux. No, it doesn't hurt me ... as my colleague Roland Rambau (he knows a lot about HPC) said once to me ... it doesn't matter which OS is staying out of the way of the workload in HPC.

    Read the article

< Previous Page | 22 23 24 25 26 27 28 29 30  | Next Page >