Search Results

Search found 5976 results on 240 pages for 'cpu cycles'.

Page 75/240 | < Previous Page | 71 72 73 74 75 76 77 78 79 80 81 82  | Next Page >

  • Good DBAs Do Baselines

    - by Louis Davidson
    One morning, you wake up and feel funny. You can’t quite put your finger on it, but something isn’t quite right. What now? Unless you happen to be a hypochondriac, you likely drag yourself out of bed, get on with the day and gather more “evidence”. You check your symptoms over the next few days; do you feel the same, better, worse? If better, then great, it was some temporal issue, perhaps caused by an allergic reaction to some suspiciously spicy chicken. If the same or worse then you go to the doctor for some health advice, but armed with some data to share, and having ruled out certain possible causes that are fixed with a bit of rest and perhaps an antacid. Whether you realize it or not, in comparing how you feel one day to the next, you have taken baseline measurements. In much the same way, a DBA uses baselines to gauge the gauge health of their database servers. Of course, while SQL Server is very willing to share data regarding its health and activities, it has almost no idea of the difference between good and bad. Over time, experienced DBAs develop “mental” baselines with which they can gauge the health of their servers almost as easily as their own body. They accumulate knowledge of the daily, natural state of each part of their database system, and so know instinctively when one of their databases “feels funny”. Equally, they know when an “issue” is just a passing tremor. They see their SQL Server with all of its four CPU cores running close 100% and don’t panic anymore. Why? It’s 5PM and every day the same thing occurs when the end-of-day reports, which are very CPU intensive, are running. Equally, they know when they need to respond in earnest when it is the first time they have heard about an issue, even if it has been happening every day. Nevertheless, no DBA can retain mental baselines for every characteristic of their systems, so we need to collect physical baselines too. In my experience, surprisingly few DBAs do this very well. Part of the problem is that SQL Server provides a lot of instrumentation. If you look, you will find an almost overwhelming amount of data regarding user activity on your SQL Server instances, and use and abuse of the available CPU, I/O and memory. It seems like a huge task even to work out which data you need to collect, let alone start collecting it on a regular basis, managing its storage over time, and performing detailed comparative analysis. However, without baselines, though, it is very difficult to pinpoint what ails a server, just by looking at a single snapshot of the data, or to spot retrospectively what caused the problem by examining aggregated data for the server, collected over many months. It isn’t as hard as you think to get started. You’ve probably already established some troubleshooting queries of the type SELECT Value FROM SomeSystemTableOrView. Capturing a set of baseline values for such a query can be as easy as changing it as follows: INSERT into BaseLine.SomeSystemTable (value, captureTime) SELECT Value, SYSDATETIME() FROM SomeSystemTableOrView; Of course, there are monitoring tools that will collect and manage this baseline data for you, automatically, and allow you to perform comparison of metrics over different periods. However, to get yourself started and to prove to yourself (or perhaps the person who writes the checks for tools) the value of baselines, stick something similar to the above query into an agent job, running every hour or so, and you are on your way with no excuses! Then, the next time you investigate a slow server, and see x open transactions, y users logged in, and z rows added per hour in the Orders table, compare to your baselines and see immediately what, if anything, has changed!

    Read the article

  • How to get faster graphics in KVM? VNC is painfully slow with Haiku OS guest, Spice won't install and SDL doesn't work

    - by Don Quixote
    I've been coming up to speed on the Haiku operating system, an Open Source clone of BeOS 5 Pro. I'm using an Apple MacBook Pro as my development machine. Apple's BootCamp BIOS does not support more than four partitions on the internal hard drive. While I can set up extended and logical partitions, doing so will prevent any of the installed operating systems from booting. To run Haiku directly on the iron, I boot it off a USB stick. Using external storage is also helpful because I am perpetually out of filesystem space. While VirtualBox is documented to allow access to physical drives, I could not actually get it to work. Also VirtualBox can only use one of the host CPU's cores. While VB guests can be configured for more than one CPU, they are only emulated. A full build of the Haiku OS takes 4.5 under VB. I had the hope of reducing build times by using KVM instead, but it's not working nearly as well as VirtualBox did. The Linux Kernel Virtual Machine is broken in all manner of fundamental ways as seen from Haiku. But I'm a coder; maybe I could contribute to fixing some of those problems. The first problem I've got is that Haiku's video in virt-manager is quite painfully slow. When I drag Haiku windows around the desktop, they lag quite far behind where my mouse is. It's quite difficult to move a window to a precise position on the screen. Just imagine that the mouse was connected to the window title bar with a really stretchy spring. Also Haiku's mouse lags quite far behind where I have moved it. I found lots of Personal Package Archives that enable Spice from QEMU / KVM at the Ubuntu Personal Package Arhives. I tried a few of the PPAs but none of them worked; with one of them, the command "add-apt-repository" crashed with a traceback. There is a Wiki page about Spice, but it says that it only works on 64-bit. My Early 2006 MacBook Pro is 32-bit. Its Apple Model Identifier is MacBookPro1,1; these use Core Duos NOT Core 2 Duos. I don't mind building a source deb for 32-bit if I can expect it to work. Is there some reason that Spice should be 64-bit only? Does it need features of the x86_64 Instruction Set Architecture that x86 does not have? When I try using SDL from virt-manager, the configuration for Local SDL Window says "Xauth: /home/mike/.Xauthority". When I try to start my guest, virt-manager emits an error. When I Googled the error message, the usual solution was to make ~/.Xauthority readible. However, .Xauthorty does not exist in my home directory. Instead I have a $XAUTHORITY environment variable. There is no way to configure SDL in virt-manager to use $XAUTHORITY instead of ~/.Xauthority. Neither does it work to copy the value of $XAUTHORITY into the file. I am ready to scream, because I've been five fscking days trying to make KVM work for Haiku development. There is a whole lot more that is broken than the slow video. All I really want to do for now is speed up my full builds of Haiku by using "jam -j2" to use both cores in my CPU. I may try Xen next, but the last time I monkeyed with Xen it was far, far more broken than I am finding KVM to be. Just for now, I would be satisfied if there were some way to use my USB stick as a drive in VirtualBox. VB does allow me to configure /dev/sdb as a drive, but it always causes a fatal error when I try to launch the guest. Thank You For Any Advice You Can Give Me. -

    Read the article

  • Using Appendbuffers in unity for terrain generation

    - by Wardy
    Like many others I figured I would try and make the most of the monster processing power of the GPU but I'm having trouble getting the basics in place. CPU code: using UnityEngine; using System.Collections; public class Test : MonoBehaviour { public ComputeShader Generator; public MeshTopology Topology; void OnEnable() { var computedMeshPoints = ComputeMesh(); CreateMeshFrom(computedMeshPoints); } private Vector3[] ComputeMesh() { var size = (32*32) * 4; // 4 points added for each x,z pos var buffer = new ComputeBuffer(size, 12, ComputeBufferType.Append); Generator.SetBuffer(0, "vertexBuffer", buffer); Generator.Dispatch(0, 1, 1, 1); var results = new Vector3[size]; buffer.GetData(results); buffer.Dispose(); return results; } private void CreateMeshFrom(Vector3[] generatedPoints) { var filter = GetComponent<MeshFilter>(); var renderer = GetComponent<MeshRenderer>(); if (generatedPoints.Length > 0) { var mesh = new Mesh { vertices = generatedPoints }; var colors = new Color[generatedPoints.Length]; var indices = new int[generatedPoints.Length]; //TODO: build this different based on topology of the mesh being generated for (int i = 0; i < indices.Length; i++) { indices[i] = i; colors[i] = Color.blue; } mesh.SetIndices(indices, Topology, 0); mesh.colors = colors; mesh.RecalculateNormals(); mesh.Optimize(); mesh.RecalculateBounds(); filter.sharedMesh = mesh; } else { filter.sharedMesh = null; } } } GPU code: #pragma kernel Generate AppendStructuredBuffer<float3> vertexBuffer : register(u0); void genVertsAt(uint2 xzPos) { //TODO: put some height generation code here. // could even run marching cubes / dual contouring code. float3 corner1 = float3( xzPos[0], 0, xzPos[1] ); float3 corner2 = float3( xzPos[0] + 1, 0, xzPos[1] ); float3 corner3 = float3( xzPos[0], 0, xzPos[1] + 1); float3 corner4 = float3( xzPos[0] + 1, 0, xzPos[1] + 1 ); vertexBuffer.Append(corner1); vertexBuffer.Append(corner2); vertexBuffer.Append(corner3); vertexBuffer.Append(corner4); } [numthreads(32, 1, 32)] void Generate (uint3 threadId : SV_GroupThreadID, uint3 groupId : SV_GroupID) { uint2 currentXZ = unint2( groupId.x * 32 + threadId.x, groupId.z * 32 + threadId.z); genVertsAt(currentXZ); } Can anyone explain why when I call "buffer.GetData(results);" on the CPU after the compute dispatch call my buffer is full of Vector3(0,0,0), I'm not expecting any y values yet but I would expect a bunch of thread indexes in the x,z values for the Vector3 array. I'm not getting any errors in any of this code which suggests it's correct syntax-wise but maybe the issue is a logical bug. Also: Yes, I know I'm generating 4,000 Vector3's and then basically round tripping them. However, the purpose of this code is purely to learn how round tripping works between CPU and GPU in Unity.

    Read the article

  • Columnstore Case Study #1: MSIT SONAR Aggregations

    - by aspiringgeek
    Preamble This is the first in a series of posts documenting big wins encountered using columnstore indexes in SQL Server 2012 & 2014.  Many of these can be found in this deck along with details such as internals, best practices, caveats, etc.  The purpose of sharing the case studies in this context is to provide an easy-to-consume quick-reference alternative. Why Columnstore? If we’re looking for a subset of columns from one or a few rows, given the right indexes, SQL Server can do a superlative job of providing an answer. If we’re asking a question which by design needs to hit lots of rows—DW, reporting, aggregations, grouping, scans, etc., SQL Server has never had a good mechanism—until columnstore. Columnstore indexes were introduced in SQL Server 2012. However, they're still largely unknown. Some adoption blockers existed; yet columnstore was nonetheless a game changer for many apps.  In SQL Server 2014, potential blockers have been largely removed & they're going to profoundly change the way we interact with our data.  The purpose of this series is to share the performance benefits of columnstore & documenting columnstore is a compelling reason to upgrade to SQL Server 2014. App: MSIT SONAR Aggregations At MSIT, performance & configuration data is captured by SCOM. We archive much of the data in a partitioned data warehouse table in SQL Server 2012 for reporting via an application called SONAR.  By definition, this is a primary use case for columnstore—report queries requiring aggregation over large numbers of rows.  New data is refreshed each night by an automated table partitioning mechanism—a best practices scenario for columnstore. The Win Compared to performance using classic indexing which resulted in the expected query plan selection including partition elimination vs. SQL Server 2012 nonclustered columnstore, query performance increased significantly.  Logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more.  Other than creating the columnstore index, no special modifications or tweaks to the app or databases schema were necessary to achieve the performance improvements.  Existing nonclustered indexes were rendered superfluous & were deleted, thus mitigating maintenance challenges such as defragging as well as conserving disk capacity. Details The table provides the raw data & summarizes the performance deltas. Logical Reads (8K pages) CPU (ms) Durn (ms) Columnstore 160,323 20,360 9,786 Conventional Table & Indexes 9,053,423 549,608 193,903 ? x56 x27 x20 The charts provide additional perspective of this data.  "Conventional vs. Columnstore Metrics" document the raw data.  Note on this linear display the magnitude of the conventional index performance vs. columnstore.  The “Metrics (?)” chart expresses these values as a ratio. Summary For DW, reports, & other BI workloads, columnstore often provides significant performance enhancements relative to conventional indexing.  I have documented here, the first in a series of reports on columnstore implementations, results from an initial implementation at MSIT in which logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more.  Subsequent features in this series document performance enhancements that are even more significant. 

    Read the article

  • The best computer ever

    - by Jeff
    (This is a repost from my personal blog… wow… I need to write more technical stuff!) About three years and three months ago, I bought a 17" MacBook Pro, and it turned out to be the best computer I've ever owned. You might think that every computer with better specs is automatically better than the last, but that hasn't been my experience. My first one was a Sony, back in the Pentium III days, and it cost an astonishing $2,500. That was even more ridiculous in 1999 dollars. It had a dial-up modem, and a CD-ROM, built-in! It may have even played DVD's. A few years later I bought an HP, and it ended up being a pile of shit. The power connector inside came loose from the board, and on occasion would even short. In 2005, I bought a Dell, and it wasn't bad. It had a really high resolution screen (complete with dead pixels, a problem in those days), and it was the first laptop I felt I could do real work on. When 2006 rolled around, Apple started making computers with Intel CPU's, and I bought the very first one the week it came out. I used Boot Camp to run Windows. I still have it in its box somewhere, and I used it for three years. The current 17" was new in 2009. The goodness was largely rooted in having a big screen with lots of dots. This computer has been the source of hundreds of blog posts, tens of thousands of lines of code, video and photo editing, and of course, a whole lot of Web surfing. It connected to corpnet at Microsoft, WiFi in Hawaii and has presented many a deck. It has traveled with me tens of thousands of miles. Last year, I put a solid state drive in it, and it was like getting a new computer. I can boot up a Windows 7 VM in about 19 seconds. Having 8 gigs of RAM has always been fantastic. Everything about it has been fast and fun. When new, the battery (when not using VM's) could get as much as 10 hours. I can still do 7 without much trouble. After 460 charge cycles, the battery health is still between 85 and 90%. The only real negative has been the size and weight. It's only an inch thick, but naturally it's pretty big with a 17" screen. You don't get battery life like that without a huge battery, either, so it's heavy. It was never a deal breaker, but sometimes a long haul across a large airport, you know you're carrying it. Today, Apple announced a new, thinner and lighter 15" laptop, with twice the RAM and CPU cores, and four times the screen resolution. It basically handles my size and weight issues while retaining the resolution, and it still costs less than my 17" did. So I ordered one. Three years is an excellent run, but I kind of budgeted for a new workhorse this year anyway. So if you're interested in a 17" MacBook Pro with a Core 2 Duo 2.66 GHz CPU, 8 gigs of RAM and a 320 gig hard drive (sorry, I'm keeping the SSD), I have one to sell. They've apparently discontinued the 17", which is going to piss off the video community. It's in excellent condition, with a few minor scratches, but I take care of my stuff.

    Read the article

  • Faster Memory Allocation Using vmtasks

    - by Steve Sistare
    You may have noticed a new system process called "vmtasks" on Solaris 11 systems: % pgrep vmtasks 8 % prstat -p 8 PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 8 root 0K 0K sleep 99 -20 9:10:59 0.0% vmtasks/32 What is vmtasks, and why should you care? In a nutshell, vmtasks accelerates creation, locking, and destruction of pages in shared memory segments. This is particularly helpful for locked memory, as creating a page of physical memory is much more expensive than creating a page of virtual memory. For example, an ISM segment (shmflag & SHM_SHARE_MMU) is locked in memory on the first shmat() call, and a DISM segment (shmflg & SHM_PAGEABLE) is locked using mlock() or memcntl(). Segment operations such as creation and locking are typically single threaded, performed by the thread making the system call. In many applications, the size of a shared memory segment is a large fraction of total physical memory, and the single-threaded initialization is a scalability bottleneck which increases application startup time. To break the bottleneck, we apply parallel processing, harnessing the power of the additional CPUs that are always present on modern platforms. For sufficiently large segments, as many of 16 threads of vmtasks are employed to assist an application thread during creation, locking, and destruction operations. The segment is implicitly divided at page boundaries, and each thread is given a chunk of pages to process. The per-page processing time can vary, so for dynamic load balancing, the number of chunks is greater than the number of threads, and threads grab chunks dynamically as they finish their work. Because the threads modify a single application address space in compressed time interval, contention on locks protecting VM data structures locks was a problem, and we had to re-scale a number of VM locks to get good parallel efficiency. The vmtasks process has 1 thread per CPU and may accelerate multiple segment operations simultaneously, but each operation gets at most 16 helper threads to avoid monopolizing CPU resources. We may reconsider this limit in the future. Acceleration using vmtasks is enabled out of the box, with no tuning required, and works for all Solaris platform architectures (SPARC sun4u, SPARC sun4v, x86). The following tables show the time to create + lock + destroy a large segment, normalized as milliseconds per gigabyte, before and after the introduction of vmtasks: ISM system ncpu before after speedup ------ ---- ------ ----- ------- x4600 32 1386 245 6X X7560 64 1016 153 7X M9000 512 1196 206 6X T5240 128 2506 234 11X T4-2 128 1197 107 11x DISM system ncpu before after speedup ------ ---- ------ ----- ------- x4600 32 1582 265 6X X7560 64 1116 158 7X M9000 512 1165 152 8X T5240 128 2796 198 14X (I am missing the data for T4 DISM, for no good reason; it works fine). The following table separates the creation and destruction times: ISM, T4-2 before after ------ ----- create 702 64 destroy 495 43 To put this in perspective, consider creating a 512 GB ISM segment on T4-2. Creating the segment would take 6 minutes with the old code, and only 33 seconds with the new. If this is your Oracle SGA, you save over 5 minutes when starting the database, and you also save when shutting it down prior to a restart. Those minutes go directly to your bottom line for service availability.

    Read the article

  • Using WKA in Large Coherence Clusters (Disabling Multicast)

    - by jpurdy
    Disabling hardware multicast (by configuring well-known addresses aka WKA) will place significant stress on the network. For messages that must be sent to multiple servers, rather than having a server send a single packet to the switch and having the switch broadcast that packet to the rest of the cluster, the server must send a packet to each of the other servers. While hardware varies significantly, consider that a server with a single gigabit connection can send at most ~70,000 packets per second. To continue with some concrete numbers, in a cluster with 500 members, that means that each server can send at most 140 cluster-wide messages per second. And if there are 10 cluster members on each physical machine, that number shrinks to 14 cluster-wide messages per second (or with only mild hyperbole, roughly zero). It is also important to keep in mind that network I/O is not only expensive in terms of the network itself, but also the consumption of CPU required to send (or receive) a message (due to things like copying the packet bytes, processing a interrupt, etc). Fortunately, Coherence is designed to rely primarily on point-to-point messages, but there are some features that are inherently one-to-many: Announcing the arrival or departure of a member Updating partition assignment maps across the cluster Creating or destroying a NamedCache Invalidating a cache entry from a large number of client-side near caches Distributing a filter-based request across the full set of cache servers (e.g. queries, aggregators and entry processors) Invoking clear() on a NamedCache The first few of these are operations that are primarily routed through a single senior member, and also occur infrequently, so they usually are not a primary consideration. There are cases, however, where the load from introducing new members can be substantial (to the point of destabilizing the cluster). Consider the case where cluster in the first paragraph grows from 500 members to 1000 members (holding the number of physical machines constant). During this period, there will be 500 new member introductions, each of which may consist of several cluster-wide operations (for the cluster membership itself as well as the partitioned cache services, replicated cache services, invocation services, management services, etc). Note that all of these introductions will route through that one senior member, which is sharing its network bandwidth with several other members (which will be communicating to a lesser degree with other members throughout this process). While each service may have a distinct senior member, there's a good chance during initial startup that a single member will be the senior for all services (if those services start on the senior before the second member joins the cluster). It's obvious that this could cause CPU and/or network starvation. In the current release of Coherence (3.7.1.3 as of this writing), the pure unicast code path also has less sophisticated flow-control for cluster-wide messages (compared to the multicast-enabled code path), which may also result in significant heap consumption on the senior member's JVM (from the message backlog). This is almost never a problem in practice, but with sufficient CPU or network starvation, it could become critical. For the non-operational concerns (near caches, queries, etc), the application itself will determine how much load is placed on the cluster. Applications intended for deployment in a pure unicast environment should be careful to avoid excessive dependence on these features. Even in an environment with multicast support, these operations may scale poorly since even with a constant request rate, the underlying workload will increase at roughly the same rate as the underlying resources are added. Unless there is an infrastructural requirement to the contrary, multicast should be enabled. If it can't be enabled, care should be taken to ensure the added overhead doesn't lead to performance or stability issues. This is particularly crucial in large clusters.

    Read the article

  • installer can't find partition, but fdisk can find them

    - by pxd
    I'm installing ubuntu 12.04, my system had install 2 system -- winxp and ubuntu 10.10. Now, I want to update ubuntu to 12.04. I use usb disk to install 12.04. But, the installer can't not find my partition in my harddisk. But, the fdisk can find them. Can you help me? How to do? ubuntu@ubuntu:~$ sudo lshw -short H/W path Device Class Description system HP 2230s (NN868PA#AB2) /0 bus 3037 /0/9 memory 64KiB BIOS /0/0 processor Intel(R) Core(TM)2 Duo CPU T6570 @ 2.10GHz /0/0/1 memory 2MiB L2 cache /0/0/3 memory 32KiB L1 cache /0/0/0.1 processor Logical CPU /0/0/0.2 processor Logical CPU /0/2 memory 32KiB L1 cache /0/4 memory 2GiB System Memory /0/4/0 memory SODIMM [empty] /0/4/1 memory 2GiB SODIMM DDR2 Synchronous 800 MHz (1.2 ns) /0/100 bridge Mobile 4 Series Chipset Memory Controller Hub /0/100/2 display Mobile 4 Series Chipset Integrated Graphics Controller /0/100/2.1 display Mobile 4 Series Chipset Integrated Graphics Controller /0/100/1a bus 82801I (ICH9 Family) USB UHCI Controller #4 /0/100/1a.1 bus 82801I (ICH9 Family) USB UHCI Controller #5 /0/100/1a.2 bus 82801I (ICH9 Family) USB UHCI Controller #6 /0/100/1a.7 bus 82801I (ICH9 Family) USB2 EHCI Controller #2 /0/100/1b multimedia 82801I (ICH9 Family) HD Audio Controller /0/100/1c bridge 82801I (ICH9 Family) PCI Express Port 1 /0/100/1c.1 bridge 82801I (ICH9 Family) PCI Express Port 2 /0/100/1c.1/0 wlan1 network PRO/Wireless 5100 AGN [Shiloh] Network Connection /0/100/1c.2 bridge 82801I (ICH9 Family) PCI Express Port 3 /0/100/1c.4 bridge 82801I (ICH9 Family) PCI Express Port 5 /0/100/1c.5 bridge 82801I (ICH9 Family) PCI Express Port 6 /0/100/1c.5/0 eth1 network 88E8072 PCI-E Gigabit Ethernet Controller /0/100/1d bus 82801I (ICH9 Family) USB UHCI Controller #1 /0/100/1d.1 bus 82801I (ICH9 Family) USB UHCI Controller #2 /0/100/1d.2 bus 82801I (ICH9 Family) USB UHCI Controller #3 /0/100/1d.7 bus 82801I (ICH9 Family) USB2 EHCI Controller #1 /0/100/1e bridge 82801 Mobile PCI Bridge /0/100/1f bridge ICH9M LPC Interface Controller /0/100/1f.2 scsi0 storage 82801IBM/IEM (ICH9M/ICH9M-E) 4 port SATA Controller [AHCI mode] /0/100/1f.2/0 /dev/sda disk 500GB WDC WD5000BEVT-0 /0/100/1f.2/0/1 /dev/sda1 volume 48GiB Windows NTFS volume /0/100/1f.2/0/2 /dev/sda2 volume 416GiB Extended partition /0/100/1f.2/0/2/5 /dev/sda5 volume 97GiB HPFS/NTFS partition /0/100/1f.2/0/2/6 /dev/sda6 volume 198GiB HPFS/NTFS partition /0/100/1f.2/0/2/7 /dev/sda7 volume 27GiB Linux filesystem partition /0/100/1f.2/0/2/8 /dev/sda8 volume 93GiB Linux filesystem partition /0/100/1f.2/1 /dev/cdrom disk CDDVDW TS-L633M /0/1 scsi6 storage /0/1/0.0.0 /dev/sdb disk 15GB STORAGE DEVICE /0/1/0.0.0/0 /dev/sdb disk 15GB /0/1/0.0.0/0/1 /dev/sdb1 volume 14GiB Windows FAT volume /1 power HZ04037 ubuntu@ubuntu:~$ ubuntu@ubuntu:~$ sudo fdisk -l Disk /dev/sda: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x31263125 Device Boot Start End Blocks Id System /dev/sda1 * 63 102277727 51138832+ 7 HPFS/NTFS/exFAT /dev/sda2 102277728 976784129 437253201 f W95 Ext'd (LBA) /dev/sda5 102277791 307078127 102400168+ 7 HPFS/NTFS/exFAT /dev/sda6 307078191 724141151 208531480+ 7 HPFS/NTFS/exFAT /dev/sda7 724142080 781459455 28658688 83 Linux /dev/sda8 781461504 976771071 97654784 83 Linux Disk /dev/sdb: 15.9 GB, 15931539456 bytes 64 heads, 32 sectors/track, 15193 cylinders, total 31116288 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x0009eb92 Device Boot Start End Blocks Id Systemfile:///home/ubuntu/Pictures/Screenshot%20from%202012-07-07%2010:25:40.png /dev/sdb1 * 32 31115263 15557616 c W95 FAT32 (LBA) ubuntu 12.04 installer can't find the partition in my hard disk, only find device - /dev/sda.(sorry, I'm new user, so can't send image.)

    Read the article

  • High Resolution Timeouts

    - by user12607257
    The default resolution of application timers and timeouts is now 1 msec in Solaris 11.1, down from 10 msec in previous releases. This improves out-of-the-box performance of polling and event based applications, such as ticker applications, and even the Oracle rdbms log writer. More on that in a moment. As a simple example, the poll() system call takes a timeout argument in units of msec: System Calls poll(2) NAME poll - input/output multiplexing SYNOPSIS int poll(struct pollfd fds[], nfds_t nfds, int timeout); In Solaris 11, a call to poll(NULL,0,1) returns in 10 msec, because even though a 1 msec interval is requested, the implementation rounds to the system clock resolution of 10 msec. In Solaris 11.1, this call returns in 1 msec. In specification lawyer terms, the resolution of CLOCK_REALTIME, introduced by POSIX.1b real time extensions, is now 1 msec. The function clock_getres(CLOCK_REALTIME,&res) returns 1 msec, and any library calls whose man page explicitly mention CLOCK_REALTIME, such as nanosleep(), are subject to the new resolution. Additionally, many legacy functions that pre-date POSIX.1b and do not explicitly mention a clock domain, such as poll(), are subject to the new resolution. Here is a fairly comprehensive list: nanosleep pthread_mutex_timedlock pthread_mutex_reltimedlock_np pthread_rwlock_timedrdlock pthread_rwlock_reltimedrdlock_np pthread_rwlock_timedwrlock pthread_rwlock_reltimedwrlock_np mq_timedreceive mq_reltimedreceive_np mq_timedsend mq_reltimedsend_np sem_timedwait sem_reltimedwait_np poll select pselect _lwp_cond_timedwait _lwp_cond_reltimedwait semtimedop sigtimedwait aiowait aio_waitn aio_suspend port_get port_getn cond_timedwait cond_reltimedwait setitimer (ITIMER_REAL) misc rpc calls, misc ldap calls This change in resolution was made feasible because we made the implementation of timeouts more efficient a few years back when we re-architected the callout subsystem of Solaris. Previously, timeouts were tested and expired by the kernel's clock thread which ran 100 times per second, yielding a resolution of 10 msec. This did not scale, as timeouts could be posted by every CPU, but were expired by only a single thread. The resolution could be changed by setting hires_tick=1 in /etc/system, but this caused the clock thread to run at 1000 Hz, which made the potential scalability problem worse. Given enough CPUs posting enough timeouts, the clock thread could be a performance bottleneck. We fixed that by re-implementing the timeout as a per-CPU timer interrupt (using the cyclic subsystem, for those familiar with Solaris internals). This decoupled the clock thread frequency from timeout resolution, and allowed us to improve default timeout resolution without adding CPU overhead in the clock thread. Here are some exceptions for which the default resolution is still 10 msec. The thread scheduler's time quantum is 10 msec by default, because preemption is driven by the clock thread (plus helper threads for scalability). See for example dispadmin, priocntl, fx_dptbl, rt_dptbl, and ts_dptbl. This may be changed using hires_tick. The resolution of the clock_t data type, primarily used in DDI functions, is 10 msec. It may be changed using hires_tick. These functions are only used by developers writing kernel modules. A few functions that pre-date POSIX CLOCK_REALTIME mention _SC_CLK_TCK, CLK_TCK, "system clock", or no clock domain. These functions are still driven by the clock thread, and their resolution is 10 msec. They include alarm, pcsample, times, clock, and setitimer for ITIMER_VIRTUAL and ITIMER_PROF. Their resolution may be changed using hires_tick. Now back to the database. How does this help the Oracle log writer? Foreground processes post a redo record to the log writer, which releases them after the redo has committed. When a large number of foregrounds are waiting, the release step can slow down the log writer, so under heavy load, the foregrounds switch to a mode where they poll for completion. This scales better because every foreground can poll independently, but at the cost of waiting the minimum polling interval. That was 10 msec, but is now 1 msec in Solaris 11.1, so the foregrounds process transactions faster under load. Pretty cool.

    Read the article

  • ?SPARC T4?????????????·???? : Netra SPARC T4-1

    - by user13138700
    ?SPARC T4???????????????·??????? Netra SPARC T4-1 ???? Netra SPARC T4-2 ?2012?1?10??????????3?15??????????????(????) ?????????? Netra SPARC T4-1 ? 4core ???( T4 ???????? 4core ???)(*)???????????????????????????(*)( Netra SPARC T4-1 ?????? 4core ???? 8core ????????) ??? prtdiag ????? pginfo ??????????????? 8????/1core ???? prtdiag ????????4core=32???????????????pginfo ?????????????????core ???????????????????? # ./prtdiag -v System Configuration: Oracle Corporation sun4v Netra SPARC T4-1 ???????: 130560 M ??? ================================ ?? CPU ================================ CPU ID Frequency Implementation Status ------ --------- ---------------------- ------- 0 2848 MHz SPARC-T4 on-line 1 2848 MHz SPARC-T4 on-line 2 2848 MHz SPARC-T4 on-line 3 2848 MHz SPARC-T4 on-line 4 2848 MHz SPARC-T4 on-line 5 2848 MHz SPARC-T4 on-line 6 2848 MHz SPARC-T4 on-line 7 2848 MHz SPARC-T4 on-line 8 2848 MHz SPARC-T4 on-line 9 2848 MHz SPARC-T4 on-line 10 2848 MHz SPARC-T4 on-line 11 2848 MHz SPARC-T4 on-line 12 2848 MHz SPARC-T4 on-line 13 2848 MHz SPARC-T4 on-line 14 2848 MHz SPARC-T4 on-line 15 2848 MHz SPARC-T4 on-line 16 2848 MHz SPARC-T4 on-line 17 2848 MHz SPARC-T4 on-line 18 2848 MHz SPARC-T4 on-line 19 2848 MHz SPARC-T4 on-line 20 2848 MHz SPARC-T4 on-line 21 2848 MHz SPARC-T4 on-line 22 2848 MHz SPARC-T4 on-line 23 2848 MHz SPARC-T4 on-line 24 2848 MHz SPARC-T4 on-line 25 2848 MHz SPARC-T4 on-line 26 2848 MHz SPARC-T4 on-line 27 2848 MHz SPARC-T4 on-line 28 2848 MHz SPARC-T4 on-line 29 2848 MHz SPARC-T4 on-line 30 2848 MHz SPARC-T4 on-line 31 2848 MHz SPARC-T4 on-line ======================= Physical Memory Configuration ======================== ???? # pginfo -p -T 0 (System [system,chip]) CPUs: 0-31 `-- 3 (Data_Pipe_to_memory [system,chip]) CPUs: 0-31 |-- 2 (Floating_Point_Unit [core]) CPUs: 0-7 | `-- 1 (Integer_Pipeline [core]) CPUs: 0-7 |-- 5 (Floating_Point_Unit [core]) CPUs: 8-15 | `-- 4 (Integer_Pipeline [core]) CPUs: 8-15 |-- 7 (Floating_Point_Unit [core]) CPUs: 16-23 | `-- 6 (Integer_Pipeline [core]) CPUs: 16-23 `-- 9 (Floating_Point_Unit [core]) CPUs: 24-31 `-- 8 (Integer_Pipeline [core]) CPUs: 24-31 T4 ????????????????????????????????????????????????? T3 ?????(S2 core)?????T4 ?????(S3 core)?????????????5???????????? T3 ?????(S2 core)?????????????????????????(????????)?????????????????????????????????????????????·???????????????????????????????????????? ????T4 ????????????????????????????T4 ??????????·??????? Netra SPARC T4-1 4core ????????????????????????????????????T3 ???????????????????????????? ?????????Netra SPARC T4-1 ??????????????? Netra SPARC T4-1 ?? Computing 1 x SPARC T4 4?? 32???? or 8 ?? 64 ???? 2.85GHz CPU (1?????8????) 16 x DDR3 DIMM (?? 256GB ?????16GB DIMM ???) I/O and Storage 3 x Low Profile PCI-Express Gen2 ???? (2 x 10Gb Ethernet XAUI ???????) 2 x Full-height Half-length PCI-Express Gen2 ???? 4 x 10/100/1000 Ethernet ???????? 4 x 2.5” SAS2 HDD 4 x USB ??? (?? 2, ?? 2) RAS and Management and Power Supply ???? (RAID????), ????PSU ?????????? ILOM?????????????? 2N (1+1) , AC ???? DC ?? Support OS Oracle Solaris 10 10/9, 9/10, 8/11, Oracle Solaris 11 11/11 Oracle VM Server for SPARC 2.1 (LDoms) ???? ??? NEBS Level3?? ??????21” 19”(EIA-310D),23”,24”,600mm????? ?????(?????)????????? ????SPARC T4 ????????SPARC T4 ?????????????????????????(4???)???????????? Oracle OpenWorld Tokyo 2012 ?3??(4/4(?)?4/5(?)?4/6(?))?????????????????????&?????????????????SPARC T4 ?????????????????????????????????·?????????????????SPARC T4 ???????????????????!? Oracle OpenWorld Tokyo 2012 http://www.oracle.com/openworld/jp-ja/index.html ????·???????????? 4/6(?) Develop D3-13 (14:00 - 14:45) ???????????49 ??? ?????? 7264 ???????????????

    Read the article

  • MSAcpi_ThermalZoneTemperature class not showing actual temperature

    - by jchoudhury
    i want to fetch CPU Performance data in real time including temperature. i used the following code to get CPU Temperature: try { ManagementObjectSearcher searcher = new ManagementObjectSearcher("root\\WMI", "SELECT * FROM MSAcpi_ThermalZoneTemperature"); foreach (ManagementObject queryObj in searcher.Get()) { double temp = Convert.ToDouble(queryObj["CurrentTemperature"].ToString()); double temp_critical = Convert.ToDouble(queryObj["CriticalTripPoint"].ToString()); double temp_cel = (temp/10 - 273.15); double temp_critical_cel = temp_critical / 10 - 273.15; lblCurrentTemp.Text = temp_cel.ToString(); lblCriticalTemp.Text = temp_critical_cel.ToString(); } } catch (ManagementException e) { MessageBox.Show("An error occurred while querying for WMI data: " + e.Message); } but this code shows the temperature that is not the correct temperature. It ususally shows 49.5-50.5 degrees centigrade. But I used "OpenHardwareMonitor" that report CPU temperature over 71 degree centigrade and changing fractions along with time fractions. is there anything I am missing in the code? I used the above code in timer_click event for every 500ms interval to refresh the temperature reading but it's always showing the same temperature from the beginning of execution. That implies if you run this application and if it shows 49 degree then after 1 hour session, it'll constantly show 49 degree. Where is the problem? please help. Thanks in advance.

    Read the article

  • Context migration in CUDA.NET

    - by Vyacheslav
    I'm currently using CUDA.NET library by GASS. I need to initialize cuda arrays (actually cublas vectors, but it doesn't matters) in one CPU thread and use them in other CPU thread. But CUDA context which holding all initialized arrays and loaded functions, can be attached to only one CPU thread. There is mechanism called context migration API to detach context from one thread and attach it to another. But i don't how to properly use it in CUDA.NET. I tried something like this: class Program { private static float[] vector1, vector2; private static CUDA cuda; private static CUBLAS cublas; private static CUdeviceptr ptr; static void Main(string[] args) { cuda = new CUDA(false); cublas = new CUBLAS(cuda); cuda.Init(); cuda.CreateContext(0); AllocateVectors(); cuda.DetachContext(); CUcontext context = cuda.PopCurrentContext(); GetVectorFromDeviceAsync(context); } private static void AllocateVectors() { vector1 = new float[]{1f, 2f, 3f, 4f, 5f}; ptr = cublas.Allocate(vector1.Length, sizeof (float)); cublas.SetVector(vector1, ptr); vector2 = new float[5]; } private static void GetVectorFromDevice(object objContext) { CUcontext localContext = (CUcontext) objContext; cuda.PushCurrentContext(localContext); cuda.AttachContext(localContext); //change vector somehow vector1[0] = -1; //copy changed vector to device cublas.SetVector(vector1, ptr); cublas.GetVector(ptr, vector2); CUDADriver.cuCtxPopCurrent(ref localContext); } private static void GetVectorFromDeviceAsync(CUcontext cUcontext) { Thread thread = new Thread(GetVectorFromDevice); thread.IsBackground = false; thread.Start(cUcontext); } } But execution fails on attempt to copy changed vector to device because context is not attached? Any ideas how i can get it work?

    Read the article

  • How to get Processor and Motherboard Id ?

    - by Frank
    I use the code from http://www.rgagnon.com/javadetails/java-0580.html to get Motherboard Id, but the result is "null", <1 How can that be ? <2 Also I modified the code a bit to look like this to get processor Id : "Set objWMIService = GetObject(\"winmgmts:\\\\.\\root\\cimv2\")\n"+ "Set colItems = objWMIService.ExecQuery _ \n"+ " (\"Select * from Win32_Processor\") \n"+ "For Each objItem in colItems \n"+ " Wscript.Echo objItem.ProcessorId \n"+ " exit for ' do the first cpu only! \n"+ "Next \n"; The result is something like : ProcessorId = BFEBFBFF00010676 On http://msdn.microsoft.com/en-us/library/aa389273%28VS.85%29.aspx it says : ProcessorId : Processor information that describes the processor features. For an x86 class CPU, the field format depends on the processor support of the CPUID instruction. If the instruction is supported, the property contains 2 (two) DWORD formatted values. The first is an offset of 08h-0Bh, which is the EAX value that a CPUID instruction returns with input EAX set to 1. The second is an offset of 0Ch-0Fh, which is the EDX value that the instruction returns. Only the first two bytes of the property are significant and contain the contents of the DX register at CPU reset—all others are set to 0 (zero), and the contents are in DWORD format. I don't quite understand it, in plain English, is it unique or just a number for this class of processors, for instance all Intel Core2 Duo P8400 will have this number ? Frank

    Read the article

  • Winforms: How to speed up Invalidate()?

    - by Pedery
    I'm developing a retained mode drawing application in GDI+. The application can draw simple shapes to a canvas and perform basic editing. The math that does this is optimized to the last byte and is not an issue. I'm drawing on a panel that is using the built-in Controlstyles.DoubleBuffer. Now, my problem arises if I run my app maximized on a big monitor (HD in my case). If I try to draw a line from one corner of the (big) canvas to the diagonally oposite other, it will start to lag and the CPU goes high up. Each graphical object in my app has a boundingbox. Thus, when I invalidate the boundingbox of a line that goes from one corner of the maximized app to the oposite diagonal one, that boundingbox is virtually as big as the canvas. When a user is drawing a line, this invalidation of the boundingbox thus happens on the mousemove event, and there is a clear lag visible. This lag also exists if the line is the only object on the canvas. I've tried to optimize this in many ways. If I draw a shorter line, the CPU and the lag goes down. If I remove the Invalidate() and keep all other code, the app is quick. If I use a Region (that only spans the figure) to invalidate instead of the boundingbox, it is just as slow. If I split the boundingbox into a range of smaller boxes that lie back to back, thus reducing the invalidation area, no visible performance gain can be seen. Thus I'm at a loss here. How can I speed up the invalidation? On a side note, both Paint.Net and Mspaint suffers from the same shortcommings. Word and PowerPoint however, seem to be able to paint a line as described above with no lag and no CPU load at all. Thus it's possible to achieve the desired results, the question is how?

    Read the article

  • Surprising results with .NET multi-theading algorithm

    - by Myles J
    Hi, I've recently wrote a C# console time tabling algorithm that is based on a combination of a genetic algorithm with a few brute force routines thrown in. The initial results were promising but I figured I could improve the performance by splitting the brute force routines up to run in parallel on multi processor architectures. To do this I used the well documented Producer/Consumer model (as documented in this fantastic article http://www.albahari.com/threading/part2.aspx#_ProducerConsumerQWaitHandle). I changed my code to create one thread per logical processor during the brute force routines. The performance gains on my work station were very pleasing. I am running Windows XP on the following hardware: Intel Core 2 Quad CPU 2.33 GHz 3.49 GB RAM Initial tests indicated average performance gains of approx 40% when using 4 threads. The next step was to deploy the new multi-threading version of the algorithm to our higher spec UAT server. Here is the spec of our UAT server: Windows 2003 Server R2 Enterprise x64 8 cpu (Quad-Core) AMD Opteron 2.70 GHz 255 GB RAM After running the first round of tests we were all extremely surprised to find that the algorithm actually runs slower on the high spec W2003 server than on my local XP work station! In fact the tests seem to indicate that it doesn't matter how many threads are generated (tests were ran with the app spawning between 2 to 32 threads). The algorithm always runs significantly slower on the UAT W2003 server? How could this be? Surely the app should run faster on a 8 cpu (Quad-Core) than my 2 Quad work station? Why are we seeing no performance gains with the multi-threading on the W2003 server whilst the XP workstation tests show gains of up to 40%? Any help or pointers would be appreciated. Regards Myles

    Read the article

  • Removing a platform from Configuration Manager

    - by demoncodemonkey
    I have a solution containing C# and C++/CLI projects. There are 3 platforms in my solution: Any CPU Win32 Mixed Platforms I never want to "just build the C# ones" or "just build the C++ ones", I always want to build all projects. So the platforms metaphor is meaningless to me, I'll leave it on Mixed Platforms or whatever as long as they all build. Now VS sometimes automatically switches the current platform to Any CPU (I'm not sure when or why). This means that pressing F7 will only try to build the C# projects, which is obviously no good. So I have to switch back to Mixed Platforms and try again. So how to workaround this irritating problem? I have tried 2 ways: In Configuration Manager, remove Any CPU and Win32 platforms. This worked until I added a new project and Visual Studio very kindly added them back in... :/ In Configuration Manager, check all checkboxes for all projects in all configurations in all platforms. This becomes a nightmare to manage with many projects in the solution. Any other ideas?

    Read the article

  • SQL Server Blocking Issue

    - by Robin Weston
    We currently have an issue that occurs roughly once a day on SQL 2005 database server, although the time it happens is not consistent. Basically, the database grinds to a halt, and starts refusing connections with the following error message. This includes logging into SSMS: A connection was successfully established with the server, but then an error occurred during the login process. (provider: TCP Provider, error: 0 - The specified network name is no longer available.) Our CPU usage for SQL is usually around 15%, but when the DB is in it's broken state it's around 70%, so it's clearly doing something, even if no-one can connect. Even if I disable the web app that uses the database the CPU still doesn't go down. I am unable to restart the SQLSERVER process as it is unresponsive, so I have to end up killing the process manually, which then puts the DB into Suspect/Recovery mode (which I can fix but it's a pain). Below are some PerfMon stats I gathered when the DB was in it's broken state which might help. I have a bunch more if people want to request them: Active Transactions: 2 (Never Changes) Logical Connections: 34 (NC) Process Blocked: 16 (NC) User Connections: 30 (NC) Batch Request: 0 (NC) Active Jobs: 2 (NC) Log Truncations: 596 (NC) Log Shrinks: 24 (NC) Longest Running Transaction Time: 99 (NC) I guess they key is finding out what the DB is using it's CPU on, but as I can't even log into SSMS this isn't possible with the standard methods. Disturbingly, I can't even use the dedicated admin connection to get into SSMS. I get the same timout as with all other requests. Any advice, reccomendations, or even sympathy, is much appreciated!

    Read the article

  • Simple description of worker and I/O threads in .NET

    - by Konstantin
    It's very hard to find detailed but simple description of worker and I/O threads in .NET What's clear to me regarding this topic (but may not be technically precise): Worker threads are threads that should employ CPU for their work; I/O threads (also called "completion port threads") should employ device drivers for their work and essentially "do nothing", only monitor the completion of non-CPU operations. What is not clear: Although method ThreadPool.GetAvailableThreads returns number of available threads of both types, it seems there is no public API to schedule work for I/O thread. You can only manually create worker thread in .NET? It seems that single I/O thread can monitor multiple I/O operations. Is it true? If so, why ThreadPool has so many available I/O threads by default? In some texts I read that callback, triggered after I/O operation completion is performed by I/O thread. Is it true? Isn’t this a job for worker thread, considering that this callback is CPU operation? To be more specific – do ASP.NET asynchronous pages user I/O threads? What exactly is performance benefit in switching I/O work to separate thread instead of increasing maximum number of worker threads? Is it because single I/O thread does monitor multiple operations? Or Windows does more efficient context switching when using I/O threads?

    Read the article

  • Java Random Slowdowns on Mac OS cont'd

    - by javajustice
    I asked this question a few weeks ago, but I'm still having the problem and I have some new hints. The original question is here: http://stackoverflow.com/questions/1651887/java-random-slowdowns-on-mac-os Basically, I have a java application that splits a job into independent pieces and runs them in separate threads. The threads have no synchronization or shared memory items. The only resources they do share are data files on the hard disk, with each thread having an open file channel. Most of the time it runs very fast, but occasionally it will run very slow for no apparent reason. If I attach a CPU profiler to it, then it will start running quickly again. If I take a CPU snapshot, it says its spending most of its time in "self time" in a function that doesn't do anything except check a few (unshared unsynchronized) booleans. I don't know how this could be accurate because 1, it makes no sense, and 2, attaching the profiler seems to knock the threads out of whatever mode they're in and fix the problem. Also, regardless of whether it runs fast or slow, it always finishes and gives the same output, and it never dips in total cpu usage (in this case ~1500%), implying that the threads aren't getting blocked. I have tried different garbage collectors, different sizings the parts of the memory space, writing data output to non-raid drives, and putting all data output in threads separate the main worker threads. Does anyone have any idea what kind of problem this could be? Could it be the operating system (OS X 10.6.2) ? I have not been able to duplicate it on a windows machine, but I don't have one with a similar hardware configuration.

    Read the article

  • New projects not built when target platform is set explicitly

    - by stiank81
    I create a new solution with one project, and then change the target platform from "Any CPU" to "x86". After this new projects added doesn't get built by default, and their target platform doesn't follow the global settings. Why?! Looking at the configuration manager new projects added are not checked to "Build", and they get target platform "Any CPU" instead of the globally set x86. Why is this happening? I expect new projects too to get the globally set and defined x86 target platform.. Some things I've tried: Toggle global platform back to Any CPU, and then to x86 again. No change.. Choosing platform explicitly for the new project. x86 is not available in the list, and when I say <New..> and try adding it I'm not allowed as ".. a solution platform with the same name already exists.". On the build properties for the new project I can't change the platform in the Configuration section, but I can set "Platform target" to x86 in the General section. It is however not clear whether this actually makes a difference, and it wouldn't respond if I change the target platform globally later. Initially I thought this was a problem from converting my solution from VS2008 to VS2010, but the problem applies both places. I.e. when I create a solution in VS2008 and just stay in VS2008 I still get the problem.

    Read the article

  • Did I implement clock drift properly?

    - by David Titarenco
    I couldn't find any clock drift RNG code for Windows anywhere so I attempted to implement it myself. I haven't run the numbers through ent or DIEHARD yet, and I'm just wondering if this is even remotely correct... void QueryRDTSC(__int64* tick) { __asm { xor eax, eax cpuid rdtsc mov edi, dword ptr tick mov dword ptr [edi], eax mov dword ptr [edi+4], edx } } __int64 clockDriftRNG() { __int64 CPU_start, CPU_end, OS_start, OS_end; // get CPU ticks -- uses RDTSC on the Processor QueryRDTSC(&CPU_start); Sleep(1); QueryRDTSC(&CPU_end); // get OS ticks -- uses the Motherboard clock QueryPerformanceCounter((LARGE_INTEGER*)&OS_start); Sleep(1); QueryPerformanceCounter((LARGE_INTEGER*)&OS_end); // CPU clock is ~1000x faster than mobo clock // return raw return ((CPU_end - CPU_start)/(OS_end - OS_start)); // or // return a random number from 0 to 9 // return ((CPU_end - CPU_start)/(OS_end - OS_start)%10); } If you're wondering why I Sleep(1), it's because if I don't, OS_end - OS_start returns 0 consistently (because of the bad timer resolution, I presume). Basically, (CPU_end - CPU_start)/(OS_end - OS_start) always returns around 1000 with a slight variation based on the entropy of CPU load, maybe temperature, quartz crystal vibration imperfections, etc. Anyway, the numbers have a pretty decent distribution, but this could be totally wrong. I have no idea.

    Read the article

  • How long is the time frame between context switches on Windows?

    - by mattcodes
    Reading CLR via C# 2.0 (I dont have 3.0 with me at the moment) Is this still the case: If there is only one CPU in a computer, only one thread can run at any one time. Windows has to keep track of the thread objects, and every so often, Windows has to decide which thread to schedule next to go to the CPU. This is additional code that has to execute once every 20 milliseconds or so. When Windows makes a CPU stop executing one thread's code and start executing another thread's code, we call this a context switch. A context switch is fairly expensive because the operating system has to: So circa CLR via C# 2.0 lets say we are on Pentium 4 2.4ghz 1 core non-HT, XP. Every 20 milliseconds? Where a CLR thread or Java thread is mapped to an OS thread only a maximum of 50 threads per second may get a chance to to run? I've read that context switching is very fast in mircoseconds here on SO, but how often roughly (magnitude style guesses) will say a modest 5 year old server Windows 2003 Pentium Xeon single core give the OS the opportunity to context switch? 20ms in the right area? I dont need exact figures I just want to be sure that's in the right area, seems rather long to me.

    Read the article

  • My program is spending most of its time in objc_msgSend. Does that mean that Objective-C has bad per

    - by Paperflyer
    Hello Stackoverflow. I have written an application that has a number of custom views and generally draws a lot of lines and bitmaps. Since performance is somewhat critical for the application, I spent a good amount of time optimizing draw performance. Now, activity monitor tells me that my application is usually using about 12% CPU and Instrument (the profiler) says that a whopping 10% CPU is spent in objc_msgSend (mostly in drawing related system calls). On the one hand, I am glad about this since it means that my drawing is about as fast as it gets and my optimizations where a huge success. On the other hand, it seems to imply that the only thing that is still using my CPU is the Objective-C overhead for messages (objc_msgSend). Hence, that if I had written the application in, say, Carbon, its performance would be drastically better. Now I am tempted to conclude that Objective-C is a language with bad performance, even though Cocoa seems to be awfully efficient since it can apparently draw faster than Objective-C can send messages. So, is Objective-C really a language with bad performance? What do you think about that?

    Read the article

  • How expensive is a context switch? Is it better to implement a manual task switch than to rely on OS

    - by Vilx-
    The title says it all. Imagine I have two (three, four, whatever) tasks that have to run in parallel. Now, the easy way to do this would be to create separate threads and forget about it. But on a plain old single-core CPU that would mean a lot of context switching - and we all know that context switching is big, bad, slow, and generally simply Evil. It should be avoided, right? On that note, if I'm writing the software from ground up anyway, I could go the extra mile and implement my own task-switching. Split each task in parts, save the state inbetween, and then switch among them within a single thread. Or, if I detect that there are multiple CPU cores, I could just give each task to a separate thread and all would be well. The second solution does have the advantage of adapting to the number of available CPU cores, but will the manual task-switch really be faster than the one in the OS core? Especially if I'm trying to make the whole thing generic with a TaskManager and an ITask, etc?

    Read the article

  • Emulating a computer running MS-DOS

    - by Richard
    Writing emulators has always fascinated me. Now I want to write an emulator for an IBM PC and run MS-DOS on it (I've got the floppy image files). I have good experience in C++ and C and basic knowledge of assembler and the architecture of a CPU. I also know that there are thousands of emulators out there doing exactly what I want to do, but I'd be doing this for pure joy only. How much work do I have to expect? (If my goal is to boot DOS and create a text file with it, all emulated) What CPU should I emulate ? Where can I find documentation on how the machine code is organized and which opcodes mean what, so I can unpack and execute them correctly with my emulator? Does MS-DOS still run on the newest generations of processors? Would it theoretically be able to natively run on a 64-bit AMD Phenom 2 processor w/ a modern mainboard, HDD, RAM, etc.? What else, besides emulating the CPU, could be an important factor (in terms of difficulty)? I would only aim for outputting / inputting text to the system via the host system's console, no sound or other more advanced IO etc. Have you written an emulator yet? What was your first one for? How hard was it? Do you have any special tips for me? Thanks in advance

    Read the article

< Previous Page | 71 72 73 74 75 76 77 78 79 80 81 82  | Next Page >