Search Results

Search found 5954 results on 239 pages for 'cpu cores'.

Page 78/239 | < Previous Page | 74 75 76 77 78 79 80 81 82 83 84 85  | Next Page >

  • Using Appendbuffers in unity for terrain generation

    - by Wardy
    Like many others I figured I would try and make the most of the monster processing power of the GPU but I'm having trouble getting the basics in place. CPU code: using UnityEngine; using System.Collections; public class Test : MonoBehaviour { public ComputeShader Generator; public MeshTopology Topology; void OnEnable() { var computedMeshPoints = ComputeMesh(); CreateMeshFrom(computedMeshPoints); } private Vector3[] ComputeMesh() { var size = (32*32) * 4; // 4 points added for each x,z pos var buffer = new ComputeBuffer(size, 12, ComputeBufferType.Append); Generator.SetBuffer(0, "vertexBuffer", buffer); Generator.Dispatch(0, 1, 1, 1); var results = new Vector3[size]; buffer.GetData(results); buffer.Dispose(); return results; } private void CreateMeshFrom(Vector3[] generatedPoints) { var filter = GetComponent<MeshFilter>(); var renderer = GetComponent<MeshRenderer>(); if (generatedPoints.Length > 0) { var mesh = new Mesh { vertices = generatedPoints }; var colors = new Color[generatedPoints.Length]; var indices = new int[generatedPoints.Length]; //TODO: build this different based on topology of the mesh being generated for (int i = 0; i < indices.Length; i++) { indices[i] = i; colors[i] = Color.blue; } mesh.SetIndices(indices, Topology, 0); mesh.colors = colors; mesh.RecalculateNormals(); mesh.Optimize(); mesh.RecalculateBounds(); filter.sharedMesh = mesh; } else { filter.sharedMesh = null; } } } GPU code: #pragma kernel Generate AppendStructuredBuffer<float3> vertexBuffer : register(u0); void genVertsAt(uint2 xzPos) { //TODO: put some height generation code here. // could even run marching cubes / dual contouring code. float3 corner1 = float3( xzPos[0], 0, xzPos[1] ); float3 corner2 = float3( xzPos[0] + 1, 0, xzPos[1] ); float3 corner3 = float3( xzPos[0], 0, xzPos[1] + 1); float3 corner4 = float3( xzPos[0] + 1, 0, xzPos[1] + 1 ); vertexBuffer.Append(corner1); vertexBuffer.Append(corner2); vertexBuffer.Append(corner3); vertexBuffer.Append(corner4); } [numthreads(32, 1, 32)] void Generate (uint3 threadId : SV_GroupThreadID, uint3 groupId : SV_GroupID) { uint2 currentXZ = unint2( groupId.x * 32 + threadId.x, groupId.z * 32 + threadId.z); genVertsAt(currentXZ); } Can anyone explain why when I call "buffer.GetData(results);" on the CPU after the compute dispatch call my buffer is full of Vector3(0,0,0), I'm not expecting any y values yet but I would expect a bunch of thread indexes in the x,z values for the Vector3 array. I'm not getting any errors in any of this code which suggests it's correct syntax-wise but maybe the issue is a logical bug. Also: Yes, I know I'm generating 4,000 Vector3's and then basically round tripping them. However, the purpose of this code is purely to learn how round tripping works between CPU and GPU in Unity.

    Read the article

  • How to get faster graphics in KVM? VNC is painfully slow with Haiku OS guest, Spice won't install and SDL doesn't work

    - by Don Quixote
    I've been coming up to speed on the Haiku operating system, an Open Source clone of BeOS 5 Pro. I'm using an Apple MacBook Pro as my development machine. Apple's BootCamp BIOS does not support more than four partitions on the internal hard drive. While I can set up extended and logical partitions, doing so will prevent any of the installed operating systems from booting. To run Haiku directly on the iron, I boot it off a USB stick. Using external storage is also helpful because I am perpetually out of filesystem space. While VirtualBox is documented to allow access to physical drives, I could not actually get it to work. Also VirtualBox can only use one of the host CPU's cores. While VB guests can be configured for more than one CPU, they are only emulated. A full build of the Haiku OS takes 4.5 under VB. I had the hope of reducing build times by using KVM instead, but it's not working nearly as well as VirtualBox did. The Linux Kernel Virtual Machine is broken in all manner of fundamental ways as seen from Haiku. But I'm a coder; maybe I could contribute to fixing some of those problems. The first problem I've got is that Haiku's video in virt-manager is quite painfully slow. When I drag Haiku windows around the desktop, they lag quite far behind where my mouse is. It's quite difficult to move a window to a precise position on the screen. Just imagine that the mouse was connected to the window title bar with a really stretchy spring. Also Haiku's mouse lags quite far behind where I have moved it. I found lots of Personal Package Archives that enable Spice from QEMU / KVM at the Ubuntu Personal Package Arhives. I tried a few of the PPAs but none of them worked; with one of them, the command "add-apt-repository" crashed with a traceback. There is a Wiki page about Spice, but it says that it only works on 64-bit. My Early 2006 MacBook Pro is 32-bit. Its Apple Model Identifier is MacBookPro1,1; these use Core Duos NOT Core 2 Duos. I don't mind building a source deb for 32-bit if I can expect it to work. Is there some reason that Spice should be 64-bit only? Does it need features of the x86_64 Instruction Set Architecture that x86 does not have? When I try using SDL from virt-manager, the configuration for Local SDL Window says "Xauth: /home/mike/.Xauthority". When I try to start my guest, virt-manager emits an error. When I Googled the error message, the usual solution was to make ~/.Xauthority readible. However, .Xauthorty does not exist in my home directory. Instead I have a $XAUTHORITY environment variable. There is no way to configure SDL in virt-manager to use $XAUTHORITY instead of ~/.Xauthority. Neither does it work to copy the value of $XAUTHORITY into the file. I am ready to scream, because I've been five fscking days trying to make KVM work for Haiku development. There is a whole lot more that is broken than the slow video. All I really want to do for now is speed up my full builds of Haiku by using "jam -j2" to use both cores in my CPU. I may try Xen next, but the last time I monkeyed with Xen it was far, far more broken than I am finding KVM to be. Just for now, I would be satisfied if there were some way to use my USB stick as a drive in VirtualBox. VB does allow me to configure /dev/sdb as a drive, but it always causes a fatal error when I try to launch the guest. Thank You For Any Advice You Can Give Me. -

    Read the article

  • 4.8M wasn't enough so we went for 5.055M tpmc with Unbreakable Enterprise Kernel r2 :-)

    - by wcoekaer
    We released a new set of benchmarks today. One is an updated tpc-c from a few months ago where we had just over 4.8M tpmc at $0.98 and we just updated it to go to 5.05M and $0.89. The other one is related to Java Middleware performance. You can find the press release here. Now, I don't want to talk about the actual relevance of the benchmark numbers, as I am not in the benchmark team. I want to talk about why these numbers and these efforts, unrelated to what they mean to your workload, matter to customers. The actual benchmark effort is a very big, long, expensive undertaking where many groups work together as a big virtual team. Having the virtual team be within a single company of course helps tremendously... We already start with a very big server setup with tons of storage, many disks, lots of ram, lots of cpu's, cores, threads, large database setups. Getting the whole setup going to start tuning, by itself, is no easy task, but then the real fun starts with tuning the system for optimal performance -and- stability. A benchmark is not just revving an engine at high rpm, it's actually hitting the circuit. The tests require long runs, require surviving availability tests, such as surviving crashes -and- recovery under load. In the TPC-C example, the x4800 system had 4TB ram, 160 threads (8 sockets, hyperthreaded, 10 cores/socket), tons of storage attached, tons of luns visible to the OS. flash storage, non flash storage... many things at high scale that all have to be perfectly synchronized. During this process, we find bugs, we fix bugs, we find performance issues, we fix performance issues, we find interesting potential features to investigate for the future, we start new development projects for future releases and all this goes back into the products. As more and more customers, for Oracle Linux, are running larger and larger, faster and faster, more mission critical, higher available databases..., these things are just absolutely critical. Unrelated to what anyone's specific opinion is about tpc-c or tpc-h or specjenterprise etc, there is a ton of effort that the customer benefits from. All this work makes Oracle Linux and/or Oracle Solaris better platforms. Whether it's faster, more stable, more scalable, more resilient. It helps. Another point that I always like to re-iterate around UEK and UEK2 : we have our kernel source git repository online. Complete changelog of the mainline kernel, and our changes, easy to pull, easy to dissect, easy to know what went in when, why and where. No need to go log into a website and manually click through pages to hopefully discover changes or patches. No need to untar 2 tar balls and run a diff.

    Read the article

  • Columnstore Case Study #1: MSIT SONAR Aggregations

    - by aspiringgeek
    Preamble This is the first in a series of posts documenting big wins encountered using columnstore indexes in SQL Server 2012 & 2014.  Many of these can be found in this deck along with details such as internals, best practices, caveats, etc.  The purpose of sharing the case studies in this context is to provide an easy-to-consume quick-reference alternative. Why Columnstore? If we’re looking for a subset of columns from one or a few rows, given the right indexes, SQL Server can do a superlative job of providing an answer. If we’re asking a question which by design needs to hit lots of rows—DW, reporting, aggregations, grouping, scans, etc., SQL Server has never had a good mechanism—until columnstore. Columnstore indexes were introduced in SQL Server 2012. However, they're still largely unknown. Some adoption blockers existed; yet columnstore was nonetheless a game changer for many apps.  In SQL Server 2014, potential blockers have been largely removed & they're going to profoundly change the way we interact with our data.  The purpose of this series is to share the performance benefits of columnstore & documenting columnstore is a compelling reason to upgrade to SQL Server 2014. App: MSIT SONAR Aggregations At MSIT, performance & configuration data is captured by SCOM. We archive much of the data in a partitioned data warehouse table in SQL Server 2012 for reporting via an application called SONAR.  By definition, this is a primary use case for columnstore—report queries requiring aggregation over large numbers of rows.  New data is refreshed each night by an automated table partitioning mechanism—a best practices scenario for columnstore. The Win Compared to performance using classic indexing which resulted in the expected query plan selection including partition elimination vs. SQL Server 2012 nonclustered columnstore, query performance increased significantly.  Logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more.  Other than creating the columnstore index, no special modifications or tweaks to the app or databases schema were necessary to achieve the performance improvements.  Existing nonclustered indexes were rendered superfluous & were deleted, thus mitigating maintenance challenges such as defragging as well as conserving disk capacity. Details The table provides the raw data & summarizes the performance deltas. Logical Reads (8K pages) CPU (ms) Durn (ms) Columnstore 160,323 20,360 9,786 Conventional Table & Indexes 9,053,423 549,608 193,903 ? x56 x27 x20 The charts provide additional perspective of this data.  "Conventional vs. Columnstore Metrics" document the raw data.  Note on this linear display the magnitude of the conventional index performance vs. columnstore.  The “Metrics (?)” chart expresses these values as a ratio. Summary For DW, reports, & other BI workloads, columnstore often provides significant performance enhancements relative to conventional indexing.  I have documented here, the first in a series of reports on columnstore implementations, results from an initial implementation at MSIT in which logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more.  Subsequent features in this series document performance enhancements that are even more significant. 

    Read the article

  • overheating and shutdown problems when adobe flash runs?

    - by hamid
    I'm a new user of UBUNTU and using a Dell latitude D630. When I browse to site that have some flash animation (mostly advertisements), the temperature of cores increase dramatically (I check with sensors, in the worse case it was 104C for one core and 93 for the other core) and if I don't close the website it will shutdown the laptop. Do you have any suggestion or solution for that? PS: as an example for crashing sites you can see "tabnak.ir", a news website with lots of ads.

    Read the article

  • SQL Server Connections Fall 2011 - Demos

    - by Adam Machanic
    Today is the last day of the annual SQL Server Connections show in Vegas, and I've just completed my third and final talk. (Now off to find a frosty beverage or two.) This year I did three sessions: SQL302: Parallelism and Performance: Are You Getting Full Return on Your CPU Investment? Over the past five years, multi-core processors have made the jump from semi-obscure to commonplace in the data center. While servers with 16, 32, or even 64 cores were once an out-of-reach choice for all except the...(read more)

    Read the article

  • New SQL Server 2012 per core licensing – Thank you Microsoft

    - by jchang
    Many of us have probably seen the new SQL Server 2012 per core licensing, with Enterprise Edition at $6,874 per core super ceding the $27,495 per socket of SQL Server 2008 R2 (discounted to $19,188 for 4-way and $23,370 for 2-way in TPC benchmark reports) with Software Assurance at $6,874 per processor? Datacenter was $57,498 per processor, so the new per-core licensing puts 2012 EE on par with 2008R2 DC, at 8-cores per socket. This is a significant increase for EE licensing on the 2-way Xeon 5600...(read more)

    Read the article

  • Ubuntu 12.10 running slow

    - by andrew
    I pasted syslog and perhaps anyone can see trouble that might need attention. It is running too slow for what I would suspect. Opening apps and web pages just takes forever. http://paste.ubuntu.com/1303211/ System Specs: Oct 24 12:42:55 ubuntu kernel: [ 1.369735] powernow-k8: Found 1 AMD V140 Processor (1 cpu cores) (version 2.20.00) http://en.wikipedia.org/wiki/List_of_AMD_Phenom_microprocessors#.22Champlain.22_.2845_nm.2C_Single-core.29_2

    Read the article

  • The best computer ever

    - by Jeff
    (This is a repost from my personal blog… wow… I need to write more technical stuff!) About three years and three months ago, I bought a 17" MacBook Pro, and it turned out to be the best computer I've ever owned. You might think that every computer with better specs is automatically better than the last, but that hasn't been my experience. My first one was a Sony, back in the Pentium III days, and it cost an astonishing $2,500. That was even more ridiculous in 1999 dollars. It had a dial-up modem, and a CD-ROM, built-in! It may have even played DVD's. A few years later I bought an HP, and it ended up being a pile of shit. The power connector inside came loose from the board, and on occasion would even short. In 2005, I bought a Dell, and it wasn't bad. It had a really high resolution screen (complete with dead pixels, a problem in those days), and it was the first laptop I felt I could do real work on. When 2006 rolled around, Apple started making computers with Intel CPU's, and I bought the very first one the week it came out. I used Boot Camp to run Windows. I still have it in its box somewhere, and I used it for three years. The current 17" was new in 2009. The goodness was largely rooted in having a big screen with lots of dots. This computer has been the source of hundreds of blog posts, tens of thousands of lines of code, video and photo editing, and of course, a whole lot of Web surfing. It connected to corpnet at Microsoft, WiFi in Hawaii and has presented many a deck. It has traveled with me tens of thousands of miles. Last year, I put a solid state drive in it, and it was like getting a new computer. I can boot up a Windows 7 VM in about 19 seconds. Having 8 gigs of RAM has always been fantastic. Everything about it has been fast and fun. When new, the battery (when not using VM's) could get as much as 10 hours. I can still do 7 without much trouble. After 460 charge cycles, the battery health is still between 85 and 90%. The only real negative has been the size and weight. It's only an inch thick, but naturally it's pretty big with a 17" screen. You don't get battery life like that without a huge battery, either, so it's heavy. It was never a deal breaker, but sometimes a long haul across a large airport, you know you're carrying it. Today, Apple announced a new, thinner and lighter 15" laptop, with twice the RAM and CPU cores, and four times the screen resolution. It basically handles my size and weight issues while retaining the resolution, and it still costs less than my 17" did. So I ordered one. Three years is an excellent run, but I kind of budgeted for a new workhorse this year anyway. So if you're interested in a 17" MacBook Pro with a Core 2 Duo 2.66 GHz CPU, 8 gigs of RAM and a 320 gig hard drive (sorry, I'm keeping the SSD), I have one to sell. They've apparently discontinued the 17", which is going to piss off the video community. It's in excellent condition, with a few minor scratches, but I take care of my stuff.

    Read the article

  • Faster Memory Allocation Using vmtasks

    - by Steve Sistare
    You may have noticed a new system process called "vmtasks" on Solaris 11 systems: % pgrep vmtasks 8 % prstat -p 8 PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 8 root 0K 0K sleep 99 -20 9:10:59 0.0% vmtasks/32 What is vmtasks, and why should you care? In a nutshell, vmtasks accelerates creation, locking, and destruction of pages in shared memory segments. This is particularly helpful for locked memory, as creating a page of physical memory is much more expensive than creating a page of virtual memory. For example, an ISM segment (shmflag & SHM_SHARE_MMU) is locked in memory on the first shmat() call, and a DISM segment (shmflg & SHM_PAGEABLE) is locked using mlock() or memcntl(). Segment operations such as creation and locking are typically single threaded, performed by the thread making the system call. In many applications, the size of a shared memory segment is a large fraction of total physical memory, and the single-threaded initialization is a scalability bottleneck which increases application startup time. To break the bottleneck, we apply parallel processing, harnessing the power of the additional CPUs that are always present on modern platforms. For sufficiently large segments, as many of 16 threads of vmtasks are employed to assist an application thread during creation, locking, and destruction operations. The segment is implicitly divided at page boundaries, and each thread is given a chunk of pages to process. The per-page processing time can vary, so for dynamic load balancing, the number of chunks is greater than the number of threads, and threads grab chunks dynamically as they finish their work. Because the threads modify a single application address space in compressed time interval, contention on locks protecting VM data structures locks was a problem, and we had to re-scale a number of VM locks to get good parallel efficiency. The vmtasks process has 1 thread per CPU and may accelerate multiple segment operations simultaneously, but each operation gets at most 16 helper threads to avoid monopolizing CPU resources. We may reconsider this limit in the future. Acceleration using vmtasks is enabled out of the box, with no tuning required, and works for all Solaris platform architectures (SPARC sun4u, SPARC sun4v, x86). The following tables show the time to create + lock + destroy a large segment, normalized as milliseconds per gigabyte, before and after the introduction of vmtasks: ISM system ncpu before after speedup ------ ---- ------ ----- ------- x4600 32 1386 245 6X X7560 64 1016 153 7X M9000 512 1196 206 6X T5240 128 2506 234 11X T4-2 128 1197 107 11x DISM system ncpu before after speedup ------ ---- ------ ----- ------- x4600 32 1582 265 6X X7560 64 1116 158 7X M9000 512 1165 152 8X T5240 128 2796 198 14X (I am missing the data for T4 DISM, for no good reason; it works fine). The following table separates the creation and destruction times: ISM, T4-2 before after ------ ----- create 702 64 destroy 495 43 To put this in perspective, consider creating a 512 GB ISM segment on T4-2. Creating the segment would take 6 minutes with the old code, and only 33 seconds with the new. If this is your Oracle SGA, you save over 5 minutes when starting the database, and you also save when shutting it down prior to a restart. Those minutes go directly to your bottom line for service availability.

    Read the article

  • Using WKA in Large Coherence Clusters (Disabling Multicast)

    - by jpurdy
    Disabling hardware multicast (by configuring well-known addresses aka WKA) will place significant stress on the network. For messages that must be sent to multiple servers, rather than having a server send a single packet to the switch and having the switch broadcast that packet to the rest of the cluster, the server must send a packet to each of the other servers. While hardware varies significantly, consider that a server with a single gigabit connection can send at most ~70,000 packets per second. To continue with some concrete numbers, in a cluster with 500 members, that means that each server can send at most 140 cluster-wide messages per second. And if there are 10 cluster members on each physical machine, that number shrinks to 14 cluster-wide messages per second (or with only mild hyperbole, roughly zero). It is also important to keep in mind that network I/O is not only expensive in terms of the network itself, but also the consumption of CPU required to send (or receive) a message (due to things like copying the packet bytes, processing a interrupt, etc). Fortunately, Coherence is designed to rely primarily on point-to-point messages, but there are some features that are inherently one-to-many: Announcing the arrival or departure of a member Updating partition assignment maps across the cluster Creating or destroying a NamedCache Invalidating a cache entry from a large number of client-side near caches Distributing a filter-based request across the full set of cache servers (e.g. queries, aggregators and entry processors) Invoking clear() on a NamedCache The first few of these are operations that are primarily routed through a single senior member, and also occur infrequently, so they usually are not a primary consideration. There are cases, however, where the load from introducing new members can be substantial (to the point of destabilizing the cluster). Consider the case where cluster in the first paragraph grows from 500 members to 1000 members (holding the number of physical machines constant). During this period, there will be 500 new member introductions, each of which may consist of several cluster-wide operations (for the cluster membership itself as well as the partitioned cache services, replicated cache services, invocation services, management services, etc). Note that all of these introductions will route through that one senior member, which is sharing its network bandwidth with several other members (which will be communicating to a lesser degree with other members throughout this process). While each service may have a distinct senior member, there's a good chance during initial startup that a single member will be the senior for all services (if those services start on the senior before the second member joins the cluster). It's obvious that this could cause CPU and/or network starvation. In the current release of Coherence (3.7.1.3 as of this writing), the pure unicast code path also has less sophisticated flow-control for cluster-wide messages (compared to the multicast-enabled code path), which may also result in significant heap consumption on the senior member's JVM (from the message backlog). This is almost never a problem in practice, but with sufficient CPU or network starvation, it could become critical. For the non-operational concerns (near caches, queries, etc), the application itself will determine how much load is placed on the cluster. Applications intended for deployment in a pure unicast environment should be careful to avoid excessive dependence on these features. Even in an environment with multicast support, these operations may scale poorly since even with a constant request rate, the underlying workload will increase at roughly the same rate as the underlying resources are added. Unless there is an infrastructural requirement to the contrary, multicast should be enabled. If it can't be enabled, care should be taken to ensure the added overhead doesn't lead to performance or stability issues. This is particularly crucial in large clusters.

    Read the article

  • installer can't find partition, but fdisk can find them

    - by pxd
    I'm installing ubuntu 12.04, my system had install 2 system -- winxp and ubuntu 10.10. Now, I want to update ubuntu to 12.04. I use usb disk to install 12.04. But, the installer can't not find my partition in my harddisk. But, the fdisk can find them. Can you help me? How to do? ubuntu@ubuntu:~$ sudo lshw -short H/W path Device Class Description system HP 2230s (NN868PA#AB2) /0 bus 3037 /0/9 memory 64KiB BIOS /0/0 processor Intel(R) Core(TM)2 Duo CPU T6570 @ 2.10GHz /0/0/1 memory 2MiB L2 cache /0/0/3 memory 32KiB L1 cache /0/0/0.1 processor Logical CPU /0/0/0.2 processor Logical CPU /0/2 memory 32KiB L1 cache /0/4 memory 2GiB System Memory /0/4/0 memory SODIMM [empty] /0/4/1 memory 2GiB SODIMM DDR2 Synchronous 800 MHz (1.2 ns) /0/100 bridge Mobile 4 Series Chipset Memory Controller Hub /0/100/2 display Mobile 4 Series Chipset Integrated Graphics Controller /0/100/2.1 display Mobile 4 Series Chipset Integrated Graphics Controller /0/100/1a bus 82801I (ICH9 Family) USB UHCI Controller #4 /0/100/1a.1 bus 82801I (ICH9 Family) USB UHCI Controller #5 /0/100/1a.2 bus 82801I (ICH9 Family) USB UHCI Controller #6 /0/100/1a.7 bus 82801I (ICH9 Family) USB2 EHCI Controller #2 /0/100/1b multimedia 82801I (ICH9 Family) HD Audio Controller /0/100/1c bridge 82801I (ICH9 Family) PCI Express Port 1 /0/100/1c.1 bridge 82801I (ICH9 Family) PCI Express Port 2 /0/100/1c.1/0 wlan1 network PRO/Wireless 5100 AGN [Shiloh] Network Connection /0/100/1c.2 bridge 82801I (ICH9 Family) PCI Express Port 3 /0/100/1c.4 bridge 82801I (ICH9 Family) PCI Express Port 5 /0/100/1c.5 bridge 82801I (ICH9 Family) PCI Express Port 6 /0/100/1c.5/0 eth1 network 88E8072 PCI-E Gigabit Ethernet Controller /0/100/1d bus 82801I (ICH9 Family) USB UHCI Controller #1 /0/100/1d.1 bus 82801I (ICH9 Family) USB UHCI Controller #2 /0/100/1d.2 bus 82801I (ICH9 Family) USB UHCI Controller #3 /0/100/1d.7 bus 82801I (ICH9 Family) USB2 EHCI Controller #1 /0/100/1e bridge 82801 Mobile PCI Bridge /0/100/1f bridge ICH9M LPC Interface Controller /0/100/1f.2 scsi0 storage 82801IBM/IEM (ICH9M/ICH9M-E) 4 port SATA Controller [AHCI mode] /0/100/1f.2/0 /dev/sda disk 500GB WDC WD5000BEVT-0 /0/100/1f.2/0/1 /dev/sda1 volume 48GiB Windows NTFS volume /0/100/1f.2/0/2 /dev/sda2 volume 416GiB Extended partition /0/100/1f.2/0/2/5 /dev/sda5 volume 97GiB HPFS/NTFS partition /0/100/1f.2/0/2/6 /dev/sda6 volume 198GiB HPFS/NTFS partition /0/100/1f.2/0/2/7 /dev/sda7 volume 27GiB Linux filesystem partition /0/100/1f.2/0/2/8 /dev/sda8 volume 93GiB Linux filesystem partition /0/100/1f.2/1 /dev/cdrom disk CDDVDW TS-L633M /0/1 scsi6 storage /0/1/0.0.0 /dev/sdb disk 15GB STORAGE DEVICE /0/1/0.0.0/0 /dev/sdb disk 15GB /0/1/0.0.0/0/1 /dev/sdb1 volume 14GiB Windows FAT volume /1 power HZ04037 ubuntu@ubuntu:~$ ubuntu@ubuntu:~$ sudo fdisk -l Disk /dev/sda: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x31263125 Device Boot Start End Blocks Id System /dev/sda1 * 63 102277727 51138832+ 7 HPFS/NTFS/exFAT /dev/sda2 102277728 976784129 437253201 f W95 Ext'd (LBA) /dev/sda5 102277791 307078127 102400168+ 7 HPFS/NTFS/exFAT /dev/sda6 307078191 724141151 208531480+ 7 HPFS/NTFS/exFAT /dev/sda7 724142080 781459455 28658688 83 Linux /dev/sda8 781461504 976771071 97654784 83 Linux Disk /dev/sdb: 15.9 GB, 15931539456 bytes 64 heads, 32 sectors/track, 15193 cylinders, total 31116288 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x0009eb92 Device Boot Start End Blocks Id Systemfile:///home/ubuntu/Pictures/Screenshot%20from%202012-07-07%2010:25:40.png /dev/sdb1 * 32 31115263 15557616 c W95 FAT32 (LBA) ubuntu 12.04 installer can't find the partition in my hard disk, only find device - /dev/sda.(sorry, I'm new user, so can't send image.)

    Read the article

  • Queries barely over the Cost Threshold for Parallelism

    - by jchang
    I had discussed SQL Server parallelism in Oct 2010, with my thoughts on the best settings for: Cost Threshold for Parallelism (CTP) and Max Degrees of Parallelism (MAXDOP) in Parallelism Strategy and Comments . At the time, I had intended to follow up with detailed measurements. So now a mere 2 years later, here it is. The general thought was that CTP should be raised from the default value of 5, and MAXDOP should be changed from unrestricted, on modern systems with very many cores, and most especially...(read more)

    Read the article

  • Queries barely over the Cost Threshold for Parallelism

    - by jchang
    Previously I had discussed SQL Server parallelism, with my thoughts on the best settings for: Cost Threshold for Parallelism (CTP) and Max Degrees of Parallelism (MAXDOP) in Parallelism Strategy and Comments . At the time, I had intended to follow up with detailed measurements. So now a mere 2 years later, here it is. The general thought was that CTP should be raised from the default value of 5, and MAXDOP should be changed from unrestricted, on modern systems with very many cores, and most especially...(read more)

    Read the article

  • High Resolution Timeouts

    - by user12607257
    The default resolution of application timers and timeouts is now 1 msec in Solaris 11.1, down from 10 msec in previous releases. This improves out-of-the-box performance of polling and event based applications, such as ticker applications, and even the Oracle rdbms log writer. More on that in a moment. As a simple example, the poll() system call takes a timeout argument in units of msec: System Calls poll(2) NAME poll - input/output multiplexing SYNOPSIS int poll(struct pollfd fds[], nfds_t nfds, int timeout); In Solaris 11, a call to poll(NULL,0,1) returns in 10 msec, because even though a 1 msec interval is requested, the implementation rounds to the system clock resolution of 10 msec. In Solaris 11.1, this call returns in 1 msec. In specification lawyer terms, the resolution of CLOCK_REALTIME, introduced by POSIX.1b real time extensions, is now 1 msec. The function clock_getres(CLOCK_REALTIME,&res) returns 1 msec, and any library calls whose man page explicitly mention CLOCK_REALTIME, such as nanosleep(), are subject to the new resolution. Additionally, many legacy functions that pre-date POSIX.1b and do not explicitly mention a clock domain, such as poll(), are subject to the new resolution. Here is a fairly comprehensive list: nanosleep pthread_mutex_timedlock pthread_mutex_reltimedlock_np pthread_rwlock_timedrdlock pthread_rwlock_reltimedrdlock_np pthread_rwlock_timedwrlock pthread_rwlock_reltimedwrlock_np mq_timedreceive mq_reltimedreceive_np mq_timedsend mq_reltimedsend_np sem_timedwait sem_reltimedwait_np poll select pselect _lwp_cond_timedwait _lwp_cond_reltimedwait semtimedop sigtimedwait aiowait aio_waitn aio_suspend port_get port_getn cond_timedwait cond_reltimedwait setitimer (ITIMER_REAL) misc rpc calls, misc ldap calls This change in resolution was made feasible because we made the implementation of timeouts more efficient a few years back when we re-architected the callout subsystem of Solaris. Previously, timeouts were tested and expired by the kernel's clock thread which ran 100 times per second, yielding a resolution of 10 msec. This did not scale, as timeouts could be posted by every CPU, but were expired by only a single thread. The resolution could be changed by setting hires_tick=1 in /etc/system, but this caused the clock thread to run at 1000 Hz, which made the potential scalability problem worse. Given enough CPUs posting enough timeouts, the clock thread could be a performance bottleneck. We fixed that by re-implementing the timeout as a per-CPU timer interrupt (using the cyclic subsystem, for those familiar with Solaris internals). This decoupled the clock thread frequency from timeout resolution, and allowed us to improve default timeout resolution without adding CPU overhead in the clock thread. Here are some exceptions for which the default resolution is still 10 msec. The thread scheduler's time quantum is 10 msec by default, because preemption is driven by the clock thread (plus helper threads for scalability). See for example dispadmin, priocntl, fx_dptbl, rt_dptbl, and ts_dptbl. This may be changed using hires_tick. The resolution of the clock_t data type, primarily used in DDI functions, is 10 msec. It may be changed using hires_tick. These functions are only used by developers writing kernel modules. A few functions that pre-date POSIX CLOCK_REALTIME mention _SC_CLK_TCK, CLK_TCK, "system clock", or no clock domain. These functions are still driven by the clock thread, and their resolution is 10 msec. They include alarm, pcsample, times, clock, and setitimer for ITIMER_VIRTUAL and ITIMER_PROF. Their resolution may be changed using hires_tick. Now back to the database. How does this help the Oracle log writer? Foreground processes post a redo record to the log writer, which releases them after the redo has committed. When a large number of foregrounds are waiting, the release step can slow down the log writer, so under heavy load, the foregrounds switch to a mode where they poll for completion. This scales better because every foreground can poll independently, but at the cost of waiting the minimum polling interval. That was 10 msec, but is now 1 msec in Solaris 11.1, so the foregrounds process transactions faster under load. Pretty cool.

    Read the article

  • Kernel Log: Linux 2.6.34 goes into testing

    <b>The H Open:</b> "Improvements include graphics drivers for recent Radeon GPUs and for the graphics cores of some Intel processors that are only expected to be released early next year. Another new addition is the LogFS SSD file system."

    Read the article

  • Detecting Hyper-Threading state

    - by jchang
    To interpret performance counters and execution statistics correctly, it is necessary to know state of Hyper-Threading. In principle, at low overall CPU utilization, for non-parallel execution plans, it should not matter whether HT is enabled or not. Of course, DBA life is never that simple. The state of HT does matter at high over utilization and in parallel execution plans depending on the DOP. SQL Server does seem to try to allocate threads on distinct physical cores at intermediate DOP (DOP less...(read more)

    Read the article

  • Detecting Hyper-Threading state

    - by jchang
    To interpret performance counters and execution statistics correctly, it is necessary to know state of Hyper-Threading. In principle, at low overall CPU utilization, for non-parallel execution plans, it should not matter whether HT is enabled or not. Of course, DBA life is never that simple. The state of HT does matter at high over utilization and in parallel execution plans depending on the DOP. SQL Server does seem to try to allocate threads on distinct physical cores at intermediate DOP (DOP less...(read more)

    Read the article

  • ?SPARC T4?????????????·???? : Netra SPARC T4-1

    - by user13138700
    ?SPARC T4???????????????·??????? Netra SPARC T4-1 ???? Netra SPARC T4-2 ?2012?1?10??????????3?15??????????????(????) ?????????? Netra SPARC T4-1 ? 4core ???( T4 ???????? 4core ???)(*)???????????????????????????(*)( Netra SPARC T4-1 ?????? 4core ???? 8core ????????) ??? prtdiag ????? pginfo ??????????????? 8????/1core ???? prtdiag ????????4core=32???????????????pginfo ?????????????????core ???????????????????? # ./prtdiag -v System Configuration: Oracle Corporation sun4v Netra SPARC T4-1 ???????: 130560 M ??? ================================ ?? CPU ================================ CPU ID Frequency Implementation Status ------ --------- ---------------------- ------- 0 2848 MHz SPARC-T4 on-line 1 2848 MHz SPARC-T4 on-line 2 2848 MHz SPARC-T4 on-line 3 2848 MHz SPARC-T4 on-line 4 2848 MHz SPARC-T4 on-line 5 2848 MHz SPARC-T4 on-line 6 2848 MHz SPARC-T4 on-line 7 2848 MHz SPARC-T4 on-line 8 2848 MHz SPARC-T4 on-line 9 2848 MHz SPARC-T4 on-line 10 2848 MHz SPARC-T4 on-line 11 2848 MHz SPARC-T4 on-line 12 2848 MHz SPARC-T4 on-line 13 2848 MHz SPARC-T4 on-line 14 2848 MHz SPARC-T4 on-line 15 2848 MHz SPARC-T4 on-line 16 2848 MHz SPARC-T4 on-line 17 2848 MHz SPARC-T4 on-line 18 2848 MHz SPARC-T4 on-line 19 2848 MHz SPARC-T4 on-line 20 2848 MHz SPARC-T4 on-line 21 2848 MHz SPARC-T4 on-line 22 2848 MHz SPARC-T4 on-line 23 2848 MHz SPARC-T4 on-line 24 2848 MHz SPARC-T4 on-line 25 2848 MHz SPARC-T4 on-line 26 2848 MHz SPARC-T4 on-line 27 2848 MHz SPARC-T4 on-line 28 2848 MHz SPARC-T4 on-line 29 2848 MHz SPARC-T4 on-line 30 2848 MHz SPARC-T4 on-line 31 2848 MHz SPARC-T4 on-line ======================= Physical Memory Configuration ======================== ???? # pginfo -p -T 0 (System [system,chip]) CPUs: 0-31 `-- 3 (Data_Pipe_to_memory [system,chip]) CPUs: 0-31 |-- 2 (Floating_Point_Unit [core]) CPUs: 0-7 | `-- 1 (Integer_Pipeline [core]) CPUs: 0-7 |-- 5 (Floating_Point_Unit [core]) CPUs: 8-15 | `-- 4 (Integer_Pipeline [core]) CPUs: 8-15 |-- 7 (Floating_Point_Unit [core]) CPUs: 16-23 | `-- 6 (Integer_Pipeline [core]) CPUs: 16-23 `-- 9 (Floating_Point_Unit [core]) CPUs: 24-31 `-- 8 (Integer_Pipeline [core]) CPUs: 24-31 T4 ????????????????????????????????????????????????? T3 ?????(S2 core)?????T4 ?????(S3 core)?????????????5???????????? T3 ?????(S2 core)?????????????????????????(????????)?????????????????????????????????????????????·???????????????????????????????????????? ????T4 ????????????????????????????T4 ??????????·??????? Netra SPARC T4-1 4core ????????????????????????????????????T3 ???????????????????????????? ?????????Netra SPARC T4-1 ??????????????? Netra SPARC T4-1 ?? Computing 1 x SPARC T4 4?? 32???? or 8 ?? 64 ???? 2.85GHz CPU (1?????8????) 16 x DDR3 DIMM (?? 256GB ?????16GB DIMM ???) I/O and Storage 3 x Low Profile PCI-Express Gen2 ???? (2 x 10Gb Ethernet XAUI ???????) 2 x Full-height Half-length PCI-Express Gen2 ???? 4 x 10/100/1000 Ethernet ???????? 4 x 2.5” SAS2 HDD 4 x USB ??? (?? 2, ?? 2) RAS and Management and Power Supply ???? (RAID????), ????PSU ?????????? ILOM?????????????? 2N (1+1) , AC ???? DC ?? Support OS Oracle Solaris 10 10/9, 9/10, 8/11, Oracle Solaris 11 11/11 Oracle VM Server for SPARC 2.1 (LDoms) ???? ??? NEBS Level3?? ??????21” 19”(EIA-310D),23”,24”,600mm????? ?????(?????)????????? ????SPARC T4 ????????SPARC T4 ?????????????????????????(4???)???????????? Oracle OpenWorld Tokyo 2012 ?3??(4/4(?)?4/5(?)?4/6(?))?????????????????????&?????????????????SPARC T4 ?????????????????????????????????·?????????????????SPARC T4 ???????????????????!? Oracle OpenWorld Tokyo 2012 http://www.oracle.com/openworld/jp-ja/index.html ????·???????????? 4/6(?) Develop D3-13 (14:00 - 14:45) ???????????49 ??? ?????? 7264 ???????????????

    Read the article

  • MSAcpi_ThermalZoneTemperature class not showing actual temperature

    - by jchoudhury
    i want to fetch CPU Performance data in real time including temperature. i used the following code to get CPU Temperature: try { ManagementObjectSearcher searcher = new ManagementObjectSearcher("root\\WMI", "SELECT * FROM MSAcpi_ThermalZoneTemperature"); foreach (ManagementObject queryObj in searcher.Get()) { double temp = Convert.ToDouble(queryObj["CurrentTemperature"].ToString()); double temp_critical = Convert.ToDouble(queryObj["CriticalTripPoint"].ToString()); double temp_cel = (temp/10 - 273.15); double temp_critical_cel = temp_critical / 10 - 273.15; lblCurrentTemp.Text = temp_cel.ToString(); lblCriticalTemp.Text = temp_critical_cel.ToString(); } } catch (ManagementException e) { MessageBox.Show("An error occurred while querying for WMI data: " + e.Message); } but this code shows the temperature that is not the correct temperature. It ususally shows 49.5-50.5 degrees centigrade. But I used "OpenHardwareMonitor" that report CPU temperature over 71 degree centigrade and changing fractions along with time fractions. is there anything I am missing in the code? I used the above code in timer_click event for every 500ms interval to refresh the temperature reading but it's always showing the same temperature from the beginning of execution. That implies if you run this application and if it shows 49 degree then after 1 hour session, it'll constantly show 49 degree. Where is the problem? please help. Thanks in advance.

    Read the article

  • Context migration in CUDA.NET

    - by Vyacheslav
    I'm currently using CUDA.NET library by GASS. I need to initialize cuda arrays (actually cublas vectors, but it doesn't matters) in one CPU thread and use them in other CPU thread. But CUDA context which holding all initialized arrays and loaded functions, can be attached to only one CPU thread. There is mechanism called context migration API to detach context from one thread and attach it to another. But i don't how to properly use it in CUDA.NET. I tried something like this: class Program { private static float[] vector1, vector2; private static CUDA cuda; private static CUBLAS cublas; private static CUdeviceptr ptr; static void Main(string[] args) { cuda = new CUDA(false); cublas = new CUBLAS(cuda); cuda.Init(); cuda.CreateContext(0); AllocateVectors(); cuda.DetachContext(); CUcontext context = cuda.PopCurrentContext(); GetVectorFromDeviceAsync(context); } private static void AllocateVectors() { vector1 = new float[]{1f, 2f, 3f, 4f, 5f}; ptr = cublas.Allocate(vector1.Length, sizeof (float)); cublas.SetVector(vector1, ptr); vector2 = new float[5]; } private static void GetVectorFromDevice(object objContext) { CUcontext localContext = (CUcontext) objContext; cuda.PushCurrentContext(localContext); cuda.AttachContext(localContext); //change vector somehow vector1[0] = -1; //copy changed vector to device cublas.SetVector(vector1, ptr); cublas.GetVector(ptr, vector2); CUDADriver.cuCtxPopCurrent(ref localContext); } private static void GetVectorFromDeviceAsync(CUcontext cUcontext) { Thread thread = new Thread(GetVectorFromDevice); thread.IsBackground = false; thread.Start(cUcontext); } } But execution fails on attempt to copy changed vector to device because context is not attached? Any ideas how i can get it work?

    Read the article

  • How to get Processor and Motherboard Id ?

    - by Frank
    I use the code from http://www.rgagnon.com/javadetails/java-0580.html to get Motherboard Id, but the result is "null", <1 How can that be ? <2 Also I modified the code a bit to look like this to get processor Id : "Set objWMIService = GetObject(\"winmgmts:\\\\.\\root\\cimv2\")\n"+ "Set colItems = objWMIService.ExecQuery _ \n"+ " (\"Select * from Win32_Processor\") \n"+ "For Each objItem in colItems \n"+ " Wscript.Echo objItem.ProcessorId \n"+ " exit for ' do the first cpu only! \n"+ "Next \n"; The result is something like : ProcessorId = BFEBFBFF00010676 On http://msdn.microsoft.com/en-us/library/aa389273%28VS.85%29.aspx it says : ProcessorId : Processor information that describes the processor features. For an x86 class CPU, the field format depends on the processor support of the CPUID instruction. If the instruction is supported, the property contains 2 (two) DWORD formatted values. The first is an offset of 08h-0Bh, which is the EAX value that a CPUID instruction returns with input EAX set to 1. The second is an offset of 0Ch-0Fh, which is the EDX value that the instruction returns. Only the first two bytes of the property are significant and contain the contents of the DX register at CPU reset—all others are set to 0 (zero), and the contents are in DWORD format. I don't quite understand it, in plain English, is it unique or just a number for this class of processors, for instance all Intel Core2 Duo P8400 will have this number ? Frank

    Read the article

  • Winforms: How to speed up Invalidate()?

    - by Pedery
    I'm developing a retained mode drawing application in GDI+. The application can draw simple shapes to a canvas and perform basic editing. The math that does this is optimized to the last byte and is not an issue. I'm drawing on a panel that is using the built-in Controlstyles.DoubleBuffer. Now, my problem arises if I run my app maximized on a big monitor (HD in my case). If I try to draw a line from one corner of the (big) canvas to the diagonally oposite other, it will start to lag and the CPU goes high up. Each graphical object in my app has a boundingbox. Thus, when I invalidate the boundingbox of a line that goes from one corner of the maximized app to the oposite diagonal one, that boundingbox is virtually as big as the canvas. When a user is drawing a line, this invalidation of the boundingbox thus happens on the mousemove event, and there is a clear lag visible. This lag also exists if the line is the only object on the canvas. I've tried to optimize this in many ways. If I draw a shorter line, the CPU and the lag goes down. If I remove the Invalidate() and keep all other code, the app is quick. If I use a Region (that only spans the figure) to invalidate instead of the boundingbox, it is just as slow. If I split the boundingbox into a range of smaller boxes that lie back to back, thus reducing the invalidation area, no visible performance gain can be seen. Thus I'm at a loss here. How can I speed up the invalidation? On a side note, both Paint.Net and Mspaint suffers from the same shortcommings. Word and PowerPoint however, seem to be able to paint a line as described above with no lag and no CPU load at all. Thus it's possible to achieve the desired results, the question is how?

    Read the article

< Previous Page | 74 75 76 77 78 79 80 81 82 83 84 85  | Next Page >