Search Results

Search found 5756 results on 231 pages for 'cpu utilization'.

Page 102/231 | < Previous Page | 98 99 100 101 102 103 104 105 106 107 108 109  | Next Page >

  • High performance text file parsing in .net

    - by diamandiev
    Here is the situation: I am making a small prog to parse server log files. I tested it with a log file with several thousand requests (between 10000 - 20000 don't know exactly) What i have to do is to load the log text files into memory so that i can query them. This is taking the most resources. The methods that take the most cpu time are those (worst culprits first): string.split - splits the line values into a array of values string.contains - checking if the user agent contains a specific agent string. (determine browser ID) string.tolower - various purposes streamreader.readline - to read the log file line by line. string.startswith - determine if line is a column definition line or a line with values there were some others that i was able to replace. For example the dictionary getter was taking lots of resources too. Which i had not expected since its a dictionary and should have its keys indexed. I replaced it with a multidimensional array and saved some cpu time. Now i am running on a fast dual core and the total time it takes to load the file i mentioned is about 1 sec. Now this is really bad. Imagine a site that has tens of thousands of visits a day. It's going to take minutes to load the log file. So what are my alternatives? If any, cause i think this is just a .net limitation and i can't do much about it.

    Read the article

  • Google App Engine - Caching generated HTML

    - by Alexander
    I have written a Google App Engine application that programatically generates a bunch of HTML code that is really the same output for each user who logs into my system, and I know that this is going to be in-efficient when the code goes into production. So, I am trying to figure out the best way to cache the generated pages. The most probable option is to generate the pages and write them into the database, and then check the time of the database put operation for a given page against the time that the code was last updated. Then, if the code is newer than the last put to the database (for a particular HTML request), new HTML will be generated and served, and cached to the database. If the code is older than the last put to the database, then I will just get the HTML direct from the database and serve it (therefore avoiding all the CPU wastage of generating the HTML). I am not only looking to minimize load times, but to minimize CPU usage. However, one issue that I am having is that I can't figure out how to programatically check when the version of code uploaded to the app engine was updated. I am open to any suggestions on this approach, or other approaches for caching generated html. Note that while memcache could help in this situation, I believe that it is not the final solution since I really only need to re-generate html when the code is updated (as opposed to every time the memcache expires). Kind Regards, and thank you in advance for any suggestions you may be able to offer. -Alex

    Read the article

  • What does flushing thread local memory to global memory mean?

    - by Jack Griffith
    Hi, I am aware that the purpose of volatile variables in Java is that writes to such variables are immediately visible to other threads. I am also aware that one of the effects of a synchronized block is to flush thread-local memory to global memory. I have never fully understood the references to 'thread-local' memory in this context. I understand that data which only exists on the stack is thread-local, but when talking about objects on the heap my understanding becomes hazy. I was hoping that to get comments on the following points: When executing on a machine with multiple processors, does flushing thread-local memory simply refer to the flushing of the CPU cache into RAM? When executing on a uniprocessor machine, does this mean anything at all? If it is possible for the heap to have the same variable at two different memory locations (each accessed by a different thread), under what circumstances would this arise? What implications does this have to garbage collection? How aggressively do VMs do this kind of thing? Overall, I think am trying to understand whether thread-local means memory that is physically accessible by only one CPU or if there is logical thread-local heap partitioning done by the VM? Any links to presentations or documentation would be immensely helpful. I have spent time researching this, and although I have found lots of nice literature, I haven't been able to satisfy my curiosity regarding the different situations & definitions of thread-local memory. Thanks very much.

    Read the article

  • How to clean sys.conversation_endpoints

    - by Manjoor
    I have a table, a trigger on the table implemented using service broker. More than Half million records are inserted daily into the table. The asynchronous SP is used to check sveral condition by using inserted data and update other tables. It was running fine for last 1 month and the SP was get executed withing 2-3 seconds of insertion of record. But now it take more than 90 minute. At present sys.conversation_endpoints have too much records. (Note that all the table are truncated daily as I do not need those records day after) Other database activities are normal (average 60% CPU Utilization). Now where i need to look?? I can re-create database without any problem but i don't think it is a good way to resolve the problem

    Read the article

  • Performance degrades for more than 2 threads on Xeon X5355

    - by zoolii
    Hi All, I am writing an application using boost threads and using boost barriers to synchronize the threads. I have two machines to test the application. Machine 1 is a core2 duo (T8300) cpu machine (windows XP professional - 4GB RAM) where I am getting following performance figures : Number of threads :1 , TPS :21 Number of threads :2 , TPS :35 (66 % improvement) further increase in number of threads decreases the TPS but that is understandable as the machine has only two cores. Machine 2 is a 2 quad core ( Xeon X5355) cpu machine (windows 2003 server with 4GB RAM) and has 8 effective cores. Number of threads :1 , TPS :21 Number of threads :2 , TPS :27 (28 % improvement) Number of threads :4 , TPS :25 Number of threads :8 , TPS :24 As you can see, performance is degrading after 2 threads (though it has 8 cores). If the program has some bottle neck , then for 2 thread also it should have degraded. Any idea? , Explanations ? , Does the OS has some role in performance ? - It seems like the Core2duo (2.4GHz) scales better than Xeon X5355 (2.66GHz) though it has better clock speed. Thank you -Zoolii

    Read the article

  • Powershell - Splitting variable into chunks

    - by Andrew
    I have written a query in Powershell interrogating a F5 BIG-IP box through it's iControl API to bring back CPU usage etc. Using this code (see below) I can return the data back into a CSV format which is fine. However the $csvdata variable contains all the data. I need to be able to take this variable and for each line split each column of data into a seperate variable. The output currently looks like this: timestamp,"Utilization" 1276181160,2.3282800000e+00 Any advice would be most welcome $SystemStats = (Get-F5.iControl).SystemStatistics ### Allocate a new Query Object and add the inputs needed $Query = New-Object -TypeName iControl.SystemStatisticsPerformanceStatisticQuery $Query.object_name = $i $Query.start_time = $startTime $Query.end_time = 0 $Query.interval = $interval $Query.maximum_rows = 0 ### Make method call passing in an array of size one with the specified query $ReportData = $SystemStats.get_performance_graph_csv_statistics( (,$Query) ) ### Allocate a new encoder and turn the byte array into a string $ASCII = New-Object -TypeName System.Text.ASCIIEncoding $csvdata = $ASCII.GetString($ReportData[0].statistic_data)

    Read the article

  • Java runs out of memory, even though I give it plenty!

    - by spitzanator
    Hey, folks. So, I'm running a java server (specifically Winstone: http://winstone.sourceforge.net/ ) Like this: java -server -Xmx12288M -jar /usr/share/java/winstone-0.9.10.jar --useSavedSessions=false --webappsDir=/var/servlets --commonLibFolder=/usr/share/java This has worked fine in the past, but now it needs to load a bunch more stuff into memory than it has before. The odd part is that, according to 'top', it has 15.0g of VIRT(ual memory) and it's RES(ident set) is 8.4g. Once it hits 8.4g, the CPU hangs at 100% (even though it's loading from disk), and eventually, I get Java's OutOfMemoryError. Presumably, the CPU hanging at 100% is Java doing garbage collection. So, my question is, what gives? I gave it 12 gigs of memory! And it's only using 8.2 gigs before it throws in the towel. What am I doing wrong? Oh, and I'm using java version "1.6.0_07" Java(TM) SE Runtime Environment (build 1.6.0_07-b06) Java HotSpot(TM) 64-Bit Server VM (build 10.0-b23, mixed mode) on Linux. Thanks, Matt

    Read the article

  • Thoughts on GoGrid vs EC2

    - by Jason
    I am currently hosting my SaaS application at GoGrid (Microsoft stack). Here's what I have: Database Server - physical box, 12 GB RAM, 2 X Quad Core CPU (2.13 GHz Xeon E5506) 2 Web / App servers - cloud servers, 2 GB RAM, 2 VCPUs 300 GB monthly bandwidth I am paying around $900 / month for this. My web / app servers are busting at the seams and need to be upgraded to 4 GB of RAM. I also need a firewall, and GoGrid just added this service for an additional $200. After the upgrade, I will be paying around $1,400. I started looking at Amazon EC2, specifically this config: Database server - "High Memory Double Extra Large Instance" - 34 GB RAM, 13 EC2 compute units 2 Web / App servers - "Large Instance" - 7.5 GB RAM, 4 EC2 compute units If I go with 1 year reserved instances, my upfront cost would be $4,500 and my monthly would be $700. This comes to $1,075 / month when amortized. Amazon also includes a firewall for free. Here are my questions: Do any of you have experience running a database (especially SQL Server) on an EC2 instance? How did it perform compared to a dedicated machine? One of my major concerns is with disk I/O. Amazon's description of a compute unit is fairly vague. Any ideas on how the CPU performance on the database servers would compare? I am hoping that the Amazon solution will provide significantly better performance than my current or even improved GoGrid setup. Having a virtual database server would also be nice in terms of availability. Right now I would be in serious trouble if I had any hardware issues. Thanks for any insight...

    Read the article

  • How to improve performance

    - by Ram
    Hi, In one of mine applications I am dealing with graphics objects. I am using open source GPC library to clip/merge two shapes. To improve accuracy I am sampling (adding multiple points between two edges) existing shapes. But before displaying back the merged shape I need to remove all the points between two edges. But I am not able to find an efficient algorithm that will remove all points between two edges which has same slope with minimum CPU utilization. Currently all points are of type PointF Any pointer on this will be a great help.

    Read the article

  • scala REPL is slow on vista

    - by Jacques René Mesrine
    I installed scala-2.8.0.RC3 by extracting the tgz file into my cygwin (vista) home directory. I made sure to set $PATH to scala-2.8.0.RC3/bin. I start the REPL by typing: $ scala Welcome to Scala version 2.8.0.RC3 (Java HotSpot(TM) Client VM, Java 1.6.0_20). Type in expressions to have them evaluated. Type :help for more information. scala> Now when I tried to enter an expression scala> 1 + 'a' the cursor hangs there without any response. Granted that I have chrome open with a million tabs and VLC playing in the background, but CPU utilization was 12% and virtual memory was about 75% utilized. What's going on ? Do I have to set the CLASSPATH or perform other steps.

    Read the article

  • How to know if a device can be disabled or not?

    - by user326498
    I use the following code to enable/disable a device installed on my computer: SP_PROPCHANGE_PARAMS params; memset(&params, 0, sizeof(params)); devParams.cbSize = sizeof(devParams); params.ClassInstallHeader.cbSize = sizeof(params.ClassInstallHeader); params.ClassInstallHeader.InstallFunction = DIF_PROPERTYCHANGE; params.Scope = DICS_FLAG_GLOBAL; params.StateChange = DICS_DISABLE ; params.HwProfile = 0; // current profile if(!SetupDiSetClassInstallParams(m_hDev, &m_hDevInfo,&params.ClassInstallHeader,sizeof(SP_PROPCHANGE_PARAMS))) { dwErr = GetLastError(); return FALSE; } if(!SetupDiCallClassInstaller(DIF_PROPERTYCHANGE,m_hDev,&m_hDevInfo)) { dwErr = GetLastError(); return FALSE; } return TRUE; This code works perfectly only for those devices that can also be disabled by using Windows Device Manager, and won't work for some un-disabled devices such as my cpu device: Intel(R) Pentium(R) Dual CPU E2160 @ 1.80GHz. So the problem is how to determine if a device can be disabled or not programmatically? Is there any API to realize this goal? Thank you!

    Read the article

  • Reading same file from multiple threads in C#

    - by Gustavo Rubio
    Hi. I was googling for some advise about this and I found some links. The most obvious was this one but in the end what im wondering is how well my code is implemented. I have basically two classes. One is the Converter and the other is ConverterThread I create an instance of this Converter class that has a property ThreadNumber that tells me how many threads should be run at the same time (this is read from user) since this application will be used on multi-cpu systems (physically, like 8 cpu) so it is suppossed that this will speed up the import The Converter instance reads a file that can range from 100mb to 800mb and each line of this file is a tab-delimitted value record that is imported to another destination like a database. The ConverterThread class simply runs inside the thread (new Thread(ConverterThread.StartThread)) and has event notification so when its work is done it can notify the Converter class and then I can sum up the progress for all these threads and notify the user (in the GUI for example) about how many of these records have been imported and how many bytes have been read. It seems, however that I'm having some trouble because I get random errors about the file not being able to be read or that the sum of the progress (percentage) went above 100% which is not possible and I think that happens because threads are not being well managed and probably the information returned by the event is malformed (since it "travels" from one thread to another) Do you have any advise on better practices of implementation of threads so I can accomplish this? Thanks in advance.

    Read the article

  • Parsing data with Clojure, interval problem.

    - by Andrea Di Persio
    Hello! I'm writing a little parser in clojure for learning purpose. basically is a TSV file parser that need to be put in a database, but I added a complication. The complication itself is that in the same file there are more intervals. The file look like this: ###andreadipersio 2010-03-19 16:10:00### USER COMM PID PPID %CPU %MEM TIME root launchd 1 0 0.0 0.0 2:46.97 root DirectoryService 11 1 0.0 0.2 0:34.59 root notifyd 12 1 0.0 0.0 0:20.83 root diskarbitrationd 13 1 0.0 0.0 0:02.84` .... ###andreadipersio 2010-03-19 16:20:00### USER COMM PID PPID %CPU %MEM TIME root launchd 1 0 0.0 0.0 2:46.97 root DirectoryService 11 1 0.0 0.2 0:34.59 root notifyd 12 1 0.0 0.0 0:20.83 root diskarbitrationd 13 1 0.0 0.0 0:02.84 I ended up with this code: (defn is-header? "Return true if a line is header" [line] (> (count (re-find #"^\#{3}" line)) 0)) (defn extract-fields "Return regex matches" [line pattern] (rest (re-find pattern line))) (defn process-lines [lines] (map process-line lines)) (defn process-line [line] (if (is-header? line) (extract-fields line header-pattern)) (extract-fields line data-pattern)) My idea is that in 'process-line' interval need to be merged with data so I have something like this: ('andreadipersio', '2010-03-19', '16:10:00', 'root', 'launchd', 1, 0, 0.0, 0.0, '2:46.97') for every row till the next interval, but I can't figure how to make this happen. I tried with something like this: (def process-line [line] (if is-header? line) (def header-data (extract-fields line header-pattern))) (cons header-data (extract-fields line data-pattern))) But this doesn't work as excepted. Any hints? Thanks!

    Read the article

  • Am I a discoverer of a bug in the WPF engine?

    - by bitbonk
    We have a MFC 8 application compiled with /CLR that contains a larger amount of Windows Forms UserControls wich again contain WPF user controls using ElementHost. Due to the architecture of our software we can not use HwndHost directly. We observed an extremely strange behavior here that we can not make any sense of: When the CPU load is very high during startup of the application and there are a lot live of ElementHost instances, the whole property engine completely stops working. For example animations that usually just work fine now never update the values of the bound properties, they just stay at some random value after startup. When I set a property that is not bound to anything the value is correctly stored in the dependency property (calling the getter returns the new value) but the visual representation never reflects that. I set the background to red but the background color does not change. We tested this on a lot of different machines all running Windows XP SP2 and it is pretty reproducible. The funny thing here is, that there is in fact one situation where the bound properties actually pickup a new value from the animation and the visual gets updated based on the property values. It is when I resize the ElementHost or when I hide and reshow the parent native control. As soon as I do this, properties that are bound to an animation pickup a new value and the visuals rerender based on the new property values - but just once - if I want to see another update I have to resize the ElementHost. Do you have any explanation of what could be happening here or how I could approach this problem to find it out? What can I do to debug this? Is there a way I can get more information about what WPF actually does or where WPF might have crashed? To me it currently seems like a bug in WPF itself since it only happens at high CPU load at startup.

    Read the article

  • Update SQL Server 2000 to SQL Server 2008: Benefits please?

    - by Ciaran Archer
    Hi there I'm looking for the benefits of upgrading from SQL Server 2000 to 2008. I was wondering: What database features can we leverage with 2008 that we can't now? What new TSQL features can we look forward to using? What performance benefits can we expect to see? What else will make management go for it? And the converse: What problems can we expect to encounter? What other problems have people found when migrating? Why fix something that isn't (technically) broken? We work in a Java shop, so any .NET / CLR stuff won't rock our world. We also use Eclipse as our main development so any integration with Visual Studio won't be a plus. We do use SQL Server Management Studio however. Some background: Our main database machine is a 32bit Dell Intel Xeon MP CPU 2.0GHz, 40MB of RAM with Physical Address Extension running Windows Server 2003 Enterprise Edition. We will not be changing our hardware. Our databases in total are under a TB with some having more than 200 tables. But they are busy and during busy times we see 60-80% CPU utilisation. Apart form the fact that SQL Server 2000 is coming close to end of life, why should we upgrade? Any and all contributions are appreciated!

    Read the article

  • Unicorn: Which number of worker processes to use?

    - by blackbird07
    I am running a Ruby on Rails app on a virtual Linux server that is capped at 1GB RAM. Currently, I am constantly hitting the limit and would like to optimize memory utilization. One option I am looking at is reducing the number of unicorn workers. So what is the best way to determine the number of unicorn workers to use? The current setting is 10 workers, but the maximum number of requests per second I have seen on Google Analytics Real-Time is 3 (only scored once at a peak time; in 99% of the time not going above 1 request per second). So is it a save assumption that I can - for now - go with 4 workers, leaving room for unexpected amounts of requests? What are the metrics I should have a look at for determining the number of workers and what are the tools I can use for that on my Ubuntu machine?

    Read the article

  • AnyCPU/x86/x64 for C# application and it's C++/CLI dependency

    - by Soonts
    I'm Windows developer, I'm using Microsoft visual studio 2008 SP1. My developer machine is 64 bit. The software I'm currently working on is managed .exe written in C#. Unfortunately, I was unable to solve the whole problem solely in C#. That's why I also developed a small managed DLL in C++/CLI. Both projects are in the same solution. My C# .exe build target is "Any CPU". When my C++ DLL build target is "x86", the DLL is not loaded. As far as I understood when I googled, the reason is C++/CLI language, unlike other .NET languages, compiles to the native code, not managed code. I switched the C++ DLL build target to x64, and everything works now. However, AFAIK everything will stop working as soon as my client will install my product on a 32-bit OS. I have to support Windows Vista and 7, both 32 and 64 bit versions of each of them. I don't want to fall back to 32 bits. That 250 lines of C++ code in my DLL is only 2% of my codebase. And that DLL is only used in several places, so in the typical usage scenario it's not even loaded. My DLL implements two COM objects with ATL, so I can't use "/clr:safe" project setting. Is there way to configure the solution and the projects so that C# project builds "Any CPU" version, the C++ project builds both 32 bit and 64 bit versions, then in the runtime when the managed .EXE is starting up, it uses either 32-bit DLL or 64-bit DLL depending on the OS? Or maybe there's some better solution I'm not aware of? Thanks in advance!

    Read the article

  • Have threads run indefinitely in a java application

    - by TP
    I am trying to program a game in which I have a Table class and each person sitting at the table is a separate thread. The game involves the people passing tokens around and then stopping when the party chime sounds. how do i program the run() method so that once I start the person threads, they do not die and are alive until the end of the game One solution that I tried was having a while (true) {} loop in the run() method but that increases my CPU utilization to around 60-70 percent. Is there a better method?

    Read the article

  • Custom Controls Properties - C# , Forms - :(

    - by user353600
    Hi I m adding custom control to my flowlayoutpanel , its a sort of forex data , refresh every second , so on each timer tick , i m adding a control , changing controls button text , then adding it to flowlayout panel , i m doing it at each 100ms timer tick , it takeing tooo much CPU , here is my custom Control . public partial class UserControl1 : UserControl { public UserControl1() { InitializeComponent(); } private void UserControl1_Load(object sender, EventArgs e) { } public void displaydata(string name , string back3price , string back3 , string back2price , string back2 , string back1price , string back1 , string lay3price , string lay3 , string lay2price , string lay2 , string lay1price , string lay1 ) { lblrunnerName.Text = name.ToString(); btnback3.Text = back3.ToString() + "\n" + back3price.ToString(); btnback2.Text = back2.ToString() + "\n" + back2price.ToString(); btnback1.Text = back1.ToString() + "\n" + back1price.ToString(); btnlay1.Text = lay1.ToString() + "\n" + lay1price.ToString(); btnlay2.Text = lay2.ToString() + "\n" + lay2price.ToString(); btnlay3.Text = lay3.ToString() + "\n" + lay3price.ToString(); } and here is how i m adding control; private void timer1_Tick(object sender, EventArgs e) { localhost.marketData[] md; md = ser.getM1(); flowLayoutPanel1.Controls.Clear(); foreach (localhost.marketData item in md) { UserControl1 ur = new UserControl1(); ur.Name = item.runnerName + item.runnerID; ur.displaydata(item.runnerName, item.back3price, item.back3, item.back2price, item.back2, item.back1price, item.back1, item.lay3price, item.lay3, item.lay2price, item.lay2, item.lay1price, item.lay1); flowLayoutPanel1.SuspendLayout(); flowLayoutPanel1.Controls.Add(ur); flowLayoutPanel1.ResumeLayout(); } } now its happing on 10 times on each send , taking 60% of my Core2Duo cpu . is there any other way , i can just add contols first time , and then change the text of cutom controls buttons on runtime on each refresh or timer tick i m using c# .Net

    Read the article

  • Is there a reason why SSIS significantly slows down after a few minutes?

    - by Mark
    I'm running a fairly substantial SSIS package against SQL 2008 - and I'm getting the same results both in my dev environment (Win7-x64 + SQL-x64-Developer) and the production environment (Server 2008 x64 + SQL Std x64). The symptom is that initial data loading screams at between 50K - 500K records per second, but after a few minutes the speed drops off dramatically and eventually crawls embarrasingly slowly. The database is in Simple recovery model, the target tables are empty, and all of the prerequisites for minimally logged bulk inserts are being met. The data flow is a simple load from a RAW input file to a schema-matched table (i.e. no complex transforms of data, no sorting, no lookups, no SCDs, etc.) The problem has the following qualities and resiliences: Problem persists no matter what the target table is. RAM usage is lowish (45%) - there's plenty of spare RAM available for SSIS buffers or SQL Server to use. Perfmon shows buffers are not spooling, disk response times are normal, disk availability is high. CPU usage is low (hovers around 25% shared between sqlserver.exe and DtsDebugHost.exe) Disk activity primarily on TempDB.mdf, but I/O is very low (< 600 Kb/s) OLE DB destination and SQL Server Destination both exhibit this problem. To sum it up, I expect either disk, CPU or RAM to be exhausted before the package slows down, but instead its as if the SSIS package is taking an afternoon nap. SQL server remains responsive to other queries, and I can't find any performance counters or logged events that betray the cause of the problem. I'll gratefully reward any reasonable answers / suggestions.

    Read the article

  • Efficiently remove points with same slope

    - by Ram
    Hi, In one of mine applications I am dealing with graphics objects. I am using open source GPC library to clip/merge two shapes. To improve accuracy I am sampling (adding multiple points between two edges) existing shapes. But before displaying back the merged shape I need to remove all the points between two edges. But I am not able to find an efficient algorithm that will remove all points between two edges which has same slope with minimum CPU utilization. Currently all points are of type PointF Any pointer on this will be a great help.

    Read the article

  • Boost threading/mutexs, why does this work?

    - by Flamewires
    Code: #include <iostream> #include "stdafx.h" #include <boost/thread.hpp> #include <boost/thread/mutex.hpp> using namespace std; boost::mutex mut; double results[10]; void doubler(int x) { //boost::mutex::scoped_lock lck(mut); results[x] = x*2; } int _tmain(int argc, _TCHAR* argv[]) { boost::thread_group thds; for (int x = 10; x>0; x--) { boost::thread *Thread = new boost::thread(&doubler, x); thds.add_thread(Thread); } thds.join_all(); for (int x = 0; x<10; x++) { cout << results[x] << endl; } return 0; } Output: 0 2 4 6 8 10 12 14 16 18 Press any key to continue . . . So...my question is why does this work(as far as i can tell, i ran it about 20 times), producing the above output, even with the locking commented out? I thought the general idea was: in each thread: calculate 2*x copy results to CPU register(s) store calculation in correct part of array copy results back to main(shared) memory I would think that under all but perfect conditions this would result in some part of the results array having 0 values. Is it only copying the required double of the array to a cpu register? Or is it just too short of a calculation to get preempted before it writes the result back to ram? Thanks.

    Read the article

  • What's the most trivial function that would benfit from being computed on a GPU?

    - by hanDerPeder
    Hi. I'm just starting out learning OpenCL. I'm trying to get a feel for what performance gains to expect when moving functions/algorithms to the GPU. The most basic kernel given in most tutorials is a kernel that takes two arrays of numbers and sums the value at the corresponding indexes and adds them to a third array, like so: __kernel void add(__global float *a, __global float *b, __global float *answer) { int gid = get_global_id(0); answer[gid] = a[gid] + b[gid]; } __kernel void sub(__global float* n, __global float* answer) { int gid = get_global_id(0); answer[gid] = n[gid] - 2; } __kernel void ranksort(__global const float *a, __global float *answer) { int gid = get_global_id(0); int gSize = get_global_size(0); int x = 0; for(int i = 0; i < gSize; i++){ if(a[gid] > a[i]) x++; } answer[x] = a[gid]; } I am assuming that you could never justify computing this on the GPU, the memory transfer would out weight the time it would take computing this on the CPU by magnitudes (I might be wrong about this, hence this question). What I am wondering is what would be the most trivial example where you would expect significant speedup when using a OpenCL kernel instead of the CPU?

    Read the article

  • What OpenGL functions are not GPU accelerated?

    - by Xavier Ho
    I was shocked when I read this (from the OpenGL wiki): glTranslate, glRotate, glScale Are these hardware accelerated? No, there are no known GPUs that execute this. The driver computes the matrix on the CPU and uploads it to the GPU. All the other matrix operations are done on the CPU as well : glPushMatrix, glPopMatrix, glLoadIdentity, glFrustum, glOrtho. This is the reason why these functions are considered deprecated in GL 3.0. You should have your own math library, build your own matrix, upload your matrix to the shader. For a very, very long time I thought most of the OpenGL functions use the GPU to do computation. I'm not sure if this is a common misconception, but after a while of thinking, this makes sense. Old OpenGL functions (2.x and older) are really not suitable for real-world applications, due to too many state switches. This makes me realise that, possibly, many OpenGL functions do not use the GPU at all. So, the question is: Which OpenGL function calls don't use the GPU? I believe knowing the answer to the above question would help me become a better programmer with OpenGL. Please do share some of your insights.

    Read the article

  • What about parallelism across network using multiple PCs?

    - by MainMa
    Parallel computing is used more and more, and new framework features and shortcuts make it easier to use (for example Parallel extensions which are directly available in .NET 4). Now what about the parallelism across network? I mean, an abstraction of everything related to communications, creation of processes on remote machines, etc. Something like, in C#: NetworkParallel.ForEach(myEnumerable, () => { // Computing and/or access to web ressource or local network database here }); I understand that it is very different from the multi-core parallelism. The two most obvious differences would probably be: The fact that such parallel task will be limited to computing, without being able for example to use files stored locally (but why not a database?), or even to use local variables, because it would be rather two distinct applications than two threads of the same application, The very specific implementation, requiring not just a separate thread (which is quite easy), but spanning a process on different machines, then communicating with them over local network. Despite those differences, such parallelism is quite possible, even without speaking about distributed architecture. Do you think it will be implemented in a few years? Do you agree that it enables developers to easily develop extremely powerfull stuff with much less pain? Example: Think about a business application which extracts data from the database, transforms it, and displays statistics. Let's say this application takes ten seconds to load data, twenty seconds to transform data and ten seconds to build charts on a single machine in a company, using all the CPU, whereas ten other machines are used at 5% of CPU most of the time. In a such case, every action may be done in parallel, resulting in probably six to ten seconds for overall process instead of forty.

    Read the article

< Previous Page | 98 99 100 101 102 103 104 105 106 107 108 109  | Next Page >