Search Results

Search found 13461 results on 539 pages for 'optimizing performance'.

Page 146/539 | < Previous Page | 142 143 144 145 146 147 148 149 150 151 152 153  | Next Page >

  • How to improve performance of non-scalar aggregations on denormalized tables

    - by The Lazy DBA
    Suppose we have a denormalized table with about 80 columns, and grows at the rate of ~10 million rows (about 5GB) per month. We currently have 3 1/2 years of data (~400M rows, ~200GB). We create a clustered index to best suit retrieving data from the table on the following columns that serve as our primary key... [FileDate] ASC, [Region] ASC, [KeyValue1] ASC, [KeyValue2] ASC ... because when we query the table, we always have the entire primary key. So these queries always result in clustered index seeks and are therefore very fast, and fragmentation is kept to a minimum. However, we do have a situation where we want to get the most recent FileDate for every Region, typically for reports, i.e. SELECT [Region] , MAX([FileDate]) AS [FileDate] FROM HugeTable GROUP BY [Region] The "best" solution I can come up to this is to create a non-clustered index on Region. Although it means an additional insert on the table during loads, the hit isn't minimal (we load 4 times per day, so fewer than 100,000 additional index inserts per load). Since the table is also partitioned by FileDate, results to our query come back quickly enough (200ms or so), and that result set is cached until the next load. However I'm guessing that someone with more data warehousing experience might have a solution that's more optimal, as this, for some reason, doesn't "feel right".

    Read the article

  • NSURLConnection on simulator and iphone performance issues

    - by Nava Carmon
    I'm experiencing a weird problem and i wonder if anybody else has noticed this: I'm using NSURLConnection as it appears in apple's examples to get xml files from a certain server - pretty straight forward. And most of time it works, but sometimes it's just stuck after initialing and don't get into connection's delegate methods. I'm working with WiFi & 3G and the same server all the time. When it comes to didFailWithError i see that mostly it was a timeout error. When I enter same link in Safari it takes a second to bring data. And after another trial I can access the link. What might be the reason for such a weird behavior? How can I improve it? What is the role of cache policy with NSURLConnection? Thanks, Nava

    Read the article

  • Very slow remote page load performance in QtWebKit (Windows)

    - by Gil
    I get very slow load times on QtWebKit on Windows 7, through ADSL. I am using the Qt Demo Browser, on a Core2 Quad, 64 bit Windows 7, 4GB ram, 2gb processor. through a VPN. Simplest example: google search page takes ~18 seconds to load, compared to 2.5 on Chrome (cash cleared). On larger pages, with scripts etc. it is worse. I tried Qt 4.6 and also the Qt 4.7 beta, but don't see any difference. I see the same results with Arora browser. Are there any settings, or patches that can be applied to fix this? Thanks

    Read the article

  • boost multi_index_container and erase performance

    - by rjoshi
    I have a boost multi_index_container declared as below which is indexed by hash_unique id(unsigned long) and hash_non_unique transaction id(long). Insertion and retrieval of elements is fast but when I delete elements, it is much slower. I was expecting it to be constant time as key is hashed. e.g To erase elements from container for 10,000 elements, it takes around 2.53927016 seconds for 15,000 elements, it takes around 7.137100068 seconds for 20,000 elements, it takes around 21.391720757 seconds Is it something I am missing or is it expected behavior? class Session { public: Session() { //increment unique id static unsigned long counter = 0; boost::mutex::scoped_lock guard(mx); counter++; m_nId = counter; } unsigned long GetId() { return m_nId; } long GetTransactionHandle(){ return m_nTransactionHandle; } .... private: unsigned long m_nId; long m_nTransactionHandle; boost::mutext mx; .... }; typedef multi_index_container< Session*, indexed_by< hashed_unique< mem_fun<Session,unsigned long,&Session::GetId> >, hashed_non_unique< mem_fun<Session,unsigned long,&Session::GetTransactionHandle> > > //end indexed_by > SessionContainer; typedef SessionContainer::nth_index<0>::type SessionById; int main() { ... SessionContainer container; SessionById *pSessionIdView = &get<0>(container); unsigned counter = atoi(argv[1]); vector<Session*> vSes(counter); //insert for(unsigned i = 0; i < counter; i++) { Session *pSes = new Session(); container.insert(pSes); vSes.push_back(pSes); } timespec ts; lock_settime(CLOCK_PROCESS_CPUTIME_ID, &ts); //erase for(unsigned i = 0; i < counter; i++) { pSessionIdView->erase(vSes[i]->getId()); delete vSes[i]; } lock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts); std::cout << "Total time taken for erase:" << ts.tv_sec << "." << ts.tv_nsec << "\n"; return (EXIST_SUCCESS); }

    Read the article

  • IE performance issues with offsetHeight and offsetWidth

    - by Paul
    I have a site that grabs the response text from an AJAX call and does 'innerHTML' on a div that is going to contain it. After I do the 'innerHTML' I process the DIV by traversing the whole hierarchy of nodes and grabbing their [offsetWidth/offsetHeight] to do some operations with it. Why not css style width/height? because sometimes those values are not available since I don't control what is coming from the AJAX response, plus I want the real box dimensions including borders/scrolls/padding. On large injections (let's say 7,000 new DOM elements) IE takes way longer time than FF/Safari just to get this [offsetWidth/offsetHeight], actually if I wasn't doing injection but just render the contents of the HTML in the browser and processing it, it would be much faster. But that is not an option since I have to inject it on a div that will contain it. Anybody has deal with this kind of issue before? is there an alternative to innerHTML, I have try using documentFragment to inject and process and the move it to the div and still I don't see much gain. How can I get the values that are available with [offsetWidth/offsetHeight]? Thanks a bunch for any suggestions. Paul

    Read the article

  • Is I/O Completion ports(Windows) or Asynchronous I/O (AIO) will improve performance of multithreaded

    - by Naga
    Hi, I want to use I/O Completion ports for Windows and Asynchronous I/O (AIO) for solaris and Linux versions of my server application. The application server is multithreaded and it can accept lot of concurrent TCP connections and can process many requests per conenction. Every request will be handled by seperate detached thread. Is this criteria well enough to use the latest AIO?. Is there any standardization using which one code can be used to all platforms. Thanks, Naga

    Read the article

  • Unexpected performance curve from CPython merge sort

    - by vkazanov
    I have implemented a naive merge sorting algorithm in Python. Algorithm and test code is below: import time import random import matplotlib.pyplot as plt import math from collections import deque def sort(unsorted): if len(unsorted) <= 1: return unsorted to_merge = deque(deque([elem]) for elem in unsorted) while len(to_merge) > 1: left = to_merge.popleft() right = to_merge.popleft() to_merge.append(merge(left, right)) return to_merge.pop() def merge(left, right): result = deque() while left or right: if left and right: elem = left.popleft() if left[0] > right[0] else right.popleft() elif not left and right: elem = right.popleft() elif not right and left: elem = left.popleft() result.append(elem) return result LOOP_COUNT = 100 START_N = 1 END_N = 1000 def test(fun, test_data): start = time.clock() for _ in xrange(LOOP_COUNT): fun(test_data) return time.clock() - start def run_test(): timings, elem_nums = [], [] test_data = random.sample(xrange(100000), END_N) for i in xrange(START_N, END_N): loop_test_data = test_data[:i] elapsed = test(sort, loop_test_data) timings.append(elapsed) elem_nums.append(len(loop_test_data)) print "%f s --- %d elems" % (elapsed, len(loop_test_data)) plt.plot(elem_nums, timings) plt.show() run_test() As much as I can see everything is OK and I should get a nice N*logN curve as a result. But the picture differs a bit: Things I've tried to investigate the issue: PyPy. The curve is ok. Disabled the GC using the gc module. Wrong guess. Debug output showed that it doesn't even run until the end of the test. Memory profiling using meliae - nothing special or suspicious. ` I had another implementation (a recursive one using the same merge function), it acts the similar way. The more full test cycles I create - the more "jumps" there are in the curve. So how can this behaviour be explained and - hopefully - fixed? UPD: changed lists to collections.deque UPD2: added the full test code UPD3: I use Python 2.7.1 on a Ubuntu 11.04 OS, using a quad-core 2Hz notebook. I tried to turn of most of all other processes: the number of spikes went down but at least one of them was still there.

    Read the article

  • SQL Server insert performance

    - by Jose
    I have an insert query that gets generated like this INSERT INTO InvoiceDetail (LegacyId,InvoiceId,DetailTypeId,Fee,FeeTax,Investigatorid,SalespersonId,CreateDate,CreatedById,IsChargeBack,Expense,RepoAgentId,PayeeName,ExpensePaymentId,AdjustDetailId) VALUES(1,1,2,1500.0000,0.0000,163,1002,'11/30/2001 12:00:00 AM',1116,0,550.0000,850,NULL,@ExpensePay1,NULL); DECLARE @InvDetail1 INT; SET @InvDetail1 = (SELECT @@IDENTITY); This query is generated for only 110K rows. It takes 30 minutes for all of these query's to execute I checked the query plan and the largest % nodes are A Clustered Index Insert at 57% query cost which has a long xml that I don't want to post. A Table Spool which is 38% query cost <RelOp AvgRowSize="35" EstimateCPU="5.01038E-05" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Eager Spool" NodeId="80" Parallel="false" PhysicalOp="Table Spool" EstimatedTotalSubtreeCost="0.0466109"> <OutputList> <ColumnReference Database="[SkipPro]" Schema="[dbo]" Table="[InvoiceDetail]" Column="InvoiceId" /> <ColumnReference Database="[SkipPro]" Schema="[dbo]" Table="[InvoiceDetail]" Column="InvestigatorId" /> <ColumnReference Column="Expr1054" /> <ColumnReference Column="Expr1055" /> </OutputList> <Spool PrimaryNodeId="3" /> </RelOp> So my question is what is there that I can do to improve the speed of this thing? I already run ALTER TABLE TABLENAME NOCHECK CONSTRAINTS ALL Before the queries and then ALTER TABLE TABLENAME NOCHECK CONSTRAINTS ALL after the queries. And that didn't shave off hardly anything off of the time. Know I am running these queries in a .NET application that uses a SqlCommand object to send the query. I then tried to output the sql commands to a file and then execute it using sqlcmd, but I wasn't getting any updates on how it was doing, so I gave up on that. Any ideas or hints or help?

    Read the article

  • FLV performance and garbage collection

    - by justinbach
    I'm building a large flash site (AS3) that uses huge FLVs as transition videos from section to section. The FLVs are 1280x800 and are being scaled to 1680x1050 (much of which is not displayed to users with smaller screens), and are around 5-8 seconds apiece. I'm encoding the videos using On2's hi-def codec, VP6-S, and playback is pretty good with native FLV players, Perian-equipped Quicktime, and simple proof-of-concept FLV playback apps built in AS3. The problem I'm having is that in the context of the actual site, playback isn't as smooth; the framerate isn't quite as good as it should be, and more problematically, there's occasional jerkiness and dropped frames (sometimes pausing the video for as long as a quarter of a second or so). My guess is that this is being caused by garbage collection in the Flash player, which happens nondeterministically and is therefore hard to test and control for. I'm using a single instance of FLVPlayback to play the videos; I originally was using NetStream objects and so forth directly but switched to FLVPlayback for this reason. Has anyone experienced this sort of jerkiness with FLVPlayback (or more generally, with hi-def Flash video)? Am I right about GC being the culprit here, and if so, is there any way to prevent it during playback of these system-intensive transitions?

    Read the article

  • MYSQL: Simplify this Query for better performance

    - by Treby
    How can i simplify this code. coz this uses subquerying SELECT ub.id_product as c_pid,DATE(ub.datetime_prchs)AS datePurchased,cb.bookname, (SELECT GROUP_CONCAT(c.userid ORDER BY c.userid ASC SEPARATOR ', ') FROM user_books ub INNER JOIN campus_bookinfo cb ON ub.id_product=cb.idx_campus_bookinfo LEFT JOIN customer c ON ub.id_customer=c.id_customer WHERE ub.id_product = c_pid )as buyer, cb.iAmount FROM user_books ub INNER JOIN campus_bookinfo cb ON ub.id_product=cb.idx_campus_bookinfo LEFT JOIN customer c ON ub.id_customer=c.id_customer WHERE ub.id_customer = 29 GROUP BY bookname ORDER BY ub.datetime_prchs I need a better code for the same output.. Thanks in advance

    Read the article

  • Do Blob properties on entities affect query performance?

    - by Jaroslav Záruba
    Hello I'm trying to make my mind on whether to store a binary representation of an entity as its Blob property, or whether I better keep the blobs in some separate 'wrapping' class. Possible impact on memory heap and/or a query execution time are my concerns in the first case, complexity votes against the other one. I know Blobs are not indexed, i.e. index size is not what I'm worrying about. Also I assume for blobs Datastore puts defaultFetchGroup to false, but does it mean that blobs don't make a difference in queries? Regards J. Záruba

    Read the article

  • Important question about linq to SQL performance on high loaded web applications

    - by Alex
    I started working with linq to SQL several weeks ago. I got really tired of working with SQL server directly through the SQL queries (sqldatareader, sqlcommand and all this good stuff).  After hearing about linq to SQL and mvc I quickly moved all my projects to these technologies. I expected linq to SQL work slower but it suprisongly turned out to be pretty fast, primarily because I always forgot to close my connections when using datareaders. Now I don't have to worry about it. But there's one problem that really bothers me. There's one page that's requested thousands of times a day. The system gets data in the beginning, works with it and updates it. Primarily the updates are ++ @ -- (increase and decrease values). I used to do it like this UPDATE table SET value=value+1 WHERE ID=@I'd It worked with no problems obviously. But with linq to SQL the data is taken in the beginning, moved to the class, changed and then saved. Stats.registeredusers++; Db.submitchanges(); Let's say there were 100 000 users. Linq will say "let it be 100 001" instead of "let it be increased by 1". But if there value of users has already been increased (that happens in my site all the time) then linq will be like oops, this value is already 100 001. Whatever I'll throw an exception" You can change this behavior so that it won't throw an exception but it still will not set the value to 100 002. Like I said, it happened with me all the time. The stas value was increased twice a second on average. I simply had to rewrite this chunk of code with classic ado net. So my question is how can you solve the problem with linq

    Read the article

  • Adobe Livecycle performance

    - by Fabio
    I have installed Adobe Livecycle in order to convert MSWORD files to PDF from a web-app. Specifically, I use the DocConverter tool. Previously I have used OpenOffice UNO SDK, but I have found some problems with particular documents. Now, the conversion is ok, but the conversion time is huge. These are the times to convert documents of different sizes via Openoffice and via Livecycle. Could you suggest anything? SIZE (bytes) Openoffice (sec) Adobe LiveCycle (sec) 24064 1 8 50688 0 3 100864 0 3 253952 0 5 509440 1 5 1017856 5 18 2098688 8 10 4042240 19 45 6281216 0 9 8212480 32 125

    Read the article

  • jquery performance of binding multiple click events.

    - by DA
    I have a situation where I need to bind a click event to an object multiple times. For instance: for(i=0;i<=100;i++){ $myObject.click(function(){ window.location = "myurl"+i+".html"; }) ...do other stuff... } Via that markup, does $myObject end up with 100 click events attached to it? Should I be unbinding the click event first each time? for(i=0;i<=100;i++){ $myObject.unbind('click').click(function(){ window.location = "myurl"+i+".html"; }) ...do other stuff... }

    Read the article

  • SSL certificate performance issue

    - by sparagi
    There are some cheaper SSL certificates out there. Would a certificate from Verisign perform better/faster than a certificate from a discount provider? My gut is telling me that it does not make a difference b/c ultimately the certificate is installed on the server.

    Read the article

  • .Net 3.5 Asynchronous Socket Server Performance Problem

    - by iBrAaAa
    I'm developing an Asynchronous Game Server using .Net Socket Asynchronous Model( BeginAccept/EndAccept...etc.) The problem I'm facing is described like that: When I have only one client connected, the server response time is very fast but once a second client connects, the server response time increases too much. I've measured the time from a client sends a message to the server until it gets the reply in both cases. I found that the average time in case of one client is about 17ms and in case of 2 clients about 280ms!!! What I really see is that: When 2 clients are connected and only one of them is moving(i.e. requesting service from the server) it is equivalently equal to the case when only one client is connected(i.e. fast response). However, when the 2 clients move at the same time(i.e. requests service from the server at the same time) their motion becomes very slow (as if the server replies each one of them in order i.e. not simultaneously). Basically, what I am doing is that: When a client requests a permission for motion from the server and the server grants him the request, the server then broadcasts the new position of the client to all the players. So if two clients are moving in the same time, the server is eventually trying to broadcast to both clients the new position of each of them at the same time. EX: Client1 asks to go to position (2,2) Client2 asks to go to position (5,5) Server sends to each of Client1 & Client2 the same two messages: message1: "Client1 at (2,2)" message2: "Client2 at (5,5)" I believe that the problem comes from the fact that Socket class is thread safe according MSDN documentation http://msdn.microsoft.com/en-us/library/system.net.sockets.socket.aspx. (NOT SURE THAT IT IS THE PROBLEM) Below is the code for the server: /// /// This class is responsible for handling packet receiving and sending /// public class NetworkManager { /// /// An integer to hold the server port number to be used for the connections. Its default value is 5000. /// private readonly int port = 5000; /// /// hashtable contain all the clients connected to the server. /// key: player Id /// value: socket /// private readonly Hashtable connectedClients = new Hashtable(); /// /// An event to hold the thread to wait for a new client /// private readonly ManualResetEvent resetEvent = new ManualResetEvent(false); /// /// keeps track of the number of the connected clients /// private int clientCount; /// /// The socket of the server at which the clients connect /// private readonly Socket mainSocket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp); /// /// The socket exception that informs that a client is disconnected /// private const int ClientDisconnectedErrorCode = 10054; /// /// The only instance of this class. /// private static readonly NetworkManager networkManagerInstance = new NetworkManager(); /// /// A delegate for the new client connected event. /// /// the sender object /// the event args public delegate void NewClientConnected(Object sender, SystemEventArgs e); /// /// A delegate for the position update message reception. /// /// the sender object /// the event args public delegate void PositionUpdateMessageRecieved(Object sender, PositionUpdateEventArgs e); /// /// The event which fires when a client sends a position message /// public PositionUpdateMessageRecieved PositionUpdateMessageEvent { get; set; } /// /// keeps track of the number of the connected clients /// public int ClientCount { get { return clientCount; } } /// /// A getter for this class instance. /// /// only instance. public static NetworkManager NetworkManagerInstance { get { return networkManagerInstance; } } private NetworkManager() {} /// Starts the game server and holds this thread alive /// public void StartServer() { //Bind the mainSocket to the server IP address and port mainSocket.Bind(new IPEndPoint(IPAddress.Any, port)); //The server starts to listen on the binded socket with max connection queue //1024 mainSocket.Listen(1024); //Start accepting clients asynchronously mainSocket.BeginAccept(OnClientConnected, null); //Wait until there is a client wants to connect resetEvent.WaitOne(); } /// /// Receives connections of new clients and fire the NewClientConnected event /// private void OnClientConnected(IAsyncResult asyncResult) { Interlocked.Increment(ref clientCount); ClientInfo newClient = new ClientInfo { WorkerSocket = mainSocket.EndAccept(asyncResult), PlayerId = clientCount }; //Add the new client to the hashtable and increment the number of clients connectedClients.Add(newClient.PlayerId, newClient); //fire the new client event informing that a new client is connected to the server if (NewClientEvent != null) { NewClientEvent(this, System.EventArgs.Empty); } newClient.WorkerSocket.BeginReceive(newClient.Buffer, 0, BasePacket.GetMaxPacketSize(), SocketFlags.None, new AsyncCallback(WaitForData), newClient); //Start accepting clients asynchronously again mainSocket.BeginAccept(OnClientConnected, null); } /// Waits for the upcoming messages from different clients and fires the proper event according to the packet type. /// /// private void WaitForData(IAsyncResult asyncResult) { ClientInfo sendingClient = null; try { //Take the client information from the asynchronous result resulting from the BeginReceive sendingClient = asyncResult.AsyncState as ClientInfo; // If client is disconnected, then throw a socket exception // with the correct error code. if (!IsConnected(sendingClient.WorkerSocket)) { throw new SocketException(ClientDisconnectedErrorCode); } //End the pending receive request sendingClient.WorkerSocket.EndReceive(asyncResult); //Fire the appropriate event FireMessageTypeEvent(sendingClient.ConvertBytesToPacket() as BasePacket); // Begin receiving data from this client sendingClient.WorkerSocket.BeginReceive(sendingClient.Buffer, 0, BasePacket.GetMaxPacketSize(), SocketFlags.None, new AsyncCallback(WaitForData), sendingClient); } catch (SocketException e) { if (e.ErrorCode == ClientDisconnectedErrorCode) { // Close the socket. if (sendingClient.WorkerSocket != null) { sendingClient.WorkerSocket.Close(); sendingClient.WorkerSocket = null; } // Remove it from the hash table. connectedClients.Remove(sendingClient.PlayerId); if (ClientDisconnectedEvent != null) { ClientDisconnectedEvent(this, new ClientDisconnectedEventArgs(sendingClient.PlayerId)); } } } catch (Exception e) { // Begin receiving data from this client sendingClient.WorkerSocket.BeginReceive(sendingClient.Buffer, 0, BasePacket.GetMaxPacketSize(), SocketFlags.None, new AsyncCallback(WaitForData), sendingClient); } } /// /// Broadcasts the input message to all the connected clients /// /// public void BroadcastMessage(BasePacket message) { byte[] bytes = message.ConvertToBytes(); foreach (ClientInfo client in connectedClients.Values) { client.WorkerSocket.BeginSend(bytes, 0, bytes.Length, SocketFlags.None, SendAsync, client); } } /// /// Sends the input message to the client specified by his ID. /// /// /// The message to be sent. /// The id of the client to receive the message. public void SendToClient(BasePacket message, int id) { byte[] bytes = message.ConvertToBytes(); (connectedClients[id] as ClientInfo).WorkerSocket.BeginSend(bytes, 0, bytes.Length, SocketFlags.None, SendAsync, connectedClients[id]); } private void SendAsync(IAsyncResult asyncResult) { ClientInfo currentClient = (ClientInfo)asyncResult.AsyncState; currentClient.WorkerSocket.EndSend(asyncResult); } /// Fires the event depending on the type of received packet /// /// The received packet. void FireMessageTypeEvent(BasePacket packet) { switch (packet.MessageType) { case MessageType.PositionUpdateMessage: if (PositionUpdateMessageEvent != null) { PositionUpdateMessageEvent(this, new PositionUpdateEventArgs(packet as PositionUpdatePacket)); } break; } } } The events fired are handled in a different class, here are the event handling code for the PositionUpdateMessage (Other handlers are irrelevant): private readonly Hashtable onlinePlayers = new Hashtable(); /// /// Constructor that creates a new instance of the GameController class. /// private GameController() { //Start the server server = new Thread(networkManager.StartServer); server.Start(); //Create an event handler for the NewClientEvent of networkManager networkManager.PositionUpdateMessageEvent += OnPositionUpdateMessageReceived; } /// /// this event handler is called when a client asks for movement. /// private void OnPositionUpdateMessageReceived(object sender, PositionUpdateEventArgs e) { Point currentLocation = ((PlayerData)onlinePlayers[e.PositionUpdatePacket.PlayerId]).Position; Point locationRequested = e.PositionUpdatePacket.Position; ((PlayerData)onlinePlayers[e.PositionUpdatePacket.PlayerId]).Position = locationRequested; // Broadcast the new position networkManager.BroadcastMessage(new PositionUpdatePacket { Position = locationRequested, PlayerId = e.PositionUpdatePacket.PlayerId }); }

    Read the article

  • CUDA, more threads for same work = Longer run time despite better occupancy, Why?

    - by zenna
    I encountered a strange problem where increasing my occupancy by increasing the number of threads reduced performance. I created the following program to illustrate the problem: #include <stdio.h> #include <stdlib.h> #include <cuda_runtime.h> __global__ void less_threads(float * d_out) { int num_inliers; for (int j=0;j<800;++j) { //Do 12 computations num_inliers += threadIdx.x*1; num_inliers += threadIdx.x*2; num_inliers += threadIdx.x*3; num_inliers += threadIdx.x*4; num_inliers += threadIdx.x*5; num_inliers += threadIdx.x*6; num_inliers += threadIdx.x*7; num_inliers += threadIdx.x*8; num_inliers += threadIdx.x*9; num_inliers += threadIdx.x*10; num_inliers += threadIdx.x*11; num_inliers += threadIdx.x*12; } if (threadIdx.x == -1) d_out[blockIdx.x*blockDim.x+threadIdx.x] = num_inliers; } __global__ void more_threads(float *d_out) { int num_inliers; for (int j=0;j<800;++j) { // Do 4 computations num_inliers += threadIdx.x*1; num_inliers += threadIdx.x*2; num_inliers += threadIdx.x*3; num_inliers += threadIdx.x*4; } if (threadIdx.x == -1) d_out[blockIdx.x*blockDim.x+threadIdx.x] = num_inliers; } int main(int argc, char* argv[]) { float *d_out = NULL; cudaMalloc((void**)&d_out,sizeof(float)*25000); more_threads<<<780,128>>>(d_out); less_threads<<<780,32>>>(d_out); return 0; } Note both kernels should do the same amount of work in total, the (if threadIdx.x == -1 is a trick to stop the compiler optimising everything out and leaving an empty kernel). The work should be the same as more_threads is using 4 times as many threads but with each thread doing 4 times less work. Significant results form the profiler results are as followsL: more_threads: GPU runtime = 1474 us,reg per thread = 6,occupancy=1,branch=83746,divergent_branch = 26,instructions = 584065,gst request=1084552 less_threads: GPU runtime = 921 us,reg per thread = 14,occupancy=0.25,branch=20956,divergent_branch = 26,instructions = 312663,gst request=677381 As I said previously, the run time of the kernel using more threads is longer, this could be due to the increased number of instructions. Why are there more instructions? Why is there any branching, let alone divergent branching, considering there is no conditional code? Why are there any gst requests when there is no global memory access? What is going on here! Thanks

    Read the article

  • Generics and Performance question.

    - by Tarmon
    Hey Everyone, I was wondering if anyone could look over a class I wrote, I am receiving generic warnings in Eclipse and I am just wondering if it could be cleaned up at all. All of the warnings I received are surrounded in ** in my code below. The class takes a list of strings in the form of (hh:mm AM/PM) and converts them into HourMinute objects in order to find the first time in the list that comes after the current time. I am also curious about if there are more efficient ways to do this. This works fine but the student in me just wants to find out how I could do this better. public class FindTime { private String[] hourMinuteStringArray; public FindTime(String[] hourMinuteStringArray){ this.hourMinuteStringArray = hourMinuteStringArray; } public int findTime(){ HourMinuteList hourMinuteList = convertHMStringArrayToHMArray(hourMinuteStringArray); Calendar calendar = new GregorianCalendar(); int hour = calendar.get(Calendar.HOUR_OF_DAY); int minute = calendar.get(Calendar.MINUTE); HourMinute now = new HourMinute(hour,minute); int nearestTimeIndex = findNearestTimeIndex(hourMinuteList, now); return nearestTimeIndex; } private int findNearestTimeIndex(HourMinuteList hourMinuteList, HourMinute now){ HourMinute current; int position = 0; Iterator<HourMinute> iterator = **hourMinuteList.iterator()**; while(iterator.hasNext()){ current = (HourMinute) iterator.next(); if(now.compareTo(current) == -1){ return position; } position++; } return position; } private static HourMinuteList convertHMStringArrayToHMArray(String[] times){ FindTime s = new FindTime(new String[1]); HourMinuteList list = s.new HourMinuteList(); String[] splitTime = new String[3]; for(String time : times ){ String[] tempFirst = time.split(":"); String[] tempSecond = tempFirst[1].split(" "); splitTime[0] = tempFirst[0]; splitTime[1] = tempSecond[0]; splitTime[2] = tempSecond[1]; int hour = Integer.parseInt(splitTime[0]); int minute = Integer.parseInt(splitTime[1]); HourMinute hm; if(splitTime[2] == "AM"){ hm = s.new HourMinute(hour,minute); } else if((splitTime[2].equals("PM")) && (hour < 12)){ hm = s.new HourMinute(hour + 12,minute); } else{ hm = s.new HourMinute(hour,minute); } **list.add(hm);** } return list; } class **HourMinuteList** extends **ArrayList** implements RandomAccess{ } class HourMinute implements **Comparable** { int hour; int minute; public HourMinute(int hour, int minute) { setHour(hour); setMinute(minute); } int getMinute() { return this.minute; } String getMinuteString(){ if(this.minute < 10){ return "0" + this.minute; }else{ return "" + this.minute; } } int getHour() { return this.hour; } void setHour(int hour) { this.hour = hour; } void setMinute(int minute) { this.minute = minute; } @Override public int compareTo(Object aThat) { if (aThat instanceof HourMinute) { HourMinute that = (HourMinute) aThat; if (this.getHour() == that.getHour()) { if (this.getMinute() > that.getMinute()) { return 1; } else if (this.getMinute() < that.getMinute()) { return -1; } else { return 0; } } else if (this.getHour() > that.getHour()) { return 1; } else if (this.getHour() < that.getHour()) { return -1; } else { return 0; } } return 0; } } If you have any questions let me know. Thanks, Rob

    Read the article

  • Mssql dilemma, performance

    - by Woland
    Hello I am creating app where user can save options witch one is better? 1) to save into user table varchar feeld smthing like ('1,23,4354,34,3') query for this is select * from data where CHARINDEX ( 'L', Providers , 0 ) 0 2) create other table where user options are and just add rows select * from data where Providers in (select Providers from userdata where userid=100) thanks for help

    Read the article

  • Poor DB4O performance - What am I doing wrong?

    - by Jon
    I read Rob Conery's post about Object databases and thought I'd give it a go. I have a class lets say Book with 15 properties mostly string, some int, one Datetime and 2 decimals. I used Rob's code to open the database file etc to test on a Winforms project not a Web project. So I did the following: for(int i = 0; i < 1000000; i++) { var Book = new Book(); //Assign variables s.Save(Book); } s.CommitChanges(); This took at least a couple of minutes. I then tried to retrieve all data to test speed: var spp = s.All<Passports>(); This one line was quick, however, then doing: sppp.Count() as well as a seperate test doing a foreach loop and incrementing a counter took a couple of minutes. This seems quite slow to me. Am I doing something wrong or am I expecting too much?

    Read the article

  • dynamics CRM performance question

    - by tomo
    Hello Dynamics CRM gurus :) My boss asked me to do a research on available CMSes on market because cms we are using currently is rather a mess. For me as a .NET developer it would be great to choose and implement Dynamics CRM because of extensibility and perfect integration with .NET environment and well-known tools. All marketing blahbla sounds great but I'd like to know about common DISADVANTAGES, ISSUES concerning this system. The most important is how it is performing in a company with about 150 concurrent and very active users. I heard that is't really slow comparing to competitors system. Thanks in advance Best regards, Tomasz.

    Read the article

  • General SQL Server query performance

    - by Kiril
    Hey guys, This might be stupid, but databases are not my thing :) Imagine the following scenario. A user can create a post and other users can reply to his post, thus forming a thread. Everything goes in a single table called Posts. All the posts that form a thread are connected with each other through a generated key called ThreadID. This means that when user #1 creates a new post, a ThreadID is generated, and every reply that follows has a ThreadID pointing to the initial post (created by user #1). What I am trying to do is limit the number of replies to let's say 20 per thread. I'm wondering which of the approaches bellow is faster: 1 I add a new integer column (e.x. Counter) to Posts. After a user replies to the initial post, I update the initial post's Counter field. If it reaches 20 I lock the thread. 2 After a user replies to the initial post, I select all the posts that have the same ThreadID. If this collection has more than 20 items, I lock the thread. For further information: I am using SQL Server database and Linq-to-SQL entity model. I'd be glad if you tell me your opinions on the two approaches or share another, faster approach. Best Regards, Kiril

    Read the article

< Previous Page | 142 143 144 145 146 147 148 149 150 151 152 153  | Next Page >