Search Results

Search found 3828 results on 154 pages for 'mathematical optimization'.

Page 39/154 | < Previous Page | 35 36 37 38 39 40 41 42 43 44 45 46  | Next Page >

  • What calls trigger a new batch?

    - by sebf
    I am finding my project is starting to show performance degradation and I need to optimize it. The answer to my previous question and this presentation from NVidia have helped greatly in understanding the performance characteristics of code using the GPU but there are a couple of things that aren't clear that I need to know to optimize my drawing. Specifically, what calls make the distinction between batches. I know that any state changes cause a new batch, so that includes: Render State Changes Buffer Changes Shader Changes Render Target Changes Correct? What else counts as a 'state change'? Does each Draw**Primitive() call constitute a new batch? Even if I were to issue the same call twice, with no state changes, or call it once on on part of the buffer, then again on another? If I were to update a buffer, but not change the bindings, would that be a new batch? That presentation and a DX9 page suggest using all of the texture slots available, which I take to mean loading multiple objects in 'parallel' by mapping their buffers/shaders/textures to slots 1-16. But I am not sure how this works - surely to do this you would need to change the buffer binding and that would count as a state change? (or is it a case of you do but it saves 16 calls so its OK?)

    Read the article

  • Object pools for efficient resource management

    - by GameDevEnthusiast
    How can I avoid using default new() to create each object? My previous demo had very unpleasant framerate hiccups during dynamic memory allocations (usually, when arrays are resized), and creating lots of small objects which often contain one pointer to some DirectX resource seems like an awful lot of waste. I'm thinking about: Creating a master look-up table to refer to objects by handles (for safety & ease of serialization), much like EntityList in source engine Creating a templated object pool, which will store items contiguously (more cache-friendly, fast iteration, etc.) and the stored elements will be accessed (by external systems) via the global lookup table. The object pool will use the swap-with-last trick for fast removal (it will invoke the object's ~destructor first) and will update the corresponding indices in the global table accordingly (when growing/shrinking/moving elements). The elements will be copied via plain memcpy(). Is it a good idea? Will it be safe to store objects of non-POD types (e.g. pointers, vtable) in such containers? Related post: Dynamic Memory Allocation and Memory Management

    Read the article

  • Which opcodes are faster at the CPU level?

    - by Geotarget
    In every programming language there are sets of opcodes that are recommended over others. I've tried to list them here, in order of speed. Bitwise Integer Addition / Subtraction Integer Multiplication / Division Comparison Control flow Float Addition / Subtraction Float Multiplication / Division Where you need high-performance code, C++ can be hand optimized in assembly, to use SIMD instructions or more efficient control flow, data types, etc. So I'm trying to understand if the data type (int32 / float32 / float64) or the operation used (*, +, &) affects performance at the CPU level. Is a single multiply slower on the CPU than an addition? In MCU theory you learn that speed of opcodes is determined by the number of CPU cycles it takes to execute. So does it mean that multiply takes 4 cycles and add takes 2? Exactly what are the speed characteristics of the basic math and control flow opcodes? If two opcodes take the same number of cycles to execute, then both can be used interchangeably without any performance gain / loss? Any other technical details you can share regarding x86 CPU performance is appreciated

    Read the article

  • Game has noticeable frame drops but when through a profiler it always runs smooth

    - by felipedrl
    I'm trying to optimize my PC game but I can find the bottleneck since every time I run it through a profiler (gDEBugger) it runs smooths. When running outside gDEBugger I get these annoying hiccups. It's not just the graphics, the sound also gets choppy. The drops are inconsistent across runs, i.e, sometimes I run the same scenario and get no drops at all, sometimes I get a few drops, and others the game is consistently slow. The only constant is: when running through gDEBugger I ALWAYS get a smooth run. I'm suspecting something outside my game is interfering and causing these drops, but what in the hell does gDEBugger do that nullifies these drops? A higher process priority? Any ideas? Thanks in advance.

    Read the article

  • Multithreading for a mixed-genre game in Python?

    - by arrogantc
    So here's the situation. I'm making a game that mixes two genres; arcade shooter and puzzler. They don't intertwine TOO much; all the interaction that really goes on is that every time an enemy is destroyed, a block is created. The blocks aren't even a part of the main collision detection system; they have their own more suited to their needs. What I want to ask is this; might it be a good idea to have the arcade shooter portion run on one thread, and the puzzle game portion run on another?

    Read the article

  • What's a good starting point to learn about JIT compilers?

    - by davidk01
    I've spent the past few months learning about stack based virtual machines, parsers, compilers, and some elementary things about hardware architecture. I've also written a few parsers and compilers for C like languages to understand the generic parser/compiler pipeline. Now I'd like to take my understanding further by learning about optimizing compilers and JIT compilers but I'm having a hard time finding material at the right level. I don't yet understand enough to dive into a code base like PyPy or LuaJIT but I also know more than what most introductory compiler books have to offer. So what are some good books for an intermediate beginner like to me to look into?

    Read the article

  • Recasting and Drawing in SDL

    - by user1078123
    I have some code that essentially draws a column on the screen of a wall in a raycasting-type 3d engine. I am trying to optimize it, as it takes about 10 milliseconds do draw a million pixels using this, and the vast majority of game time is spent in this loop. However, I don't quite understand what's occurring, particularly the recasting (I modified the "pixel manipulation" sample code from the SDL documentation). "canvas" is the surface I am drawing to, and "hello" is the surface containing the texture for the column. int c = (curcol)* canvas->format->BytesPerPixel; void *canvaspixels = canvas->pixels; Uint16 texpitch = hello->pitch; int lim = (drawheight +startdraw) * canvpitch +c + (int) canvaspixels; Uint8 *k = (Uint8 *)hello->pixels + (hit)* hello->format->BytesPerPixel; for (int j= (startdraw)*(canvpitch)+c + (int) canvaspixels; (j< lim); j+= canvpitch){ Uint8 *q = (Uint8 *) ((int(h))*(texpitch)+k); *(Uint32 *)j = *(Uint32 *)q; h += s; } We have void pointers (not sure how those are even represented), 8, 16, and 32 bit ints (h and s are floats), all being intermingled, and while it works, it is quite confusing.

    Read the article

  • Why is my fresh install of 12.04 running slow?

    - by user75129
    Hey guys I'm a new linux user, I figured it would be the best for the laptop I just purchased because it's said to be faster than Windows 7. I'm currently dual-booting with Windows 7 Professial and Ubuntu 12.04. The laptop I am using is the LG X Note P210 Specs: Intel Core i5 470UM Dual Core clocked at 1.33GHz 12.5" HD LED LCD Screen at 1366 by 768 4GB DDR3 @ 1333 MHz RAM Integrated Intel HD Graphics Card 4 Cell Battery with 3150mAh It comes loaded Windows 7 Home Premium 64-Bit, it runs fine on that but my Ubuntu 12.04 runs slower than it and I don't understand why, it definitely has decent specs to run even a 64-bit operating system and do some gaming. Granted I know it's not the best but for a laptop it does the job so Ubuntu should work especially since it's said to make older units with worse specs run even better. I'm not all that familiar with coding and all so what are things I can do to optimize speed without overclocking? Boot up is fine, its program response time I believe, once Im in the actual OS, it lags, slows down, apps stop working, take forever to load up apps.

    Read the article

  • Fast software color interpolating triangle rasterization technique

    - by Belgin
    I'm implementing a software renderer with this rasterization method, however, I was wondering if there is a possibility to improve it, or if there exists an alternative technique that is much faster. I'm specifically interested in rendering small triangles, like the ones from this 100k poly dragon: As you can see, the method I'm using is not perfect either, as it leaves small gaps from time to time (at least I think that's what's happening). I don't mind using assembly optimizations. Pseudocode or actual code (C/C++ or similar) is appreciated. Thanks in advance.

    Read the article

  • How to optimise mesh data

    - by Wardy
    So i have some procedurally generated mesh data and i want to reduce it down to its minimum number of verts. In case it matters this is a unity project. Working on the basis of a simple example, lets assume a typical flat surface of points 2 by 3. The point / vertex at [1,1] is used in many triangles. I've generated mesh for a voxel type engine that adds verts to a list based on face visiblility and now I want to remove all the duplicates. Can anyone come up with an efficient way of doing this because what i have is sooo bad its not even funny (and i don't even think it's logically correct) ... private void Optimize() { Vector3 v; Vector3 v2; for (int i = 0; i < Vertices.Count; i++) { v = Vertices[i]; for (int j = i+1; j < Vertices.Count; j++) { v2 = Vertices[j]; if (v.x == v2.x && v.y == v2.y && v.z == v2.z) { for (int ind = 0; ind < Indices.Count; ind++) { if (Indices[ind] == j) { Indices[ind] = i; } else if (Indices[ind] > j && Indices[ind] > 0) Indices[ind]--; } Vertices.RemoveAt(j); Uvs.RemoveAt(j); Normals.RemoveAt(j); } } } } EDIT: Ok i managed to get this (code sample above updated) to render an "optimised" set of verts but the UV data is all wrong now, which would make sense because i'm basically just removing any UV Vector that represents a UV coord for a removed vert and not actually considering what I need to do to "fix the tri" so to speak. The code now seemingly does work but its quite time consuming, still looking to further optimise.

    Read the article

  • SEO on an existing platform

    - by Simon
    I'm given the task to increase user visits and conversions on for a recruitment website. Conversions would be interested job seekers submitting their CV. The manager would first like to increase the organic search results and optimize the website before starting with targeted campaigns. The problem is, they are using a proprietary recruitment software platform which I can barely add changes to. For example, the url's all look like dynamic url's without any semantic meaning and the markup is almost completely build automatically by that platform. I'm also confident that the lack of submitted CV's is due to a bad user experience of the website (no incentives or clear CTA to register) Besides optimizing the static texts and page titles, is there anything I can do? Thanks

    Read the article

  • Boolean checks with a single quadtree, or multiple quadtrees?

    - by Djentleman
    I'm currently developing a 2D sidescrolling shooter game for PC (think metroidvania but with a lot more happening at once). Using XNA. I'm utilising quadtrees for my spatial partitioning system. All objects will be encompassed by standard bounding geometry (box or sphere) with possible pixel-perfect collision detection implemented after geometry collision (depends on how optimised I can get it). These are my collision scenarios, with < representing object overlap (multiplayer co-op is the reason for the player<player scenario): Collision scenarios (true = collision occurs): Player <> Player = false Enemy <> Enemy = false Player <> Enemy = true PlayerBullet <> Enemy = true PlayerBullet <> Player = false PlayerBullet <> EnemyBullet = true PlayerBullet <> PlayerBullet = false EnemyBullet <> Player = true EnemyBullet <> Enemy = false EnemyBullet <> EnemyBullet = false Player <> Environment = true Enemy <> Environment = true PlayerBullet <> Environment = true EnemyBullet <> Environment = true Going off this information and the fact that were will likely be several hundred objects rendering on-screen at any given time, my question is as follows: Which method is likely to be the most efficient/optimised and why: Using a single quadtree with boolean checks for collision between the different types of objects. Using three quadtrees at once (player, enemy, environment), only testing the player and enemy trees against each other while testing both the player and enemy trees against the environment tree.

    Read the article

  • Hosting advice for a write-heavy dynamic website

    - by Rahul Rawat
    I have built a website using PHP and MySQL and now I am looking for a hosting service. I am expecting about a 1000 users registering and about 5-10k pageviews/day in a week's time. So which host should I opt for? It will let users submit contents of type blobs and submit around 10 pictures per users. I hope that traffic will increase so can justhost's or bluehost's shared hosting serve that purpose or should I go for more dedicated ones. Basically the site is write heavy and there are average 2-3 MySQL queries per page and it is quite dynamic. So depending on these requirements which web hosting will be optimal for me.

    Read the article

  • Adsense ads are not a good fit for my site

    - by Ryan Grush
    I run an academic network for college students to communicate at particular universities and we run Google Adsense. The site pulls in a decent amount for a side project but our CTR is horrible <0.2% and our RPM is equally low. The problem lies in the fact that Google pegs us as an education site (which we are) but shows our users ads for U of Phoenix, Devry U and other for-profit universities. All of our users are students of the more higher-caliber institutions and therefore have no use for these ads. I've known about this problem for some time but I don't know what to do to show more relevant ads instead (i.e. Spring Break, school apparel, poker, sports, etc). What would be the best way to change this?

    Read the article

  • Can I get a C++ Compiler to instantiate objects at compile time

    - by gam3
    I am writing some code that has a very large number of reasonably simple objects and I would like them the be created at compile time. I would think that a compiler would be able to do this, but I have not been able to figure out how. In C I could do the the following: #include <stdio.h> typedef struct data_s { int a; int b; char *c; } info; info list[] = { 1, 2, "a", 3, 4, "b", }; main() { int i; for (i = 0; i < sizeof(list)/sizeof(*list); i++) { printf("%d %s\n", i, list[i].c); } } Using #C++* each object has it constructor called rather than just being layed out in memory. #include <iostream> using std::cout; using std::endl; class Info { const int a; const int b; const char *c; public: Info(const int, const int, const char *); const int get_a() { return a; }; const int get_b() { return b; }; const char *get_c() const { return c; }; }; Info::Info(const int a, const int b, const char *c) : a(a), b(b), c(c) {}; Info list[] = { Info(1, 2, "a"), Info(3, 4, "b"), }; main() { for (int i = 0; i < sizeof(list)/sizeof(*list); i++) { cout << i << " " << list[i].get_c() << endl; } } I just don't see what information is not available for the compiler to completely instantiate these objects at compile time, so I assume I am missing something.

    Read the article

  • How much configurability to give to users regarding concurrency?

    - by rwong
    This question is a narrowing-down of these related questions: How much effort should we spend to programming for multiple cores? Concurrency: How do you approach the design and debug the implementation? Given that each user's computers may have different performance characteristics with respect to calculations, memory, disk I/O bandwidth and network I/O bandwidth, and that it is difficult to implement an automated self-tuning system in your software, how much configurability should we give to the end-users so that they can find ways (by trial-and-error?) to improve our software's efficiency? If we give users the ability to change these settings, how do we give visual feedback to users so they can measure the performance changes?

    Read the article

  • Which social sign-in (Google, twitter, fb, etc) is most often used (if I could only choose one, which would statistically retain the most users)?

    - by David
    I am working with a startup which is about to do it's launch in maybe 2-3 weeks. In order to see the primary features of the site, the user has to register or sign in if they have already registered. We quickly decided we wanted to incorporate social plugins as alternatives to a conventional sign up, just like stackexchange does. But seeing that we are strapped for time and fairly amateur developers, I'm trying to justify just choosing one or two social sign-ins to start with for the launch and then maybe add more later. Based on my experience as a user, I'm guessing that twitter and google (in no particular order of importance) would probably be the most important social sign-ins in order to retain as many users as possible, but have absolutely no statistics to back that up other than my own anecdotal experience. This question hasn't been visibly asked on the internet, so I figured I'd hop on stackexchange and give it a punt. Thanks.

    Read the article

  • How to generate portal zones?

    - by Meow
    I'm developing a portal-based scene manager. Basically all it does is to check the portals against the camera frustum, and render their associated portal zones accordingly. Is there any way my editor can generate portal zones automatically with the user having to set the portals themselves only? For example, the Max Payne 1/2 engine ("Max-FX") only required to set the portal quads, unlike the C4 engine where you also have to explicitly set the portal zones.

    Read the article

  • Are there jobs which are oriented towards optimisation programming or assembly

    - by jokoon
    3D engine programmers have to care a little about execution speed, but what about the programmers at ATI and nVidia ? How much do they need to optimize their driver applications ? Are there jobs out there who only purpose is execution speed and optimisation, or jobs for people to program only in assembly ? Please, no flame war about "premature optimisation is the root of all evil", I just want to know if such jobs exists, maybe in security ? In kernel programming ? Where ? Not at all ?

    Read the article

  • How can I best implement 'cache until further notice' with memcache in multiple tiers?

    - by ajreal
    the term "client" used here is not referring to client's browser, but client server Before cache workflow 1. client make a HTTP request --> 2. server process --> 3. store parsed results into memcache for next use (cache indefinitely) --> 4. return results to client --> 5. client get the result, store into client's local memcache with TTL After cache workflow 1. another client make a HTTP request --> 2. memcache found return memcache results to client --> 3. client get the result, store into client's local memcache with TTL TTL = time to live Is possible for me to know when the data was updated, and to expire relevant memcache(s) accordingly. However, the pitfalls on client site cache TTL Any data update before the TTL is not pick-up by client memcache. In reverse manner, where there is no update, client memcache still expire after the TTL First request (or concurrent requests) after cache TTL will get throttle as it need to repeat the "Before cache workflow" In the event where client require several HTTP requests on a single web page, it could be very bad in performance. Ideal solution should be client to cache indefinitely until further notice. Here are the three proposals about futher notice Proposal 1 : Make use on HTTP header (current implementation) 1. client sent HTTP request last modified time header 2. server check if last data modified time=last cache time return status 304 3. client based on header to decide further processing GOOD? ---- - save some parsing for client - lesser data transfer BAD? ---- - fire a HTTP request is still slow - server end still need to process lots of requests Proposal 2 : Consistently issue a HTTP request to check all data group last modified time 1. client fire a HTTP request 2. server to return last modified time for all data group 3. client compare local last cache time with the result 4. if data group last cache time < server last modified time then request again for that data group only GOOD? ---- - only fetch what is no up-to-date - less requests for server BAD? ---- - every web page require a HTTP request Proposal 3 : Tell client when new data is available (Push) 1. when server end notice there is a change on a data group 2. notify clients on the changes 3. help clients to fetch again data 4. then reset client local memcache after data is parsed GOOD? ---- - let the cache act/behave like a true cache BAD? ---- - encourage race condition My preference is on proposal 3, and something like Gearman could be ideal Where there is a change, Gearman server to sent the task to multiple clients (workers). Am I crazy? (I know my first question is a bit crazy)

    Read the article

  • Images Loading Very Slowly

    - by Vecta
    I'm currently working on optimizing my site to try to decrease load time by using Pingdom tools. I seem to be having some difficulty with long load times on images. For example, the body background for my site is a 29kb file but takes almost 500 ms to load, the majority of which is spent connecting to the server. This one seems to take the longest times but other images seem to take a lot of time as well—the majority of which seems to be spent connecting to the server. This also seems to fluctuate as I've seen the same image load in 500ms one minute and ten minutes later load in 1.5 seconds. My site is using the Modx CMS but I'm not sure if that would affect this at all. Is it more likely that this is a server issue? Is there anything that I should check or do to help alleviate these inflated 'connect' times?

    Read the article

  • Grid based collision - How many cells?

    - by Fibericon
    The game I'm creating is a bullet hell game, so there can be quite a few objects on the screen at any given time. It probably maxes out at about 40 enemies and 200 or so bullets. That being said, I'm splitting up the playing field into a grid for my collision checking. Right now, it's only 8 cells. How many would be optimal? I'm worried that if I use too many, I'll be wasting CPU power. My main concern is processing power, to make the game run smoothly. RAM is not a big concern for me.

    Read the article

  • How can I get the best performance for playing games?

    - by Oli
    I've been playing a couple of Wine-games today and decided to switch to metacity to see what the performance difference was like. If you've never done it before, you just run metacity --replace but don't do that if you use Unity! Anyway, surprise surprise it was like playing on a dedicated Windows gaming machine. Playing under metacity today was bliss. Much higher framerates and just a fluidity that you'd expect from a native game. I'm not sure I can go back. Switching to metacity is no hardship but I wonder if there's anything else in the WM landscape that I should try out. I'm essentially looking for suggestions for the best way to play games. Mix up WMs, dedicated X sessions, whatever... As long as it makes Wine games run faster. Small print One process per answer (eg: New X session + OpenBox) We should probably land on a benchmark so we can show percentage improvement over a stock Compiz desktop. I'm open to suggestions in the comments. If people could test it and submit their how much it improves things for them in the comments, that would give others a good idea of if it's worth the pain.

    Read the article

  • What are some efficient ways to set up my environment when working on a remote site?

    - by Prefix
    Hello fellow Programmers, I am still a relatively new programmer and have recently gotten my first on-campus programming position. I am the sole dev responsible for 8 domains as well as 3 small sized PHP web apps. The campus has its web environment divided into staging and live servers -- we develop on the staging via SFTP and then push the updates to the live server through a web GUI. I use Sublime Text 2 and the Sublime SFTP plugin currently for all my dev work (its my preferred editor). If I am just making an edit to a page I'll open that individual file via the ftp browser. If I am working on the PHP web app projects, I have the app directory mapped to a local folder so that when I save locally the file is auto-uploaded through Sublime SFTP. I feel like this workflow is slow and sub-optimal. How can I improve my workflow for working with remote content? I'd love to set up a local environment on my machine as that would eliminate the constant SFTP upload/download, but as I said there are many sites and the space required for a local copy of the entire domain would be quite large and complex; not to mention keeping it updated with whatever the latest on the staging server is would be a nightmare. Anyone know how I can improve my general web dev workflow from what I've described? I'd really like to cut out constantly editing over FTP but I'm not sure where to start other than ripping the entire directory and dumping it into XAMP.

    Read the article

< Previous Page | 35 36 37 38 39 40 41 42 43 44 45 46  | Next Page >