Search Results

Search found 13634 results on 546 pages for 'memory cards'.

Page 153/546 | < Previous Page | 149 150 151 152 153 154 155 156 157 158 159 160 | Next Page >

Does LINQ require significantly more processing cycles and memory than lower-level data iteration techniques?

- by Matthew Patrick Cashatt

Background I am recently in the process of enduring grueling tech interviews for positions that use the .NET stack, some of which include silly questions like this one, and some questions that are more valid. I recently came across an issue that may be valid but I want to check with the community here to be sure. When asked by an interviewer how I would count the frequency of words in a text document and rank the results, I answered that I would Use a stream object put the text file in memory as a string. Split the string into an array on spaces while ignoring punctuation. Use LINQ against the array to .GroupBy() and .Count(), then OrderBy() said count. I got this answer wrong for two reasons: Streaming an entire text file into memory could be disasterous. What if it was an entire encyclopedia? Instead I should stream one block at a time and begin building a hash table. LINQ is too expensive and requires too many processing cycles. I should have built a hash table instead and, for each iteration, only added a word to the hash table if it didn't otherwise exist and then increment it's count. The first reason seems, well, reasonable. But the second gives me more pause. I thought that one of the selling points of LINQ is that it simply abstracts away lower-level operations like hash tables but that, under the veil, it is still the same implementation. Question Aside from a few additional processing cycles to call any abstracted methods, does LINQ require significantly more processing cycles to accomplish a given data iteration task than a lower-level task (such as building a hash table) would?

Read the article
Creating foreign words' learning site with memory technique (Web 2.0)? Will it work?

- by Michal P.

I would like to earn a little money for realizing a good, simple project. My idea is to build a website for learning of chosen by me language (for users knowing English) using mnemonics. Users would be encourage to enter English words with translation to another language and describing the way, how to remember a foreign language word (an association link). Example: if I choose learning Spanish for people who knows English well, it would look like that: every user would be encourage to enter a way to remember a chosen by him/her Spanish word. So he/she would enter to the dictionary (my site database) ,e.g., English word: beach - playa (Spanish word). Then he/she would describe the method to remember Spanish word, e.g., "Image that U r on the beach and U play volleyball" - we have the word play and recall playa (mnemonics). I would like to give possibility of pic hotlinks, encourage for fun or little shocking memory links which is -- in the art of memory -- good. I would choose a language to take a niche of Google Search. The big question is if I don't lose my time on it?? (Maybe I need to find prototype way to check that idea?)

Read the article
How Can I Improve This Card-Game AI?

- by James Burgess

Let me get this out there before anything else: this is a learning exercise for me. I am not a game developer by trade or hobby (at least, not seriously) and am purely delving into some AI- and 3D-related topics to broaden my horizons a bit. As part of the learning experience, I thought I'd have a go at developing a basic card game AI. I selected Pit as the card game I was going to attempt to emulate (specifically, the 'bull and bear' variation of the game as mentioned in the link above). Unfortunately, the rule-set that I'm used to playing with (an older version of the game) isn't described. The basics of it are: The number of commodities played with is equal to the number of players. The bull and bear cards are included. All but two players receive 8 cards, two receive 9 cards. A player can win the round with 7 + bull, 8, or 8 + bull (receiving double points). The bear is a penalty card. You can trade up to a maximum of 4 cards at a time. They must all be of the same type, but can optionally include the bull or bear (so, you could trade A, A, A, Bull - but not A, B, A, Bull). For those who have played the card game, it will probably have been as obvious to you as it was to me that given the nature of the game, gameplay would seem to resemble a greedy algorithm. With this in mind, I thought it might simplify my AI experience somewhat. So, here's what I've come up with for a basic AI player to play Pit... and I'd really just like any form of suggestion (from improvements to reading materials) relating to it. Here it is in something vaguely pseudo-code-ish ;) While AI does not hold 7 similar + bull, 8 similar, or 8 similar + bull, do: 1. Establish 'target' hand, by seeing which card AI holds the most of. 2. Prepare to trade next-most-numerous card type in a trade (max. held, or 4, whichever is fewer) 3. If holding the bear, add to (if trading <=3 cards) or replace in (if trading 4 cards) hand. 4. Offer cards for trade. 5. If cards are accepted for trade within X turns, continue (clearing 'failed card types'). Otherwise: a. If only one card remains in the trade, go to #6. Otherwise: i. Remove one non-penalty card from the trade. ii. Return to #5. 6. Add card type to temporary list of failed card types. 7. Repeat from #2 (excluding 'failed card types'). I'm aware this is likely to be a sub-optimal way of solving the problem, but that's why I'm posting this question. Are there any AI- or algorithm-related concepts that I've missed and should be incorporating to make a better AI? Additionally, what are the flaws with my AI at present (I'm well aware it's probably far from complete)? Thanks in advance!

Read the article
What DX level does my graphics card support? Does it go to 11?

- by Daniel Moth

Recently I run into a situation that I have run into quite a few times. Someone encounters a machine and the question arises: "Is there a DirectX 11 card in this machine?". Typically the reason you are interested in that is because cards with DirectX 11 drivers fully support DirectCompute (and by extension C++ AMP) for GPGPU programming. The driver specifically is WDDM (1.1 on Windows 7 and Windows 8 introduces WDDM 1.2 with cool new capabilities). There are many ways for figuring out if you have a DirectX11 card, so here are the approaches that you can use, with a bonus right at the end of the post. Run DxDiag WindowsKey + R, type DxDiag and hit Enter. That is the DirectX diagnostic tool, which unfortunately, only tells you on the "System" tab what is the highest version of DirectX installed on your machine. So if it reports DirectX 11, that doesn't mean you have a DX11 driver! The "Display" tab has a promising "DDI version" label, but unfortunately that doesn't seem to be accurate on the machines I've tested it with (or I may be misinterpreting its use). Either way, this tool is not the one you want for this purpose, although it is good for telling you the WDDM version among other things. Use the Microsoft hardware page There is a Microsoft Windows 7 compatibility center, that lists all hardware (tip: use the advanced search) and you could try and locate your device there… good luck. Use Wikipedia or the hardware vendor's website Use the Wikipedia page for the vendor cards, for both nvidia and amd. Often this information will also be in the specifications for the cards on the IHV site, but is is nice that wikipedia has a single page per vendor that you can search etc. There is a column in the tables for API support where you can see the DirectX version. Check if it is one of these recommended DX11 cards You may not have a DirectX 11 card and are interested in purchasing one. While I am in no position to make recommendations, I will list here some cards from two big IHVs that we know are DirectX 11 capable. Some AMD (aka ATI) cards Low end, inexpensive DX11 hardware: Radeon 5450, 5550, 6450, 6570 Mid range (decent perf, single precision): Radeon 5750, 5770, 6770, 6790 High end (capable of double precision): Radeon 5850, 5870, 6950, 6970 Single precision APUs: AMD E-Series APUs AMD A-Series APUs Some NVIDIA cards Low end, inexpensive DX11 hardware: GeForce GT430, GT 440, GT520, GTS 450 Quadro 400, 600 Mid-range (decent perf, single precision): GeForce GTX 460, GTX 550 Ti, GTX 560, GTX 560 Ti Quadro 2000 High end (capable of double precision): GeForce GTX 480, GTX 570, GTX 580, GTX 590, GTX 595 Quadro 4000, 5000, 6000 Tesla C2050, C2070, C2075 Get the DirectX SDK and run DirectX Caps Viewer Download and install the June 2010 DirectX SDK. As part of that you now have the DirectX Capabilities Viewer utility (find it in your start menu by searching for "DirectX Caps Viewer", the filename is DXCapsViewer.exe). It will list all your devices (emulated, and real hardware ones) under the first node. Expand the hardware entries and then expand again the Direct3D 11 folder. If you see D3D_FEATURE_LEVEL_11_ under that, then your card supports feature level 11 which means it supports DirectCompute and C++ AMP. In the following screenshot of one of my old laptops, the card only goes to feature level 10. Run a utility from the web that just tells you! Of course, writing some C++ AMP code that enumerates accelerators and lists the ones that are capable is trivial. However that requires that you have redistributed the runtime, so a more broadly applicable approach is to use the DX APIs directly to enumerate the DX11 capable cards. That is exactly what the development lead for C++ AMP has done and he describes and shares that utility at this post. Comments about this post by Daniel Moth welcome at the original blog.

Read the article
What DX level does my graphics card support? Does it go to 11?

- by Daniel Moth

Recently I run into a situation that I have run into quite a few times. Someone encounters a machine and the question arises: "Is there a DirectX 11 card in this machine?". Typically the reason you are interested in that is because cards with DirectX 11 drivers fully support DirectCompute (and by extension C++ AMP) for GPGPU programming. The driver specifically is WDDM (1.1 on Windows 7 and Windows 8 introduces WDDM 1.2 with cool new capabilities). There are many ways for figuring out if you have a DirectX11 card, so here are the approaches that you can use, with a bonus right at the end of the post. Run DxDiag WindowsKey + R, type DxDiag and hit Enter. That is the DirectX diagnostic tool, which unfortunately, only tells you on the "System" tab what is the highest version of DirectX installed on your machine. So if it reports DirectX 11, that doesn't mean you have a DX11 driver! The "Display" tab has a promising "DDI version" label, but unfortunately that doesn't seem to be accurate on the machines I've tested it with (or I may be misinterpreting its use). Either way, this tool is not the one you want for this purpose, although it is good for telling you the WDDM version among other things. Use the Microsoft hardware page There is a Microsoft Windows 7 compatibility center, that lists all hardware (tip: use the advanced search) and you could try and locate your device there… good luck. Use Wikipedia or the hardware vendor's website Use the Wikipedia page for the vendor cards, for both nvidia and amd. Often this information will also be in the specifications for the cards on the IHV site, but is is nice that wikipedia has a single page per vendor that you can search etc. There is a column in the tables for API support where you can see the DirectX version. Check if it is one of these recommended DX11 cards You may not have a DirectX 11 card and are interested in purchasing one. While I am in no position to make recommendations, I will list here some cards from two big IHVs that we know are DirectX 11 capable. Some AMD (aka ATI) cards Low end, inexpensive DX11 hardware: Radeon 5450, 5550, 6450, 6570 Mid range (decent perf, single precision): Radeon 5750, 5770, 6770, 6790 High end (capable of double precision): Radeon 5850, 5870, 6950, 6970 Single precision APUs: AMD E-Series APUs AMD A-Series APUs Some NVIDIA cards Low end, inexpensive DX11 hardware: GeForce GT430, GT 440, GT520, GTS 450 Quadro 400, 600 Mid-range (decent perf, single precision): GeForce GTX 460, GTX 550 Ti, GTX 560, GTX 560 Ti Quadro 2000 High end (capable of double precision): GeForce GTX 480, GTX 570, GTX 580, GTX 590, GTX 595 Quadro 4000, 5000, 6000 Tesla C2050, C2070, C2075 Get the DirectX SDK and run DirectX Caps Viewer Download and install the June 2010 DirectX SDK. As part of that you now have the DirectX Capabilities Viewer utility (find it in your start menu by searching for "DirectX Caps Viewer", the filename is DXCapsViewer.exe). It will list all your devices (emulated, and real hardware ones) under the first node. Expand the hardware entries and then expand again the Direct3D 11 folder. If you see D3D_FEATURE_LEVEL_11_ under that, then your card supports feature level 11 which means it supports DirectCompute and C++ AMP. In the following screenshot of one of my old laptops, the card only goes to feature level 10. Run a utility from the web that just tells you! Of course, writing some C++ AMP code that enumerates accelerators and lists the ones that are capable is trivial. However that requires that you have redistributed the runtime, so a more broadly applicable approach is to use the DX APIs directly to enumerate the DX11 capable cards. That is exactly what the development lead for C++ AMP has done and he describes and shares that utility at this post. Comments about this post by Daniel Moth welcome at the original blog.

Read the article
UPOS RFIDScanner data format

- by Robert Snyder

A lot of work that I do currently is based in the OPOS/UPOS world. My company has a device that can read 13.56Mhz tags (RFID), Smart Cards, and Mag Stripe cards. Up until somewhat recently I have only been working with RFID for a very specific scenario. That was to read UltraLight C and Desfire cards. These cards were all setup very specifically so that I could take the data read from those cards and force it into a MSR track2 format. The past couple of weeks, however, I have been working on reading RFID credit cards (since I have a Visa card I've been using mine), and Smart Card credit cards. (The visa card I have has both) In learning how to communicate with SmartCard and reading ISO7816 and EMVCO documents I became a little more familiar with how info is stored. But now I have a question regarding UPOS. The RFID data on my Visa is stored (and read) very similar to how the data is stored and read from the Smart Card on my Visa. Cool. Well in the UPOS spec for SmartCardRW the ReadData method returns a byte array. That's cool, I can just return all that data and then parse it as my heart desires. The RFID though has a LinkedList of Tags. Well this makes sense in terms of my Visa card (reminds me of a question I have in regards to SmartCard, but that is for another question) but what about ULC and Desfire, or for that matter any Mifare card. Pages, Files, Purses don't exactly fit the Tag profile. For instance lets just say I read pages 4-12 on my ULC card. Each page I read is 4 bytes long. Does this mean I have 9 tags in my LinkedList? Is my Tag id the page number? Or then how does that translate to Desfire? I open application 123456 and read file 1 and file 2, Do I have 2 tags? and if so what is my tag id? At least with my Visa I think that I have to use the Tag id (ex 5F24 for my expiration date) and value of {0x15, 0x10, 0x31} Part of me says yes..that makes sense. Another part of me says, "well if that is the case then why doesn't SmartCardRW have Tags?" So that is my question. How do I format my data from those different types of media? or is that the job of my Control Object (the application)? Is so how does it know? The only protocols I have are: // Summary: // Enumerates the available predefined RFID tag protocols the device supports. [Flags] public enum RFIDProtocols { EpcClass0 = 1, RFIDSdt0Plus = 2, EpcClass1 = 4, EpcClass1Gen2 = 8, EpcClass2 = 16, Iso14443A = 4096, Iso14443B = 8192, Iso15693 = 12288, Iso180006B = 16384, Other = 16777216, All = 1073741824, } If I use that well all of my cards that I have are all Iso14443A. I use the ATQA and the SAK to know what type of card I really have. There is no RFID property that lets me specify that. So I'm lost.

Read the article
How do I convert a simple ruby flashcard program into a ROR app?

- by Mark Wilbur

What I'm trying to do is make a basic flashcard app on rails. At this point, all I'm looking for is the functionality to iterate through a list of flashcards, quiz the user and let the user know if they were right or not. In ruby, it didn't take me long to write: class Card attr_accessor :answer, :question def initialize(answer = "", question="") @answer = answer @question = question end def quiz puts "What does #@question mean?" answer = gets.chomp if answer == @answer puts "Right" return true else puts "Wrong" return answer end end end class Cardlist attr_accessor :Cards def initialize(Cards = []) @Cards = Cards end def quiz Cards.each do |w| w.quiz end end end The problem I'm having with rails is figuring out where to put the logic to loop through all the cards in the list. I've made models specifying that Card belongs_to cardlist and that Cardlist has_many cards. I know application logic should go in the controller, but if I were to make a "quiz" action for my Cardlist controller, how would I make it iterate through all the cards? After each "quiz" page generated, I'd need to get an answer back from the user, respond (maybe flash) whether it was right or not and then continue onto the next question. Would any of that logic have to go into the view in order to make sure it's sending back the user inputted answers to the controller?

Read the article
SAP dévoile Business Object 4.0, la nouvelle version de sa solution BI intègre la mobilité, les réseaux sociaux et le « in-memory »

SAP dévoile Business Object 4.0 La nouvelle version de sa solution BI intègre la mobilité, les réseaux sociaux et le « in-memory » SAP vient de dévoiler Business Object 4.0, la prochaine version de sa plate-forme de nouvelle génération de Business Intelligence et de Gestion d'Information d'Entreprise (EIM). [IMG]http://ftp-developpez.com/gordon-fowler/SAP/Slide-5-SAP-BusinessObjects-4.0-Event-Insight2.jpg[/IMG] Après SAP ByDesign 2.6, sa suite ERP en mode SaaS (qui arrive avec un tout nouveau SDK), Business Object 4.0 est la deuxième très grosse annonce de cette année 2011 que Nicolas Sekkaki, Direc...

Read the article
C# Performance Pitfall – Interop Scenarios Change the Rules

- by Reed

C# and .NET, overall, really do have fantastic performance in my opinion. That being said, the performance characteristics dramatically differ from native programming, and take some relearning if you’re used to doing performance optimization in most other languages, especially C, C++, and similar. However, there are times when revisiting tricks learned in native code play a critical role in performance optimization in C#. I recently ran across a nasty scenario that illustrated to me how dangerous following any fixed rules for optimization can be… The rules in C# when optimizing code are very different than C or C++. Often, they’re exactly backwards. For example, in C and C++, lifting a variable out of loops in order to avoid memory allocations often can have huge advantages. If some function within a call graph is allocating memory dynamically, and that gets called in a loop, it can dramatically slow down a routine. This can be a tricky bottleneck to track down, even with a profiler. Looking at the memory allocation graph is usually the key for spotting this routine, as it’s often “hidden” deep in call graph. For example, while optimizing some of my scientific routines, I ran into a situation where I had a loop similar to: for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i]); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This loop was at a fairly high level in the call graph, and often could take many hours to complete, depending on the input data. As such, any performance optimization we could achieve would be greatly appreciated by our users. After a fair bit of profiling, I noticed that a couple of function calls down the call graph (inside of ProcessElement), there was some code that effectively was doing: // Allocate some data required DataStructure* data = new DataStructure(num); // Call into a subroutine that passed around and manipulated this data highly CallSubroutine(data); // Read and use some values from here double values = data->Foo; // Cleanup delete data; // ... return bar; Normally, if “DataStructure” was a simple data type, I could just allocate it on the stack. However, it’s constructor, internally, allocated it’s own memory using new, so this wouldn’t eliminate the problem. In this case, however, I could change the call signatures to allow the pointer to the data structure to be passed into ProcessElement and through the call graph, allowing the inner routine to reuse the same “data” memory instead of allocating. At the highest level, my code effectively changed to something like: DataStructure* data = new DataStructure(numberToProcess); for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i], data); } delete data; Granted, this dramatically reduced the maintainability of the code, so it wasn’t something I wanted to do unless there was a significant benefit. In this case, after profiling the new version, I found that it increased the overall performance dramatically – my main test case went from 35 minutes runtime down to 21 minutes. This was such a significant improvement, I felt it was worth the reduction in maintainability. In C and C++, it’s generally a good idea (for performance) to: Reduce the number of memory allocations as much as possible, Use fewer, larger memory allocations instead of many smaller ones, and Allocate as high up the call stack as possible, and reuse memory I’ve seen many people try to make similar optimizations in C# code. For good or bad, this is typically not a good idea. The garbage collector in .NET completely changes the rules here. In C#, reallocating memory in a loop is not always a bad idea. In this scenario, for example, I may have been much better off leaving the original code alone. The reason for this is the garbage collector. The GC in .NET is incredibly effective, and leaving the allocation deep inside the call stack has some huge advantages. First and foremost, it tends to make the code more maintainable – passing around object references tends to couple the methods together more than necessary, and overall increase the complexity of the code. This is something that should be avoided unless there is a significant reason. Second, (unlike C and C++) memory allocation of a single object in C# is normally cheap and fast. Finally, and most critically, there is a large advantage to having short lived objects. If you lift a variable out of the loop and reuse the memory, its much more likely that object will get promoted to Gen1 (or worse, Gen2). This can cause expensive compaction operations to be required, and also lead to (at least temporary) memory fragmentation as well as more costly collections later. As such, I’ve found that it’s often (though not always) faster to leave memory allocations where you’d naturally place them – deep inside of the call graph, inside of the loops. This causes the objects to stay very short lived, which in turn increases the efficiency of the garbage collector, and can dramatically improve the overall performance of the routine as a whole. In C#, I tend to: Keep variable declarations in the tightest scope possible Declare and allocate objects at usage While this tends to cause some of the same goals (reducing unnecessary allocations, etc), the goal here is a bit different – it’s about keeping the objects rooted for as little time as possible in order to (attempt) to keep them completely in Gen0, or worst case, Gen1. It also has the huge advantage of keeping the code very maintainable – objects are used and “released” as soon as possible, which keeps the code very clean. It does, however, often have the side effect of causing more allocations to occur, but keeping the objects rooted for a much shorter time. Now – nowhere here am I suggesting that these rules are hard, fast rules that are always true. That being said, my time spent optimizing over the years encourages me to naturally write code that follows the above guidelines, then profile and adjust as necessary. In my current project, however, I ran across one of those nasty little pitfalls that’s something to keep in mind – interop changes the rules. In this case, I was dealing with an API that, internally, used some COM objects. In this case, these COM objects were leading to native allocations (most likely C++) occurring in a loop deep in my call graph. Even though I was writing nice, clean managed code, the normal managed code rules for performance no longer apply. After profiling to find the bottleneck in my code, I realized that my inner loop, a innocuous looking block of C# code, was effectively causing a set of native memory allocations in every iteration. This required going back to a “native programming” mindset for optimization. Lifting these variables and reusing them took a 1:10 routine down to 0:20 – again, a very worthwhile improvement. Overall, the lessons here are: Always profile if you suspect a performance problem – don’t assume any rule is correct, or any code is efficient just because it looks like it should be Remember to check memory allocations when profiling, not just CPU cycles Interop scenarios often cause managed code to act very differently than “normal” managed code. Native code can be hidden very cleverly inside of managed wrappers

Read the article
tile_static, tile_barrier, and tiled matrix multiplication with C++ AMP

- by Daniel Moth

We ended the previous post with a mechanical transformation of the C++ AMP matrix multiplication example to the tiled model and in the process introduced tiled_index and tiled_grid. This is part 2. tile_static memory You all know that in regular CPU code, static variables have the same value regardless of which thread accesses the static variable. This is in contrast with non-static local variables, where each thread has its own copy. Back to C++ AMP, the same rules apply and each thread has its own value for local variables in your lambda, whereas all threads see the same global memory, which is the data they have access to via the array and array_view. In addition, on an accelerator like the GPU, there is a programmable cache, a third kind of memory type if you'd like to think of it that way (some call it shared memory, others call it scratchpad memory). Variables stored in that memory share the same value for every thread in the same tile. So, when you use the tiled model, you can have variables where each thread in the same tile sees the same value for that variable, that threads from other tiles do not. The new storage class for local variables introduced for this purpose is called tile_static. You can only use tile_static in restrict(direct3d) functions, and only when explicitly using the tiled model. What this looks like in code should be no surprise, but here is a snippet to confirm your mental image, using a good old regular C array // each tile of threads has its own copy of locA, // shared among the threads of the tile tile_static float locA[16][16]; Note that tile_static variables are scoped and have the lifetime of the tile, and they cannot have constructors or destructors. tile_barrier In amp.h one of the types introduced is tile_barrier. You cannot construct this object yourself (although if you had one, you could use a copy constructor to create another one). So how do you get one of these? You get it, from a tiled_index object. Beyond the 4 properties returning index objects, tiled_index has another property, barrier, that returns a tile_barrier object. The tile_barrier class exposes a single member, the method wait. 15: // Given a tiled_index object named t_idx 16: t_idx.barrier.wait(); 17: // more code …in the code above, all threads in the tile will reach line 16 before a single one progresses to line 17. Note that all threads must be able to reach the barrier, i.e. if you had branchy code in such a way which meant that there is a chance that not all threads could reach line 16, then the code above would be illegal. Tiled Matrix Multiplication Example – part 2 So now that we added to our understanding the concepts of tile_static and tile_barrier, let me obfuscate rewrite the matrix multiplication code so that it takes advantage of tiling. Before you start reading this, I suggest you get a cup of your favorite non-alcoholic beverage to enjoy while you try to fully understand the code. 01: void MatrixMultiplyTiled(vector<float>& vC, const vector<float>& vA, const vector<float>& vB, int M, int N, int W) 02: { 03: static const int TS = 16; 04: array_view<const float,2> a(M, W, vA); 05: array_view<const float,2> b(W, N, vB); 06: array_view<writeonly<float>,2> c(M,N,vC); 07: parallel_for_each(c.grid.tile< TS, TS >(), 08: [=] (tiled_index< TS, TS> t_idx) restrict(direct3d) 09: { 10: int row = t_idx.local[0]; int col = t_idx.local[1]; 11: float sum = 0.0f; 12: for (int i = 0; i < W; i += TS) { 13: tile_static float locA[TS][TS], locB[TS][TS]; 14: locA[row][col] = a(t_idx.global[0], col + i); 15: locB[row][col] = b(row + i, t_idx.global[1]); 16: t_idx.barrier.wait(); 17: for (int k = 0; k < TS; k++) 18: sum += locA[row][k] * locB[k][col]; 19: t_idx.barrier.wait(); 20: } 21: c[t_idx.global] = sum; 22: }); 23: } Notice that all the code up to line 9 is the same as per the changes we made in part 1 of tiling introduction. If you squint, the body of the lambda itself preserves the original algorithm on lines 10, 11, and 17, 18, and 21. The difference being that those lines use new indexing and the tile_static arrays; the tile_static arrays are declared and initialized on the brand new lines 13-15. On those lines we copy from the global memory represented by the array_view objects (a and b), to the tile_static vanilla arrays (locA and locB) – we are copying enough to fit a tile. Because in the code that follows on line 18 we expect the data for this tile to be in the tile_static storage, we need to synchronize the threads within each tile with a barrier, which we do on line 16 (to avoid accessing uninitialized memory on line 18). We also need to synchronize the threads within a tile on line 19, again to avoid the race between lines 14, 15 (retrieving the next set of data for each tile and overwriting the previous set) and line 18 (not being done processing the previous set of data). Luckily, as part of the awesome C++ AMP debugger in Visual Studio there is an option that helps you find such races, but that is a story for another blog post another time. May I suggest reading the next section, and then coming back to re-read and walk through this code with pen and paper to really grok what is going on, if you haven't already? Cool. Why would I introduce this tiling complexity into my code? Funny you should ask that, I was just about to tell you. There is only one reason we tiled our extent, had to deal with finding a good tile size, ensure the number of threads we schedule are correctly divisible with the tile size, had to use a tiled_index instead of a normal index, and had to understand tile_barrier and to figure out where we need to use it, and double the size of our lambda in terms of lines of code: the reason is to be able to use tile_static memory. Why do we want to use tile_static memory? Because accessing tile_static memory is around 10 times faster than accessing the global memory on an accelerator like the GPU, e.g. in the code above, if you can get 150GB/second accessing data from the array_view a, you can get 1500GB/second accessing the tile_static array locA. And since by definition you are dealing with really large data sets, the savings really pay off. We have seen tiled implementations being twice as fast as their non-tiled counterparts. Now, some algorithms will not have performance benefits from tiling (and in fact may deteriorate), e.g. algorithms that require you to go only once to global memory will not benefit from tiling, since with tiling you already have to fetch the data once from global memory! Other algorithms may benefit, but you may decide that you are happy with your code being 150 times faster than the serial-version you had, and you do not need to invest to make it 250 times faster. Also algorithms with more than 3 dimensions, which C++ AMP supports in the non-tiled model, cannot be tiled. Also note that in future releases, we may invest in making the non-tiled model, which already uses tiling under the covers, go the extra step and use tile_static memory on your behalf, but it is obviously way to early to commit to anything like that, and we certainly don't do any of that today. Comments about this post by Daniel Moth welcome at the original blog.

Read the article
What's the difference between "Flash Drive" and "Flash Memory"?

- by Clive D

I have a problem with a Blu ray disk I bought. I talked to a Sony technician who advised me to plug a "USB Flash Memory Stick" into the Blu-ray player. Is this something specific? Is there a difference between the following two? "USB Flash Drive" "USB Flash Memory" When I go to Curry's or other sites that sell USB Sticks, they only talk about "USB Flash Drives". I've been in computing for many years and know the basics, but "memory" and "drive" are different things to me.

Read the article
Fatal error: Out of memory (allocated ...) (tried to allocate ... bytes) not due to memory_limit setting

- by Lorenz Meyer

Since a few days, I get the following error on my server: Fatal error: Out of memory (allocated 262144) (tried to allocate 393216 bytes) Usually this error is due to a memory consumption that is exceeding the configured memory_limit, but in my case there is no relation. The memory_limit is set to 128MB, and in this case, we not even reach 1MB. Also the server does not have a big load, in fact it is an intranet server, and there are just a few people conected to it. System: Windows Server 2003, 1Go RAM, only 600 MB used. Apache 2.2.4 PHP 5.2.3 This error is appearing randomly. The memory limit reached also is randomly between a few kB to a few MB. Sometimes restarting Apache is required to get rid of the error, sometimes it disapears itself. Restarting Apache or the entire server helps temporarily. Where could this problem come from ? How could I narrow down the error source ?

Read the article
How do I take some RAM and use it towards Dedicated video memory for my Nvidia graphics card?

- by Noah Rainey

I have a Nividia GeForce 6150SE nForce 430 graphics card (so it's quite old), it only gets 64MB of dedicated memory by default. I went into the bios and see if I can increase it, but it wouldn't let me. However, from the Nividia control panel I see I have up to 1071MB of total available graphics memory. I'm not sure what that means and I'm not sure how I can harness this memory and use some RAM for my graphics card. Can someone explain if this is possible and if so, how?

Read the article
TechEd 2014 Day 3

- by John Paul Cook

There is some confusion about durability of data stored in SQL Server in-memory tables, so some review of the concepts is appropriate. The in-memory option is enabled at the database level. Enabling it at the database level only gives you the option to specify the in-memory feature on a table by table basis. No existing tables or new tables will by default become in-memory tables when you enable the feature at the database level. If you choose to make a table an in-memory table, by default it is...(read more)

Read the article
TechEd 2014 Day 3

- by John Paul Cook

There is some confusion about durability of data stored in SQL Server in-memory tables, so some review of the concepts is appropriate. The in-memory option is enabled at the database level. Enabling it at the database level only gives you the option to specify the in-memory feature on a table by table basis. No existing tables or new tables will by default become in-memory tables when you enable the feature at the database level. If you choose to make a table an in-memory table, by default it is...(read more)

Read the article
How long does it take in practice to warm up large in-memory databases?

- by Sim

Companies such as Peak Hosting are offering 64 core machines with 512Gb RAM for $2K/month. This is a very interesting choice for in-memory databases such as Memcached/Redis as well as databases whose performance degrades rapidly when the data & indexes don't fit in RAM, such as MongoDB. My main concern with monster machines such as these is the time it takes to warm up an in-memory database. In my experience, theoretical metrics, e.g., that SATA can load 100Mb/sec, fall short of what happens in practice. Even at that rate, 100Mb/sec means that loading up 512Gb RAM machine from SATA disks can take over 1 1/2 hours (!). I am looking for real-world reports of warm-up times for machines with very large memory. Please, share details of the software on the machine, data size, storage configuration, e.g., SATA or SSD, network, hosting/cloud provider, if relevant, etc.

Read the article
Why does using 2 memory sticks cause my computer to crash?

- by hi

My computer randomly crashes when playing games, but if I remove one memory stick (it does not matter which one I remove), it does not crash anymore. Memory tests do not find errors, I just put in a new power supply (650W), I only have 1 graphics card, so why is this happening? BTW, they are the same memory, same vendor same specs, everything I bought it together (2x2GB) My motherboard is a Asus P5Q Pro, so it supports both dual channel and more than 4gb. Switching slots does nothing, as long as I don't use more than 1 I'm fine.

Read the article
Why OS X use swap when there is lots of "inactive memory"?

- by Balchev

I am using OS X from few months (Lion and now Mountain Lion). I have 8 GB on my mini and almost daily now it get close to that. On Windows 7 machine with 8 GB I never had that kind of problem. Anyway, I read over the net, that the inactive memory is app cache of the programs that are recently closed and can be used for faster reopening.And this inactive memory can be released to a new app if needed. It is not released. Instead OS X starts swapping. So my question is why OS X use swap when there is lots of "inactive memory"? Here a screen that shows what I mean: I really hope there is a away to make OS X to use those 2.69 GB before start swapping.I really do.

Read the article
I asked this yesterday, after the input given I'm still having trouble implementing..

- by Josh

I'm not sure how to fix this or what I did wrong, but whenever I enter in a value it just closes out the run prompt. So, seems I do have a problem somewhere in my coding. Whenever I run the program and input a variable, it always returns the same answer.."The content at location 76 is 0." On that note, someone told me that "I don't know, but I suspect that Program A incorrectly has a fixed address being branched to on instructions 10 and 11." - mctylr but I'm not sure how to fix that.. I'm trying to figure out how to incorporate this idea from R Samuel Klatchko.. I'm still not sure what I'm missing but I can't get it to work.. const int OP_LOAD = 3; const int OP_STORE = 4; const int OP_ADD = 5; ... const int OP_LOCATION_MULTIPLIER = 100; mem[0] = OP_LOAD * OP_LOCATION_MULTIPLIER + ...; mem[1] = OP_ADD * OP_LOCATION_MULTIPLIER + ...; operand = memory[ j ] % OP_LOCATION_MULTIPLIER; operation = memory[ j ] / OP_LOCATION_MULTIPLIER; I'm new to programming, I'm not the best, so I'm going for simplicity. Also this is an SML program. Anyway, this IS a homework assignment and I'm wanting a good grade on this. So I was looking for input and making sure this program will do what I'm hoping they are looking for. Anyway, here are the instructions: Write SML (Simpletron Machine language) programs to accomplish each of the following task: A) Use a sentinel-controlled loop to read positive number s and compute and print their sum. Terminate input when a neg number is entered. B) Use a counter-controlled loop to read seven numbers, some positive and some negative, and compute + print the avg. C) Read a series of numbers, and determine and print the largest number. The first number read indicates how many numbers should be processed. Without further a due, here is my program. All together. int main() { const int READ = 10; const int WRITE = 11; const int LOAD = 20; const int STORE = 21; const int ADD = 30; const int SUBTRACT = 31; const int DIVIDE = 32; const int MULTIPLY = 33; const int BRANCH = 40; const int BRANCHNEG = 41; const int BRANCHZERO = 41; const int HALT = 43; int mem[100] = {0}; //Making it 100, since simpletron contains a 100 word mem. int operation; //taking the rest of these variables straight out of the book seeing as how they were italisized. int operand; int accum = 0; // the special register is starting at 0 int j; // This is for part a, it will take in positive variables in a sent-controlled loop and compute + print their sum. Variables from example in text. memory [0] = 1010; memory [01] = 2009; memory [02] = 3008; memory [03] = 2109; memory [04] = 1109; memory [05] = 4300; memory [06] = 1009; j = 0; //Makes the variable j start at 0. while ( true ) { operand = memory[ j ]%100; // Finds the op codes from the limit on the memory (100) operation = memory[ j ]/100; //using a switch loop to set up the loops for the cases switch ( operation ){ case 10: //reads a variable into a word from loc. Enter in -1 to exit cout <<"\n Input a positive variable: "; cin >> memory[ operand ]; break; case 11: // takes a word from location cout << "\n\nThe content at location " << operand << "is " << memory[operand]; break; case 20:// loads accum = memory[ operand ]; break; case 21: //stores memory[ operand ] = accum; break; case 30: //adds accum += mem[operand]; break; case 31: // subtracts accum-= memory[ operand ]; break; case 32: //divides accum /=(memory[ operand ]); break; case 33: // multiplies accum*= memory [ operand ]; break; case 40: // Branches to location j = -1; break; case 41: //branches if acc. is < 0 if (accum < 0) j = 5; break; case 42: //branches if acc = 0 if (accum == 0) j = 5; break; case 43: // Program ends exit(0); break; } j++; } return 0; }

Read the article
Is there a way to set the amount of memory available in the iPhone Simulator?

- by Rob Segal

Does anyone know if its possible to set the amount of memory available in the simulator? I'm assuming the simulator will use as much memory as possible from the system but this makes it more difficult to recreate certain low memory crashes/bugs.

Read the article
How does the linux kernel manage less than 1GB physical memory ?

- by TheLoneJoker

I'm learning the linux kernel internals and while reading "Understanding Linux Kernel", quite a few memory related questions struck me. One of them is, how the Linux kernel handles the memory mapping if the physical memory of say only 512 MB is installed on my system. As I read, kernel maps 0(or 16) MB-896MB physical RAM into 0xC0000000 linear address and can directly address it. So, in the above described case where I only have 512 MB: How can the kernel map 896 MB from only 512 MB ? What about user mode processes in this situation? Where are user mode processes in phys RAM? Every article explains only the situation, when you've installed 4 GB of memory and the kernel maps the 1 GB into kernel space and user processes uses the remaining amount of RAM. I would appreciate any help in improving my understanding. Thanks..!

Read the article
How to check memory performance in Activity Monitor in instruments of Xcode and and How to slove it?

- by Madan Mohan

Hi Guys, I have to check memory performance of my application I have solved Leaks now I want to improve the performance of memory. So, Please tell me the process how can I fix and improve the memory performance. Thank You.

Read the article
Why 2 GB memory limit when running in 64 bit Windows ?

- by Roland Bengtsson

I'm a member in a team that develop a Delphi application. The memory requirements are huge. 500 MB is normal but in some cases it got out of memory exception. The memory allocated in that cases is typically between 1000 - 1700 MB. We of course want 64-bits compiler but that won't happen now (and if it happens we also must convert to unicode, but that is another story...). My question is why is there a 2 GB memory limit per process when running in a 64 bit environment. The pointer is 32 bit so I think 4 GB would be the right limit. I use Delphi 2007.

Read the article
PHP Memory limit problem while creating xml of magento products..

- by Jitendra

Hello Masters, Thanks in advance, I need help in solving php memory problem, I have created a script in php that automatically fetch magento product data,the problem is that when there is large number of product in database, the script gives memory fatal error i have changed the memory limit to 256M in my php.ini but still the script not executing totally. i have checked the script its working fine if there is number of product is not too much but if there is larger number my script not working.. Please help... -Thanks Jitendra Dhobi

Read the article
Problem in migrating to LAMP from XAMPP.. Memory limit error

- by Geshan

I was using XAMPP for my local machine but as I wanted to run applications like mysql work bench and some test frameworks I decided to switch to LAMP self install. I'm using ubuntu and followed the instructions at: https://help.ubuntu.com/community/ApacheMySQLPHP But the problem is LAMP is consuming too much of my memory (RAM) I've allocated 124 MB currently but still it gives me memory exhausted error when I run Drush (Drupal command line). When I do drush cc to clear cache it give me the following: Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 122880 bytes) in /var/www/----/sites/all/modules/ubercart/uc_order/uc_order.order_pane.inc on line 150 Call Stack: 0.0020 185624 1. {main}() /opt/drush/drush.php:0 0.0254 1303672 2. drush_main() /opt/drush/drush.php:37 0.2674 5107784 3. drush_bootstrap() /opt/drush/drush.php:71 0.2676 5109872 4. _drush_bootstrap_drupal_full() /opt/drush/includes/environment.inc:173 0.2676 5151032 5. drupal_bootstrap() /opt/drush/includes/environment.inc:655 0.3030 7739048 6. _drupal_bootstrap() /var/www/missmoti/includes/bootstrap.inc:989 0.3122 8855792 7. _drupal_bootstrap_full() /var/www/missmoti/includes/bootstrap.inc:1078 0.3445 12387320 8. module_load_all() /var/www/missmoti/includes/common.inc:2608 0.5194 32586544 9. drupal_load() /var/www/missmoti/includes/module.inc:14 0.5251 33361112 10. include_once('/var/www/missmoti/sites/all/modules/ubercart/uc_order/uc_order.module') /var/www/-----/includes/bootstrap.inc:617 Drush command could not be completed. In each error it shows me a back trace and I guess this default debugger I'm not aware of in Apache or my PHP config it eating up the memory. If anyone can help I"d be glad. Another error below: Fatal error: Call to undefined function dsm() in /var/www/-----/sites/all/modules/custom/gtpath/gtpath.module on line 180 Call Stack # Time Memory Function Location 1 0.0002 120144 {main}( ) ../index.php:0 2 1.7604 68224112 theme( ) ../index.php:36 3 2.0188 77346112 call_user_func_array ( ) ../theme.inc:658 4 2.0188 77347024 gtpath_preprocess_page( ) ../theme.inc:0 how do I deal with this default debugger? how do I turn it off??

Read the article

Search Results

Search found 13634 results on 546 pages for 'memory cards'.

Page 153/546 | < Previous Page | 149 150 151 152 153 154 155 156 157 158 159 160 | Next Page >

- by Matthew Patrick Cashatt

- by Michal P.

- by James Burgess

- by Daniel Moth

- by Daniel Moth

- by Robert Snyder

- by Mark Wilbur

- by Reed

- by Daniel Moth

- by Clive D

- by Lorenz Meyer

- by Noah Rainey

- by John Paul Cook

- by John Paul Cook

- by Sim

- by hi

- by Balchev

- by Josh

- by Rob Segal

- by TheLoneJoker

- by Madan Mohan

- by Roland Bengtsson

- by Jitendra

- by Geshan

< Previous Page | 149 150 151 152 153 154 155 156 157 158 159 160 | Next Page >