Search Results

Search found 751 results on 31 pages for 'quad'.

Page 27/31 | < Previous Page | 23 24 25 26 27 28 29 30 31 | Next Page >

WPF drawing performance with large numbers of geometries

- by MyFaJoArCo

Hello, I have problems with WPF drawing performance. There are a lot of small EllipseGeometry objects (1024 ellipses, for example), which are added to three separate GeometryGroups with different foreground brushes. After, I render it all on simple Image control. Code: DrawingGroup tmpDrawing = new DrawingGroup(); GeometryGroup onGroup = new GeometryGroup(); GeometryGroup offGroup = new GeometryGroup(); GeometryGroup disabledGroup = new GeometryGroup(); for (int x = 0; x < DisplayWidth; ++x) { for (int y = 0; y < DisplayHeight; ++y) { if (States[x, y] == true) onGroup.Children.Add(new EllipseGeometry(new Rect((double)x * EDGE, (double)y * EDGE, EDGE, EDGE))); else if (States[x, y] == false) offGroup.Children.Add(new EllipseGeometry(new Rect((double)x * EDGE, (double)y * EDGE, EDGE, EDGE))); else disabledGroup.Children.Add(new EllipseGeometry(new Rect((double)x * EDGE, (double)y * EDGE, EDGE, EDGE))); } } tmpDrawing.Children.Add(new GeometryDrawing(OnBrush, null, onGroup)); tmpDrawing.Children.Add(new GeometryDrawing(OffBrush, null, offGroup)); tmpDrawing.Children.Add(new GeometryDrawing(DisabledBrush, null, disabledGroup)); DisplayImage.Source = new DrawingImage(tmpDrawing); It works fine, but takes too much time - 0.5s on Core 2 Quad, 2s on Pentium 4. I need <0.1s everywhere. All Ellipses, how you can see, are equal. Background of control, where is my DisplayImage, is solid (black, for example), so we can use this fact. I tried to use 1024 Ellipse elements instead of Image with EllipseGeometries, and it was working much faster (~0.5s), but not enough. How to speed up it? Regards, Oleg Eremeev P.S. Sorry for my English.

Read the article
Bilinear interpolation - DirectX vs. GDI+

- by holtavolt

I have a C# app for which I've written GDI+ code that uses Bitmap/TextureBrush rendering to present 2D images, which can have various image processing functions applied. This code is a new path in an application that mimics existing DX9 code, and they share a common library to perform all vector and matrix (e.g. ViewToWorld/WorldToView) operations. My test bed consists of DX9 output images that I compare against the output of the new GDI+ code. A simple test case that renders to a viewport that matches the Bitmap dimensions (i.e. no zoom or pan) does match pixel-perfect (no binary diff) - but as soon as the image is zoomed up (magnified), I get very minor differences in 5-10% of the pixels. The magnitude of the difference is 1 (occasionally 2)/256. I suspect this is due to interpolation differences. Question: For a DX9 ortho projection (and identity world space), with a camera perpendicular and centered on a textured quad, is it reasonable to expect DirectX.Direct3D.TextureFilter.Linear to generate identical output to a GDI+ TextureBrush filled rectangle/polygon when using the System.Drawing.Drawing2D.InterpolationMode.Bilinear setting? For this (magnification) case, the DX9 code is using this (MinFilter,MipFilter set similarly): Device.SetSamplerState(0, SamplerStageStates.MagFilter, (int)TextureFilter.Linear); and the GDI+ path is using: g.InterpolationMode = InterpolationMode.Bilinear; I thought that "Bilinear Interpolation" was a fairly specific filter definition, but then I noticed that there is another option in GDI+ for "HighQualityBilinear" (which I've tried, with no difference - which makes sense given the description of "added prefiltering for shrinking") Followup Question: Is it reasonable to expect pixel-perfect output matching between DirectX and GDI+ (assuming all external coordinates passed in are equal)? If not, why not? Finally, there are a number of other APIs I could be using (Direct2D, WPF, GDI, etc.) - and this question generally applies to comparing the output of "equivalent" bilinear interpolated output images across any two of these. Thanks!

Read the article
OpenGL pixels drawn with each horizontal pair swapped

- by Tim Kane

I'm somewhat new to OpenGL though I'm fairly sure my problem lies in the pixel format being used, or how my texture is being generated... I'm drawing a texture onto a flat 2D quad using a 16bit RGB5_A1 pixel format, though I don't make use of any alpha at this stage. The problem I'm having is that each pair of horizontal pixel values have been swapped. That is... if the pixels positions should be in this order (assume 8x2 image) 0 1 2 3 4 5 6 7 they are instead drawn as 1 0 3 2 5 4 7 6 Or, more clearly from this image (below). Left is what I get... Right is what I should get. . The question is... How have I ended up with this? Is there something wrong with the pixel format? Unlikely since the colours all appear correct, and I would expect all kinds of nasty if it were down to endian-ness. Suggestions greatly appreciated. Update: Turns out the problem was in my source renderer. Interestingly, I've avoided the problem entirely by using 32-bit textures (haven't tried 24-bit at this point).

Read the article
changing the serialization procedure for a graph of objects (.net framework)

- by pierusch

Hello I'm developing a scientific application using .net framework. The application depends heavily upon a large data structure (a tree like structure) that has been serialized using a standard binaryformatter object. The graph structure looks like this: <serializable()>Public class BigObjet inherits list(of smallObject) end class <serializable()>public class smallObject inherits list(of otherSmallerObjects) end class ... The binaryFormatter object does a nice job but it's not optimized at all and the entire data structure reaches around 100Mb on my filesystem. Deserialization works too but it's pretty slow (around 30seconds on my quad core). I've found a nice .dll on codeproject (see "optimizing serialization...") so I wrote a modified version of the classes above overriding the default serialization/deserialization procedure reaching very good results. The problem is this: I can't lose the data previosly serialized with the old version and I'd like to be able to use the new serialization/deserialization method. I have some ideas but I'm pretty sure someone will be able to give me a proper and better advice ! use an "helper" graph of objects who takes care of the entire serialization/deserialization procedure reading data from the old format and converting them into the classes I nedd. This could work but the binaryformatter "needs" to know the types being serialized so........ :( modify the "old" graph to include a modified version of serialization procedure...so I'll be able to deserialize old file and save them with the new format......this doesn't sound too good imho. well any help will be higly highly appreciated :)

Read the article
What database strategy to choose for a large web application

- by Snoopy

I have to rewrite a large database application, running on 32 servers. The hardware is up to date, each machine has two quad core Xeon and 32 GByte RAM. The database is multi-tenant, each customer has his own file, around 5 to 10 GByte each. I run around 50 databases on this hardware. The app is open to the web, so I have no control on the load. There are no really complex queries, so SQL is not required if there is a better solution. The databases get updated via FTP every day at midnight. The database is read-only. C# is my favourite language and I want to use ASP.NET MVC. I thought about the following options: Use two big SQL servers running SQL Server 2012 to serve the 32 servers with data. On the 32 servers running IIS hosting providing REST services. Denormalize the database and use Redis on each webserver. Use booksleeve as a Redis client. Use a combination of SQL Server and Redis Use SQL Server 2012 together with Hadoop Use Hadoop without SQL Server What is the best way for a read-only database, to get the best performance without loosing maintainability? Does Map-Reduce make sense at all in such a scenario? The reason for the rewrite is, the old app written in C++ with ISAM technology is too slow, the interfaces are old fashioned and not nice to use from an website, especially when using ajax. The app uses a relational datamodel with many tables, but it is possible to write one accerlerator table where all queries can be performed on, and all other information from the other tables are possible by a simple key lookup.

Read the article
gcc compilations (sometimes) result in cpu underload

- by confusedCoder

I have a larger C++ program which starts out by reading thousands of small text files into memory and storing data in stl containers. This takes about a minute. Periodically, a compilation will exhibit behavior where that initial part of the program will run at about 22-23% CPU load. Once that step is over, it goes back to ~100% CPU. It is more likely to happen with O2 flag turned on but not consistently. It happens even less often with the -p flag which makes it almost impossible to profile. I did capture it once but the gprof output wasn't helpful - everything runs with the same relative speed just at low cpu usage. I am quite certain that this has nothing to do with multiple cores. I do have a quad-core cpu, and most of the code is multi-threaded, but I tested this issue running a single thread. Also, when I run the problematic step in multiple threads, each thread only runs at ~20% CPU. I apologize ahead of time for the vagueness of the question but I have run out of ideas as to how to troubleshoot it further, so any hints might be helpful. UPDATE: Just to make sure it's clear, the problematic part of the code does sometimes (~30-40% of the compilations) run at 100% CPU, so it's hard to buy the (otherwise reasonable) argument that I/O is the bottleneck

Read the article
How to detect .NET WPF memory leak or GC long run?

- by Néstor Sánchez A.

I have the next very strange situation and problem: .NET 4.0 application for diagram editing (WPF). Runs ok in my PC: 8GM RAM, 3.0GHz, i7 quad-core. While creating objects (mostly diagram nodes and connectors, plus all the undo/redo information) the TaskManager show, as expected, some memory usage "jumps" (up and down). These mem-usage "jumps" also remains executing AFTER user interaction ended. Maybe this is the GC cleaning/regorganizing memory? To see what is going on, I've used the Ants mem profiler, but somewhat it prevents those "jumps" to happen after user interaction. PROBLEM: It Freezes/Hangs after seconds or minutes of usage in some slow/weak laptos/netbooks of my beta testers (under 2GHz of speed and under 2GB of RAM). I was thinking of a memory leak, but... EDIT: Also, there is the case that the memory usage grows and grows until collapse (only in slow machines). In a Windows XP Mode machine (VM in Win 7) with only 512MB of RAM Assigned it works fine without mem-usage "jumps" after user interaction (no GC cleaning?!). So, I really have a big trouble because I cannot reproduce the error, only see these strange behaviour (mem jumps), and the tool supposed to show me what is happening is hiding the problem (like the "observer's paradox"). Any ideas on what's happening and how to solve it?

Read the article
I need to make a multithreading program (python)

- by Andreawu98

import multiprocessing import time from itertools import product out_file = open("test.txt", 'w') P = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p','q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',] N = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] M = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] c = int(input("Insert the number of digits you want: ")) n = int(input("If you need number press 1: ")) m = int(input("If you need upper letters press 1: ")) i = [] if n == 1: P = P + N if m == 1: P = P + M then = time.time() def worker(): for i in product(P, repeat=c): #check every possibilities k = '' for z in range(0, c): # k = k + str(i[z]) # print each possibility in a txt without parentesis or comma out_file.write( k + '\n') # out_file.close() now = time.time() diff = str(now - then) # To see how long does it take print(diff) worker() time.sleep(10) # just to check console The code check every single possibility and print it out in a test.txt file. It works but I really can't understand how can I speed it up. I saw it use 1 core out of my quad core CPU so I thought Multi-threading might work even though I don't know how. Please help me. Sorry for my English, I am from Italy.

Read the article
Object disapear/don't scale in the Z-AXIS of OPENGL.

- by user315684

This code is susposed to have a QUAD orbit around a center point in basically a circle. The problem is while it does the X rotation fine it disapears when it moves in Z axis and doesn't appear to change in size. It feel like it's rendering everything in Orthagraphic view or something. This is my first OpenGL project. OPENGL CODE START HERE glClearColor(0.0f, 0.0f, 0.0f, 0.0f); glClear(GL_COLOR_BUFFER_BIT); glMatrixMode (GL_PROJECTION); glPushMatrix(); //glRotatef(theta, 0.0f, 0.0f, 1.0f); glScalef(0.75f, 0.75f, 0.75f); glTranslatef(planeX, -0.0f, 0.0f); glBegin(GL_QUADS); glColor3f(1.0f, 0.0f, 0.0f); glVertex3f(0.0f, 0.0f, planeZ); glColor3f(0.0f, 1.0f, 0.0f); glVertex3f(0.0f, 1.0f, planeZ); glColor3f(0.0f, 0.0f, 1.0f); glVertex3f(1.0f, 1.0f, planeZ); glColor3f(0.0f, 0.0f, 1.0f); glVertex3f(1.0f, 0.0f, planeZ); glEnd(); glPopMatrix(); SwapBuffers(hDC); theta += 1.0f; planeX = (sin(0.0314159265f*theta)); planeZ = (cos(0.0314159265f*theta)); Sleep (1); ENDS HERE

Read the article
how to solve a weired swig python c++ interfacing type error

- by user2981648

I want to use swig to switch a simple cpp function to python and use "scipy.integrate.quadrature" function to calculate the integration. But python 2.7 reports a type error. Do you guys know what is going on here? Thanks a lot. Furthermore, "scipy.integrate.quad" runs smoothly. So is there something special for "scipy.integrate.quadrature" function? The code is in the following: File "testfunctions.h": #ifndef TESTFUNCTIONS_H #define TESTFUNCTIONS_H double test_square(double x); #endif File "testfunctions.cpp": #include "testfunctions.h" double test_square(double x) { return x * x; } File "swig_test.i" : /* File : swig_test.i */ %module swig_test %{ #include "testfunctions.h" %} /* Let's just grab the original header file here */ %include "testfunctions.h" File "test.py": import scipy.integrate import _swig_test print scipy.integrate.quadrature(_swig_test.test_square, 0., 1.) error info: UMD has deleted: _swig_test Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 523, in runfile execfile(filename, namespace) File "D:\data\haitaliu\Desktop\Projects\swig_test\Release\test.py", line 4, in <module> print scipy.integrate.quadrature(_swig_test.test_square, 0., 1.) File "C:\Python27\lib\site-packages\scipy\integrate\quadrature.py", line 161, in quadrature newval = fixed_quad(vfunc, a, b, (), n)[0] File "C:\Python27\lib\site-packages\scipy\integrate\quadrature.py", line 61, in fixed_quad return (b-a)/2.0*sum(w*func(y,*args),0), None File "C:\Python27\lib\site-packages\scipy\integrate\quadrature.py", line 90, in vfunc return func(x, *args) TypeError: in method 'test_square', argument 1 of type 'double'

Read the article
How to make this JavaScript much faster?

- by Ralph

Still trying to answer this question, and I think I finally found a solution, but it runs too slow. var $div = $('<div>') .css({ 'border': '1px solid red', 'position': 'absolute', 'z-index': '65535' }) .appendTo('body'); $('body *').live('mousemove', function(e) { var topElement = null; $('body *').each(function() { if(this == $div[0]) return true; var $elem = $(this); var pos = $elem.offset(); var width = $elem.width(); var height = $elem.height(); if(e.pageX > pos.left && e.pageY > pos.top && e.pageX < (pos.left + width) && e.pageY < (pos.top + height)) { var zIndex = document.defaultView.getComputedStyle(this, null).getPropertyValue('z-index'); if(zIndex == 'auto') zIndex = $elem.parents().length; if(topElement == null || zIndex > topElement.zIndex) { topElement = { 'node': $elem, 'zIndex': zIndex }; } } }); if(topElement != null ) { var $elem = topElement.node; $div.offset($elem.offset()).width($elem.width()).height($elem.height()); } }); It basically loops through all the elements on the page and finds the top-most element beneath the cursor. Is there maybe some way I could use a quad-tree or something and segment the page so the loop runs faster?

Read the article
Optimize Duplicate Detection

- by Dave Jarvis

Background This is an optimization problem. Oracle Forms XML files have elements such as: <Trigger TriggerName="name" TriggerText="SELECT * FROM DUAL" ... /> Where the TriggerText is arbitrary SQL code. Each SQL statement has been extracted into uniquely named files such as: sql/module=DIAL_ACCESS+trigger=KEY-LISTVAL+filename=d_access.fmb.sql sql/module=REP_PAT_SEEN+trigger=KEY-LISTVAL+filename=rep_pat_seen.fmb.sql I wrote a script to generate a list of exact duplicates using a brute force approach. Problem There are 37,497 files to compare against each other; it takes 8 minutes to compare one file against all the others. Logically, if A = B and A = C, then there is no need to check if B = C. So the problem is: how do you eliminate the redundant comparisons? The script will complete in approximately 208 days. Script Source Code The comparison script is as follows: #!/bin/bash echo Loading directory ... for i in $(find sql/ -type f -name \*.sql); do echo Comparing $i ... for j in $(find sql/ -type f -name \*.sql); do if [ "$i" = "$j" ]; then continue; fi # Case insensitive compare, ignore spaces diff -IEbwBaq $i $j > /dev/null # 0 = no difference (i.e., duplicate code) if [ $? = 0 ]; then echo $i :: $j >> clones.txt fi done done Question How would you optimize the script so that checking for cloned code is a few orders of magnitude faster? System Constraints Using a quad-core CPU with an SSD; trying to avoid using cloud services if possible. The system is a Windows-based machine with Cygwin installed -- algorithms or solutions in other languages are welcome. Thank you!

Read the article
The most efficient way to calculate an integral in a dataset range

- by Annalisa

I have an array of 10 rows by 20 columns. Each columns corresponds to a data set that cannot be fitted with any sort of continuous mathematical function (it's a series of numbers derived experimentally). I would like to calculate the integral of each column between row 4 and row 8, then store the obtained result in a new array (20 rows x 1 column). I have tried using different scipy.integrate modules (e.g. quad, trpz,...). The problem is that, from what I understand, scipy.integrate must be applied to functions, and I am not sure how to convert each column of my initial array into a function. As an alternative, I thought of calculating the average of each column between row 4 and row 8, then multiply this number by 4 (i.e. 8-4=4, the x-interval) and then store this into my final 20x1 array. The problem is...ehm...that I don't know how to calculate the average over a given range. The question I am asking are: Which method is more efficient/straightforward? Can integrals be calculated over a data set like the one that I have described? How do I calculate the average over a range of rows?

Read the article
List of values as keys for a Map

- by thr

I have lists of variable length where each item can be one of four unique, that I need to use as keys for another object in a map. Assume that each value can be either 0, 1, 2 or 3 (it's not integer in my real code, but a lot easier to explain this way) so a few examples of key lists could be: [1, 0, 2, 3] [3, 2, 1] [1, 0, 0, 1, 1, 3] [2, 3, 1, 1, 2] [1, 2] So, to re-iterate: each item in the list can be either 0, 1, 2 or 3 and there can be any number of items in a list. My first approach was to try to hash the contents of the array, using the built in GetHashCode() in .NET to combine the hash of each element. But since this would return an int I would have to deal with collisions manually (two equal int values are identical to a Dictionary). So my second approach was to use a quad tree, breaking down each item in the list into a Node that has four pointers (one for each possible value) to the next four possible values (with the root node representing [], an empty list), inserting [1, 0, 2] => Foo, [1, 3] => Bar and [1, 0] => Baz into this tree would look like this: Grey nodes nodes being unused pointers/nodes. Though I worry about the performance of this setup, but there will be no need to deal with hash collisions and the tree won't become to deep (there will mostly be lists with 2-6 items stored, rarely over 6). Is there some other magic way to store items with lists of values as keys that I have missed?

Read the article
BizTalk 2009 - BizTalk Benchmark Wizard: Running a Test

- by StuartBrierley

The BizTalk Benchmark Wizard is a ultility that can be used to gain some validation of a BizTalk installation, giving a level of guidance on whether it is performing as might be expected. It should be used after BizTalk Server has been installed and before any solutions are deployed to the environment. This will ensure that you are getting consistent and clean results from the BizTalk Benchmark Wizard. The BizTalk Benchmark Wizard applies load to the BizTalk Server environment under a choice of specific scenarios. During these scenarios performance counter information is collected and assessed against statistics that are appropriate to the BizTalk Server environment. For details on installing the Benchmark Wizard see my previous post. The BizTalk Benchmarking Wizard provides two simple test scenarios, one for messaging and one for Orchestrations, which can be used to test your BizTalk implementation. Messaging Loadgen generates a new XML message and sends it over NetTCP A WCF-NetTCP Receive Location receives a the xml document from Loadgen. The PassThruReceive pipeline performs no processing and the message is published by the EPM to the MessageBox. The WCF One-Way Send Port, which is the only subscriber to the message, retrieves the message from the MessageBox The PassThruTransmit pipeline provides no additional processing The message is delivered to the back end WCF service by the WCF NetTCP adapter Orchestrations Loadgen generates a new XML message and sends it over NetTCP A WCF-NetTCP Receive Location receives a the xml document from Loadgen. The XMLReceive pipeline performs no processing and the message is published by the EPM to the MessageBox. The message is delivered to a simple Orchestration which consists of a receive location and a send port The WCF One-Way Send Port, which is the only subscriber to the Orchestration message, retrieves the message from the MessageBox The PassThruTransmit pipeline provides no additional processing The message is delivered to the back end WCF service by the WCF NetTCP adapter Below is a quick outline of how to run the BizTalk Benchmark Wizard on a single server, although it should be noted that this is not ideal as this server is then both generating and processing the load. In order to separate this load out you should run the "Indigo" service on a seperate server. To start the BizTalk Benchmark Wizard click Start > All Programs > BizTalk Benchmark Wizard > BizTalk Benchmark Wizard. On this screen click next, you will then get the following pop up window. Check the server and database names and check the "check prerequsites" check-box before pressing ok. The wizard will then check that the appropriate test scenarios are installed. You should then choose the test scenario that wish to run (messaging or orchestration) and the architecture that most closely matches your environment. You will then be asked to confirm the host server for each of the host instances. Next you will be presented with the prepare screen. You will need to start the indigo service before pressing the Test Indigo Service Button. If you are running the indigo service on a separate server you can enter the server name here. To start the indigo service click Start > All Programs > BizTalk Benchmark Wizard > Start Indigo Service. While the test is running you will be presented with two speed dial type displays - one for the received messages per second and one for the processed messages per second. The green dial shows the current rate and the red dial shows the overall average rate. Optionally you can view the CPU usage of the various servers involved in processing the tests. For my development environment I expected low results and this is what I got. Although looking at the online high scores table and comparing to the quad core system listed, the results are perhaps not really that bad. At some time I may look at what improvements I can make to this score, but if you are interested in that now take a look at Benchmark your BizTalk Server (Part 3).

Read the article
GLSL subroutine not being used

- by amoffat

I'm using a gaussian blur fragment shader. In it, I thought it would be concise to include 2 subroutines: one for selecting the horizontal texture coordinate offsets, and another for the vertical texture coordinate offsets. This way, I just have one gaussian blur shader to manage. Here is the code for my shader. The {{NAME}} bits are template placeholders that I substitute in at shader compile time: #version 420 subroutine vec2 sample_coord_type(int i); subroutine uniform sample_coord_type sample_coord; in vec2 texcoord; out vec3 color; uniform sampler2D tex; uniform int texture_size; const float offsets[{{NUM_SAMPLES}}] = float[]({{SAMPLE_OFFSETS}}); const float weights[{{NUM_SAMPLES}}] = float[]({{SAMPLE_WEIGHTS}}); subroutine(sample_coord_type) vec2 vertical_coord(int i) { return vec2(0.0, offsets[i] / texture_size); } subroutine(sample_coord_type) vec2 horizontal_coord(int i) { //return vec2(offsets[i] / texture_size, 0.0); return vec2(0.0, 0.0); // just for testing if this subroutine gets used } void main(void) { color = vec3(0.0); for (int i=0; i<{{NUM_SAMPLES}}; i++) { color += texture(tex, texcoord + sample_coord(i)).rgb * weights[i]; color += texture(tex, texcoord - sample_coord(i)).rgb * weights[i]; } } Here is my code for selecting the subroutine: blur_program->start(); blur_program->set_subroutine("sample_coord", "vertical_coord", GL_FRAGMENT_SHADER); blur_program->set_int("texture_size", width); blur_program->set_texture("tex", *deferred_output); blur_program->draw(); // draws a quad for the fragment shader to run on and: void ShaderProgram::set_subroutine(constr name, constr routine, GLenum target) { GLuint routine_index = glGetSubroutineIndex(id, target, routine.c_str()); GLuint uniform_index = glGetSubroutineUniformLocation(id, target, name.c_str()); glUniformSubroutinesuiv(target, 1, &routine_index); // debugging int num_subs; glGetActiveSubroutineUniformiv(id, target, uniform_index, GL_NUM_COMPATIBLE_SUBROUTINES, &num_subs); std::cout << uniform_index << " " << routine_index << " " << num_subs << "\n"; } I've checked for errors, and there are none. When I pass in vertical_coord as the routine to use, my scene is blurred vertically, as it should be. The routine_index variable is also 1 (which is weird, because vertical_coord subroutine is the first listed in the shader code...but no matter, maybe the compiler is switching things around) However, when I pass in horizontal_coord, my scene is STILL blurred vertically, even though the value of routine_index is 0, suggesting that a different subroutine is being used. Yet the horizontal_coord subroutine explicitly does not blur. What's more is, whichever subroutine comes first in the shader, is the subroutine that the shader uses permanently. Right now, vertical_coord comes first, so the shader blurs vertically always. If I put horizontal_coord first, the scene is unblurred, as expected, but then I cannot select the vertical_coord subroutine! :) Also, the value of num_subs is 2, suggesting that there are 2 subroutines compatible with my sample_coord subroutine uniform. Just to re-iterate, all of my return values are fine, and there are no glGetError() errors happening. Any ideas?

Read the article
Speeding up procedural texture generation

- by FalconNL

Recently I've begun working on a game that takes place in a procedurally generated solar system. After a bit of a learning curve (having neither worked with Scala, OpenGL 2 ES or Libgdx before), I have a basic tech demo going where you spin around a single procedurally textured planet: The problem I'm running into is the performance of the texture generation. A quick overview of what I'm doing: a planet is a cube that has been deformed to a sphere. To each side, a n x n (e.g. 256 x 256) texture is applied, which are bundled in one 8n x n texture that is sent to the fragment shader. The last two spaces are not used, they're only there to make sure the width is a power of 2. The texture is currently generated on the CPU, using the updated 2012 version of the simplex noise algorithm linked to in the paper 'Simplex noise demystified'. The scene I'm using to test the algorithm contains two spheres: the planet and the background. Both use a greyscale texture consisting of six octaves of 3D simplex noise, so for example if we choose 128x128 as the texture size there are 128 x 128 x 6 x 2 x 6 = about 1.2 million calls to the noise function. The closest you will get to the planet is about what's shown in the screenshot and since the game's target resolution is 1280x720 that means I'd prefer to use 512x512 textures. Combine that with the fact the actual textures will of course be more complicated than basic noise (There will be a day and night texture, blended in the fragment shader based on sunlight, and a specular mask. I need noise for continents, terrain color variation, clouds, city lights, etc.) and we're looking at something like 512 x 512 x 6 x 3 x 15 = 70 million noise calls for the planet alone. In the final game, there will be activities when traveling between planets, so a wait of 5 or 10 seconds, possibly 20, would be acceptable since I can calculate the texture in the background while traveling, though obviously the faster the better. Getting back to our test scene, performance on my PC isn't too terrible, though still too slow considering the final result is going to be about 60 times worse: 128x128 : 0.1s 256x256 : 0.4s 512x512 : 1.7s This is after I moved all performance-critical code to Java, since trying to do so in Scala was a lot worse. Running this on my phone (a Samsung Galaxy S3), however, produces a more problematic result: 128x128 : 2s 256x256 : 7s 512x512 : 29s Already far too long, and that's not even factoring in the fact that it'll be minutes instead of seconds in the final version. Clearly something needs to be done. Personally, I see a few potential avenues, though I'm not particularly keen on any of them yet: Don't precalculate the textures, but let the fragment shader calculate everything. Probably not feasible, because at one point I had the background as a fullscreen quad with a pixel shader and I got about 1 fps on my phone. Use the GPU to render the texture once, store it and use the stored texture from then on. Upside: might be faster than doing it on the CPU since the GPU is supposed to be faster at floating point calculations. Downside: effects that cannot (easily) be expressed as functions of simplex noise (e.g. gas planet vortices, moon craters, etc.) are a lot more difficult to code in GLSL than in Scala/Java. Calculate a large amount of noise textures and ship them with the application. I'd like to avoid this if at all possible. Lower the resolution. Buys me a 4x performance gain, which isn't really enough plus I lose a lot of quality. Find a faster noise algorithm. If anyone has one I'm all ears, but simplex is already supposed to be faster than perlin. Adopt a pixel art style, allowing for lower resolution textures and fewer noise octaves. While I originally envisioned the game in this style, I've come to prefer the realistic approach. I'm doing something wrong and the performance should already be one or two orders of magnitude better. If this is the case, please let me know. If anyone has any suggestions, tips, workarounds, or other comments regarding this problem I'd love to hear them.

Read the article
Problems implementing a screen space shadow ray tracing shader

- by Grieverheart

Here I previously asked for the possibility of ray tracing shadows in screen space in a deferred shader. Several problems were pointed out. One of the most important problem is that only visible objects can cast shadows and objects between the camera and the shadow caster can interfere. Still I thought it'd be a fun experiment. The idea is to calculate the view coordinates of pixels and cast a ray to the light. The ray is then traced pixel by pixel to the light and its depth is compared with the depth at the pixel. If a pixel is in front of the ray, a shadow is casted at the original pixel. At first I thought that I could use the DDA algorithm in 2D to calculate the distance 't' (in p = o + t d, where o origin, d direction) to the next pixel and use it in the 3D ray equation to find the ray's z coordinate at that pixel's position. For the 2D ray, I would use the projected and biased 3D ray direction and origin. The idea was that 't' would be the same in both 2D and 3D equations. Unfortunately, this is not the case since the projection matrix is 4D. Thus, some tweak needs to be done to make this work this way. I would like to ask if someone knows of a way to do what I described above, i.e. from a 2D ray in texture coordinate space to get the 3D ray in screen space. I did implement a simple version of the idea which you can see in the following video: video here Shadows may seem a bit pixelated, but that's mostly because of the size of the step in 't' I chose. And here is the shader: #version 330 core uniform sampler2D DepthMap; uniform vec2 projAB; uniform mat4 projectionMatrix; const vec3 light_p = vec3(-30.0, 30.0, -10.0); noperspective in vec2 pass_TexCoord; smooth in vec3 viewRay; layout(location = 0) out float out_AO; vec3 CalcPosition(void){ float depth = texture(DepthMap, pass_TexCoord).r; float linearDepth = projAB.y / (depth - projAB.x); vec3 ray = normalize(viewRay); ray = ray / ray.z; return linearDepth * ray; } void main(void){ vec3 origin = CalcPosition(); if(origin.z < -60) discard; vec2 pixOrigin = pass_TexCoord; //tex coords vec3 dir = normalize(light_p - origin); vec2 texel_size = vec2(1.0 / 600.0); float t = 0.1; ivec2 pixIndex = ivec2(pixOrigin / texel_size); out_AO = 1.0; while(true){ vec3 ray = origin + t * dir; vec4 temp = projectionMatrix * vec4(ray, 1.0); vec2 texCoord = (temp.xy / temp.w) * 0.5 + 0.5; ivec2 newIndex = ivec2(texCoord / texel_size); if(newIndex != pixIndex){ float depth = texture(DepthMap, texCoord).r; float linearDepth = projAB.y / (depth - projAB.x); if(linearDepth > ray.z + 0.1){ out_AO = 0.2; break; } pixIndex = newIndex; } t += 0.5; if(texCoord.x < 0 || texCoord.x > 1.0 || texCoord.y < 0 || texCoord.y > 1.0) break; } } As you can see, here I just increment 't' by some arbitrary factor, calculate the 3D ray and project it to get the pixel coordinates, which is not really optimal. Hopefully, I would like to optimize the code as much as possible and compare it with shadow mapping and how it scales with the number of lights. PS: Keep in mind that I reconstruct position from depth by interpolating rays through a full screen quad.

Read the article
SDL_image & OpenGL Problem

- by Dylan

i've been following tutorials online to load textures using SDL and display them on a opengl quad. but ive been getting weird results that no one else on the internet seems to be getting... so when i render the texture in opengl i get something like this. http://www.kiddiescissors.com/after.png when the original .bmp file is this: http://www.kiddiescissors.com/before.bmp ive tried other images too, so its not that this particular image is corrupt. it seems like my rgb channels are all jumbled or something. im pulling my hair out at this point. heres the relevant code from my init() function if ( SDL_Init(SDL_INIT_VIDEO) != 0 ) { return 1; } SDL_GL_SetAttribute(SDL_GL_DOUBLEBUFFER, 1); SDL_GL_SetAttribute( SDL_GL_RED_SIZE, 8 ); SDL_GL_SetAttribute( SDL_GL_GREEN_SIZE, 8 ); SDL_GL_SetAttribute( SDL_GL_BLUE_SIZE, 8 ); SDL_GL_SetAttribute( SDL_GL_ALPHA_SIZE, 8 ); SDL_GL_SetAttribute(SDL_GL_MULTISAMPLEBUFFERS, 1); SDL_GL_SetAttribute(SDL_GL_MULTISAMPLESAMPLES, 4); SDL_SetVideoMode(WINDOW_WIDTH, WINDOW_HEIGHT, 32, SDL_HWSURFACE | SDL_GL_DOUBLEBUFFER | SDL_OPENGL); glMatrixMode(GL_PROJECTION); glLoadIdentity(); gluPerspective(50, (GLfloat)WINDOW_WIDTH/WINDOW_HEIGHT, 1, 50); glMatrixMode(GL_MODELVIEW); glLoadIdentity(); glEnable(GL_DEPTH_TEST); glEnable(GL_MULTISAMPLE); glEnable(GL_TEXTURE_2D); glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE); glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_ALPHA); glEnable(GL_BLEND); heres the code that is called when my main player object (the one with which this sprite is associated) is initialized texture = 0; SDL_Surface* surface = IMG_Load("i.bmp"); glGenTextures(1, &texture); glBindTexture(GL_TEXTURE_2D, texture); glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, surface->w, surface->h, 0, GL_RGB, GL_UNSIGNED_BYTE, surface->pixels); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); SDL_FreeSurface(surface); and then heres the relevant code from my display function glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); glLoadIdentity(); glColor4f(1, 1, 1, 1); glPushMatrix(); glBindTexture(GL_TEXTURE_2D, texture); glTranslatef(getCenter().x, getCenter().y, 0); glRotatef(getAngle()*(180/M_PI), 0, 0, 1); glTranslatef(-getCenter().x, -getCenter().y, 0); glBegin(GL_QUADS); glTexCoord2f(0, 0); glVertex3f(getTopLeft().x, getTopLeft().y, 0); glTexCoord2f(0, 1); glVertex3f(getTopLeft().x, getTopLeft().y + size.y, 0); glTexCoord2f(1, 1); glVertex3f(getTopLeft().x + size.x, getTopLeft().y + size.y, 0); glTexCoord2f(1, 0); glVertex3f(getTopLeft().x + size.x, getTopLeft().y, 0); glEnd(); glPopMatrix(); let me know if i left out anything important... or if you need more info from me. thanks a ton, -Dylan

Read the article
How the number of indexes built on a table can impact performances?

- by Davide Mauri

We all know that putting too many indexes (I’m talking of non-clustered index only, of course) on table may produce performance problems due to the overhead that each index bring to all insert/update/delete operations on that table. But how much? I mean, we all agree – I think – that, generally speaking, having many indexes on a table is “bad”. But how bad it can be? How much the performance will degrade? And on a concurrent system how much this situation can also hurts SELECT performances? If SQL Server take more time to update a row on a table due to the amount of indexes it also has to update, this also means that locks will be held for more time, slowing down the perceived performance of all queries involved. I was quite curious to measure this, also because when teaching it’s by far more impressive and effective to show to attended a chart with the measured impact, so that they can really “feel” what it means! To do the tests, I’ve create a script that creates a table (that has a clustered index on the primary key which is an identity column) , loads 1000 rows into the table (inserting 1000 row using only one insert, instead of issuing 1000 insert of one row, in order to minimize the overhead needed to handle the transaction, that would have otherwise ), and measures the time taken to do it. The process is then repeated 16 times, each time adding a new index on the table, using columns from table in a round-robin fashion. Test are done against different row sizes, so that it’s possible to check if performance changes depending on row size. The result are interesting, although expected. This is the chart showing how much time it takes to insert 1000 on a table that has from 0 to 16 non-clustered indexes. Each test has been run 20 times in order to have an average value. The value has been cleaned from outliers value due to unpredictable performance fluctuations due to machine activity. The test shows that in a table with a row size of 80 bytes, 1000 rows can be inserted in 9,05 msec if no indexes are present on the table, and the value grows up to 88 (!!!) msec when you have 16 indexes on it This means a impact on performance of 975%. That’s *huge*! Now, what happens if we have a bigger row size? Say that we have a table with a row size of 1520 byte. Here’s the data, from 0 to 16 indexes on that table: In this case we need near 22 msec to insert 1000 in a table with no indexes, but we need more that 500msec if the table has 16 active indexes! Now we’re talking of a 2410% impact on performance! Now we can have a tangible idea of what’s the impact of having (too?) many indexes on a table and also how the size of a row also impact performances. That’s why the golden rule of OLTP databases “few indexes, but good” is so true! (And in fact last week I saw a database with tables with 1700bytes row size and 23 (!!!) indexes on them!) This also means that a too heavy denormalization is really not a good idea (we’re always talking about OLTP systems, keep it in mind), since the performance get worse with the increase of the row size. So, be careful out there, and keep in mind the “equilibrium” is the key world of a database professional: equilibrium between read and write performance, between normalization and denormalization, between to few and too may indexes. PS Tests are done on a VMWare Workstation 7 VM with 2 CPU and 4 GB of Memory. Host machine is a Dell Precsioni M6500 with i7 Extreme X920 Quad-Core HT 2.0Ghz and 16Gb of RAM. Database is stored on a SSD Intel X-25E Drive, Simple Recovery Model, running on SQL Server 2008 R2. If you also want to to tests on your own, you can download the test script here: Open TestIndexPerformance.sql

Read the article
Virtualized data centre–Part three: Architecture

- by marc dekeyser

Having the basics (like discussed in the previous articles) is all good and well, but how do we get started on this?! It can be quite daunting after all! From my own point of view I can absolutely confirm your worries and concerns, but also tell you that it is not as hard as it seems! Deciding on what kind of motherboard to buy, processor and how much memory is an activity you will spend quite some time doing research on. And that is not even mentioning storage! All in all it comes down to setting you expectations and your budget. Probably adjusting your expectations according to your budget :). Processors As a rule of thumb you want VT-D (virtualization) technology built in to the processor allowing you to have 64 bit machines running on your host. Memory The more the better! If you are building a home lab don’t bother with ECC unless you are going to run machines that absolutely should be on all the time and your comfort depends on it! Motherboard Depends on what you are going to do with storage: If you are going the NAS way then the number of SATA port/RAID capabilities do not really matter. If you decide to have a single server with lots of dedicated storage it obviously matters how much SATA ports you will have, alternatively you could use a RAID controller (but these set you back a pretty penny if you want one. DELL 6i’s are usually available for a good bargain if you can find one!). Easiest is to get one with a built-in graphics card (on-board) as you are just adding more heat, power usage and possible points of failure. Networking Just like your choice of motherboard the networking side tends to depend on how you want to go. A single virtualization host with local storage can usually get away with having a single network card, a cluster or server which uses iSCSI storage tends to have more than one teamed up :). Storage The dreaded beast from the dark! The horror which lives in the forest! The most difficult decision you are going to make in the building of your lab. Why you might ask? Simple my friend, having the right choice of storage can make or break your virtualization solution. The performance of you storage choice will have an important impact on the responsiveness of your virtual machines and the deployment of new machines. It also makes a run with your budget! If you decide to go the NAS route you will be dropping a lot more money than if you would be having just a bunch of disks sitting in a server and manually distributing the virtual machines over the disks. Platform I’m a Microsoftee so Hyper-V is a dead giveaway for me. If you are interested in using VMware I won’t stop you but the rest of my posts will be oriented on Server 2012 Hyper-V (aka 3.0)! What did I use? Before someone asks me this in the comments I’ll give you a quick run down of what I am using. - Intel 2.4 quad core processors (i something something) - 24 GB DDR3 Memory - Single disk in each server (might look at this as I move the servers to 2012) - Synology DS1812+ NAS - 3 network interfaces where possible - HP1800 procurve managed switch I decided to spring for the NAS as I will also be using it for backups and media storage (which is working out quite nicely with my Xbox 360 I must say). At the time of building my 2 boxes (over a year and a half ago) these set me back about 900 euros each so I can image you can build the same or better for a lower price. Next article will be diagramming what I want to achieve and starting a build on the Hyper V 3.0 cluster!

Read the article
Projective texture and deferred lighting

- by Vodácek

In my previous question, I asked whether it is possible to do projective texturing with deferred lighting. Now (more than half a year later) I have a problem with my implementation of the same thing. I am trying to apply this technique in light pass. (my projector doesn't affect albedo). I have this projector View a Projection matrix: Matrix projection = Matrix.CreateOrthographicOffCenter(-halfWidth * Scale, halfWidth * Scale, -halfHeight * Scale, halfHeight * Scale, 1, 100000); Matrix view = Matrix.CreateLookAt(Position, Target, Vector3.Up); Where halfWidth and halfHeight is are half of the texture's width and height, Position is the Projector's position and target is the projector's target. This seems to be ok. I am drawing full screen quad with this shader: float4x4 InvViewProjection; texture2D DepthTexture; texture2D NormalTexture; texture2D ProjectorTexture; float4x4 ProjectorViewProjection; sampler2D depthSampler = sampler_state { texture = <DepthTexture>; minfilter = point; magfilter = point; mipfilter = point; }; sampler2D normalSampler = sampler_state { texture = <NormalTexture>; minfilter = point; magfilter = point; mipfilter = point; }; sampler2D projectorSampler = sampler_state { texture = <ProjectorTexture>; AddressU = Clamp; AddressV = Clamp; }; float viewportWidth; float viewportHeight; // Calculate the 2D screen position of a 3D position float2 postProjToScreen(float4 position) { float2 screenPos = position.xy / position.w; return 0.5f * (float2(screenPos.x, -screenPos.y) + 1); } // Calculate the size of one half of a pixel, to convert // between texels and pixels float2 halfPixel() { return 0.5f / float2(viewportWidth, viewportHeight); } struct VertexShaderInput { float4 Position : POSITION0; }; struct VertexShaderOutput { float4 Position :POSITION0; float4 PositionCopy : TEXCOORD1; }; VertexShaderOutput VertexShaderFunction(VertexShaderInput input) { VertexShaderOutput output; output.Position = input.Position; output.PositionCopy=output.Position; return output; } float4 PixelShaderFunction(VertexShaderOutput input) : COLOR0 { float2 texCoord =postProjToScreen(input.PositionCopy) + halfPixel(); // Extract the depth for this pixel from the depth map float4 depth = tex2D(depthSampler, texCoord); //return float4(depth.r,0,0,1); // Recreate the position with the UV coordinates and depth value float4 position; position.x = texCoord.x * 2 - 1; position.y = (1 - texCoord.y) * 2 - 1; position.z = depth.r; position.w = 1.0f; // Transform position from screen space to world space position = mul(position, InvViewProjection); position.xyz /= position.w; //compute projection float3 projection=tex2D(projectorSampler,postProjToScreen(mul(position,ProjectorViewProjection)) + halfPixel()); return float4(projection,1); } In first part of pixel shader is recovered position from G-buffer (this code I am using in other shaders without any problem) and then is tranformed to projector viewprojection space. Problem is that projection doesn't appear. Here is an image of my situation: The green lines are the rendered projector frustum. Where is my mistake hidden? I am using XNA 4. Thanks for advice and sorry for my English. EDIT: Shader above is working but projection was too small. When I changed the Scale property to a large value (e.g. 100), the projection appears. But when the camera moves toward the projection, the projection expands, as can bee seen on this YouTube video.

Read the article
Best triple head display setup

- by dgel

I'm currently running Ubuntu 12.04 with a darn good triple head display setup. I've got a VisionTek 900530 Radeon HD 5450 512MB DDR3 PCI Express video card that has two DVI outputs and one Mini DisplayPort that I have connected to a HDMI adapter. I'm running three identical Asus 1920x1080 monitors that each have a DVI, VGA, and HDMI input. I'm using the xorg-edgers ppa, so I'm using the open source radeon driver version 6.99.99. I tried using the ATI binary fglrx driver, but I wasn't able to get the three monitors working properly- the monitor connected via HDMI / DisplayPort wouldn't run at full resolution. The setup is almost perfect: Compiz runs fine and is quite snappy. I'm not able to use that great compiz feature where you can drag a window to the side of a display and it will half maximize. I occasionally experience display corruption weirdness with Unity and need to restart it. When I use a dropdown menu in LibreOffice it often pops the menu down in another window. For example, if I'm using the center monitor and click the Insert menu, the menu pulls down on the monitor to my right, forcing me to chase it. If I chase down the menu and choose Manual Break, the dialog appears over on my left monitor. This absurdity is mildly entertaining but has lost its novelty. I've decided to build a new system and have spared no expense- latest i7 processor, SSD, etc. I really like the performance of the Nvidia binary drivers, so I put two ZOTAC ZT-40707-10L GeForce GT 440 in the system, figuring I'd have four DVI outputs and an awesome triple (or even eventually quad) head setup. Unfortunately it appears that I didn't do sufficient research before my purchase. It seems that Nvidia TwinView only supports two monitors on one card (I guess that's why they call it TwinView...). I messed around with running two X servers, but I really don't want that- being able to drag windows to any monitor is critical. It doesn't sound like Xinerama is an option because from what I understand it simply doesn't support Compiz. I've seen a BaseMosaic option that can be used with the Nvidia drivers that appears to support an almost unlimited number of heads- unfortunately me cheap little cards don't support it. I'm also not sure whether you'll still have all nice maximizing and snapping that TwinView provides, or whether Ubuntu will only see it as one massive display. I put my old trusty ATI card into my new system and installed 12.10. I'm using the opensource radeon drivers again because even in 12.10 I can't get the fglrx binary drivers to do triple head. Unfortunately, even with an unbelievably powerful system the experience is extremely sluggish (much more so than my experience in 12.04). The menu scattering problem appears to be fixed, but I get a lot of nasty Unity display corruption. So finally, my question is this: What hardware / drivers should I use? I'm willing to buy (almost) any video card(s). I have two PCI-Express 3.0 slots on my motherboard (which has an integrated Intel HD card). I'm willing to use ATI or Nvidia cards and willing to run Ubuntu 12.04.1 or 12.10. I'm not a gamer, but do want beautiful and snappy Compiz effects. Does anyone out there have the perfect triple head setup in 12.04 or 12.10? What hardware / drivers are you using? I have those two Nvidia cards but will probably be returning them unless someone knows a way to use them together for a triple head setup. Since I'm having pretty good luck with a single ATI card providing three displays, should I just buy a beefier one with the hopes that it will fix the horrible sluggishness I'm experiencing in 12.10?

Read the article
Best approach to depth streaming via existing codec

- by Kevin

I'm working on a development system (and game) intended for games set mostly in static third-person views. We produce our scenery by CG and photographic techniques. Our background art is rendered off-line by a production-grade renderer. To allow the runtime imagery to properly interact with the background art, I wrote a program to convert from depth output by Mental Ray into a texture, and a pixel shader to draw a quad such that the Z data comes from the texture. This technique is working out very well, but now we've decided that some of the camera angle changes between scenes should be animated. The animation itself is straightforward to produce from our CG models. We intend to encode it to some HD video codec such as H.264. The problem is that in order to maintain our runtime imagery on the screen, the depth buffer will need to be loaded for each video frame. Due to the bandwidth, the video's depth data will need to be compressed efficiently. I've looked into methods for performing temporal compression of depth info and found an interesting research paper here: http://web4.cs.ucl.ac.uk/staff/j.kautz/publications/depth-streaming.pdf The method establishes a mapping between 16-bit depth values and YCbCr values. The mapping is tuned to the properties of existing video codecs in order to maximize precision of the decoded depths after the YCbCr has undergone video compression. It allows an existing, unmodified video codec to be used on the backend. I'm looking at how to pull this off with the least possible work. (This design change was unplanned.) Our game engine itself is native C++, presently for Win32 and DirectX, although we've worked hard to keep platform dependence segregated because we intend other ports. We don't have motion video facilities in the engine yet but will ultimately need that anyway for cinematics. I was planning on using some off-the-shelf motion video solution we can plug into our engine, and haven't chosen one yet. This new added requirement makes selecting one harder since, among other things, we'll now need to bypass colourspace conversion on one of the streams, and also will need to be playing two streams simultaneously in lockstep, on top of in some cases audio on one of them (for the cinematics). I'm also wondering if it's possible (or even useful) to do the conversion from YCbCr to depth in a pixel shader, or if it's better to just do it in CPU and separately load the resulting depth values into a locked tex. The conversion unfortunately does involve branching logic per-pixel. (There are more naive mappings that don't need branching, but they produce inferior results.) It could be reduced to a table lookup but the table would be 32MB. Programming is second-nature to me but I'm not that experienced with pix shaders and have zero knowledge of off-the-shelf video solutions. I'd therefore be interested in advice from others who may have dealt more with depth streaming, pixel shaders, and/or off-the-shelf codecs, regarding how feasible the proposed application is and what off-the-shelf video systems out there would best get along with this usage case.

Read the article
With a little effort you can “SEMI”-protect your C# assemblies with obfuscation.

- by mbcrump

This method will not protect your assemblies from a experienced hacker. Everyday we see new keygens, cracks, serials being released that contain ways around copy protection from small companies. This is a simple process that will make a lot of hackers quit because so many others use nothing. If you were a thief would you pick the house that has security signs and an alarm or one that has nothing? To so begin: Obfuscation is the concealment of meaning in communication, making it confusing and harder to interpret. Lets begin by looking at the cartoon below: You are probably familiar with the term and probably ignored this like most programmers ignore user security. Today, I’m going to show you reflection and a way to obfuscate it. Please understand that I am aware of ways around this, but I believe some security is better than no security. In this sample program below, the code appears exactly as it does in Visual Studio. When the program runs, you get either a true or false in a console window. Sample Program. using System; using System.Diagnostics; using System.Linq; namespace ObfuscateMe { class Program { static void Main(string[] args) { Console.WriteLine(IsProcessOpen("notepad")); //Returns a True or False depending if you have notepad running. Console.ReadLine(); } public static bool IsProcessOpen(string name) { return Process.GetProcesses().Any(clsProcess => clsProcess.ProcessName.Contains(name)); } } } Pretend, that this is a commercial application. The hacker will only have the executable and maybe a few config files, etc. After reviewing the executable, he can determine if it was produced in .NET by examing the file in ILDASM or Redgate’s Reflector. We are going to examine the file using RedGate’s Reflector. Upon launch, we simply drag/drop the exe over to the application. We have the following for the Main method: and for the IsProcessOpen method: Without any other knowledge as to how this works, the hacker could export the exe and get vs project build or copy this code in and our application would run. Using Reflector output. using System; using System.Diagnostics; using System.Linq; namespace ObfuscateMe { class Program { static void Main(string[] args) { Console.WriteLine(IsProcessOpen("notepad")); Console.ReadLine(); } public static bool IsProcessOpen(string name) { return Process.GetProcesses().Any<Process>(delegate(Process clsProcess) { return clsProcess.ProcessName.Contains(name); }); } } } The code is not identical, but returns the same value. At this point, with a little bit of effort you could prevent the hacker from reverse engineering your code so quickly by using Eazfuscator.NET. Eazfuscator.NET is just one of many programs built for this. Visual Studio ships with a community version of Dotfoscutor. So download and load Eazfuscator.NET and drag/drop your exectuable/project into the window. It will work for a few minutes depending if you have a quad-core or not. After it finishes, open the executable in RedGate Reflector and you will get the following: Main After Obfuscation IsProcessOpen Method after obfuscation: As you can see with the jumbled characters, it is not as easy as the first example. I am aware of methods around this, but it takes more effort and unless the hacker is up for the challenge, they will just pick another program. This is also helpful if you are a consultant and make clients pay a yearly license fee. This would prevent the average software developer from jumping into your security routine after you have left. I hope this article helped someone. If you have any feedback, please leave it in the comments below.

Read the article

Search Results

Search found 751 results on 31 pages for 'quad'.

Page 27/31 | < Previous Page | 23 24 25 26 27 28 29 30 31 | Next Page >

- by MyFaJoArCo

- by holtavolt

- by Tim Kane

- by pierusch

- by Snoopy

- by confusedCoder

- by Néstor Sánchez A.

- by Andreawu98

- by user315684

- by user2981648

- by Ralph

- by Dave Jarvis

- by Annalisa

- by thr

- by StuartBrierley

- by amoffat

- by FalconNL

- by Grieverheart

- by Dylan

- by Davide Mauri

- by marc dekeyser

- by Vodácek

- by dgel

- by Kevin

- by mbcrump

< Previous Page | 23 24 25 26 27 28 29 30 31 | Next Page >