Search Results

Search found 5070 results on 203 pages for 'algorithm'.

Page 202/203 | < Previous Page | 198 199 200 201 202 203  | Next Page >

  • Hard to append a table with many records into another without generating duplicates

    - by Bill Mudry
    I may seem to be a bit wordy at first but for the hope it will be easier for all of you to understand what I am doing in the first place. I have an uncommon but enjoyable activity of collecting as many species of wood from around the world as I can (over 2,900 so far). Ok, that is the real world. Meanwhile I have spent over 8 years compiling over 5.8 meg of text data on all the woods of the world. That got so large that learning some basic PHP and MySQL was most welcome so I could build a new database driven home for all this research. I am still slow at it but getting there. The original premise was to find evidence of as many species of woods in the world I can. The more names identified, the more successful the project. I have named the project TAXA for ease of conversation (short for Taxonomy). You are most welcome to take a look at what I have so far at www.prowebcanada.com/taxa. It is 95% dynamically driven. So far I am reporting about 6,500 botanical wood names and, as said above, the more I can report, the more successful is the project. I have a file of all the woods in the second largest wood collection in the world, the Tervuren wood collection in the Netherlands with over 11,300 wood names even after cleaning out all duplicates. That is almost twice the number I am reporting now so porting all the new wood names from Tervuren to the 'species' table where I keep the reported data would be a major desirable advancement in the project. At one point I was able to add all the Tervuren records to the species table but over 3,000 duplicates also formed. They were not in the Tervuren file in the first place but represent the same wood names common to both files. It is common sense that there would be woods common to both that when merged would create new duplicates. At one point and with the help of others from another forum, I may very well have finally got the proper SQL statement. When I ran it, though, the system said (semi-amusingly at first) ----- that it had gone away! After looking up on the Net what could have have done this, one reason is that the MySQL timeout lapses and probably because of the large size of files I am running. I am running this on a rented account on Godaddy so I cannot go about trying to adjust any config file. For safety, I copied the tervuren.sql file as tervuren_target.sql and the species.sql file as species_master.sql tp use as working files just to make sure I protect the original files from destruction or damage. Later I can name the species_master back to just species.sql once I am happy all worked well. The species file has about 18 columns in it but only 5 columns match the columns in the Tervuren file (name for name and collation also). The rest of the columns are just along for the ride, so to speak. The common key in both is the 'species_name" columns in both. I am not sure it is at all proper to call one a primary key and the other a foreign key since there really is no relational connection to them. One is just more data for the other and can disappear after, never to be referred to the working code in the application. I have been very surprised and flabbergasted on how hard it can be to append records from one large table into another (with same column names plus others) without generating NEW duplicates in the first place. Watch out thinking that a SELECT DISTINCT statement may do the job because absolutely NO records in the species table must get destroyed in the process and there is no way (well, that I know of) to tell the 'DISTINCT" command this. Yes, the original 'species' table has duplicates in it even before all this but, trust me ---- they have to be removed the long hard way manually record by record or I will lose precious information. It is more important to just make sure no NEW duplicates form through bringing in new names in the tervuren_target.species_name into species.species_name. I am hoping and thinking that a straight SQL solution should work --- except for that nasty timeout. How do I get past that? Could it mean that I may have to turn to a PHP plus SQL method?? Or ..... would I have to break up the Tervuren files into a few smaller ones and run them independently (hope not....)" So far, what seems should be easy has proven to be unexpectedly tricky. I appreciate any help you can give but start from the assumption that this may be harder to do right than it may seem on the surface. By the way --- I am running a quad 64 bit system with Windows 7, so at least I have some fairly hefty power on the client end. I have a direct ethernet cable feeding a cable connection to the Internet. Once I get an algorithm and code working for this, I also have many other lists to process that could make the 'species' table grow even more. It could be equivalent to (ahem) lighting a rocket under my project (especially compared to do this record by record manually)! This is my first time in this forum, so I do not know how I can receive any replies. Do I have to to come back here periodically or are replies emailed out also? It would be great if you CC'd copies to me at billmudry at rogers.com :-) Much thanks for your patience and help, Bill Mudry Mississauga, Ontario Canada (next to Toronto).

    Read the article

  • How to match ColdFusion encryption with Java 1.4.2?

    - by JohnTheBarber
    * sweet - thanks to Edward Smith for the CF Technote that indicated the key from ColdFusion was Base64 encoded. See generateKey() for the 'fix' My task is to use Java 1.4.2 to match the results a given ColdFusion code sample for encryption. Known/given values: A 24-byte key A 16-byte salt (IVorSalt) Encoding is Hex Encryption algorithm is AES/CBC/PKCS5Padding A sample clear-text value The encrypted value of the sample clear-text after going through the ColdFusion code Assumptions: Number of iterations not specified in the ColdFusion code so I assume only one iteration 24-byte key so I assume 192-bit encryption Given/working ColdFusion encryption code sample: <cfset ThisSalt = "16byte-salt-here"> <cfset ThisAlgorithm = "AES/CBC/PKCS5Padding"> <cfset ThisKey = "a-24byte-key-string-here"> <cfset thisAdjustedNow = now()> <cfset ThisDateTimeVar = DateFormat( thisAdjustedNow , "yyyymmdd" )> <cfset ThisDateTimeVar = ThisDateTimeVar & TimeFormat( thisAdjustedNow , "HHmmss" )> <cfset ThisTAID = ThisDateTimeVar & "|" & someOtherData> <cfset ThisTAIDEnc = Encrypt( ThisTAID , ThisKey , ThisAlgorithm , "Hex" , ThisSalt)> My Java 1.4.2 encryption/decryption code swag: package so.example; import java.security.*; import javax.crypto.Cipher; import javax.crypto.spec.IvParameterSpec; import javax.crypto.spec.SecretKeySpec; import org.apache.commons.codec.binary.*; public class SO_AES192 { private static final String _AES = "AES"; private static final String _AES_CBC_PKCS5Padding = "AES/CBC/PKCS5Padding"; private static final String KEY_VALUE = "a-24byte-key-string-here"; private static final String SALT_VALUE = "16byte-salt-here"; private static final int ITERATIONS = 1; private static IvParameterSpec ivParameterSpec; public static String encryptHex(String value) throws Exception { Key key = generateKey(); Cipher c = Cipher.getInstance(_AES_CBC_PKCS5Padding); ivParameterSpec = new IvParameterSpec(SALT_VALUE.getBytes()); c.init(Cipher.ENCRYPT_MODE, key, ivParameterSpec); String valueToEncrypt = null; String eValue = value; for (int i = 0; i < ITERATIONS; i++) { // valueToEncrypt = SALT_VALUE + eValue; // pre-pend salt - Length > sample length valueToEncrypt = eValue; // don't pre-pend salt Length = sample length byte[] encValue = c.doFinal(valueToEncrypt.getBytes()); eValue = Hex.encodeHexString(encValue); } return eValue; } public static String decryptHex(String value) throws Exception { Key key = generateKey(); Cipher c = Cipher.getInstance(_AES_CBC_PKCS5Padding); ivParameterSpec = new IvParameterSpec(SALT_VALUE.getBytes()); c.init(Cipher.DECRYPT_MODE, key, ivParameterSpec); String dValue = null; char[] valueToDecrypt = value.toCharArray(); for (int i = 0; i < ITERATIONS; i++) { byte[] decordedValue = Hex.decodeHex(valueToDecrypt); byte[] decValue = c.doFinal(decordedValue); // dValue = new String(decValue).substring(SALT_VALUE.length()); // when salt is pre-pended dValue = new String(decValue); // when salt is not pre-pended valueToDecrypt = dValue.toCharArray(); } return dValue; } private static Key generateKey() throws Exception { // Key key = new SecretKeySpec(KEY_VALUE.getBytes(), _AES); // this was wrong Key key = new SecretKeySpec(new BASE64Decoder().decodeBuffer(keyValueString), _AES); // had to un-Base64 the 'known' 24-byte key. return key; } } I cannot create a matching encrypted value nor decrypt a given encrypted value. My guess is it's something to do with how I'm handling the initial vector/salt. I'm not very crypto-savvy but I'm thinking I should be able to take the sample clear-text and produce the same encrypted value in Java as ColdFusion produced. I am able to encrypt/decrypt my own data with my Java code (so I'm consistent) but I cannot match nor decrypt the ColdFusion sample encrypted value. I have access to a local webservice that can test the encrypted output. The given ColdFusion output sample passes/decrypts fine (of course). If I try to decrypt the same sample with my Java code (using the actual key and salt) I get a "Given final block not properly padded" error. I get the same net result when I pass my attempt at encryption (using the actual key and salt) to the test webservice. Any Ideas?

    Read the article

  • Exception - Illegal Block size during decryption(Android)

    - by Vamsi
    I am writing an application which encrypts and decrypts the user notes based on the user set password. i used the following algorithms for encryption/decryption 1. PBEWithSHA256And256BitAES-CBC-BC 2. PBEWithMD5And128BitAES-CBC-OpenSSL e_Cipher = Cipher.getInstance(PBEWithSHA256And256BitAES-CBC-BC); d_Cipher = Cipher.getInstance(PBEWithSHA256And256BitAES-CBC-BC); e_Cipher.init() d_Cipher.init() encryption is working well, but when trying to decrypt it gives Exception - Illegal Block size after encryption i am converting the cipherText to HEX and storing it in a sqlite database. i am retrieving correct values from the sqlite database during decyption but when calling d_Cipher.dofinal() it throws the Exception. I thought i missed to specify the padding and tried to check what are the other available cipher algorithms but i was unable to found. so request you to please give the some knowledge on what are the cipher algorithms and padding that are supported by Android? if the algorithm which i used can be used for padding, how should i specify the padding mechanism? I am pretty new to Encryption so tried a couple of algorithms which are available in BouncyCastle.java but unsuccessful. As requested here is the code public class CryptoHelper { private static final String TAG = "CryptoHelper"; //private static final String PBEWithSHA256And256BitAES = "PBEWithSHA256And256BitAES-CBC-BC"; //private static final String PBEWithSHA256And256BitAES = "PBEWithMD5And128BitAES-CBC-OpenSSL"; private static final String PBEWithSHA256And256BitAES = "PBEWithMD5And128BitAES-CBC-OpenSSLPBEWITHSHA1AND3-KEYTRIPLEDES-CB"; private static final String randomAlgorithm = "SHA1PRNG"; public static final int SALT_LENGTH = 8; public static final int SALT_GEN_ITER_COUNT = 20; private final static String HEX = "0123456789ABCDEF"; private Cipher e_Cipher; private Cipher d_Cipher; private SecretKey secretKey; private byte salt[]; public CryptoHelper(String password) throws InvalidKeyException, NoSuchAlgorithmException, NoSuchPaddingException, InvalidAlgorithmParameterException, InvalidKeySpecException { char[] cPassword = password.toCharArray(); PBEKeySpec pbeKeySpec = new PBEKeySpec(cPassword); PBEParameterSpec pbeParamSpec = new PBEParameterSpec(salt, SALT_GEN_ITER_COUNT); SecretKeyFactory keyFac = SecretKeyFactory.getInstance(PBEWithSHA256And256BitAES); secretKey = keyFac.generateSecret(pbeKeySpec); SecureRandom saltGen = SecureRandom.getInstance(randomAlgorithm); this.salt = new byte[SALT_LENGTH]; saltGen.nextBytes(this.salt); e_Cipher = Cipher.getInstance(PBEWithSHA256And256BitAES); d_Cipher = Cipher.getInstance(PBEWithSHA256And256BitAES); e_Cipher.init(Cipher.ENCRYPT_MODE, secretKey, pbeParamSpec); d_Cipher.init(Cipher.DECRYPT_MODE, secretKey, pbeParamSpec); } public String encrypt(String cleartext) throws IllegalBlockSizeException, BadPaddingException { byte[] encrypted = e_Cipher.doFinal(cleartext.getBytes()); return convertByteArrayToHex(encrypted); } public String decrypt(String cipherString) throws IllegalBlockSizeException { byte[] plainText = decrypt(convertStringtobyte(cipherString)); return(new String(plainText)); } public byte[] decrypt(byte[] ciphertext) throws IllegalBlockSizeException { byte[] retVal = {(byte)0x00}; try { retVal = d_Cipher.doFinal(ciphertext); } catch (BadPaddingException e) { Log.e(TAG, e.toString()); } return retVal; } public String convertByteArrayToHex(byte[] buf) { if (buf == null) return ""; StringBuffer result = new StringBuffer(2*buf.length); for (int i = 0; i < buf.length; i++) { appendHex(result, buf[i]); } return result.toString(); } private static void appendHex(StringBuffer sb, byte b) { sb.append(HEX.charAt((b>>4)&0x0f)).append(HEX.charAt(b&0x0f)); } private static byte[] convertStringtobyte(String hexString) { int len = hexString.length()/2; byte[] result = new byte[len]; for (int i = 0; i < len; i++) { result[i] = Integer.valueOf(hexString.substring(2*i, 2*i+2), 16).byteValue(); } return result; } public byte[] getSalt() { return salt; } public SecretKey getSecretKey() { return secretKey; } public static SecretKey createSecretKey(char[] password) throws NoSuchAlgorithmException, InvalidKeySpecException { PBEKeySpec pbeKeySpec = new PBEKeySpec(password); SecretKeyFactory keyFac = SecretKeyFactory.getInstance(PBEWithSHA256And256BitAES); return keyFac.generateSecret(pbeKeySpec); } } I will call mCryptoHelper.decrypt(String str) then this results in Illegal block size exception My Env: Android 1.6 on Eclipse

    Read the article

  • How would you go about tackling this problem? [SOLVED in C++]

    - by incrediman
    Intro: EDIT: See solution at the bottom of this question (c++) I have a programming contest coming up in about half a week, and I've been prepping :) I found a bunch of questions from this canadian competition, they're great practice: http://cemc.math.uwaterloo.ca/contests/computing/2009/stage2/day1.pdf I'm looking at problem B ("Dinner"). Any idea where to start? I can't really think of anything besides the naive approach (ie. trying all permutations) which would take too long to be a valid answer. Btw, the language there says c++ and pascal I think, but i don't care what language you use - I mean really all I want is a hint as to the direction I should proceed in, and perhpas a short explanation to go along with it. It feels like I'm missing something obvious... Of course extended speculation is more than welcome, but I just wanted to clarify that I'm not looking for a full solution here :) Short version of the question: You have a binary string N of length 1-100 (in the question they use H's and G's instead of one's and 0's). You must remove all of the digits from it, in the least number of steps possible. In each step you may remove any number of adjacent digits so long as they are the same. That is, in each step you can remove any number of adjacent G's, or any number of adjacent H's, but you can't remove H's and G's in one step. Example: HHHGHHGHH Solution to the example: 1. HHGGHH (remove middle Hs) 2. HHHH (remove middle Gs) 3. Done (remove Hs) -->Would return '3' as the answer. Note that there can also be a limit placed on how large adjacent groups have to be when you remove them. For example it might say '2', and then you can't remove single digits (you'd have to remove pairs or larger groups at a time). Solution I took Mark Harrison's main algorithm, and Paradigm's grouping idea and used them to create the solution below. You can try it out on the official test cases if you want. //B.cpp //include debug messages? #define DEBUG false #include <iostream> #include <stdio.h> #include <vector> using namespace std; #define FOR(i,n) for (int i=0;i<n;i++) #define FROM(i,s,n) for (int i=s;i<n;i++) #define H 'H' #define G 'G' class String{ public: int num; char type; String(){ type=H; num=0; } String(char type){ this->type=type; num=1; } }; //n is the number of bits originally in the line //k is the minimum number of people you can remove at a time //moves is the counter used to determine how many moves we've made so far int n, k, moves; int main(){ /*Input from File*/ scanf("%d %d",&n,&k); char * buffer = new char[200]; scanf("%s",buffer); /*Process input into a vector*/ //the 'line' is a vector of 'String's (essentially contigious groups of identical 'bits') vector<String> line; line.push_back(String()); FOR(i,n){ //if the last String is of the correct type, simply increment its count if (line.back().type==buffer[i]) line.back().num++; //if the last String is of the wrong type but has a 0 count, correct its type and set its count to 1 else if (line.back().num==0){ line.back().type=buffer[i]; line.back().num=1; } //otherwise this is the beginning of a new group, so create the new group at the back with the correct type, and a count of 1 else{ line.push_back(String(buffer[i])); } } /*Geedily remove groups until there are at most two groups left*/ moves=0; int I;//the position of the best group to remove int bestNum;//the size of the newly connected group the removal of group I will create while (line.size()>2){ /*START DEBUG*/ if (DEBUG){ cout<<"\n"<<moves<<"\n----\n"; FOR(i,line.size()) printf("%d %c \n",line[i].num,line[i].type); cout<<"----\n"; } /*END DEBUG*/ I=1; bestNum=-1; FROM(i,1,line.size()-1){ if (line[i-1].num+line[i+1].num>bestNum && line[i].num>=k){ bestNum=line[i-1].num+line[i+1].num; I=i; } } //remove the chosen group, thus merging the two adjacent groups line[I-1].num+=line[I+1].num; line.erase(line.begin()+I);line.erase(line.begin()+I); moves++; } /*START DEBUG*/ if (DEBUG){ cout<<"\n"<<moves<<"\n----\n"; FOR(i,line.size()) printf("%d %c \n",line[i].num,line[i].type); cout<<"----\n"; cout<<"\n\nFinal Answer: "; } /*END DEBUG*/ /*Attempt the removal of the last two groups, and output the final result*/ if (line.size()==2 && line[0].num>=k && line[1].num>=k) cout<<moves+2;//success else if (line.size()==1 && line[0].num>=k) cout<<moves+1;//success else cout<<-1;//not everyone could dine. /*START DEBUG*/ if (DEBUG){ cout<<" moves."; } /*END DEBUG*/ }

    Read the article

  • More elegant way to make a C++ member function change different member variables based on template p

    - by Eric Moyer
    Today, I wrote some code that needed to add elements to different container variables depending on the type of a template parameter. I solved it by writing a friend helper class specialized on its own template parameter which had a member variable of the original class. It saved me a few hundred lines of repeating myself without adding much complexity. However, it seemed kludgey. I would like to know if there is a better, more elegant way. The code below is a greatly simplified example illustrating the problem and my solution. It compiles in g++. #include <vector> #include <algorithm> #include <iostream> namespace myNS{ template<class Elt> struct Container{ std::vector<Elt> contents; template<class Iter> void set(Iter begin, Iter end){ contents.erase(contents.begin(), contents.end()); std::copy(begin, end, back_inserter(contents)); } }; struct User; namespace WkNS{ template<class Elt> struct Worker{ User& u; Worker(User& u):u(u){} template<class Iter> void set(Iter begin, Iter end); }; }; struct F{ int x; explicit F(int x):x(x){} }; struct G{ double x; explicit G(double x):x(x){} }; struct User{ Container<F> a; Container<G> b; template<class Elt> void doIt(Elt x, Elt y){ std::vector<Elt> v; v.push_back(x); v.push_back(y); Worker<Elt>(*this).set(v.begin(), v.end()); } }; namespace WkNS{ template<class Elt> template<class Iter> void Worker<Elt>::set(Iter begin, Iter end){ std::cout << "Set a." << std::endl; u.a.set(begin, end); } template<> template<class Iter> void Worker<G>::set(Iter begin, Iter end){ std::cout << "Set b." << std::endl; u.b.set(begin, end); } }; }; int main(){ using myNS::F; using myNS::G; myNS::User u; u.doIt(F(1),F(2)); u.doIt(G(3),G(4)); } User is the class I was writing. Worker is my helper class. I have it in its own namespace because I don't want it causing trouble outside myNS. Container is a container class whose definition I don't want to modify, but is used by User in its instance variables. doIt<F> should modify a. doIt<G> should modify b. F and G are open to limited modification if that would produce a more elegant solution. (As an example of one such modification, in the real application F's constructor takes a dummy parameter to make it look like G's constructor and save me from repeating myself.) In the real code, Worker is a friend of User and member variables are private. To make the example simpler to write, I made everything public. However, a solution that requires things to be public really doesn't answer my question. Given all these caveats, is there a better way to write User::doIt?

    Read the article

  • How to make negate_unary work with any type?

    - by Chan
    Hi, Following this question: How to negate a predicate function using operator ! in C++? I want to create an operator ! can work with any functor that inherited from unary_function. I tried: template<typename T> inline std::unary_negate<T> operator !( const T& pred ) { return std::not1( pred ); } The compiler complained: Error 5 error C2955: 'std::unary_function' : use of class template requires template argument list c:\program files\microsoft visual studio 10.0\vc\include\xfunctional 223 1 Graphic Error 7 error C2451: conditional expression of type 'std::unary_negate<_Fn1>' is illegal c:\program files\microsoft visual studio 10.0\vc\include\ostream 529 1 Graphic Error 3 error C2146: syntax error : missing ',' before identifier 'argument_type' c:\program files\microsoft visual studio 10.0\vc\include\xfunctional 222 1 Graphic Error 4 error C2065: 'argument_type' : undeclared identifier c:\program files\microsoft visual studio 10.0\vc\include\xfunctional 222 1 Graphic Error 2 error C2039: 'argument_type' : is not a member of 'std::basic_ostream<_Elem,_Traits>::sentry' c:\program files\microsoft visual studio 10.0\vc\include\xfunctional 222 1 Graphic Error 6 error C2039: 'argument_type' : is not a member of 'std::basic_ostream<_Elem,_Traits>::sentry' c:\program files\microsoft visual studio 10.0\vc\include\xfunctional 230 1 Graphic Any idea? Update Follow "templatetypedef" solution, I got new error: Error 3 error C2831: 'operator !' cannot have default parameters c:\visual studio 2010 projects\graphic\graphic\main.cpp 39 1 Graphic Error 2 error C2808: unary 'operator !' has too many formal parameters c:\visual studio 2010 projects\graphic\graphic\main.cpp 39 1 Graphic Error 4 error C2675: unary '!' : 'is_prime' does not define this operator or a conversion to a type acceptable to the predefined operator c:\visual studio 2010 projects\graphic\graphic\main.cpp 52 1 Graphic Update 1 Complete code: #include <iostream> #include <functional> #include <utility> #include <cmath> #include <algorithm> #include <iterator> #include <string> #include <boost/assign.hpp> #include <boost/assign/std/vector.hpp> #include <boost/assign/std/map.hpp> #include <boost/assign/std/set.hpp> #include <boost/assign/std/list.hpp> #include <boost/assign/std/stack.hpp> #include <boost/assign/std/deque.hpp> struct is_prime : std::unary_function<int, bool> { bool operator()( int n ) const { if( n < 2 ) return 0; if( n == 2 || n == 3 ) return 1; if( n % 2 == 0 || n % 3 == 0 ) return 0; int upper_bound = std::sqrt( static_cast<double>( n ) ); for( int pf = 5, step = 2; pf <= upper_bound; ) { if( n % pf == 0 ) return 0; pf += step; step = 6 - step; } return 1; } }; /* template<typename T> inline std::unary_negate<T> operator !( const T& pred, typename T::argument_type* dummy = 0 ) { return std::not1<T>( pred ); } */ inline std::unary_negate<is_prime> operator !( const is_prime& pred ) { return std::not1( pred ); } template<typename T> inline void print_con( const T& con, const std::string& ms = "", const std::string& sep = ", " ) { std::cout << ms << '\n'; std::copy( con.begin(), con.end(), std::ostream_iterator<typename T::value_type>( std::cout, sep.c_str() ) ); std::cout << "\n\n"; } int main() { using namespace boost::assign; std::vector<int> nums; nums += 1, 3, 5, 7, 9; nums.erase( remove_if( nums.begin(), nums.end(), !is_prime() ), nums.end() ); print_con( nums, "After remove all primes" ); } Thanks, Chan Nguyen

    Read the article

  • Auto not being recognised by the compiler, what would be the best replacement?

    - by user1719605
    So I have wrote a program that uses auto however the compiler doesn't seem to recognize it, probably it is an earlier compiler. I was wondering for my code, with are suitable variables to fix my code so that I do not need to use the auto keyword? I'm thinking a pointer to a string? or a string iterator, though I am not sure. #include <cstdlib> #include <string> #include <iostream> #include <unistd.h> #include <algorithm> using namespace std; int main(int argc, char* argv[]) { enum MODE { WHOLE, PREFIX, SUFFIX, ANYWHERE, EMBEDDED } mode = WHOLE; bool reverse_match = false; int c; while ((c = getopt(argc, argv, ":wpsaev")) != -1) { switch (c) { case 'w': // pattern matches whole word mode = WHOLE; break; case 'p': // pattern matches prefix mode = PREFIX; break; case 'a': // pattern matches anywhere mode = ANYWHERE; break; case 's': // pattern matches suffix mode = SUFFIX; break; case 'e': // pattern matches anywhere mode = EMBEDDED; break; case 'v': // reverse sense of match reverse_match = true; break; } } argc -= optind; argv += optind; string pattern = argv[0]; string word; int matches = 0; while (cin >> word) { switch (mode) { case WHOLE: if (reverse_match) { if (pattern != word) { matches += 1; cout << word << endl; } } else if (pattern == word) { matches += 1; cout << word << endl; } break; case PREFIX: if (pattern.size() <= word.size()) { auto res = mismatch(pattern.begin(), pattern.end(), word.begin()); if (reverse_match) { if (res.first != word.end()) { matches += 1; cout << word << endl; } } else if (res.first == word.end()) { matches += 1; cout << word << endl; } } break; case ANYWHERE: if (reverse_match) { if (!word.find(pattern) != string::npos) { matches += 1; cout << word << endl; } } else if (word.find(pattern) != string::npos) { matches += 1; cout << word << endl; } break; case SUFFIX: if (pattern.size() <= word.size()) { auto res = mismatch(pattern.rbegin(), pattern.rend(), word.rbegin()); if (reverse_match) { if (res.first != word.rend()) { matches = +1; cout << word << endl; } } else if (res.first == word.rend()) { matches = +1; cout << word << endl; } } break; case EMBEDDED: if (reverse_match) { if (!pattern.find(word) != string::npos) { matches += 1; cout << word << endl;} } else if (pattern.find(word) != string::npos) { matches += 1; cout << word << endl; } break; } } return (matches == 0) ? 1 : 0; } Thanks in advance!

    Read the article

  • Problems configuring nameserver in plesk

    - by Saif Bechan
    Hello, i have some troubles with setting up a nameserver in PLESK for months now. I have tried all possible scenario's but i can not get this to work. I am really in need for some help, and if you can i will really appreciate it. Basically what i want is to just set up a nameserver in PLESK. I have a primary IP, and my host gave me a secondary nameserver i can use. My host is leaseweb in the netherlands. I have made some screenshots of the important parts in my opinion, maybe you guys can see some errors in them. To use the secondary nameserver provided by leaseweb i had to enable ACL on that account, i did so and made a screenshot of that too. The DNS recursion is set to localnets. These settings have not changed for months, so the dns should be fully updated everywhere. The check i run is the following: https://www.sidn.nl/over-nl/aanvraag...-server-check/ Domeinnaam (inclusief .nl): rdshosting.nl Eerste Nameserver: ns1.rdshosting.nl Eerste IP: 62.212.66.33 Tweede Nameserver: ns7.leaseweb.net Tweede ip: 62.212.76.50 If i run the dns check of the netherlands it gives me the following errors: primary name server "ns1.rdshosting.nl." Error: specified name server is not listed as NS record. All public name servers for a domain must also be listed as NS records in the zone of the domain. This domain was specified explicitly as a name server, but not found in the zone description of the primary name server. TE.6a rdshosting.nl. 86400 IN SOA ns1.rdspartners.nl. saif2k.hotmail.com. (2010031102 12H 1H 7D 3H) Error: the MNAME in SOA says "ns1.rdspartners.nl." is the primary name server. The MNAME field in the SOA record (first parameter) lists a different primary name server from the one specified for this check. RFC1035 section 3.3.13 rdshosting.nl. 86400 IN NS ns1.rdspartners.nl. Warning: hidden name server "ns1.rdspartners.nl." never used for first contact. The zone contains an NS record for a host which is not in the list of specified name servers. Hence, this name server will not be used to initiate contact to the domain. It may be used in sequential lookups, so it may still be useful. secondary name server "ns1.rdspartners.nl." [BROKEN] [HIDDEN] Failure: name server at 77.232.85.129 cannot be reached: (unknown error) The name server could not be contacted, which may be due to temporary technical problems or global DNS configuration mistakes. The internal error is shown, but not always clear about the cause. secondary name server "ns7.leaseweb.net." Info: name server looks correctly configured. I have the content of the file etc/named.conf also: // $Id: named.conf,v 1.1.1.1 2001/10/15 07:44:36 kap Exp $ // // Refer to the named(8) man page for details. If you are ever going // to setup a primary server, make sure you've understood the hairy // details of how DNS is working. Even with simple mistakes, you can // break connectivity for affected parties, or cause huge amount of // useless Internet traffic. options { allow-recursion { localnets; }; directory "/var"; auth-nxdomain no; pid-file "/var/run/named/named.pid"; // In addition to the "forwarders" clause, you can force your name // server to never initiate queries of its own, but always ask its // forwarders only, by enabling the following line: // // forward only; // If you've got a DNS server around at your upstream provider, enter // its IP address here, and enable the line below. This will make you // benefit from its cache, thus reduce overall DNS traffic in the Internet. /* forwarders { 127.0.0.1; }; */ /* * If there is a firewall between you and nameservers you want * to talk to, you might need to uncomment the query-source * directive below. Previous versions of BIND always asked * questions using port 53, but BIND 8.1 uses an unprivileged * port by default. */ // query-source address * port 53; /* * If running in a sandbox, you may have to specify a different * location for the dumpfile. */ // dump-file "s/named_dump.db"; }; //Use with the following in named.conf, adjusting the allow list as needed: key "rndc-key" { algorithm hmac-md5; secret "CeMgS23y0oWE20nyv0x40Q=="; }; controls { inet 127.0.0.1 port 953 allow { 127.0.0.1; } keys { "rndc-key"; }; }; // Note: the following will be supported in a future release. /* host { any; } { topology { 127.0.0.0/8; }; }; */ // Setting up secondaries is way easier and the rough picture for this // is explained below. // // If you enable a local name server, don't forget to enter 127.0.0.1 // into your /etc/resolv.conf so this server will be queried first. // Also, make sure to enable it in /etc/rc.conf. zone "." { type hint; file "named.root"; }; zone "0.0.127.IN-ADDR.ARPA" { type master; file "localhost.rev"; }; // NB: Do not use the IP addresses below, they are faked, and only // serve demonstration/documentation purposes! // // Example secondary config entries. It can be convenient to become // a secondary at least for the zone where your own domain is in. Ask // your network administrator for the IP address of the responsible // primary. // // Never forget to include the reverse lookup (IN-ADDR.ARPA) zone! // (This is the first bytes of the respective IP address, in reverse // order, with ".IN-ADDR.ARPA" appended.) // // Before starting to setup a primary zone, better make sure you fully // understand how DNS and BIND works, however. There are sometimes // unobvious pitfalls. Setting up a secondary is comparably simpler. // // NB: Don't blindly enable the examples below. :-) Use actual names // and addresses instead. // // NOTE!!! FreeBSD runs bind in a sandbox (see named_flags in rc.conf). // The directory containing the secondary zones must be write accessible // to bind. The following sequence is suggested: // // mkdir /etc/namedb/s // chown bind.bind /etc/namedb/s // chmod 750 /etc/namedb/s zone "rdshosting.nl" { type master; file "rdshosting.nl"; allow-transfer { 77.232.85.129; 62.212.76.50; common-allow-transfer; }; }; zone "66.212.62.in-addr.arpa" { type master; file "66.212.62.in-addr.arpa"; allow-transfer { common-allow-transfer; }; }; acl common-allow-transfer { 62.212.76.50; }; As i mentioned i made some screenshots of some parts: First the dns settings in plesk: http://www.freeimagehosting.net/uploads/2480faed5e.jpg Second the acl settings in plesk: http://www.freeimagehosting.net/uploads/777f5e69b0.jpg Third my settings at leaseweb: http://www.freeimagehosting.net/uploads/de7122b19c.jpg And last the secondary nameserver settings from leaseweb: http://www.freeimagehosting.net/uploads/fd1da38a8f.jpg If someone has anysuggestion at all on this this will be highly appriciated. Thank you for your time! PS. I am dutch so dutch answers are welcome aswell

    Read the article

  • Varnish default.vcl grace period

    - by Vladimir
    These are my settings for a grace period (/etc/varnish/default.vcl) sub vcl_recv { .... set req.grace = 360000s; ... } sub vcl_fetch { ... set beresp.grace = 360000s; ... } I tested Varnish using localhost and nodejs as a server. I started localhost, the site was up. Then I disconnected server and the site got disconnected in less than 2 min. It says: Error 503 Service Unavailable Service Unavailable Guru Meditation: XID: 1890127100 Varnish cache server Could you tell me what could be the problem? sub vcl_fetch { if (beresp.ttl < 120s) { ##std.log("Adjusting TTL"); set beresp.ttl = 36000s; ##120s; } # Do not cache the object if the backend application does not want us to. if (beresp.http.Cache-Control ~ "(no-cache|no-store|private|must-revalidate)") { return(hit_for_pass); } # Do not cache the object if the status is not in the 200s if (beresp.status >= 300) { # Remove the Set-Cookie header #remove beresp.http.Set-Cookie; return(hit_for_pass); } # # Everything below here should be cached # # Remove the Set-Cookie header ####remove beresp.http.Set-Cookie; # Set the grace time ## set beresp.grace = 1s; //change this to minutes in case of app shutdown set beresp.grace = 360000s; ## 10 hour - reduce if it has negative impact # Static assets - browser caches tpiphem for a long time. if (req.url ~ "\.(css|js|.js|jpg|jpeg|gif|ico|png)\??\d*$") { /* Remove Expires from backend, it's not long enough */ unset beresp.http.expires; /* Set the clients TTL on this object */ set beresp.http.cache-control = "public, max-age=31536000"; /* marker for vcl_deliver to reset Age: */ set beresp.http.magicmarker = "1"; } else { set beresp.http.Cache-Control = "private, max-age=0, must-revalidate"; set beresp.http.Pragma = "no-cache"; } if (req.url ~ "\.(css|js|min|)\??\d*$") { set beresp.do_gzip = true; unset beresp.http.expires; set beresp.http.cache-control = "public, max-age=31536000"; set beresp.http.expires = beresp.ttl; set beresp.http.age = "0"; } ##do not duplicate these settings if (req.url ~ ".css") { set beresp.do_gzip = true; unset beresp.http.expires; set beresp.http.cache-control = "public, max-age=31536000"; set beresp.http.expires = beresp.ttl; set beresp.http.age = "0"; } if (req.url ~ ".js") { set beresp.do_gzip = true; unset beresp.http.expires; set beresp.http.cache-control = "public, max-age=31536000"; set beresp.http.expires = beresp.ttl; set beresp.http.age = "0"; } if (req.url ~ ".min") { set beresp.do_gzip = true; unset beresp.http.expires; set beresp.http.cache-control = "public, max-age=31536000"; set beresp.http.expires = beresp.ttl; set beresp.http.age = "0"; } ## If the request to the backend returns a code other than 200, restart the loop ## If the number of restarts reaches the value of the parameter max_restarts, ## the request will be error'ed. max_restarts defaults to 4. This prevents ## an eternal loop in the event that, e.g., the object does not exist at all. if (beresp.status != 200 && beresp.status != 403 && beresp.status != 404) { return(restart); } if (beresp.status == 302) { return(deliver); } # Never cache posts if (req.url ~ "\/post\/" || req.url ~ "\/submit\/" || req.url ~ "\/ask\/" || req.url ~ "\/add\/") { return(hit_for_pass); } ##check this setting to ensure that it does not cause issues for browsers with no gzip if (beresp.http.content-type ~ "text") { set beresp.do_gzip = true; } if (beresp.http.Set-Cookie) { return(deliver); } ##if (req.url == "/index.html") { set beresp.do_esi = true; ##} ## check if this is needed or should be used # return(deliver); the object return(deliver); } sub vcl_recv { ##avoid leeching of images call hot_link; set req.grace = 360000s; ##2m ## if one backend is down - use another if (req.restarts == 0) { set req.backend = cache_director; ##we can specify individual VMs } else if (req.restarts == 1) { set req.backend = cache_director; } ## post calls should not be cached - add cookie for these requests if using micro-caching # Pass requests that are not GET or HEAD if (req.request != "GET" && req.request != "HEAD") { return(pass); ## return(pass) goes to backend - not cache } # Don't cache the result of a redirect if (req.http.Referer ~ "redir" || req.http.Origin ~ "jumpto") { return(pass); } # Don't cache the result of a redirect (asking for logon) if (req.http.Referer ~ "post" || req.http.Referer ~ "submit" || req.http.Referer ~ "add" || req.http.Referer ~ "ask") { return(pass); } # Never cache posts - ensure that we do not use these strings in our URLs' that need to be cached if (req.url ~ "\/post\/" || req.url ~ "\/submit\/" || req.url ~ "\/ask\/" || req.url ~ "\/add\/") { return(pass); } ## if (req.http.Authorization || req.http.Cookie) { if (req.http.Authorization) { /* Not cacheable by default */ return (pass); } # Handle compression correctly. Different browsers send different # "Accept-Encoding" headers, even though they mostly all support the same # compression mechanisms. By consolidating these compression headers into # a consistent format, we can reduce the size of the cache and get more hits. # @see: http:// varnish.projects.linpro.no/wiki/FAQ/Compression if (req.http.Accept-Encoding) { if (req.url ~ "\.(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg|ico)$") { # No point in compressing these remove req.http.Accept-Encoding; } else if (req.http.Accept-Encoding ~ "gzip") { # If the browser supports it, we'll use gzip. set req.http.Accept-Encoding = "gzip"; } else if (req.http.Accept-Encoding ~ "deflate") { # Next, try deflate if it is supported. set req.http.Accept-Encoding = "deflate"; } else { # Unknown algorithm. Remove it and send unencoded. unset req.http.Accept-Encoding; } } # lookup graphics, css, js & ico files in the cache if (req.url ~ "\.(png|gif|jpg|jpeg|css|.js|ico)$") { return(lookup); } ##added on 0918 - check if it causes issues with user specific content if (req.request == "GET" && req.http.cookie) { return(lookup); } # Pipe requests that are non-RFC2616 or CONNECT which is weird. if (req.request != "GET" && req.request != "HEAD" && req.request != "PUT" && req.request != "POST" && req.request != "TRACE" && req.request != "OPTIONS" && req.request != "DELETE") { ##closing connection and calling pipe return(pipe); } ##purge content via localhost only if (req.request == "PURGE") { if (!client.ip ~ purge) { error 405 "Not allowed."; } return(lookup); } ## do we need this? ## return(lookup); }

    Read the article

  • mdadm: Win7-install created a boot partition on one of my RAID6 drives. How to rebuild?

    - by EXIT_FAILURE
    My problem happened when I attempted to install Windows 7 on it's own SSD. The Linux OS I used which has knowledge of the software RAID system is on a SSD that I disconnected prior to the install. This was so that windows (or I) wouldn't inadvertently mess it up. However, and in retrospect, foolishly, I left the RAID disks connected, thinking that windows wouldn't be so ridiculous as to mess with a HDD that it sees as just unallocated space. Boy was I wrong! After copying over the installation files to the SSD (as expected and desired), it also created an ntfs partition on one of the RAID disks. Both unexpected and totally undesired! . I changed out the SSDs again, and booted up in linux. mdadm didn't seem to have any problem assembling the array as before, but if I tried to mount the array, I got the error message: mount: wrong fs type, bad option, bad superblock on /dev/md0, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so dmesg: EXT4-fs (md0): ext4_check_descriptors: Block bitmap for group 0 not in group (block 1318081259)! EXT4-fs (md0): group descriptors corrupted! I then used qparted to delete the newly created ntfs partition on /dev/sdd so that it matched the other three /dev/sd{b,c,e}, and requested a resync of my array with echo repair > /sys/block/md0/md/sync_action This took around 4 hours, and upon completion, dmesg reports: md: md0: requested-resync done. A bit brief after a 4-hour task, though I'm unsure as to where other log files exist (I also seem to have messed up my sendmail configuration). In any case: No change reported according to mdadm, everything checks out. mdadm -D /dev/md0 still reports: Version : 1.2 Creation Time : Wed May 23 22:18:45 2012 Raid Level : raid6 Array Size : 3907026848 (3726.03 GiB 4000.80 GB) Used Dev Size : 1953513424 (1863.02 GiB 2000.40 GB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Update Time : Mon May 26 12:41:58 2014 State : clean Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 4K Name : okamilinkun:0 UUID : 0c97ebf3:098864d8:126f44e3:e4337102 Events : 423 Number Major Minor RaidDevice State 0 8 16 0 active sync /dev/sdb 1 8 32 1 active sync /dev/sdc 2 8 48 2 active sync /dev/sdd 3 8 64 3 active sync /dev/sde Trying to mount it still reports: mount: wrong fs type, bad option, bad superblock on /dev/md0, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so and dmesg: EXT4-fs (md0): ext4_check_descriptors: Block bitmap for group 0 not in group (block 1318081259)! EXT4-fs (md0): group descriptors corrupted! I'm a bit unsure where to proceed from here, and trying stuff "to see if it works" is a bit too risky for me. This is what I suggest I should attempt to do: Tell mdadm that /dev/sdd (the one that windows wrote into) isn't reliable anymore, pretend it is newly re-introduced to the array, and reconstruct its content based on the other three drives. I also could be totally wrong in my assumptions, that the creation of the ntfs partition on /dev/sdd and subsequent deletion has changed something that cannot be fixed this way. My question: Help, what should I do? If I should do what I suggested , how do I do that? From reading documentation, etc, I would think maybe: mdadm --manage /dev/md0 --set-faulty /dev/sdd mdadm --manage /dev/md0 --remove /dev/sdd mdadm --manage /dev/md0 --re-add /dev/sdd However, the documentation examples suggest /dev/sdd1, which seems strange to me, as there is no partition there as far as linux is concerned, just unallocated space. Maybe these commands won't work without. Maybe it makes sense to mirror the partition table of one of the other raid devices that weren't touched, before --re-add. Something like: sfdisk -d /dev/sdb | sfdisk /dev/sdd Bonus question: Why would the Windows 7 installation do something so st...potentially dangerous? Update I went ahead and marked /dev/sdd as faulty, and removed it (not physically) from the array: # mdadm --manage /dev/md0 --set-faulty /dev/sdd # mdadm --manage /dev/md0 --remove /dev/sdd However, attempting to --re-add was disallowed: # mdadm --manage /dev/md0 --re-add /dev/sdd mdadm: --re-add for /dev/sdd to /dev/md0 is not possible --add, was fine. # mdadm --manage /dev/md0 --add /dev/sdd mdadm -D /dev/md0 now reports the state as clean, degraded, recovering, and /dev/sdd as spare rebuilding. /proc/mdstat shows the recovery progress: md0 : active raid6 sdd[4] sdc[1] sde[3] sdb[0] 3907026848 blocks super 1.2 level 6, 4k chunk, algorithm 2 [4/3] [UU_U] [>....................] recovery = 2.1% (42887780/1953513424) finish=348.7min speed=91297K/sec nmon also shows expected output: ¦sdb 0% 87.3 0.0| > |¦ ¦sdc 71% 109.1 0.0|RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR > |¦ ¦sdd 40% 0.0 87.3|WWWWWWWWWWWWWWWWWWWW > |¦ ¦sde 0% 87.3 0.0|> || It looks good so far. Crossing my fingers for another five+ hours :) Update 2 The recovery of /dev/sdd finished, with dmesg output: [44972.599552] md: md0: recovery done. [44972.682811] RAID conf printout: [44972.682815] --- level:6 rd:4 wd:4 [44972.682817] disk 0, o:1, dev:sdb [44972.682819] disk 1, o:1, dev:sdc [44972.682820] disk 2, o:1, dev:sdd [44972.682821] disk 3, o:1, dev:sde Attempting mount /dev/md0 reports: mount: wrong fs type, bad option, bad superblock on /dev/md0, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so And on dmesg: [44984.159908] EXT4-fs (md0): ext4_check_descriptors: Block bitmap for group 0 not in group (block 1318081259)! [44984.159912] EXT4-fs (md0): group descriptors corrupted! I'm not sure what do do now. Suggestions? Output of dumpe2fs /dev/md0: dumpe2fs 1.42.8 (20-Jun-2013) Filesystem volume name: Atlas Last mounted on: /mnt/atlas Filesystem UUID: e7bfb6a4-c907-4aa0-9b55-9528817bfd70 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 244195328 Block count: 976756712 Reserved block count: 48837835 Free blocks: 92000180 Free inodes: 243414877 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 791 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 RAID stripe width: 2 Flex block group size: 16 Filesystem created: Thu May 24 07:22:41 2012 Last mount time: Sun May 25 23:44:38 2014 Last write time: Sun May 25 23:46:42 2014 Mount count: 341 Maximum mount count: -1 Last checked: Thu May 24 07:22:41 2012 Check interval: 0 (<none>) Lifetime writes: 4357 GB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 28 Desired extra isize: 28 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: e177a374-0b90-4eaa-b78f-d734aae13051 Journal backup: inode blocks dumpe2fs: Corrupt extent header while reading journal super block

    Read the article

  • Toorcon 15 (2013)

    - by danx
    The Toorcon gang (senior staff): h1kari (founder), nfiltr8, and Geo Introduction to Toorcon 15 (2013) A Tale of One Software Bypass of MS Windows 8 Secure Boot Breaching SSL, One Byte at a Time Running at 99%: Surviving an Application DoS Security Response in the Age of Mass Customized Attacks x86 Rewriting: Defeating RoP and other Shinanighans Clowntown Express: interesting bugs and running a bug bounty program Active Fingerprinting of Encrypted VPNs Making Attacks Go Backwards Mask Your Checksums—The Gorry Details Adventures with weird machines thirty years after "Reflections on Trusting Trust" Introduction to Toorcon 15 (2013) Toorcon 15 is the 15th annual security conference held in San Diego. I've attended about a third of them and blogged about previous conferences I attended here starting in 2003. As always, I've only summarized the talks I attended and interested me enough to write about them. Be aware that I may have misrepresented the speaker's remarks and that they are not my remarks or opinion, or those of my employer, so don't quote me or them. Those seeking further details may contact the speakers directly or use The Google. For some talks, I have a URL for further information. A Tale of One Software Bypass of MS Windows 8 Secure Boot Andrew Furtak and Oleksandr Bazhaniuk Yuri Bulygin, Oleksandr ("Alex") Bazhaniuk, and (not present) Andrew Furtak Yuri and Alex talked about UEFI and Bootkits and bypassing MS Windows 8 Secure Boot, with vendor recommendations. They previously gave this talk at the BlackHat 2013 conference. MS Windows 8 Secure Boot Overview UEFI (Unified Extensible Firmware Interface) is interface between hardware and OS. UEFI is processor and architecture independent. Malware can replace bootloader (bootx64.efi, bootmgfw.efi). Once replaced can modify kernel. Trivial to replace bootloader. Today many legacy bootkits—UEFI replaces them most of them. MS Windows 8 Secure Boot verifies everything you load, either through signatures or hashes. UEFI firmware relies on secure update (with signed update). You would think Secure Boot would rely on ROM (such as used for phones0, but you can't do that for PCs—PCs use writable memory with signatures DXE core verifies the UEFI boat loader(s) OS Loader (winload.efi, winresume.efi) verifies the OS kernel A chain of trust is established with a root key (Platform Key, PK), which is a cert belonging to the platform vendor. Key Exchange Keys (KEKs) verify an "authorized" database (db), and "forbidden" database (dbx). X.509 certs with SHA-1/SHA-256 hashes. Keys are stored in non-volatile (NV) flash-based NVRAM. Boot Services (BS) allow adding/deleting keys (can't be accessed once OS starts—which uses Run-Time (RT)). Root cert uses RSA-2048 public keys and PKCS#7 format signatures. SecureBoot — enable disable image signature checks SetupMode — update keys, self-signed keys, and secure boot variables CustomMode — allows updating keys Secure Boot policy settings are: always execute, never execute, allow execute on security violation, defer execute on security violation, deny execute on security violation, query user on security violation Attacking MS Windows 8 Secure Boot Secure Boot does NOT protect from physical access. Can disable from console. Each BIOS vendor implements Secure Boot differently. There are several platform and BIOS vendors. It becomes a "zoo" of implementations—which can be taken advantage of. Secure Boot is secure only when all vendors implement it correctly. Allow only UEFI firmware signed updates protect UEFI firmware from direct modification in flash memory protect FW update components program SPI controller securely protect secure boot policy settings in nvram protect runtime api disable compatibility support module which allows unsigned legacy Can corrupt the Platform Key (PK) EFI root certificate variable in SPI flash. If PK is not found, FW enters setup mode wich secure boot turned off. Can also exploit TPM in a similar manner. One is not supposed to be able to directly modify the PK in SPI flash from the OS though. But they found a bug that they can exploit from User Mode (undisclosed) and demoed the exploit. It loaded and ran their own bootkit. The exploit requires a reboot. Multiple vendors are vulnerable. They will disclose this exploit to vendors in the future. Recommendations: allow only signed updates protect UEFI fw in ROM protect EFI variable store in ROM Breaching SSL, One Byte at a Time Yoel Gluck and Angelo Prado Angelo Prado and Yoel Gluck, Salesforce.com CRIME is software that performs a "compression oracle attack." This is possible because the SSL protocol doesn't hide length, and because SSL compresses the header. CRIME requests with every possible character and measures the ciphertext length. Look for the plaintext which compresses the most and looks for the cookie one byte-at-a-time. SSL Compression uses LZ77 to reduce redundancy. Huffman coding replaces common byte sequences with shorter codes. US CERT thinks the SSL compression problem is fixed, but it isn't. They convinced CERT that it wasn't fixed and they issued a CVE. BREACH, breachattrack.com BREACH exploits the SSL response body (Accept-Encoding response, Content-Encoding). It takes advantage of the fact that the response is not compressed. BREACH uses gzip and needs fairly "stable" pages that are static for ~30 seconds. It needs attacker-supplied content (say from a web form or added to a URL parameter). BREACH listens to a session's requests and responses, then inserts extra requests and responses. Eventually, BREACH guesses a session's secret key. Can use compression to guess contents one byte at-a-time. For example, "Supersecret SupersecreX" (a wrong guess) compresses 10 bytes, and "Supersecret Supersecret" (a correct guess) compresses 11 bytes, so it can find each character by guessing every character. To start the guess, BREACH needs at least three known initial characters in the response sequence. Compression length then "leaks" information. Some roadblocks include no winners (all guesses wrong) or too many winners (multiple possibilities that compress the same). The solutions include: lookahead (guess 2 or 3 characters at-a-time instead of 1 character). Expensive rollback to last known conflict check compression ratio can brute-force first 3 "bootstrap" characters, if needed (expensive) block ciphers hide exact plain text length. Solution is to align response in advance to block size Mitigations length: use variable padding secrets: dynamic CSRF tokens per request secret: change over time separate secret to input-less servlets Future work eiter understand DEFLATE/GZIP HTTPS extensions Running at 99%: Surviving an Application DoS Ryan Huber Ryan Huber, Risk I/O Ryan first discussed various ways to do a denial of service (DoS) attack against web services. One usual method is to find a slow web page and do several wgets. Or download large files. Apache is not well suited at handling a large number of connections, but one can put something in front of it Can use Apache alternatives, such as nginx How to identify malicious hosts short, sudden web requests user-agent is obvious (curl, python) same url requested repeatedly no web page referer (not normal) hidden links. hide a link and see if a bot gets it restricted access if not your geo IP (unless the website is global) missing common headers in request regular timing first seen IP at beginning of attack count requests per hosts (usually a very large number) Use of captcha can mitigate attacks, but you'll lose a lot of genuine users. Bouncer, goo.gl/c2vyEc and www.github.com/rawdigits/Bouncer Bouncer is software written by Ryan in netflow. Bouncer has a small, unobtrusive footprint and detects DoS attempts. It closes blacklisted sockets immediately (not nice about it, no proper close connection). Aggregator collects requests and controls your web proxies. Need NTP on the front end web servers for clean data for use by bouncer. Bouncer is also useful for a popularity storm ("Slashdotting") and scraper storms. Future features: gzip collection data, documentation, consumer library, multitask, logging destroyed connections. Takeaways: DoS mitigation is easier with a complete picture Bouncer designed to make it easier to detect and defend DoS—not a complete cure Security Response in the Age of Mass Customized Attacks Peleus Uhley and Karthik Raman Peleus Uhley and Karthik Raman, Adobe ASSET, blogs.adobe.com/asset/ Peleus and Karthik talked about response to mass-customized exploits. Attackers behave much like a business. "Mass customization" refers to concept discussed in the book Future Perfect by Stan Davis of Harvard Business School. Mass customization is differentiating a product for an individual customer, but at a mass production price. For example, the same individual with a debit card receives basically the same customized ATM experience around the world. Or designing your own PC from commodity parts. Exploit kits are another example of mass customization. The kits support multiple browsers and plugins, allows new modules. Exploit kits are cheap and customizable. Organized gangs use exploit kits. A group at Berkeley looked at 77,000 malicious websites (Grier et al., "Manufacturing Compromise: The Emergence of Exploit-as-a-Service", 2012). They found 10,000 distinct binaries among them, but derived from only a dozen or so exploit kits. Characteristics of Mass Malware: potent, resilient, relatively low cost Technical characteristics: multiple OS, multipe payloads, multiple scenarios, multiple languages, obfuscation Response time for 0-day exploits has gone down from ~40 days 5 years ago to about ~10 days now. So the drive with malware is towards mass customized exploits, to avoid detection There's plenty of evicence that exploit development has Project Manager bureaucracy. They infer from the malware edicts to: support all versions of reader support all versions of windows support all versions of flash support all browsers write large complex, difficult to main code (8750 lines of JavaScript for example Exploits have "loose coupling" of multipe versions of software (adobe), OS, and browser. This allows specific attacks against specific versions of multiple pieces of software. Also allows exploits of more obscure software/OS/browsers and obscure versions. Gave examples of exploits that exploited 2, 3, 6, or 14 separate bugs. However, these complete exploits are more likely to be buggy or fragile in themselves and easier to defeat. Future research includes normalizing malware and Javascript. Conclusion: The coming trend is that mass-malware with mass zero-day attacks will result in mass customization of attacks. x86 Rewriting: Defeating RoP and other Shinanighans Richard Wartell Richard Wartell The attack vector we are addressing here is: First some malware causes a buffer overflow. The malware has no program access, but input access and buffer overflow code onto stack Later the stack became non-executable. The workaround malware used was to write a bogus return address to the stack jumping to malware Later came ASLR (Address Space Layout Randomization) to randomize memory layout and make addresses non-deterministic. The workaround malware used was to jump t existing code segments in the program that can be used in bad ways "RoP" is Return-oriented Programming attacks. RoP attacks use your own code and write return address on stack to (existing) expoitable code found in program ("gadgets"). Pinkie Pie was paid $60K last year for a RoP attack. One solution is using anti-RoP compilers that compile source code with NO return instructions. ASLR does not randomize address space, just "gadgets". IPR/ILR ("Instruction Location Randomization") randomizes each instruction with a virtual machine. Richard's goal was to randomize a binary with no source code access. He created "STIR" (Self-Transofrming Instruction Relocation). STIR disassembles binary and operates on "basic blocks" of code. The STIR disassembler is conservative in what to disassemble. Each basic block is moved to a random location in memory. Next, STIR writes new code sections with copies of "basic blocks" of code in randomized locations. The old code is copied and rewritten with jumps to new code. the original code sections in the file is marked non-executible. STIR has better entropy than ASLR in location of code. Makes brute force attacks much harder. STIR runs on MS Windows (PEM) and Linux (ELF). It eliminated 99.96% or more "gadgets" (i.e., moved the address). Overhead usually 5-10% on MS Windows, about 1.5-4% on Linux (but some code actually runs faster!). The unique thing about STIR is it requires no source access and the modified binary fully works! Current work is to rewrite code to enforce security policies. For example, don't create a *.{exe,msi,bat} file. Or don't connect to the network after reading from the disk. Clowntown Express: interesting bugs and running a bug bounty program Collin Greene Collin Greene, Facebook Collin talked about Facebook's bug bounty program. Background at FB: FB has good security frameworks, such as security teams, external audits, and cc'ing on diffs. But there's lots of "deep, dark, forgotten" parts of legacy FB code. Collin gave several examples of bountied bugs. Some bounty submissions were on software purchased from a third-party (but bounty claimers don't know and don't care). We use security questions, as does everyone else, but they are basically insecure (often easily discoverable). Collin didn't expect many bugs from the bounty program, but they ended getting 20+ good bugs in first 24 hours and good submissions continue to come in. Bug bounties bring people in with different perspectives, and are paid only for success. Bug bounty is a better use of a fixed amount of time and money versus just code review or static code analysis. The Bounty program started July 2011 and paid out $1.5 million to date. 14% of the submissions have been high priority problems that needed to be fixed immediately. The best bugs come from a small % of submitters (as with everything else)—the top paid submitters are paid 6 figures a year. Spammers like to backstab competitors. The youngest sumitter was 13. Some submitters have been hired. Bug bounties also allows to see bugs that were missed by tools or reviews, allowing improvement in the process. Bug bounties might not work for traditional software companies where the product has release cycle or is not on Internet. Active Fingerprinting of Encrypted VPNs Anna Shubina Anna Shubina, Dartmouth Institute for Security, Technology, and Society (I missed the start of her talk because another track went overtime. But I have the DVD of the talk, so I'll expand later) IPsec leaves fingerprints. Using netcat, one can easily visually distinguish various crypto chaining modes just from packet timing on a chart (example, DES-CBC versus AES-CBC) One can tell a lot about VPNs just from ping roundtrips (such as what router is used) Delayed packets are not informative about a network, especially if far away from the network More needed to explore about how TCP works in real life with respect to timing Making Attacks Go Backwards Fuzzynop FuzzyNop, Mandiant This talk is not about threat attribution (finding who), product solutions, politics, or sales pitches. But who are making these malware threats? It's not a single person or group—they have diverse skill levels. There's a lot of fat-fingered fumblers out there. Always look for low-hanging fruit first: "hiding" malware in the temp, recycle, or root directories creation of unnamed scheduled tasks obvious names of files and syscalls ("ClearEventLog") uncleared event logs. Clearing event log in itself, and time of clearing, is a red flag and good first clue to look for on a suspect system Reverse engineering is hard. Disassembler use takes practice and skill. A popular tool is IDA Pro, but it takes multiple interactive iterations to get a clean disassembly. Key loggers are used a lot in targeted attacks. They are typically custom code or built in a backdoor. A big tip-off is that non-printable characters need to be printed out (such as "[Ctrl]" "[RightShift]") or time stamp printf strings. Look for these in files. Presence is not proof they are used. Absence is not proof they are not used. Java exploits. Can parse jar file with idxparser.py and decomile Java file. Java typially used to target tech companies. Backdoors are the main persistence mechanism (provided externally) for malware. Also malware typically needs command and control. Application of Artificial Intelligence in Ad-Hoc Static Code Analysis John Ashaman John Ashaman, Security Innovation Initially John tried to analyze open source files with open source static analysis tools, but these showed thousands of false positives. Also tried using grep, but tis fails to find anything even mildly complex. So next John decided to write his own tool. His approach was to first generate a call graph then analyze the graph. However, the problem is that making a call graph is really hard. For example, one problem is "evil" coding techniques, such as passing function pointer. First the tool generated an Abstract Syntax Tree (AST) with the nodes created from method declarations and edges created from method use. Then the tool generated a control flow graph with the goal to find a path through the AST (a maze) from source to sink. The algorithm is to look at adjacent nodes to see if any are "scary" (a vulnerability), using heuristics for search order. The tool, called "Scat" (Static Code Analysis Tool), currently looks for C# vulnerabilities and some simple PHP. Later, he plans to add more PHP, then JSP and Java. For more information see his posts in Security Innovation blog and NRefactory on GitHub. Mask Your Checksums—The Gorry Details Eric (XlogicX) Davisson Eric (XlogicX) Davisson Sometimes in emailing or posting TCP/IP packets to analyze problems, you may want to mask the IP address. But to do this correctly, you need to mask the checksum too, or you'll leak information about the IP. Problem reports found in stackoverflow.com, sans.org, and pastebin.org are usually not masked, but a few companies do care. If only the IP is masked, the IP may be guessed from checksum (that is, it leaks data). Other parts of packet may leak more data about the IP. TCP and IP checksums both refer to the same data, so can get more bits of information out of using both checksums than just using one checksum. Also, one can usually determine the OS from the TTL field and ports in a packet header. If we get hundreds of possible results (16x each masked nibble that is unknown), one can do other things to narrow the results, such as look at packet contents for domain or geo information. With hundreds of results, can import as CSV format into a spreadsheet. Can corelate with geo data and see where each possibility is located. Eric then demoed a real email report with a masked IP packet attached. Was able to find the exact IP address, given the geo and university of the sender. Point is if you're going to mask a packet, do it right. Eric wouldn't usually bother, but do it correctly if at all, to not create a false impression of security. Adventures with weird machines thirty years after "Reflections on Trusting Trust" Sergey Bratus Sergey Bratus, Dartmouth College (and Julian Bangert and Rebecca Shapiro, not present) "Reflections on Trusting Trust" refers to Ken Thompson's classic 1984 paper. "You can't trust code that you did not totally create yourself." There's invisible links in the chain-of-trust, such as "well-installed microcode bugs" or in the compiler, and other planted bugs. Thompson showed how a compiler can introduce and propagate bugs in unmodified source. But suppose if there's no bugs and you trust the author, can you trust the code? Hell No! There's too many factors—it's Babylonian in nature. Why not? Well, Input is not well-defined/recognized (code's assumptions about "checked" input will be violated (bug/vunerabiliy). For example, HTML is recursive, but Regex checking is not recursive. Input well-formed but so complex there's no telling what it does For example, ELF file parsing is complex and has multiple ways of parsing. Input is seen differently by different pieces of program or toolchain Any Input is a program input executes on input handlers (drives state changes & transitions) only a well-defined execution model can be trusted (regex/DFA, PDA, CFG) Input handler either is a "recognizer" for the inputs as a well-defined language (see langsec.org) or it's a "virtual machine" for inputs to drive into pwn-age ELF ABI (UNIX/Linux executible file format) case study. Problems can arise from these steps (without planting bugs): compiler linker loader ld.so/rtld relocator DWARF (debugger info) exceptions The problem is you can't really automatically analyze code (it's the "halting problem" and undecidable). Only solution is to freeze code and sign it. But you can't freeze everything! Can't freeze ASLR or loading—must have tables and metadata. Any sufficiently complex input data is the same as VM byte code Example, ELF relocation entries + dynamic symbols == a Turing Complete Machine (TM). @bxsays created a Turing machine in Linux from relocation data (not code) in an ELF file. For more information, see Rebecca "bx" Shapiro's presentation from last year's Toorcon, "Programming Weird Machines with ELF Metadata" @bxsays did same thing with Mach-O bytecode Or a DWARF exception handling data .eh_frame + glibc == Turning Machine X86 MMU (IDT, GDT, TSS): used address translation to create a Turning Machine. Page handler reads and writes (on page fault) memory. Uses a page table, which can be used as Turning Machine byte code. Example on Github using this TM that will fly a glider across the screen Next Sergey talked about "Parser Differentials". That having one input format, but two parsers, will create confusion and opportunity for exploitation. For example, CSRs are parsed during creation by cert requestor and again by another parser at the CA. Another example is ELF—several parsers in OS tool chain, which are all different. Can have two different Program Headers (PHDRs) because ld.so parses multiple PHDRs. The second PHDR can completely transform the executable. This is described in paper in the first issue of International Journal of PoC. Conclusions trusting computers not only about bugs! Bugs are part of a problem, but no by far all of it complex data formats means bugs no "chain of trust" in Babylon! (that is, with parser differentials) we need to squeeze complexity out of data until data stops being "code equivalent" Further information See and langsec.org. USENIX WOOT 2013 (Workshop on Offensive Technologies) for "weird machines" papers and videos.

    Read the article

  • Configuring Oracle iPlanet WebServer / Oracle Traffic Director to use crypto accelerators on T4-1 servers

    - by mv
    Configuring Oracle iPlanet Web Server / Oracle Traffic Director to use crypto accelerators on T4-1 servers Jyri had written a technical article on Configuring Solaris Cryptographic Framework and Sun Java System Web Server 7 on Systems With UltraSPARC T1 Processors. I tried to find out what has changed since then in T4. I have used a T4-1 SPARC system with Solaris 10. Results slightly vary for Solaris 11.  For Solaris 11, the T4 optimization was implemented in libsoftcrypto.so while it was in pkcs11_softtoken_extra.so for Solaris 10. Overview of T4 processors is here in this blog. Many thanx to Chi-Chang Lin and Julien for their help. 1. Install Oracle iPlanet Web Server / Oracle Traffic Director.  Go to instance/config directory.  # cd /opt/oracle/webserver7/https-hostname.fqdn/config 2. List default PKCS#11 Modules # ../../bin/modutil -dbdir . -listListing of PKCS #11 Modules-----------------------------------------------------------1. NSS Internal PKCS #11 Moduleslots: 2 slots attachedstatus: loadedslot: NSS Internal Cryptographic Servicestoken: NSS Generic Crypto Servicesslot: NSS User Private Key and Certificate Servicestoken: NSS Certificate DB2. Root Certslibrary name: libnssckbi.soslots: 1 slot attachedstatus: loadedslot: NSS Builtin Objectstoken: Builtin Object Token----------------------------------------------------------- 3. Initialize the soft token data store in the $HOME/.sunw/pkcs11_softtoken/ directory # pktool setpin keystore=pkcs11Enter token passphrase: olderpasswordCreate new passphrase: passwordRe-enter new passphrase: passwordPassphrase changed. 4. Offload crypto operations to Solaris Crypto Framework on T4 $ ../../bin/modutil -dbdir . -nocertdb -add SCF -libfile /usr/lib/libpkcs11.so -mechanisms RSA:AES:SHA1:MD5 Module "SCF" added to database. Note that -nocertdb means modutil won't try to open the NSS softoken key database. It doesn't even have to be present. PKCS#11 library used is /usr/lib/libpkcs11.so. If the server is running in 64 bit mode, we have to use /usr/lib/64/libpkcs11.so Unlike T1 and T2, in T4 we do not have to disable mechanisms in softtoken provider using cryptoadm. 5. List again to check that a new module SCF is added # ../../bin/modutil -dbdir . -list Listing of PKCS #11 Modules-----------------------------------------------------------1. NSS Internal PKCS #11 Moduleslots: 2 slots attachedstatus: loadedslot: NSS Internal Cryptographic Servicestoken: NSS Generic Crypto Servicesslot: NSS User Private Key and Certificate Servicestoken: NSS Certificate DB2. SCFlibrary name: /usr/lib/libpkcs11.soslots: 2 slots attachedstatus: loadedslot: Sun Metaslottoken: Sun Metaslotslot: n2rng/0 SUNW_N2_Random_Number_Generator token: n2rng/0 SUNW_N2_RNG 3. Root Certs library name: libnssckbi.so slots: 1 slot attached status: loaded slot: NSS Builtin Objects token: Builtin Object Token----------------------------------------------------------- 6.  Create certificate in “Sun Metaslot” : I have used certutil, but you must use Admin Server CLI / GUI # ../../bin/certutil -S -x -n "Server-Cert" -t "CT,CT,CT" -s "CN=*.fqdn" -d . -h "Sun Metaslot"Enter Password or Pin for "Sun Metaslot": password 7. Verify that the certificate is created properly in “Sun Metslaot” # ../../bin/certutil -L -d . -h "Sun Metaslot"Certificate Nickname Trust AttributesSSL,S/MIME,JAR/XPIEnter Password or Pin for "Sun Metaslot": passwordSun Metaslot:Server-Cert CTu,Cu,Cu# 8. Associate this newly created certificate to http listener using Admin CLI/GUI. After that server.xml should have <http-listener> ...    <ssl>        <server-cert-nickname>Sun Metaslot:Server-Cert</server-cert-nicknamer>    </ssl> Note the prefix "Sun Metaslot" 9. Disable PKCS#11 bypass To use the accelerated AES algorithm, turn off PKCS#11 bypass, and configure modutil to have the AES mechanism go to the Metaslot. After you disable PKCS#11 bypasss using Admin GUI/CLI,  check that server.xml should have <server> ....    <pkcs11>         <enabled>1</enabled>         <allow-bypass>0</allow-bypass>     </pkcs11> With PKCS#11 bypass enabled, Oracle iPlanet Web Server will only use the RSA capability of the T4, provided certificate and key are stored in the T4 slot (Metaslot). Actually, the RSA op is never bypassed in NSS, it's always done with PKCS#11 calls. So the bypass settings won't affect the behavior of the probes for RSA at all. The only thing that matters if where the RSA key and certificate live, ie. which PKCS#11 token, and thus which PKCS#11 module gets called to do the work. If your certificate/key are in the NSS certificate/key db, you will see libsoftokn3/libfreebl libraries doing the RSA work. If they are in the Sun Metaslot, it should be the Solaris code. 10. Start the server instance # ../bin/startserv Oracle iPlanet Web Server 7.0.16 B09/14/2012 03:33Please enter the PIN for the "Sun Metaslot" token: password...info: HTTP3072: http-listener-1: https://hostname.fqdn:80 ready to accept requestsinfo: CORE3274: successful server startup 11. Figure out which process to run this DTrace script on # ps -eaf | grep webservd | grep -v dogwebservd 18224 18223 0 13:17:25 ? 0:07 webservd -d /opt/oracle/webserver7/https-hostname.fqdn/config -r /opt/root 18225 18224 0 13:17:25 ? 0:00 webservd -d /opt/oracle/webserver7/https-hostname.fqdn/config -r /opt/ (For Oracle Traffic Director look for process named "trafficd") We see that the child process id is “18225” 12. Clients for testing : You can use any browser. I used NSS tool tstclnt for testing $cat > req.txtGET /index.html HTTP/1.0 For checking both RSA and AES, I used cipher “:0035” which is TLS_RSA_WITH_AES_256_CBC_SHA $./tstclnt -h hostname -p 80 -d . -T -f -o -v -c “:0035” < req.txt 13. How do I make sure that crypto accelerator is being used 13.1 Create DTrace script The following D script should be able to uncover whether T4-specific crypto routine are being called or not. It also displays stats per second. # cat > t4crypto.d#!/usr/sbin/dtrace -spid$target::*rsa*:entry,pid$target::*yf*:entry{    @ops[probemod, probefunc] = count();}tick-1sec{    printa(@ops);    trunc(@ops);} Invoke with './t4crypto.d -p <pid> ' 13.2 EXPECTED PROBES FOR Solaris 10 : If offloading to T4 HW are correctly set up, the expected DTrace output would have these probes and libraries library Operations PROBES pkcs11_softtoken_extra.so RSA soft_decrypt_rsa_pkcs_decode, soft_encrypt_rsa_pkcs_encode soft_rsa_crypt_init_common soft_rsa_decrypt, soft_rsa_encrypt soft_rsa_decrypt_common, soft_rsa_encrypt_common AES yf_aes_instructions_present yf_aes_expand256, yf_aes256_cbc_decrypt, yf_aes256_cbc_encrypt, yf_aes256_load_keys_for_decrypt, yf_aes256_load_keys_for_encrypt, Note that these are for 256, same for 128, 192... these are for cbc, same for ecb, ctr, cfb128... DES yf_des_expand, yf_des_instructions_present yf_des_encrypt libmd_psr.so MD5 yf_md5_multiblock, yf_md5_instruction_present SHA1 yf_sha1_instruction_present, yf_sha1_multibloc 13.3 SAMPLE OUTPUT FOR CIPHER TLS_RSA_WITH_AES_256_CBC_SHA (0x0035) ON T4 SPARC SOLARIS 10 WITHOUT PKCS#11 BYPASS # ./t4crypto.d -p 18225 pkcs11_softtoken_extra.so.1   soft_decrypt_rsa_pkcs_decode    1 pkcs11_softtoken_extra.so.1   soft_rsa_crypt_init_common      1 pkcs11_softtoken_extra.so.1   soft_rsa_decrypt                1 pkcs11_softtoken_extra.so.1   big_mp_mul_yf                   2 pkcs11_softtoken_extra.so.1   mpm_yf_mpmul                    2 pkcs11_softtoken_extra.so.1   mpmul_arr_yf                    2 pkcs11_softtoken_extra.so.1   rijndael_key_setup_enc_yf       2 pkcs11_softtoken_extra.so.1   soft_rsa_decrypt_common         2 pkcs11_softtoken_extra.so.1   yf_aes_expand256                2 pkcs11_softtoken_extra.so.1   yf_aes256_cbc_decrypt           3 pkcs11_softtoken_extra.so.1   yf_aes256_load_keys_for_decrypt 3 pkcs11_softtoken_extra.so.1   big_mont_mul_yf                 6 pkcs11_softtoken_extra.so.1   mm_yf_montmul                   6 pkcs11_softtoken_extra.so.1   yf_des_instructions_present     6 pkcs11_softtoken_extra.so.1   yf_aes256_cbc_encrypt           8 pkcs11_softtoken_extra.so.1   yf_aes256_load_keys_for_encrypt 8 pkcs11_softtoken_extra.so.1   yf_mpmul_present                8 pkcs11_softtoken_extra.so.1   yf_aes_instructions_present    13 pkcs11_softtoken_extra.so.1   yf_des_encrypt                 18 libmd_psr.so.1                yf_md5_multiblock              41 libmd_psr.so.1                yf_md5_instruction_present     72 libmd_psr.so.1                yf_sha1_instruction_present    82 libmd_psr.so.1                yf_sha1_multiblock             82 This indicates that both RSA and AES ops are done in Solaris Crypto Framework. 13.4 SAMPLE OUTPUT FOR CIPHER TLS_RSA_WITH_AES_256_CBC_SHA (0x0035) ON T4 SPARC SOLARIS 10 WITH PKCS#11 BYPASS # ./t4crypto.d -p 18225 pkcs11_softtoken_extra.so.1   soft_decrypt_rsa_pkcs_decode 1 pkcs11_softtoken_extra.so.1   soft_rsa_crypt_init_common   1 pkcs11_softtoken_extra.so.1   soft_rsa_decrypt             1 pkcs11_softtoken_extra.so.1   soft_rsa_decrypt_common      1 pkcs11_softtoken_extra.so.1   big_mp_mul_yf                2 pkcs11_softtoken_extra.so.1   mpm_yf_mpmul                 2 pkcs11_softtoken_extra.so.1   mpmul_arr_yf                 2 pkcs11_softtoken_extra.so.1   big_mont_mul_yf              6 pkcs11_softtoken_extra.so.1   mm_yf_montmul                6 pkcs11_softtoken_extra.so.1   yf_mpmul_present             8 For this cipher, when I enable PKCS#11 bypass, Only RSA probes are being hit AES probes are not being hit. 13.5 ustack() for RSA operations / probefunc == "soft_rsa_decrypt" / Shows that libnss3.so is calling C_* functions of libpkcs11.so which is calling functions of pkcs11_softtoken_extra.so for both cases with and without bypass. When PKCS#11 bypass is disabled (allow-bypass is 0) pkcs11_softtoken_extra.so.1`soft_rsa_decrypt pkcs11_softtoken_extra.so.1`soft_rsa_decrypt_common+0x94 pkcs11_softtoken_extra.so.1`soft_unwrapkey+0x258 pkcs11_softtoken_extra.so.1`C_UnwrapKey+0x1ec libpkcs11.so.1`meta_unwrap_key+0x17c libpkcs11.so.1`meta_UnwrapKey+0xc4 libpkcs11.so.1`C_UnwrapKey+0xfc libnss3.so`pk11_AnyUnwrapKey+0x6b8 libnss3.so`PK11_PubUnwrapSymKey+0x8c libssl3.so`ssl3_HandleRSAClientKeyExchange+0x1a0 libssl3.so`ssl3_HandleClientKeyExchange+0x154 libssl3.so`ssl3_HandleHandshakeMessage+0x440 libssl3.so`ssl3_HandleHandshake+0x11c libssl3.so`ssl3_HandleRecord+0x5e8 libssl3.so`ssl3_GatherCompleteHandshake+0x5c libssl3.so`ssl_GatherRecord1stHandshake+0x30 libssl3.so`ssl_Do1stHandshake+0xec libssl3.so`ssl_SecureRecv+0x1c8 libssl3.so`ssl_Recv+0x9c libns-httpd40.so`__1cNDaemonSessionDrun6M_v_+0x2dc When PKCS#11 bypass is enabled (allow-bypass is 1) pkcs11_softtoken_extra.so.1`soft_rsa_decrypt pkcs11_softtoken_extra.so.1`soft_rsa_decrypt_common+0x94 pkcs11_softtoken_extra.so.1`C_Decrypt+0x164 libpkcs11.so.1`meta_do_operation+0x27c libpkcs11.so.1`meta_Decrypt+0x4c libpkcs11.so.1`C_Decrypt+0xcc libnss3.so`PK11_PrivDecryptPKCS1+0x1ac libssl3.so`ssl3_HandleRSAClientKeyExchange+0xe4 libssl3.so`ssl3_HandleClientKeyExchange+0x154 libssl3.so`ssl3_HandleHandshakeMessage+0x440 libssl3.so`ssl3_HandleHandshake+0x11c libssl3.so`ssl3_HandleRecord+0x5e8 libssl3.so`ssl3_GatherCompleteHandshake+0x5c libssl3.so`ssl_GatherRecord1stHandshake+0x30 libssl3.so`ssl_Do1stHandshake+0xec libssl3.so`ssl_SecureRecv+0x1c8 libssl3.so`ssl_Recv+0x9c libns-httpd40.so`__1cNDaemonSessionDrun6M_v_+0x2dc libnsprwrap.so`ThreadMain+0x1c libnspr4.so`_pt_root+0xe8 13.6 ustack() FOR AES operations / probefunc == "yf_aes256_cbc_encrypt" / When PKCS#11 bypass is disabled (allow-bypass is 0) pkcs11_softtoken_extra.so.1`yf_aes256_cbc_encrypt pkcs11_softtoken_extra.so.1`aes_block_process_contiguous_whole_blocks+0xb4 pkcs11_softtoken_extra.so.1`aes_crypt_contiguous_blocks+0x1cc pkcs11_softtoken_extra.so.1`soft_aes_encrypt_common+0x22c pkcs11_softtoken_extra.so.1`C_EncryptUpdate+0x10c libpkcs11.so.1`meta_do_operation+0x1fc libpkcs11.so.1`meta_EncryptUpdate+0x4c libpkcs11.so.1`C_EncryptUpdate+0xcc libnss3.so`PK11_CipherOp+0x1a0 libssl3.so`ssl3_CompressMACEncryptRecord+0x264 libssl3.so`ssl3_SendRecord+0x300 libssl3.so`ssl3_FlushHandshake+0x54 libssl3.so`ssl3_SendFinished+0x1fc libssl3.so`ssl3_HandleFinished+0x314 libssl3.so`ssl3_HandleHandshakeMessage+0x4ac libssl3.so`ssl3_HandleHandshake+0x11c libssl3.so`ssl3_HandleRecord+0x5e8 libssl3.so`ssl3_GatherCompleteHandshake+0x5c libssl3.so`ssl_GatherRecord1stHandshake+0x30 libssl3.so`ssl_Do1stHandshake+0xec Shows that libnss3.so is calling C_* functions of libpkcs11.so which is calling functions of pkcs11_softtoken_extra.so However when PKCS#11 bypass is disabled (allow-bypass is 1) this stack isn't getting called. 14. LIST OF ALL THE PROBES MATCHED BY D SCRIPT FOR REFERENCE # ./t4crypto.d -p 18225 -l ID PROVIDER MODULE FUNCTION NAME ... 55720 pid18225 libmd_psr.so.1 yf_md5_instruction_present entry 55721 pid18225 libmd_psr.so.1 yf_sha256_instruction_present entry 55722 pid18225 libmd_psr.so.1 yf_sha512_instruction_present entry 55723 pid18225 libmd_psr.so.1 yf_sha1_instruction_present entry 55724 pid18225 libmd_psr.so.1 yf_sha256 entry 55725 pid18225 libmd_psr.so.1 yf_sha256_multiblock entry 55726 pid18225 libmd_psr.so.1 yf_sha512 entry 55727 pid18225 libmd_psr.so.1 yf_sha512_multiblock entry 55728 pid18225 libmd_psr.so.1 yf_sha1 entry 55729 pid18225 libmd_psr.so.1 yf_sha1_multiblock entry 55730 pid18225 libmd_psr.so.1 yf_md5 entry 55731 pid18225 libmd_psr.so.1 yf_md5_multiblock entry 55732 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_instructions_present entry 55733 pid18225 pkcs11_softtoken_extra.so.1 rijndael_key_setup_enc_yf entry 55734 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_expand128 entry 55735 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_encrypt128 entry 55736 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_decrypt128 entry 55737 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_expand192 entry 55738 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_encrypt192 entry 55739 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_decrypt192 entry 55740 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_expand256 entry 55741 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_encrypt256 entry 55742 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_decrypt256 entry 55743 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_load_keys_for_encrypt entry 55744 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_load_keys_for_encrypt entry 55745 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_load_keys_for_encrypt entry 55746 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_ecb_encrypt entry 55747 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_ecb_encrypt entry 55748 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_ecb_encrypt entry 55749 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_cbc_encrypt entry 55750 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_cbc_encrypt entry 55751 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_cbc_encrypt entry 55752 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_ctr_crypt entry 55753 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_ctr_crypt entry 55754 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_ctr_crypt entry 55755 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_cfb128_encrypt entry 55756 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_cfb128_encrypt entry 55757 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_cfb128_encrypt entry 55758 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_load_keys_for_decrypt entry 55759 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_load_keys_for_decrypt entry 55760 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_load_keys_for_decrypt entry 55761 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_ecb_decrypt entry 55762 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_ecb_decrypt entry 55763 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_ecb_decrypt entry 55764 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_cbc_decrypt entry 55765 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_cbc_decrypt entry 55766 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_cbc_decrypt entry 55767 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_cfb128_decrypt entry 55768 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_cfb128_decrypt entry 55769 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_cfb128_decrypt entry 55771 pid18225 pkcs11_softtoken_extra.so.1 yf_des_instructions_present entry 55772 pid18225 pkcs11_softtoken_extra.so.1 yf_des_expand entry 55773 pid18225 pkcs11_softtoken_extra.so.1 yf_des_encrypt entry 55774 pid18225 pkcs11_softtoken_extra.so.1 yf_mpmul_present entry 55775 pid18225 pkcs11_softtoken_extra.so.1 yf_montmul_present entry 55776 pid18225 pkcs11_softtoken_extra.so.1 mm_yf_montmul entry 55777 pid18225 pkcs11_softtoken_extra.so.1 mm_yf_montsqr entry 55778 pid18225 pkcs11_softtoken_extra.so.1 mm_yf_restore_func entry 55779 pid18225 pkcs11_softtoken_extra.so.1 mm_yf_ret_from_mont_func entry 55780 pid18225 pkcs11_softtoken_extra.so.1 mm_yf_execute_slp entry 55781 pid18225 pkcs11_softtoken_extra.so.1 big_modexp_ncp_yf entry 55782 pid18225 pkcs11_softtoken_extra.so.1 big_mont_mul_yf entry 55783 pid18225 pkcs11_softtoken_extra.so.1 mpmul_arr_yf entry 55784 pid18225 pkcs11_softtoken_extra.so.1 big_mp_mul_yf entry 55785 pid18225 pkcs11_softtoken_extra.so.1 mpm_yf_mpmul entry 55786 pid18225 libns-httpd40.so nsapi_rsa_set_priv_fn entry ... 55795 pid18225 libnss3.so prepare_rsa_priv_key_export_for_asn1 entry 55796 pid18225 libresolv.so.2 sunw_dst_rsaref_init entry 55797 pid18225 libnssutil3.so NSS_Get_SEC_UniversalStringTemplate entry ... 55813 pid18225 libsoftokn3.so prepare_low_rsa_priv_key_for_asn1 entry 55814 pid18225 libsoftokn3.so rsa_FormatOneBlock entry 55815 pid18225 libsoftokn3.so rsa_FormatBlock entry 55816 pid18225 libnssdbm3.so lg_prepare_low_rsa_priv_key_for_asn1 entry 55817 pid18225 libfreebl_32fpu_3.so rsa_build_from_primes entry 55818 pid18225 libfreebl_32fpu_3.so rsa_is_prime entry 55819 pid18225 libfreebl_32fpu_3.so rsa_get_primes_from_exponents entry 55820 pid18225 libfreebl_32fpu_3.so rsa_PrivateKeyOpNoCRT entry 55821 pid18225 libfreebl_32fpu_3.so rsa_PrivateKeyOpCRTNoCheck entry 55822 pid18225 libfreebl_32fpu_3.so rsa_PrivateKeyOpCRTCheckedPubKey entry 55823 pid18225 pkcs11_kernel.so.1 key_gen_rsa_by_value entry 55824 pid18225 pkcs11_kernel.so.1 get_rsa_private_key entry 55825 pid18225 pkcs11_kernel.so.1 get_rsa_public_key entry 55826 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_encrypt entry 55827 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_decrypt entry 55828 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_crypt_init_common entry 55829 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_encrypt_common entry 55830 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_decrypt_common entry 55831 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_sign_verify_init_common entry 55832 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_sign_common entry 55833 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_verify_common entry 55834 pid18225 pkcs11_softtoken_extra.so.1 generate_rsa_key entry 55835 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_genkey_pair entry 55836 pid18225 pkcs11_softtoken_extra.so.1 get_rsa_sha1_prefix entry 55837 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_digest_sign_common entry 55838 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_digest_verify_common entry 55839 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_verify_recover entry 55840 pid18225 pkcs11_softtoken_extra.so.1 rsa_pri_to_asn1 entry 55841 pid18225 pkcs11_softtoken_extra.so.1 asn1_to_rsa_pri entry 55842 pid18225 pkcs11_softtoken_extra.so.1 soft_encrypt_rsa_pkcs_encode entry 55843 pid18225 pkcs11_softtoken_extra.so.1 soft_decrypt_rsa_pkcs_decode entry 55844 pid18225 pkcs11_softtoken_extra.so.1 soft_sign_rsa_pkcs_encode entry 55845 pid18225 pkcs11_softtoken_extra.so.1 soft_verify_rsa_pkcs_decode entry 55770 profile tick-1sec

    Read the article

  • ANTS CLR and Memory Profiler In Depth Review (Part 1 of 2 &ndash; CLR Profiler)

    - by ToStringTheory
    One of the things that people might not know about me, is my obsession to make my code as efficient as possible.  Many people might not realize how much of a task or undertaking that this might be, but it is surely a task as monumental as climbing Mount Everest, except this time it is a challenge for the mind…  In trying to make code efficient, there are many different factors that play a part – size of project or solution, tiers, language used, experience and training of the programmer, technologies used, maintainability of the code – the list can go on for quite some time. I spend quite a bit of time when developing trying to determine what is the best way to implement a feature to accomplish the efficiency that I look to achieve.  One program that I have recently come to learn about – Red Gate ANTS Performance (CLR) and Memory profiler gives me tools to accomplish that job more efficiently as well.  In this review, I am going to cover some of the features of the ANTS profiler set by compiling some hideous example code to test against. Notice As a member of the Geeks With Blogs Influencers program, one of the perks is the ability to review products, in exchange for a free license to the program.  I have not let this affect my opinions of the product in any way, and Red Gate nor Geeks With Blogs has tried to influence my opinion regarding this product in any way. Introduction The ANTS Profiler pack provided by Red Gate was something that I had not heard of before receiving an email regarding an offer to review it for a license.  Since I look to make my code efficient, it was a no brainer for me to try it out!  One thing that I have to say took me by surprise is that upon downloading the program and installing it you fill out a form for your usual contact information.  Sure enough within 2 hours, I received an email from a sales representative at Red Gate asking if she could help me to achieve the most out of my trial time so it wouldn’t go to waste.  After replying to her and explaining that I was looking to review its feature set, she put me in contact with someone that setup a demo session to give me a quick rundown of its features via an online meeting.  After having dealt with a massive ordeal with one of my utility companies and their complete lack of customer service, Red Gates friendly and helpful representatives were a breath of fresh air, and something I was thankful for. ANTS CLR Profiler The ANTS CLR profiler is the thing I want to focus on the most in this post, so I am going to dive right in now. Install was simple and took no time at all.  It installed both the profiler for the CLR and Memory, but also visual studio extensions to facilitate the usage of the profilers (click any images for full size images): The Visual Studio menu options (under ANTS menu) Starting the CLR Performance Profiler from the start menu yields this window If you follow the instructions after launching the program from the start menu (Click File > New Profiling Session to start a new project), you are given a dialog with plenty of options for profiling: The New Session dialog.  Lots of options.  One thing I noticed is that the buttons in the lower right were half-covered by the panel of the application.  If I had to guess, I would imagine that this is caused by my DPI settings being set to 125%.  This is a problem I have seen in other applications as well that don’t scale well to different dpi scales. The profiler options give you the ability to profile: .NET Executable ASP.NET web application (hosted in IIS) ASP.NET web application (hosted in IIS express) ASP.NET web application (hosted in Cassini Web Development Server) SharePoint web application (hosted in IIS) Silverlight 4+ application Windows Service COM+ server XBAP (local XAML browser application) Attach to an already running .NET 4 process Choosing each option provides a varying set of other variables/options that one can set including options such as application arguments, operating path, record I/O performance performance counters to record (43 counters in all!), etc…  All in all, they give you the ability to profile many different .Net project types, and make it simple to do so.  In most cases of my using this application, I would be using the built in Visual Studio extensions, as they automatically start a new profiling project in ANTS with the options setup, and start your program, however RedGate has made it easy enough to profile outside of Visual Studio as well. On the flip side of this, as someone who lives most of their work life in Visual Studio, one thing I do wish is that instead of opening an entirely separate application/gui to perform profiling after launching, that instead they would provide a Visual Studio panel with the information, and integrate more of the profiling project information into Visual Studio.  So, now that we have an idea of what options that the profiler gives us, its time to test its abilities and features. Horrendous Example Code – Prime Number Generator One of my interests besides development, is Physics and Math – what I went to college for.  I have especially always been interested in prime numbers, as they are something of a mystery…  So, I decided that I would go ahead and to test the abilities of the profiler, I would write a small program, website, and library to generate prime numbers in the quantity that you ask for.  I am going to start off with some terrible code, and show how I would see the profiler being used as a development tool. First off, the IPrimes interface (all code is downloadable at the end of the post): interface IPrimes { IEnumerable<int> GetPrimes(int retrieve); } Simple enough, right?  Anything that implements the interface will (hopefully) provide an IEnumerable of int, with the quantity specified in the parameter argument.  Next, I am going to implement this interface in the most basic way: public class DumbPrimes : IPrimes { public IEnumerable<int> GetPrimes(int retrieve) { //store a list of primes already found var _foundPrimes = new List<int>() { 2, 3 }; //if i ask for 1 or two primes, return what asked for if (retrieve <= _foundPrimes.Count()) return _foundPrimes.Take(retrieve); //the next number to look at int _analyzing = 4; //since I already determined I don't have enough //execute at least once, and until quantity is sufficed do { //assume prime until otherwise determined bool isPrime = true; //start dividing at 2 //divide until number is reached, or determined not prime for (int i = 2; i < _analyzing && isPrime; i++) { //if (i) goes into _analyzing without a remainder, //_analyzing is NOT prime if (_analyzing % i == 0) isPrime = false; } //if it is prime, add to found list if (isPrime) _foundPrimes.Add(_analyzing); //increment number to analyze next _analyzing++; } while (_foundPrimes.Count() < retrieve); return _foundPrimes; } } This is the simplest way to get primes in my opinion.  Checking each number by the straight definition of a prime – is it divisible by anything besides 1 and itself. I have included this code in a base class library for my solution, as I am going to use it to demonstrate a couple of features of ANTS.  This class library is consumed by a simple non-MVVM WPF application, and a simple MVC4 website.  I will not post the WPF code here inline, as it is simply an ObservableCollection<int>, a label, two textbox’s, and a button. Starting a new Profiling Session So, in Visual Studio, I have just completed my first stint developing the GUI and DumbPrimes IPrimes class, so now I want to check my codes efficiency by profiling it.  All I have to do is build the solution (surprised initiating a profiling session doesn’t do this, but I suppose I can understand it), and then click the ANTS menu, followed by Profile Performance.  I am then greeted by the profiler starting up and already monitoring my program live: You are provided with a realtime graph at the top, and a pane at the bottom giving you information on how to proceed.  I am going to start by asking my program to show me the first 15000 primes: After the program finally began responding again (I did all the work on the main UI thread – how bad!), I stopped the profiler, which did kill the process of my program too.  One important thing to note, is that the profiler by default wants to give you a lot of detail about the operation – line hit counts, time per line, percent time per line, etc…  The important thing to remember is that this itself takes a lot of time.  When running my program without the profiler attached, it can generate the 15000 primes in 5.18 seconds, compared to 74.5 seconds – almost a 1500 percent increase.  While this may seem like a lot, remember that there is a trade off.  It may be WAY more inefficient, however, I am able to drill down and make improvements to specific problem areas, and then decrease execution time all around. Analyzing the Profiling Session After clicking ‘Stop Profiling’, the process running my application stopped, and the entire execution time was automatically selected by ANTS, and the results shown below: Now there are a number of interesting things going on here, I am going to cover each in a section of its own: Real Time Performance Counter Bar (top of screen) At the top of the screen, is the real time performance bar.  As your application is running, this will constantly update with the currently selected performance counters status.  A couple of cool things to note are the fact that you can drag a selection around specific time periods to drill down the detail views in the lower 2 panels to information pertaining to only that period. After selecting a time period, you can bookmark a section and name it, so that it is easy to find later, or after reloaded at a later time.  You can also zoom in, out, or fit the graph to the space provided – useful for drilling down. It may be hard to see, but at the top of the processor time graph below the time ticks, but above the red usage graph, there is a green bar. This bar shows at what times a method that is selected in the ‘Call tree’ panel is called. Very cool to be able to click on a method and see at what times it made an impact. As I said before, ANTS provides 43 different performance counters you can hook into.  Click the arrow next to the Performance tab at the top will allow you to change between different counters if you have them selected: Method Call Tree, ADO.Net Database Calls, File IO – Detail Panel Red Gate really hit the mark here I think. When you select a section of the run with the graph, the call tree populates to fill a hierarchical tree of method calls, with information regarding each of the methods.   By default, methods are hidden where the source is not provided (framework type code), however, Red Gate has integrated Reflector into ANTS, so even if you don’t have source for something, you can select a method and get the source if you want.  Methods are also hidden where the impact is seen as insignificant – methods that are only executed for 1% of the time of the overall calling methods time; in other words, working on making them better is not where your efforts should be focused. – Smart! Source Panel – Detail Panel The source panel is where you can see line level information on your code, showing the code for the currently selected method from the Method Call Tree.  If the code is not available, Reflector takes care of it and shows the code anyways! As you can notice, there does seem to be a problem with how ANTS determines what line is the actual line that a call is completed on.  I have suspicions that this may be due to some of the inline code optimizations that the CLR applies upon compilation of the assembly.  In a method with comments, the problem is much more severe: As you can see here, apparently the most offending code in my base library was a comment – *gasp*!  Removing the comments does help quite a bit, however I hope that Red Gate works on their counter algorithm soon to improve the logic on positioning for statistics: I did a small test just to demonstrate the lines are correct without comments. For me, it isn’t a deal breaker, as I can usually determine the correct placements by looking at the application code in the region and determining what makes sense, but it is something that would probably build up some irritation with time. Feature – Suggest Method for Optimization A neat feature to really help those in need of a pointer, is the menu option under tools to automatically suggest methods to optimize/improve: Nice feature – clicking it filters the call tree and stars methods that it thinks are good candidates for optimization.  I do wish that they would have made it more visible for those of use who aren’t great on sight: Process Integration I do think that this could have a place in my process.  After experimenting with the profiler, I do think it would be a great benefit to do some development, testing, and then after all the bugs are worked out, use the profiler to check on things to make sure nothing seems like it is hogging more than its fair share.  For example, with this program, I would have developed it, ran it, tested it – it works, but slowly. After looking at the profiler, and seeing the massive amount of time spent in 1 method, I might go ahead and try to re-implement IPrimes (I actually would probably rewrite the offending code, but so that I can distribute both sets of code easily, I’m just going to make another implementation of IPrimes).  Using two pieces of knowledge about prime numbers can make this method MUCH more efficient – prime numbers fall into two buckets 6k+/-1 , and a number is prime if it is not divisible by any other primes before it: public class SmartPrimes : IPrimes { public IEnumerable<int> GetPrimes(int retrieve) { //store a list of primes already found var _foundPrimes = new List<int>() { 2, 3 }; //if i ask for 1 or two primes, return what asked for if (retrieve <= _foundPrimes.Count()) return _foundPrimes.Take(retrieve); //the next number to look at int _k = 1; //since I already determined I don't have enough //execute at least once, and until quantity is sufficed do { //assume prime until otherwise determined bool isPrime = true; int potentialPrime; //analyze 6k-1 //assign the value to potential potentialPrime = 6 * _k - 1; //if there are any primes that divise this, it is NOT a prime number //using PLINQ for quick boost isPrime = !_foundPrimes.AsParallel() .Any(prime => potentialPrime % prime == 0); //if it is prime, add to found list if (isPrime) _foundPrimes.Add(potentialPrime); if (_foundPrimes.Count() == retrieve) break; //analyze 6k+1 //assign the value to potential potentialPrime = 6 * _k + 1; //if there are any primes that divise this, it is NOT a prime number //using PLINQ for quick boost isPrime = !_foundPrimes.AsParallel() .Any(prime => potentialPrime % prime == 0); //if it is prime, add to found list if (isPrime) _foundPrimes.Add(potentialPrime); //increment k to analyze next _k++; } while (_foundPrimes.Count() < retrieve); return _foundPrimes; } } Now there are definitely more things I can do to help make this more efficient, but for the scope of this example, I think this is fine (but still hideous)! Profiling this now yields a happy surprise 27 seconds to generate the 15000 primes with the profiler attached, and only 1.43 seconds without.  One important thing I wanted to call out though was the performance graph now: Notice anything odd?  The %Processor time is above 100%.  This is because there is now more than 1 core in the operation.  A better label for the chart in my mind would have been %Core time, but to each their own. Another odd thing I noticed was that the profiler seemed to be spot on this time in my DumbPrimes class with line details in source, even with comments..  Odd. Profiling Web Applications The last thing that I wanted to cover, that means a lot to me as a web developer, is the great amount of work that Red Gate put into the profiler when profiling web applications.  In my solution, I have a simple MVC4 application setup with 1 page, a single input form, that will output prime values as my WPF app did.  Launching the profiler from Visual Studio as before, nothing is really different in the profiler window, however I did receive a UAC prompt for a Red Gate helper app to integrate with the web server without notification. After requesting 500, 1000, 2000, and 5000 primes, and looking at the profiler session, things are slightly different from before: As you can see, there are 4 spikes of activity in the processor time graph, but there is also something new in the call tree: That’s right – ANTS will actually group method calls by get/post operations, so it is easier to find out what action/page is giving the largest problems…  Pretty cool in my mind! Overview Overall, I think that Red Gate ANTS CLR Profiler has a lot to offer, however I think it also has a long ways to go.  3 Biggest Pros: Ability to easily drill down from time graph, to method calls, to source code Wide variety of counters to choose from when profiling your application Excellent integration/grouping of methods being called from web applications by request – BRILLIANT! 3 Biggest Cons: Issue regarding line details in source view Nit pick – Processor time vs. Core time Nit pick – Lack of full integration with Visual Studio Ratings Ease of Use (7/10) – I marked down here because of the problems with the line level details and the extra work that that entails, and the lack of better integration with Visual Studio. Effectiveness (10/10) – I believe that the profiler does EXACTLY what it purports to do.  Especially with its large variety of performance counters, a definite plus! Features (9/10) – Besides the real time performance monitoring, and the drill downs that I’ve shown here, ANTS also has great integration with ADO.Net, with the ability to show database queries run by your application in the profiler.  This, with the line level details, the web request grouping, reflector integration, and various options to customize your profiling session I think create a great set of features! Customer Service (10/10) – My entire experience with Red Gate personnel has been nothing but good.  their people are friendly, helpful, and happy! UI / UX (8/10) – The interface is very easy to get around, and all of the options are easy to find.  With a little bit of poking around, you’ll be optimizing Hello World in no time flat! Overall (8/10) – Overall, I am happy with the Performance Profiler and its features, as well as with the service I received when working with the Red Gate personnel.  I WOULD recommend you trying the application and seeing if it would fit into your process, BUT, remember there are still some kinks in it to hopefully be worked out. My next post will definitely be shorter (hopefully), but thank you for reading up to here, or skipping ahead!  Please, if you do try the product, drop me a message and let me know what you think!  I would love to hear any opinions you may have on the product. Code Feel free to download the code I used above – download via DropBox

    Read the article

  • C#/.NET Little Wonders: The Generic Func Delegates

    - by James Michael Hare
    Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. Back in one of my three original “Little Wonders” Trilogy of posts, I had listed generic delegates as one of the Little Wonders of .NET.  Later, someone posted a comment saying said that they would love more detail on the generic delegates and their uses, since my original entry just scratched the surface of them. Last week, I began our look at some of the handy generic delegates built into .NET with a description of delegates in general, and the Action family of delegates.  For this week, I’ll launch into a look at the Func family of generic delegates and how they can be used to support generic, reusable algorithms and classes. Quick Delegate Recap Delegates are similar to function pointers in C++ in that they allow you to store a reference to a method.  They can store references to either static or instance methods, and can actually be used to chain several methods together in one delegate. Delegates are very type-safe and can be satisfied with any standard method, anonymous method, or a lambda expression.  They can also be null as well (refers to no method), so care should be taken to make sure that the delegate is not null before you invoke it. Delegates are defined using the keyword delegate, where the delegate’s type name is placed where you would typically place the method name: 1: // This delegate matches any method that takes string, returns nothing 2: public delegate void Log(string message); This delegate defines a delegate type named Log that can be used to store references to any method(s) that satisfies its signature (whether instance, static, lambda expression, etc.). Delegate instances then can be assigned zero (null) or more methods using the operator = which replaces the existing delegate chain, or by using the operator += which adds a method to the end of a delegate chain: 1: // creates a delegate instance named currentLogger defaulted to Console.WriteLine (static method) 2: Log currentLogger = Console.Out.WriteLine; 3:  4: // invokes the delegate, which writes to the console out 5: currentLogger("Hi Standard Out!"); 6:  7: // append a delegate to Console.Error.WriteLine to go to std error 8: currentLogger += Console.Error.WriteLine; 9:  10: // invokes the delegate chain and writes message to std out and std err 11: currentLogger("Hi Standard Out and Error!"); While delegates give us a lot of power, it can be cumbersome to re-create fairly standard delegate definitions repeatedly, for this purpose the generic delegates were introduced in various stages in .NET.  These support various method types with particular signatures. Note: a caveat with generic delegates is that while they can support multiple parameters, they do not match methods that contains ref or out parameters. If you want to a delegate to represent methods that takes ref or out parameters, you will need to create a custom delegate. We’ve got the Func… delegates Just like it’s cousin, the Action delegate family, the Func delegate family gives us a lot of power to use generic delegates to make classes and algorithms more generic.  Using them keeps us from having to define a new delegate type when need to make a class or algorithm generic. Remember that the point of the Action delegate family was to be able to perform an “action” on an item, with no return results.  Thus Action delegates can be used to represent most methods that take 0 to 16 arguments but return void.  You can assign a method The Func delegate family was introduced in .NET 3.5 with the advent of LINQ, and gives us the power to define a function that can be called on 0 to 16 arguments and returns a result.  Thus, the main difference between Action and Func, from a delegate perspective, is that Actions return nothing, but Funcs return a result. The Func family of delegates have signatures as follows: Func<TResult> – matches a method that takes no arguments, and returns value of type TResult. Func<T, TResult> – matches a method that takes an argument of type T, and returns value of type TResult. Func<T1, T2, TResult> – matches a method that takes arguments of type T1 and T2, and returns value of type TResult. Func<T1, T2, …, TResult> – and so on up to 16 arguments, and returns value of type TResult. These are handy because they quickly allow you to be able to specify that a method or class you design will perform a function to produce a result as long as the method you specify meets the signature. For example, let’s say you were designing a generic aggregator, and you wanted to allow the user to define how the values will be aggregated into the result (i.e. Sum, Min, Max, etc…).  To do this, we would ask the user of our class to pass in a method that would take the current total, the next value, and produce a new total.  A class like this could look like: 1: public sealed class Aggregator<TValue, TResult> 2: { 3: // holds method that takes previous result, combines with next value, creates new result 4: private Func<TResult, TValue, TResult> _aggregationMethod; 5:  6: // gets or sets the current result of aggregation 7: public TResult Result { get; private set; } 8:  9: // construct the aggregator given the method to use to aggregate values 10: public Aggregator(Func<TResult, TValue, TResult> aggregationMethod = null) 11: { 12: if (aggregationMethod == null) throw new ArgumentNullException("aggregationMethod"); 13:  14: _aggregationMethod = aggregationMethod; 15: } 16:  17: // method to add next value 18: public void Aggregate(TValue nextValue) 19: { 20: // performs the aggregation method function on the current result and next and sets to current result 21: Result = _aggregationMethod(Result, nextValue); 22: } 23: } Of course, LINQ already has an Aggregate extension method, but that works on a sequence of IEnumerable<T>, whereas this is designed to work more with aggregating single results over time (such as keeping track of a max response time for a service). We could then use this generic aggregator to find the sum of a series of values over time, or the max of a series of values over time (among other things): 1: // creates an aggregator that adds the next to the total to sum the values 2: var sumAggregator = new Aggregator<int, int>((total, next) => total + next); 3:  4: // creates an aggregator (using static method) that returns the max of previous result and next 5: var maxAggregator = new Aggregator<int, int>(Math.Max); So, if we were timing the response time of a web method every time it was called, we could pass that response time to both of these aggregators to get an idea of the total time spent in that web method, and the max time spent in any one call to the web method: 1: // total will be 13 and max 13 2: int responseTime = 13; 3: sumAggregator.Aggregate(responseTime); 4: maxAggregator.Aggregate(responseTime); 5:  6: // total will be 20 and max still 13 7: responseTime = 7; 8: sumAggregator.Aggregate(responseTime); 9: maxAggregator.Aggregate(responseTime); 10:  11: // total will be 40 and max now 20 12: responseTime = 20; 13: sumAggregator.Aggregate(responseTime); 14: maxAggregator.Aggregate(responseTime); The Func delegate family is useful for making generic algorithms and classes, and in particular allows the caller of the method or user of the class to specify a function to be performed in order to generate a result. What is the result of a Func delegate chain? If you remember, we said earlier that you can assign multiple methods to a delegate by using the += operator to chain them.  So how does this affect delegates such as Func that return a value, when applied to something like the code below? 1: Func<int, int, int> combo = null; 2:  3: // What if we wanted to aggregate the sum and max together? 4: combo += (total, next) => total + next; 5: combo += Math.Max; 6:  7: // what is the result? 8: var comboAggregator = new Aggregator<int, int>(combo); Well, in .NET if you chain multiple methods in a delegate, they will all get invoked, but the result of the delegate is the result of the last method invoked in the chain.  Thus, this aggregator would always result in the Math.Max() result.  The other chained method (the sum) gets executed first, but it’s result is thrown away: 1: // result is 13 2: int responseTime = 13; 3: comboAggregator.Aggregate(responseTime); 4:  5: // result is still 13 6: responseTime = 7; 7: comboAggregator.Aggregate(responseTime); 8:  9: // result is now 20 10: responseTime = 20; 11: comboAggregator.Aggregate(responseTime); So remember, you can chain multiple Func (or other delegates that return values) together, but if you do so you will only get the last executed result. Func delegates and co-variance/contra-variance in .NET 4.0 Just like the Action delegate, as of .NET 4.0, the Func delegate family is contra-variant on its arguments.  In addition, it is co-variant on its return type.  To support this, in .NET 4.0 the signatures of the Func delegates changed to: Func<out TResult> – matches a method that takes no arguments, and returns value of type TResult (or a more derived type). Func<in T, out TResult> – matches a method that takes an argument of type T (or a less derived type), and returns value of type TResult(or a more derived type). Func<in T1, in T2, out TResult> – matches a method that takes arguments of type T1 and T2 (or less derived types), and returns value of type TResult (or a more derived type). Func<in T1, in T2, …, out TResult> – and so on up to 16 arguments, and returns value of type TResult (or a more derived type). Notice the addition of the in and out keywords before each of the generic type placeholders.  As we saw last week, the in keyword is used to specify that a generic type can be contra-variant -- it can match the given type or a type that is less derived.  However, the out keyword, is used to specify that a generic type can be co-variant -- it can match the given type or a type that is more derived. On contra-variance, if you are saying you need an function that will accept a string, you can just as easily give it an function that accepts an object.  In other words, if you say “give me an function that will process dogs”, I could pass you a method that will process any animal, because all dogs are animals.  On the co-variance side, if you are saying you need a function that returns an object, you can just as easily pass it a function that returns a string because any string returned from the given method can be accepted by a delegate expecting an object result, since string is more derived.  Once again, in other words, if you say “give me a method that creates an animal”, I can pass you a method that will create a dog, because all dogs are animals. It really all makes sense, you can pass a more specific thing to a less specific parameter, and you can return a more specific thing as a less specific result.  In other words, pay attention to the direction the item travels (parameters go in, results come out).  Keeping that in mind, you can always pass more specific things in and return more specific things out. For example, in the code below, we have a method that takes a Func<object> to generate an object, but we can pass it a Func<string> because the return type of object can obviously accept a return value of string as well: 1: // since Func<object> is co-variant, this will access Func<string>, etc... 2: public static string Sequence(int count, Func<object> generator) 3: { 4: var builder = new StringBuilder(); 5:  6: for (int i=0; i<count; i++) 7: { 8: object value = generator(); 9: builder.Append(value); 10: } 11:  12: return builder.ToString(); 13: } Even though the method above takes a Func<object>, we can pass a Func<string> because the TResult type placeholder is co-variant and accepts types that are more derived as well: 1: // delegate that's typed to return string. 2: Func<string> stringGenerator = () => DateTime.Now.ToString(); 3:  4: // This will work in .NET 4.0, but not in previous versions 5: Sequence(100, stringGenerator); Previous versions of .NET implemented some forms of co-variance and contra-variance before, but .NET 4.0 goes one step further and allows you to pass or assign an Func<A, BResult> to a Func<Y, ZResult> as long as A is less derived (or same) as Y, and BResult is more derived (or same) as ZResult. Sidebar: The Func and the Predicate A method that takes one argument and returns a bool is generally thought of as a predicate.  Predicates are used to examine an item and determine whether that item satisfies a particular condition.  Predicates are typically unary, but you may also have binary and other predicates as well. Predicates are often used to filter results, such as in the LINQ Where() extension method: 1: var numbers = new[] { 1, 2, 4, 13, 8, 10, 27 }; 2:  3: // call Where() using a predicate which determines if the number is even 4: var evens = numbers.Where(num => num % 2 == 0); As of .NET 3.5, predicates are typically represented as Func<T, bool> where T is the type of the item to examine.  Previous to .NET 3.5, there was a Predicate<T> type that tended to be used (which we’ll discuss next week) and is still supported, but most developers recommend using Func<T, bool> now, as it prevents confusion with overloads that accept unary predicates and binary predicates, etc.: 1: // this seems more confusing as an overload set, because of Predicate vs Func 2: public static SomeMethod(Predicate<int> unaryPredicate) { } 3: public static SomeMethod(Func<int, int, bool> binaryPredicate) { } 4:  5: // this seems more consistent as an overload set, since just uses Func 6: public static SomeMethod(Func<int, bool> unaryPredicate) { } 7: public static SomeMethod(Func<int, int, bool> binaryPredicate) { } Also, even though Predicate<T> and Func<T, bool> match the same signatures, they are separate types!  Thus you cannot assign a Predicate<T> instance to a Func<T, bool> instance and vice versa: 1: // the same method, lambda expression, etc can be assigned to both 2: Predicate<int> isEven = i => (i % 2) == 0; 3: Func<int, bool> alsoIsEven = i => (i % 2) == 0; 4:  5: // but the delegate instances cannot be directly assigned, strongly typed! 6: // ERROR: cannot convert type... 7: isEven = alsoIsEven; 8:  9: // however, you can assign by wrapping in a new instance: 10: isEven = new Predicate<int>(alsoIsEven); 11: alsoIsEven = new Func<int, bool>(isEven); So, the general advice that seems to come from most developers is that Predicate<T> is still supported, but we should use Func<T, bool> for consistency in .NET 3.5 and above. Sidebar: Func as a Generator for Unit Testing One area of difficulty in unit testing can be unit testing code that is based on time of day.  We’d still want to unit test our code to make sure the logic is accurate, but we don’t want the results of our unit tests to be dependent on the time they are run. One way (of many) around this is to create an internal generator that will produce the “current” time of day.  This would default to returning result from DateTime.Now (or some other method), but we could inject specific times for our unit testing.  Generators are typically methods that return (generate) a value for use in a class/method. For example, say we are creating a CacheItem<T> class that represents an item in the cache, and we want to make sure the item shows as expired if the age is more than 30 seconds.  Such a class could look like: 1: // responsible for maintaining an item of type T in the cache 2: public sealed class CacheItem<T> 3: { 4: // helper method that returns the current time 5: private static Func<DateTime> _timeGenerator = () => DateTime.Now; 6:  7: // allows internal access to the time generator 8: internal static Func<DateTime> TimeGenerator 9: { 10: get { return _timeGenerator; } 11: set { _timeGenerator = value; } 12: } 13:  14: // time the item was cached 15: public DateTime CachedTime { get; private set; } 16:  17: // the item cached 18: public T Value { get; private set; } 19:  20: // item is expired if older than 30 seconds 21: public bool IsExpired 22: { 23: get { return _timeGenerator() - CachedTime > TimeSpan.FromSeconds(30.0); } 24: } 25:  26: // creates the new cached item, setting cached time to "current" time 27: public CacheItem(T value) 28: { 29: Value = value; 30: CachedTime = _timeGenerator(); 31: } 32: } Then, we can use this construct to unit test our CacheItem<T> without any time dependencies: 1: var baseTime = DateTime.Now; 2:  3: // start with current time stored above (so doesn't drift) 4: CacheItem<int>.TimeGenerator = () => baseTime; 5:  6: var target = new CacheItem<int>(13); 7:  8: // now add 15 seconds, should still be non-expired 9: CacheItem<int>.TimeGenerator = () => baseTime.AddSeconds(15); 10:  11: Assert.IsFalse(target.IsExpired); 12:  13: // now add 31 seconds, should now be expired 14: CacheItem<int>.TimeGenerator = () => baseTime.AddSeconds(31); 15:  16: Assert.IsTrue(target.IsExpired); Now we can unit test for 1 second before, 1 second after, 1 millisecond before, 1 day after, etc.  Func delegates can be a handy tool for this type of value generation to support more testable code.  Summary Generic delegates give us a lot of power to make truly generic algorithms and classes.  The Func family of delegates is a great way to be able to specify functions to calculate a result based on 0-16 arguments.  Stay tuned in the weeks that follow for other generic delegates in the .NET Framework!   Tweet Technorati Tags: .NET, C#, CSharp, Little Wonders, Generics, Func, Delegates

    Read the article

  • Ball bouncing at a certain angle and efficiency computations

    - by X Y
    I would like to make a pong game with a small twist (for now). Every time the ball bounces off one of the paddles i want it to be under a certain angle (between a min and a max). I simply can't wrap my head around how to actually do it (i have some thoughts and such but i simply cannot implement them properly - i feel i'm overcomplicating things). Here's an image with a small explanation . One other problem would be that the conditions for bouncing have to be different for every edge. For example, in the picture, on the two small horizontal edges i do not want a perfectly vertical bounce when in the middle of the edge but rather a constant angle (pi/4 maybe) in either direction depending on the collision point (before the middle of the edge, or after). All of my collisions are done with the Separating Axes Theorem (and seem to work fine). I'm looking for something efficient because i want to add a lot of things later on (maybe polygons with many edges and such). So i need to keep to a minimum the amount of checking done every frame. The collision algorithm begins testing whenever the bounding boxes of the paddle and the ball intersect. Is there something better to test for possible collisions every frame? (more efficient in the long run,with many more objects etc, not necessarily easy to code). I'm going to post the code for my game: Paddle Class public class Paddle : Microsoft.Xna.Framework.DrawableGameComponent { #region Private Members private SpriteBatch spriteBatch; private ContentManager contentManager; private bool keybEnabled; private bool isLeftPaddle; private Texture2D paddleSprite; private Vector2 paddlePosition; private float paddleSpeedY; private Vector2 paddleScale = new Vector2(1f, 1f); private const float DEFAULT_Y_SPEED = 150; private Vector2[] Normals2Edges; private Vector2[] Vertices = new Vector2[4]; private List<Vector2> lst = new List<Vector2>(); private Vector2 Edge; #endregion #region Properties public float Speed { get {return paddleSpeedY; } set { paddleSpeedY = value; } } public Vector2[] Normal2EdgesVector { get { NormalsToEdges(this.isLeftPaddle); return Normals2Edges; } } public Vector2[] VertexVector { get { return Vertices; } } public Vector2 Scale { get { return paddleScale; } set { paddleScale = value; NormalsToEdges(this.isLeftPaddle); } } public float X { get { return paddlePosition.X; } set { paddlePosition.X = value; } } public float Y { get { return paddlePosition.Y; } set { paddlePosition.Y = value; } } public float Width { get { return (Scale.X == 1f ? (float)paddleSprite.Width : paddleSprite.Width * Scale.X); } } public float Height { get { return ( Scale.Y==1f ? (float)paddleSprite.Height : paddleSprite.Height*Scale.Y ); } } public Texture2D GetSprite { get { return paddleSprite; } } public Rectangle Boundary { get { return new Rectangle((int)paddlePosition.X, (int)paddlePosition.Y, (int)this.Width, (int)this.Height); } } public bool KeyboardEnabled { get { return keybEnabled; } } #endregion private void NormalsToEdges(bool isLeftPaddle) { Normals2Edges = null; Edge = Vector2.Zero; lst.Clear(); for (int i = 0; i < Vertices.Length; i++) { Edge = Vertices[i + 1 == Vertices.Length ? 0 : i + 1] - Vertices[i]; if (Edge != Vector2.Zero) { Edge.Normalize(); //outer normal to edge !! (origin in top-left) lst.Add(new Vector2(Edge.Y, -Edge.X)); } } Normals2Edges = lst.ToArray(); } public float[] ProjectPaddle(Vector2 axis) { if (Vertices.Length == 0 || axis == Vector2.Zero) return (new float[2] { 0, 0 }); float min, max; min = Vector2.Dot(axis, Vertices[0]); max = min; for (int i = 1; i < Vertices.Length; i++) { float p = Vector2.Dot(axis, Vertices[i]); if (p < min) min = p; else if (p > max) max = p; } return (new float[2] { min, max }); } public Paddle(Game game, bool isLeftPaddle, bool enableKeyboard = true) : base(game) { contentManager = new ContentManager(game.Services); keybEnabled = enableKeyboard; this.isLeftPaddle = isLeftPaddle; } public void setPosition(Vector2 newPos) { X = newPos.X; Y = newPos.Y; } public override void Initialize() { base.Initialize(); this.Speed = DEFAULT_Y_SPEED; X = 0; Y = 0; NormalsToEdges(this.isLeftPaddle); } protected override void LoadContent() { spriteBatch = new SpriteBatch(GraphicsDevice); paddleSprite = contentManager.Load<Texture2D>(@"Content\pongBar"); } public override void Update(GameTime gameTime) { //vertices array Vertices[0] = this.paddlePosition; Vertices[1] = this.paddlePosition + new Vector2(this.Width, 0); Vertices[2] = this.paddlePosition + new Vector2(this.Width, this.Height); Vertices[3] = this.paddlePosition + new Vector2(0, this.Height); // Move paddle, but don't allow movement off the screen if (KeyboardEnabled) { float moveDistance = Speed * (float)gameTime.ElapsedGameTime.TotalSeconds; KeyboardState newKeyState = Keyboard.GetState(); if (newKeyState.IsKeyDown(Keys.Down) && Y + paddleSprite.Height + moveDistance <= Game.GraphicsDevice.Viewport.Height) { Y += moveDistance; } else if (newKeyState.IsKeyDown(Keys.Up) && Y - moveDistance >= 0) { Y -= moveDistance; } } else { if (this.Y + this.Height > this.GraphicsDevice.Viewport.Height) { this.Y = this.Game.GraphicsDevice.Viewport.Height - this.Height - 1; } } base.Update(gameTime); } public override void Draw(GameTime gameTime) { spriteBatch.Begin(SpriteSortMode.Texture,null); spriteBatch.Draw(paddleSprite, paddlePosition, null, Color.White, 0f, Vector2.Zero, Scale, SpriteEffects.None, 0); spriteBatch.End(); base.Draw(gameTime); } } Ball Class public class Ball : Microsoft.Xna.Framework.DrawableGameComponent { #region Private Members private SpriteBatch spriteBatch; private ContentManager contentManager; private const float DEFAULT_SPEED = 50; private float speedIncrement = 0; private Vector2 ballScale = new Vector2(1f, 1f); private const float INCREASE_SPEED = 50; private Texture2D ballSprite; //initial texture private Vector2 ballPosition; //position private Vector2 centerOfBall; //center coords private Vector2 ballSpeed = new Vector2(DEFAULT_SPEED, DEFAULT_SPEED); //speed #endregion #region Properties public float DEFAULTSPEED { get { return DEFAULT_SPEED; } } public Vector2 ballCenter { get { return centerOfBall; } } public Vector2 Scale { get { return ballScale; } set { ballScale = value; } } public float SpeedX { get { return ballSpeed.X; } set { ballSpeed.X = value; } } public float SpeedY { get { return ballSpeed.Y; } set { ballSpeed.Y = value; } } public float X { get { return ballPosition.X; } set { ballPosition.X = value; } } public float Y { get { return ballPosition.Y; } set { ballPosition.Y = value; } } public Texture2D GetSprite { get { return ballSprite; } } public float Width { get { return (Scale.X == 1f ? (float)ballSprite.Width : ballSprite.Width * Scale.X); } } public float Height { get { return (Scale.Y == 1f ? (float)ballSprite.Height : ballSprite.Height * Scale.Y); } } public float SpeedIncreaseIncrement { get { return speedIncrement; } set { speedIncrement = value; } } public Rectangle Boundary { get { return new Rectangle((int)ballPosition.X, (int)ballPosition.Y, (int)this.Width, (int)this.Height); } } #endregion public Ball(Game game) : base(game) { contentManager = new ContentManager(game.Services); } public void Reset() { ballSpeed.X = DEFAULT_SPEED; ballSpeed.Y = DEFAULT_SPEED; ballPosition.X = Game.GraphicsDevice.Viewport.Width / 2 - ballSprite.Width / 2; ballPosition.Y = Game.GraphicsDevice.Viewport.Height / 2 - ballSprite.Height / 2; } public void SpeedUp() { if (ballSpeed.Y < 0) ballSpeed.Y -= (INCREASE_SPEED + speedIncrement); else ballSpeed.Y += (INCREASE_SPEED + speedIncrement); if (ballSpeed.X < 0) ballSpeed.X -= (INCREASE_SPEED + speedIncrement); else ballSpeed.X += (INCREASE_SPEED + speedIncrement); } public float[] ProjectBall(Vector2 axis) { if (axis == Vector2.Zero) return (new float[2] { 0, 0 }); float min, max; min = Vector2.Dot(axis, this.ballCenter) - this.Width/2; //center - radius max = min + this.Width; //center + radius return (new float[2] { min, max }); } public void ChangeHorzDirection() { ballSpeed.X *= -1; } public void ChangeVertDirection() { ballSpeed.Y *= -1; } public override void Initialize() { base.Initialize(); ballPosition.X = Game.GraphicsDevice.Viewport.Width / 2 - ballSprite.Width / 2; ballPosition.Y = Game.GraphicsDevice.Viewport.Height / 2 - ballSprite.Height / 2; } protected override void LoadContent() { spriteBatch = new SpriteBatch(GraphicsDevice); ballSprite = contentManager.Load<Texture2D>(@"Content\ball"); } public override void Update(GameTime gameTime) { if (this.Y < 1 || this.Y > GraphicsDevice.Viewport.Height - this.Height - 1) this.ChangeVertDirection(); centerOfBall = new Vector2(ballPosition.X + this.Width / 2, ballPosition.Y + this.Height / 2); base.Update(gameTime); } public override void Draw(GameTime gameTime) { spriteBatch.Begin(); spriteBatch.Draw(ballSprite, ballPosition, null, Color.White, 0f, Vector2.Zero, Scale, SpriteEffects.None, 0); spriteBatch.End(); base.Draw(gameTime); } } Main game class public class gameStart : Microsoft.Xna.Framework.Game { GraphicsDeviceManager graphics; SpriteBatch spriteBatch; public gameStart() { graphics = new GraphicsDeviceManager(this); Content.RootDirectory = "Content"; this.Window.Title = "Pong game"; } protected override void Initialize() { ball = new Ball(this); paddleLeft = new Paddle(this,true,false); paddleRight = new Paddle(this,false,true); Components.Add(ball); Components.Add(paddleLeft); Components.Add(paddleRight); this.Window.AllowUserResizing = false; this.IsMouseVisible = true; this.IsFixedTimeStep = false; this.isColliding = false; base.Initialize(); } #region MyPrivateStuff private Ball ball; private Paddle paddleLeft, paddleRight; private int[] bit = { -1, 1 }; private Random rnd = new Random(); private int updates = 0; enum nrPaddle { None, Left, Right }; private nrPaddle PongBar = nrPaddle.None; private ArrayList Axes = new ArrayList(); private Vector2 MTV; //minimum translation vector private bool isColliding; private float overlap; //smallest distance after projections private Vector2 overlapAxis; //axis of overlap #endregion protected override void LoadContent() { spriteBatch = new SpriteBatch(GraphicsDevice); paddleLeft.setPosition(new Vector2(0, this.GraphicsDevice.Viewport.Height / 2 - paddleLeft.Height / 2)); paddleRight.setPosition(new Vector2(this.GraphicsDevice.Viewport.Width - paddleRight.Width, this.GraphicsDevice.Viewport.Height / 2 - paddleRight.Height / 2)); paddleLeft.Scale = new Vector2(1f, 2f); //scale left paddle } private bool ShapesIntersect(Paddle paddle, Ball ball) { overlap = 1000000f; //large value overlapAxis = Vector2.Zero; MTV = Vector2.Zero; foreach (Vector2 ax in Axes) { float[] pad = paddle.ProjectPaddle(ax); //pad0 = min, pad1 = max float[] circle = ball.ProjectBall(ax); //circle0 = min, circle1 = max if (pad[1] <= circle[0] || circle[1] <= pad[0]) { return false; } if (pad[1] - circle[0] < circle[1] - pad[0]) { if (Math.Abs(overlap) > Math.Abs(-pad[1] + circle[0])) { overlap = -pad[1] + circle[0]; overlapAxis = ax; } } else { if (Math.Abs(overlap) > Math.Abs(circle[1] - pad[0])) { overlap = circle[1] - pad[0]; overlapAxis = ax; } } } if (overlapAxis != Vector2.Zero) { MTV = overlapAxis * overlap; } return true; } protected override void Update(GameTime gameTime) { updates += 1; float ftime = 5 * (float)gameTime.ElapsedGameTime.TotalSeconds; if (updates == 1) { isColliding = false; int Xrnd = bit[Convert.ToInt32(rnd.Next(0, 2))]; int Yrnd = bit[Convert.ToInt32(rnd.Next(0, 2))]; ball.SpeedX = Xrnd * ball.SpeedX; ball.SpeedY = Yrnd * ball.SpeedY; ball.X += ftime * ball.SpeedX; ball.Y += ftime * ball.SpeedY; } else { updates = 100; ball.X += ftime * ball.SpeedX; ball.Y += ftime * ball.SpeedY; } //autorun :) paddleLeft.Y = ball.Y; //collision detection PongBar = nrPaddle.None; if (ball.Boundary.Intersects(paddleLeft.Boundary)) { PongBar = nrPaddle.Left; if (!isColliding) { Axes.Clear(); Axes.AddRange(paddleLeft.Normal2EdgesVector); //axis from nearest vertex to ball's center Axes.Add(FORMULAS.NormAxisFromCircle2ClosestVertex(paddleLeft.VertexVector, ball.ballCenter)); } } else if (ball.Boundary.Intersects(paddleRight.Boundary)) { PongBar = nrPaddle.Right; if (!isColliding) { Axes.Clear(); Axes.AddRange(paddleRight.Normal2EdgesVector); //axis from nearest vertex to ball's center Axes.Add(FORMULAS.NormAxisFromCircle2ClosestVertex(paddleRight.VertexVector, ball.ballCenter)); } } if (PongBar != nrPaddle.None && !isColliding) switch (PongBar) { case nrPaddle.Left: if (ShapesIntersect(paddleLeft, ball)) { isColliding = true; if (MTV != Vector2.Zero) ball.X += MTV.X; ball.Y += MTV.Y; ball.ChangeHorzDirection(); } break; case nrPaddle.Right: if (ShapesIntersect(paddleRight, ball)) { isColliding = true; if (MTV != Vector2.Zero) ball.X += MTV.X; ball.Y += MTV.Y; ball.ChangeHorzDirection(); } break; default: break; } if (!ShapesIntersect(paddleRight, ball) && !ShapesIntersect(paddleLeft, ball)) isColliding = false; ball.X += ftime * ball.SpeedX; ball.Y += ftime * ball.SpeedY; //check ball movement if (ball.X > paddleRight.X + paddleRight.Width + 2) { //IncreaseScore(Left); ball.Reset(); updates = 0; return; } else if (ball.X < paddleLeft.X - 2) { //IncreaseScore(Right); ball.Reset(); updates = 0; return; } base.Update(gameTime); } protected override void Draw(GameTime gameTime) { GraphicsDevice.Clear(Color.Aquamarine); spriteBatch.Begin(SpriteSortMode.BackToFront, BlendState.AlphaBlend); spriteBatch.End(); base.Draw(gameTime); } } And one method i've used: public static Vector2 NormAxisFromCircle2ClosestVertex(Vector2[] vertices, Vector2 circle) { Vector2 temp = Vector2.Zero; if (vertices.Length > 0) { float dist = (circle.X - vertices[0].X) * (circle.X - vertices[0].X) + (circle.Y - vertices[0].Y) * (circle.Y - vertices[0].Y); for (int i = 1; i < vertices.Length;i++) { if (dist > (circle.X - vertices[i].X) * (circle.X - vertices[i].X) + (circle.Y - vertices[i].Y) * (circle.Y - vertices[i].Y)) { temp = vertices[i]; //memorize the closest vertex dist = (circle.X - vertices[i].X) * (circle.X - vertices[i].X) + (circle.Y - vertices[i].Y) * (circle.Y - vertices[i].Y); } } temp = circle - temp; temp.Normalize(); } return temp; } Thanks in advance for any tips on the 4 issues. EDIT1: Something isn't working properly. The collision axis doesn't come out right and the interpolation also seems to have no effect. I've changed the code a bit: private bool ShapesIntersect(Paddle paddle, Ball ball) { overlap = 1000000f; //large value overlapAxis = Vector2.Zero; MTV = Vector2.Zero; foreach (Vector2 ax in Axes) { float[] pad = paddle.ProjectPaddle(ax); //pad0 = min, pad1 = max float[] circle = ball.ProjectBall(ax); //circle0 = min, circle1 = max if (pad[1] < circle[0] || circle[1] < pad[0]) { return false; } if (Math.Abs(pad[1] - circle[0]) < Math.Abs(circle[1] - pad[0])) { if (Math.Abs(overlap) > Math.Abs(-pad[1] + circle[0])) { overlap = -pad[1] + circle[0]; overlapAxis = ax * (-1); } //to get the proper axis } else { if (Math.Abs(overlap) > Math.Abs(circle[1] - pad[0])) { overlap = circle[1] - pad[0]; overlapAxis = ax; } } } if (overlapAxis != Vector2.Zero) { MTV = overlapAxis * Math.Abs(overlap); } return true; } And part of the Update method: if (ShapesIntersect(paddleRight, ball)) { isColliding = true; if (MTV != Vector2.Zero) { ball.X += MTV.X; ball.Y += MTV.Y; } //test if (overlapAxis.X == 0) //collision with horizontal edge { } else if (overlapAxis.Y == 0) //collision with vertical edge { float factor = Math.Abs(ball.ballCenter.Y - paddleRight.Y) / paddleRight.Height; if (factor > 1) factor = 1f; if (overlapAxis.X < 0) //left edge? ball.Speed = ball.DEFAULTSPEED * Vector2.Normalize(Vector2.Reflect(ball.Speed, (Vector2.Lerp(new Vector2(-1, -3), new Vector2(-1, 3), factor)))); else //right edge? ball.Speed = ball.DEFAULTSPEED * Vector2.Normalize(Vector2.Reflect(ball.Speed, (Vector2.Lerp(new Vector2(1, -3), new Vector2(1, 3), factor)))); } else //vertex collision??? { ball.Speed = -ball.Speed; } } What seems to happen is that "overlapAxis" doesn't always return the right one. So instead of (-1,0) i get the (1,0) (this happened even before i multiplied with -1 there). Sometimes there isn't even a collision registered even though the ball passes through the paddle... The interpolation also seems to have no effect as the angles barely change (or the overlapAxis is almost never (-1,0) or (1,0) but something like (0.9783473, 0.02743843)... ). What am i missing here? :(

    Read the article

  • Informed TDD &ndash; Kata &ldquo;To Roman Numerals&rdquo;

    - by Ralf Westphal
    Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/05/28/informed-tdd-ndash-kata-ldquoto-roman-numeralsrdquo.aspxIn a comment on my article on what I call Informed TDD (ITDD) reader gustav asked how this approach would apply to the kata “To Roman Numerals”. And whether ITDD wasn´t a violation of TDD´s principle of leaving out “advanced topics like mocks”. I like to respond with this article to his questions. There´s more to say than fits into a commentary. Mocks and TDD I don´t see in how far TDD is avoiding or opposed to mocks. TDD and mocks are orthogonal. TDD is about pocess, mocks are about structure and costs. Maybe by moving forward in tiny red+green+refactor steps less need arises for mocks. But then… if the functionality you need to implement requires “expensive” resource access you can´t avoid using mocks. Because you don´t want to constantly run all your tests against the real resource. True, in ITDD mocks seem to be in almost inflationary use. That´s not what you usually see in TDD demonstrations. However, there´s a reason for that as I tried to explain. I don´t use mocks as proxies for “expensive” resource. Rather they are stand-ins for functionality not yet implemented. They allow me to get a test green on a high level of abstraction. That way I can move forward in a top-down fashion. But if you think of mocks as “advanced” or if you don´t want to use a tool like JustMock, then you don´t need to use mocks. You just need to stand the sight of red tests for a little longer ;-) Let me show you what I mean by that by doing a kata. ITDD for “To Roman Numerals” gustav asked for the kata “To Roman Numerals”. I won´t explain the requirements again. You can find descriptions and TDD demonstrations all over the internet, like this one from Corey Haines. Now here is, how I would do this kata differently. 1. Analyse A demonstration of TDD should never skip the analysis phase. It should be made explicit. The requirements should be formalized and acceptance test cases should be compiled. “Formalization” in this case to me means describing the API of the required functionality. “[D]esign a program to work with Roman numerals” like written in this “requirement document” is not enough to start software development. Coding should only begin, if the interface between the “system under development” and its context is clear. If this interface is not readily recognizable from the requirements, it has to be developed first. Exploration of interface alternatives might be in order. It might be necessary to show several interface mock-ups to the customer – even if that´s you fellow developer. Designing the interface is a task of it´s own. It should not be mixed with implementing the required functionality behind the interface. Unfortunately, though, this happens quite often in TDD demonstrations. TDD is used to explore the API and implement it at the same time. To me that´s a violation of the Single Responsibility Principle (SRP) which not only should hold for software functional units but also for tasks or activities. In the case of this kata the API fortunately is obvious. Just one function is needed: string ToRoman(int arabic). And it lives in a class ArabicRomanConversions. Now what about acceptance test cases? There are hardly any stated in the kata descriptions. Roman numerals are explained, but no specific test cases from the point of view of a customer. So I just “invent” some acceptance test cases by picking roman numerals from a wikipedia article. They are supposed to be just “typical examples” without special meaning. Given the acceptance test cases I then try to develop an understanding of the problem domain. I´ll spare you that. The domain is trivial and is explain in almost all kata descriptions. How roman numerals are built is not difficult to understand. What´s more difficult, though, might be to find an efficient solution to convert into them automatically. 2. Solve The usual TDD demonstration skips a solution finding phase. Like the interface exploration it´s mixed in with the implementation. But I don´t think this is how it should be done. I even think this is not how it really works for the people demonstrating TDD. They´re simplifying their true software development process because they want to show a streamlined TDD process. I doubt this is helping anybody. Before you code you better have a plan what to code. This does not mean you have to do “Big Design Up-Front”. It just means: Have a clear picture of the logical solution in your head before you start to build a physical solution (code). Evidently such a solution can only be as good as your understanding of the problem. If that´s limited your solution will be limited, too. Fortunately, in the case of this kata your understanding does not need to be limited. Thus the logical solution does not need to be limited or preliminary or tentative. That does not mean you need to know every line of code in advance. It just means you know the rough structure of your implementation beforehand. Because it should mirror the process described by the logical or conceptual solution. Here´s my solution approach: The arabic “encoding” of numbers represents them as an ordered set of powers of 10. Each digit is a factor to multiply a power of ten with. The “encoding” 123 is the short form for a set like this: {1*10^2, 2*10^1, 3*10^0}. And the number is the sum of the set members. The roman “encoding” is different. There is no base (like 10 for arabic numbers), there are just digits of different value, and they have to be written in descending order. The “encoding” XVI is short for [10, 5, 1]. And the number is still the sum of the members of this list. The roman “encoding” thus is simpler than the arabic. Each “digit” can be taken at face value. No multiplication with a base required. But what about IV which looks like a contradiction to the above rule? It is not – if you accept roman “digits” not to be limited to be single characters only. Usually I, V, X, L, C, D, M are viewed as “digits”, and IV, IX etc. are viewed as nuisances preventing a simple solution. All looks different, though, once IV, IX etc. are taken as “digits”. Then MCMLIV is just a sum: M+CM+L+IV which is 1000+900+50+4. Whereas before it would have been understood as M-C+M+L-I+V – which is more difficult because here some “digits” get subtracted. Here´s the list of roman “digits” with their values: {1, I}, {4, IV}, {5, V}, {9, IX}, {10, X}, {40, XL}, {50, L}, {90, XC}, {100, C}, {400, CD}, {500, D}, {900, CM}, {1000, M} Since I take IV, IX etc. as “digits” translating an arabic number becomes trivial. I just need to find the values of the roman “digits” making up the number, e.g. 1954 is made up of 1000, 900, 50, and 4. I call those “digits” factors. If I move from the highest factor (M=1000) to the lowest (I=1) then translation is a two phase process: Find all the factors Translate the factors found Compile the roman representation Translation is just a look-up. Finding, though, needs some calculation: Find the highest remaining factor fitting in the value Remember and subtract it from the value Repeat with remaining value and remaining factors Please note: This is just an algorithm. It´s not code, even though it might be close. Being so close to code in my solution approach is due to the triviality of the problem. In more realistic examples the conceptual solution would be on a higher level of abstraction. With this solution in hand I finally can do what TDD advocates: find and prioritize test cases. As I can see from the small process description above, there are two aspects to test: Test the translation Test the compilation Test finding the factors Testing the translation primarily means to check if the map of factors and digits is comprehensive. That´s simple, even though it might be tedious. Testing the compilation is trivial. Testing factor finding, though, is a tad more complicated. I can think of several steps: First check, if an arabic number equal to a factor is processed correctly (e.g. 1000=M). Then check if an arabic number consisting of two consecutive factors (e.g. 1900=[M,CM]) is processed correctly. Then check, if a number consisting of the same factor twice is processed correctly (e.g. 2000=[M,M]). Finally check, if an arabic number consisting of non-consecutive factors (e.g. 1400=[M,CD]) is processed correctly. I feel I can start an implementation now. If something becomes more complicated than expected I can slow down and repeat this process. 3. Implement First I write a test for the acceptance test cases. It´s red because there´s no implementation even of the API. That´s in conformance with “TDD lore”, I´d say: Next I implement the API: The acceptance test now is formally correct, but still red of course. This will not change even now that I zoom in. Because my goal is not to most quickly satisfy these tests, but to implement my solution in a stepwise manner. That I do by “faking” it: I just “assume” three functions to represent the transformation process of my solution: My hypothesis is that those three functions in conjunction produce correct results on the API-level. I just have to implement them correctly. That´s what I´m trying now – one by one. I start with a simple “detail function”: Translate(). And I start with all the test cases in the obvious equivalence partition: As you can see I dare to test a private method. Yes. That´s a white box test. But as you´ll see it won´t make my tests brittle. It serves a purpose right here and now: it lets me focus on getting one aspect of my solution right. Here´s the implementation to satisfy the test: It´s as simple as possible. Right how TDD wants me to do it: KISS. Now for the second equivalence partition: translating multiple factors. (It´a pattern: if you need to do something repeatedly separate the tests for doing it once and doing it multiple times.) In this partition I just need a single test case, I guess. Stepping up from a single translation to multiple translations is no rocket science: Usually I would have implemented the final code right away. Splitting it in two steps is just for “educational purposes” here. How small your implementation steps are is a matter of your programming competency. Some “see” the final code right away before their mental eye – others need to work their way towards it. Having two tests I find more important. Now for the next low hanging fruit: compilation. It´s even simpler than translation. A single test is enough, I guess. And normally I would not even have bothered to write that one, because the implementation is so simple. I don´t need to test .NET framework functionality. But again: if it serves the educational purpose… Finally the most complicated part of the solution: finding the factors. There are several equivalence partitions. But still I decide to write just a single test, since the structure of the test data is the same for all partitions: Again, I´m faking the implementation first: I focus on just the first test case. No looping yet. Faking lets me stay on a high level of abstraction. I can write down the implementation of the solution without bothering myself with details of how to actually accomplish the feat. That´s left for a drill down with a test of the fake function: There are two main equivalence partitions, I guess: either the first factor is appropriate or some next. The implementation seems easy. Both test cases are green. (Of course this only works on the premise that there´s always a matching factor. Which is the case since the smallest factor is 1.) And the first of the equivalence partitions on the higher level also is satisfied: Great, I can move on. Now for more than a single factor: Interestingly not just one test becomes green now, but all of them. Great! You might say, then I must have done not the simplest thing possible. And I would reply: I don´t care. I did the most obvious thing. But I also find this loop very simple. Even simpler than a recursion of which I had thought briefly during the problem solving phase. And by the way: Also the acceptance tests went green: Mission accomplished. At least functionality wise. Now I´ve to tidy up things a bit. TDD calls for refactoring. Not uch refactoring is needed, because I wrote the code in top-down fashion. I faked it until I made it. I endured red tests on higher levels while lower levels weren´t perfected yet. But this way I saved myself from refactoring tediousness. At the end, though, some refactoring is required. But maybe in a different way than you would expect. That´s why I rather call it “cleanup”. First I remove duplication. There are two places where factors are defined: in Translate() and in Find_factors(). So I factor the map out into a class constant. Which leads to a small conversion in Find_factors(): And now for the big cleanup: I remove all tests of private methods. They are scaffolding tests to me. They only have temporary value. They are brittle. Only acceptance tests need to remain. However, I carry over the single “digit” tests from Translate() to the acceptance test. I find them valuable to keep, since the other acceptance tests only exercise a subset of all roman “digits”. This then is my final test class: And this is the final production code: Test coverage as reported by NCrunch is 100%: Reflexion Is this the smallest possible code base for this kata? Sure not. You´ll find more concise solutions on the internet. But LOC are of relatively little concern – as long as I can understand the code quickly. So called “elegant” code, however, often is not easy to understand. The same goes for KISS code – especially if left unrefactored, as it is often the case. That´s why I progressed from requirements to final code the way I did. I first understood and solved the problem on a conceptual level. Then I implemented it top down according to my design. I also could have implemented it bottom-up, since I knew some bottom of the solution. That´s the leaves of the functional decomposition tree. Where things became fuzzy, since the design did not cover any more details as with Find_factors(), I repeated the process in the small, so to speak: fake some top level, endure red high level tests, while first solving a simpler problem. Using scaffolding tests (to be thrown away at the end) brought two advantages: Encapsulation of the implementation details was not compromised. Naturally private methods could stay private. I did not need to make them internal or public just to be able to test them. I was able to write focused tests for small aspects of the solution. No need to test everything through the solution root, the API. The bottom line thus for me is: Informed TDD produces cleaner code in a systematic way. It conforms to core principles of programming: Single Responsibility Principle and/or Separation of Concerns. Distinct roles in development – being a researcher, being an engineer, being a craftsman – are represented as different phases. First find what, what there is. Then devise a solution. Then code the solution, manifest the solution in code. Writing tests first is a good practice. But it should not be taken dogmatic. And above all it should not be overloaded with purposes. And finally: moving from top to bottom through a design produces refactored code right away. Clean code thus almost is inevitable – and not left to a refactoring step at the end which is skipped often for different reasons.   PS: Yes, I have done this kata several times. But that has only an impact on the time needed for phases 1 and 2. I won´t skip them because of that. And there are no shortcuts during implementation because of that.

    Read the article

  • Async ignored on AJAX requests on Nginx server

    - by eComEvo
    Despite sending an async request to the server over AJAX, the server will not respond until the previous unrelated request has finished. The following code is only broken in this way on Nginx, but runs perfectly on Apache. This call will start a background process and it waits for it to complete so it can display the final result. $.ajax({ type: 'GET', async: true, url: $(this).data('route'), data: $('input[name=data]').val(), dataType: 'json', success: function (data) { /* do stuff */} error: function (data) { /* handle errors */} }); The below is called after the above, which on Apache requires 100ms to execute and repeats itself, showing progress for data being written in the background: checkStatusInterval = setInterval(function () { $.ajax({ type: 'GET', async: false, cache: false, url: '/process-status?process=' + currentElement.attr('id'), dataType: 'json', success: function (data) { /* update progress bar and status message */ } }); }, 1000); Unfortunately, when this script is run from nginx, the above progress request never even finishes a single request until the first AJAX request that sent the data is done. If I change the async to TRUE in the above, it executes one every interval, but none of them complete until that very first AJAX request finishes. Here is the main nginx conf file: #user nobody; worker_processes 1; #error_log logs/error.log; #error_log logs/error.log notice; #error_log logs/error.log info; #pid logs/nginx.pid; events { worker_connections 1024; } http { include mime.types; default_type application/octet-stream; server_names_hash_bucket_size 64; # configure temporary paths # nginx is started with param -p, setting nginx path to serverpack installdir fastcgi_temp_path temp/fastcgi; uwsgi_temp_path temp/uwsgi; scgi_temp_path temp/scgi; client_body_temp_path temp/client-body 1 2; proxy_temp_path temp/proxy; log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; #access_log logs/access.log main; # Sendfile copies data between one FD and other from within the kernel. # More efficient than read() + write(), since the requires transferring data to and from the user space. sendfile on; # Tcp_nopush causes nginx to attempt to send its HTTP response head in one packet, # instead of using partial frames. This is useful for prepending headers before calling sendfile, # or for throughput optimization. tcp_nopush on; # don't buffer data-sends (disable Nagle algorithm). Good for sending frequent small bursts of data in real time. tcp_nodelay on; types_hash_max_size 2048; # Timeout for keep-alive connections. Server will close connections after this time. keepalive_timeout 90; # Number of requests a client can make over the keep-alive connection. This is set high for testing. keepalive_requests 100000; # allow the server to close the connection after a client stops responding. Frees up socket-associated memory. reset_timedout_connection on; # send the client a "request timed out" if the body is not loaded by this time. Default 60. client_header_timeout 20; client_body_timeout 60; # If the client stops reading data, free up the stale client connection after this much time. Default 60. send_timeout 60; # Size Limits client_body_buffer_size 64k; client_header_buffer_size 4k; client_max_body_size 8M; # FastCGI fastcgi_connect_timeout 60; fastcgi_send_timeout 120; fastcgi_read_timeout 300; # default: 60 secs; when step debugging with XDEBUG, you need to increase this value fastcgi_buffer_size 64k; fastcgi_buffers 4 64k; fastcgi_busy_buffers_size 128k; fastcgi_temp_file_write_size 128k; # Caches information about open FDs, freqently accessed files. open_file_cache max=200000 inactive=20s; open_file_cache_valid 30s; open_file_cache_min_uses 2; open_file_cache_errors on; # Turn on gzip output compression to save bandwidth. # http://wiki.nginx.org/HttpGzipModule gzip on; gzip_disable "MSIE [1-6]\.(?!.*SV1)"; gzip_http_version 1.1; gzip_vary on; gzip_proxied any; #gzip_proxied expired no-cache no-store private auth; gzip_comp_level 6; gzip_buffers 16 8k; gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript application/javascript; # show all files and folders autoindex on; server { # access from localhost only listen 127.0.0.1:80; server_name localhost; root www; # the following default "catch-all" configuration, allows access to the server from outside. # please ensure your firewall allows access to tcp/port 80. check your "skype" config. # listen 80; # server_name _; log_not_found off; charset utf-8; access_log logs/access.log main; # handle files in the root path /www location / { index index.php index.html index.htm; } #error_page 404 /404.html; # redirect server error pages to the static page /50x.html # error_page 500 502 503 504 /50x.html; location = /50x.html { root www; } # pass the PHP scripts to FastCGI server listening on 127.0.0.1:9100 # location ~ \.php$ { try_files $uri =404; fastcgi_pass 127.0.0.1:9100; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; } # add expire headers location ~* ^.+.(gif|ico|jpg|jpeg|png|flv|swf|pdf|mp3|mp4|xml|txt|js|css)$ { expires 30d; } # deny access to .htaccess files (if Apache's document root concurs with nginx's one) # deny access to git & svn repositories location ~ /(\.ht|\.git|\.svn) { deny all; } } # include config files of "enabled" domains include domains-enabled/*.conf; } Here is the enabled domain conf file: access_log off; access_log C:/server/www/test.dev/logs/access.log; error_log C:/server/www/test.dev/logs/error.log; # HTTP Server server { listen 127.0.0.1:80; server_name test.dev; root C:/server/www/test.dev/public; index index.php; rewrite_log on; default_type application/octet-stream; #include /etc/nginx/mime.types; # Include common configurations. include domains-common/location.conf; } # HTTPS server server { listen 443 ssl; server_name test.dev; root C:/server/www/test.dev/public; index index.php; rewrite_log on; default_type application/octet-stream; #include /etc/nginx/mime.types; # Include common configurations. include domains-common/location.conf; include domains-common/ssl.conf; } Contents of ssl.conf: # OpenSSL for HTTPS connections. ssl on; ssl_certificate C:/server/bin/openssl/certs/cert.pem; ssl_certificate_key C:/server/bin/openssl/certs/cert.key; ssl_session_timeout 5m; ssl_protocols SSLv3 TLSv1 TLSv1.1 TLSv1.2; ssl_ciphers HIGH:!aNULL:!MD5; ssl_prefer_server_ciphers on; # Pass the PHP scripts to FastCGI server listening on 127.0.0.1:9100 location ~ \.php$ { try_files $uri =404; fastcgi_param HTTPS on; fastcgi_pass 127.0.0.1:9100; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; } Contents of location.conf: # Remove trailing slash to please Laravel routing system. if (!-d $request_filename) { rewrite ^/(.+)/$ /$1 permanent; } location / { try_files $uri $uri/ /index.php?$query_string; } # We don't need .ht files with nginx. location ~ /(\.ht|\.git|\.svn) { deny all; } # Added cache headers for images. location ~* \.(png|jpg|jpeg|gif)$ { expires 30d; log_not_found off; } # Only 3 hours on CSS/JS to allow me to roll out fixes during early weeks. location ~* \.(js|css)$ { expires 3h; log_not_found off; } # Add expire headers. location ~* ^.+.(gif|ico|jpg|jpeg|png|flv|swf|pdf|mp3|mp4|xml|txt)$ { expires 30d; } # Pass the PHP scripts to FastCGI server listening on 127.0.0.1:9100 location ~ \.php$ { try_files $uri /index.php =404; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; fastcgi_pass 127.0.0.1:9100; } Any ideas where this is going wrong?

    Read the article

  • What approach to take for SIMD optimizations

    - by goldenmean
    Hi, I am trying to optimize below code for SIMD operations (8way/4way/2way SIMD whiechever possible and if it gives gains in performance) I am tryin to analyze it first on paper to understand the algorithm used. How can i optimize it for SIMD:- void idct(uint8_t *dst, int stride, int16_t *input, int type) { int16_t *ip = input; uint8_t *cm = ff_cropTbl + MAX_NEG_CROP; int A, B, C, D, Ad, Bd, Cd, Dd, E, F, G, H; int Ed, Gd, Add, Bdd, Fd, Hd; int i; /* Inverse DCT on the rows now */ for (i = 0; i < 8; i++) { /* Check for non-zero values */ if ( ip[0] | ip[1] | ip[2] | ip[3] | ip[4] | ip[5] | ip[6] | ip[7] ) { A = M(xC1S7, ip[1]) + M(xC7S1, ip[7]); B = M(xC7S1, ip[1]) - M(xC1S7, ip[7]); C = M(xC3S5, ip[3]) + M(xC5S3, ip[5]); D = M(xC3S5, ip[5]) - M(xC5S3, ip[3]); Ad = M(xC4S4, (A - C)); Bd = M(xC4S4, (B - D)); Cd = A + C; Dd = B + D; E = M(xC4S4, (ip[0] + ip[4])); F = M(xC4S4, (ip[0] - ip[4])); G = M(xC2S6, ip[2]) + M(xC6S2, ip[6]); H = M(xC6S2, ip[2]) - M(xC2S6, ip[6]); Ed = E - G; Gd = E + G; Add = F + Ad; Bdd = Bd - H; Fd = F - Ad; Hd = Bd + H; /* Final sequence of operations over-write original inputs. */ ip[0] = (int16_t)(Gd + Cd) ; ip[7] = (int16_t)(Gd - Cd ); ip[1] = (int16_t)(Add + Hd); ip[2] = (int16_t)(Add - Hd); ip[3] = (int16_t)(Ed + Dd) ; ip[4] = (int16_t)(Ed - Dd ); ip[5] = (int16_t)(Fd + Bdd); ip[6] = (int16_t)(Fd - Bdd); } ip += 8; /* next row */ } ip = input; for ( i = 0; i < 8; i++) { /* Check for non-zero values (bitwise or faster than ||) */ if ( ip[1 * 8] | ip[2 * 8] | ip[3 * 8] | ip[4 * 8] | ip[5 * 8] | ip[6 * 8] | ip[7 * 8] ) { A = M(xC1S7, ip[1*8]) + M(xC7S1, ip[7*8]); B = M(xC7S1, ip[1*8]) - M(xC1S7, ip[7*8]); C = M(xC3S5, ip[3*8]) + M(xC5S3, ip[5*8]); D = M(xC3S5, ip[5*8]) - M(xC5S3, ip[3*8]); Ad = M(xC4S4, (A - C)); Bd = M(xC4S4, (B - D)); Cd = A + C; Dd = B + D; E = M(xC4S4, (ip[0*8] + ip[4*8])) + 8; F = M(xC4S4, (ip[0*8] - ip[4*8])) + 8; if(type==1){ //HACK E += 16*128; F += 16*128; } G = M(xC2S6, ip[2*8]) + M(xC6S2, ip[6*8]); H = M(xC6S2, ip[2*8]) - M(xC2S6, ip[6*8]); Ed = E - G; Gd = E + G; Add = F + Ad; Bdd = Bd - H; Fd = F - Ad; Hd = Bd + H; /* Final sequence of operations over-write original inputs. */ if(type==0){ ip[0*8] = (int16_t)((Gd + Cd ) >> 4); ip[7*8] = (int16_t)((Gd - Cd ) >> 4); ip[1*8] = (int16_t)((Add + Hd ) >> 4); ip[2*8] = (int16_t)((Add - Hd ) >> 4); ip[3*8] = (int16_t)((Ed + Dd ) >> 4); ip[4*8] = (int16_t)((Ed - Dd ) >> 4); ip[5*8] = (int16_t)((Fd + Bdd ) >> 4); ip[6*8] = (int16_t)((Fd - Bdd ) >> 4); }else if(type==1){ dst[0*stride] = cm[(Gd + Cd ) >> 4]; dst[7*stride] = cm[(Gd - Cd ) >> 4]; dst[1*stride] = cm[(Add + Hd ) >> 4]; dst[2*stride] = cm[(Add - Hd ) >> 4]; dst[3*stride] = cm[(Ed + Dd ) >> 4]; dst[4*stride] = cm[(Ed - Dd ) >> 4]; dst[5*stride] = cm[(Fd + Bdd ) >> 4]; dst[6*stride] = cm[(Fd - Bdd ) >> 4]; }else{ dst[0*stride] = cm[dst[0*stride] + ((Gd + Cd ) >> 4)]; dst[7*stride] = cm[dst[7*stride] + ((Gd - Cd ) >> 4)]; dst[1*stride] = cm[dst[1*stride] + ((Add + Hd ) >> 4)]; dst[2*stride] = cm[dst[2*stride] + ((Add - Hd ) >> 4)]; dst[3*stride] = cm[dst[3*stride] + ((Ed + Dd ) >> 4)]; dst[4*stride] = cm[dst[4*stride] + ((Ed - Dd ) >> 4)]; dst[5*stride] = cm[dst[5*stride] + ((Fd + Bdd ) >> 4)]; dst[6*stride] = cm[dst[6*stride] + ((Fd - Bdd ) >> 4)]; } } else { if(type==0){ ip[0*8] = ip[1*8] = ip[2*8] = ip[3*8] = ip[4*8] = ip[5*8] = ip[6*8] = ip[7*8] = ((xC4S4 * ip[0*8] + (IdctAdjustBeforeShift<<16))>>20); }else if(type==1){ dst[0*stride]= dst[1*stride]= dst[2*stride]= dst[3*stride]= dst[4*stride]= dst[5*stride]= dst[6*stride]= dst[7*stride]= cm[128 + ((xC4S4 * ip[0*8] + (IdctAdjustBeforeShift<<16))>>20)]; }else{ if(ip[0*8]){ int v= ((xC4S4 * ip[0*8] + (IdctAdjustBeforeShift<<16))>>20); dst[0*stride] = cm[dst[0*stride] + v]; dst[1*stride] = cm[dst[1*stride] + v]; dst[2*stride] = cm[dst[2*stride] + v]; dst[3*stride] = cm[dst[3*stride] + v]; dst[4*stride] = cm[dst[4*stride] + v]; dst[5*stride] = cm[dst[5*stride] + v]; dst[6*stride] = cm[dst[6*stride] + v]; dst[7*stride] = cm[dst[7*stride] + v]; } } } ip++; /* next column */ dst++; } }

    Read the article

  • program not working as expected!

    - by wilson88
    Can anyone just help spot why my program is not returning the expected output.related to my previous question.Am passing a vector by refrence, I want to see whats in the container before I copy them to another loaction.if u remove comments on loadRage, u will see bids are generated by the trader. #include <iostream> #include <vector> #include <string> #include <algorithm> #include <cstdlib> #include <iomanip> using namespace std; const int NUMSELLER = 1; const int NUMBUYER = 1; const int NUMBIDS = 20; const int MINQUANTITY = 1; const int MAXQUANTITY = 30; const int MINPRICE =100; const int MAXPRICE = 150; int s=0; int trdId; // Bid, simple container for values struct Bid { int bidId, trdId, qty, price; char type; // for sort and find. bool operator<(const Bid &other) const { return price < other.price; } bool operator==(int bidId) const { return this->bidId == bidId; } }; // alias to the list, make type consistent typedef vector<Bid> BidList; // this class generates bids! class Trader { private: int nextBidId; public: Trader(); Bid getNextBid(); Bid getNextBid(char type); // generate a number of bids void loadRange(BidList &, int size); void loadRange(BidList &, char type, int size); void setVector(); }; Trader::Trader() : nextBidId(1) {} #define RAND_RANGE(min, max) ((rand() % (max-min+1)) + min) Bid Trader::getNextBid() { char type = RAND_RANGE('A','B'); return getNextBid(type); } Bid Trader::getNextBid(char type) { for(int i = 0; i < NUMSELLER+NUMBUYER; i++) { // int trdId = RAND_RANGE(1,9); if (s<10){trdId=0;type='A';} else {trdId=1;type='B';} s++; int qty = RAND_RANGE(MINQUANTITY, MAXQUANTITY); int price = RAND_RANGE(MINPRICE, MAXPRICE); Bid bid = {nextBidId++, trdId, qty, price, type}; return bid; } } //void Trader::loadRange(BidList &list, int size) { // for (int i=0; i<size; i++) { list.push_back(getNextBid()); } //} // //void Trader::loadRange(BidList &list, char type, int size) { // for (int i=0; i<size; i++) { list.push_back(getNextBid(type)); } //} //---------------------------AUCTIONEER------------------------------------------- class Auctioneer { vector<Auctioneer> List; Trader trader; vector<Bid> list; public: Auctioneer(){}; void accept_bids(const BidList& bid); }; typedef vector<Auctioneer*> bidlist; void Auctioneer::accept_bids(const BidList& bid){ BidList list; //copy (BidList.begin(),BidList.end(),list); } //all the happy display commands void show(const Bid &bid) { cout << "\tBid\t(" << setw(3) << bid.bidId << "\t " << setw(3) << bid.trdId << "\t " << setw(3) << bid.type <<"\t " << setw(3) << bid.qty <<"\t " << setw(3) << bid.price <<")\t\n " ; } void show(const BidList &list) { cout << "\t\tBidID | TradID | Type | Qty | Price \n\n"; for(BidList::const_iterator itr=list.begin(); itr != list.end(); ++itr) { //cout <<"\t\t"; show(*itr); cout << endl; } cout << endl; } //search now checks for failure void show(const char *msg, const BidList &list) { cout << msg << endl; show(list); } void searchTest(BidList &list, int bidId) { cout << "Searching for Bid " << bidId << endl; BidList::const_iterator itr = find(list.begin(), list.end(), bidId); if (itr==list.end()) { cout << "Bid not found."; } else { cout << "Bid has been found. Its : "; show(*itr); } cout << endl; } //comparator function for price: returns true when x belongs before y bool compareBidList(Bid one, Bid two) { if (one.type == 'A' && two.type == 'B') return (one.price < two.price); return false; } void sort(BidList &bidlist) { sort(bidlist.begin(), bidlist.end(), compareBidList); } int main(int argc, char **argv) { Trader trader; BidList bidlist; Auctioneer auctioneer; //bidlist list; auctioneer.accept_bids(bidlist); //trader.loadRange(bidlist, NUMBIDS); show("Bids before sort:", bidlist); sort(bidlist); show("Bids after sort:", bidlist); system("pause"); return 0; }

    Read the article

  • Solving embarassingly parallel problems using Python multiprocessing

    - by gotgenes
    How does one use multiprocessing to tackle embarrassingly parallel problems? Embarassingly parallel problems typically consist of three basic parts: Read input data (from a file, database, tcp connection, etc.). Run calculations on the input data, where each calculation is independent of any other calculation. Write results of calculations (to a file, database, tcp connection, etc.). We can parallelize the program in two dimensions: Part 2 can run on multiple cores, since each calculation is independent; order of processing doesn't matter. Each part can run independently. Part 1 can place data on an input queue, part 2 can pull data off the input queue and put results onto an output queue, and part 3 can pull results off the output queue and write them out. This seems a most basic pattern in concurrent programming, but I am still lost in trying to solve it, so let's write a canonical example to illustrate how this is done using multiprocessing. Here is the example problem: Given a CSV file with rows of integers as input, compute their sums. Separate the problem into three parts, which can all run in parallel: Process the input file into raw data (lists/iterables of integers) Calculate the sums of the data, in parallel Output the sums Below is traditional, single-process bound Python program which solves these three tasks: #!/usr/bin/env python # -*- coding: UTF-8 -*- # basicsums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file. """ import csv import optparse import sys def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) return cli_parser def parse_input_csv(csvfile): """Parses the input CSV and yields tuples with the index of the row as the first element, and the integers of the row as the second element. The index is zero-index based. :Parameters: - `csvfile`: a `csv.reader` instance """ for i, row in enumerate(csvfile): row = [int(entry) for entry in row] yield i, row def sum_rows(rows): """Yields a tuple with the index of each input list of integers as the first element, and the sum of the list of integers as the second element. The index is zero-index based. :Parameters: - `rows`: an iterable of tuples, with the index of the original row as the first element, and a list of integers as the second element """ for i, row in rows: yield i, sum(row) def write_results(csvfile, results): """Writes a series of results to an outfile, where the first column is the index of the original row of data, and the second column is the result of the calculation. The index is zero-index based. :Parameters: - `csvfile`: a `csv.writer` instance to which to write results - `results`: an iterable of tuples, with the index (zero-based) of the original row as the first element, and the calculated result from that row as the second element """ for result_row in results: csvfile.writerow(result_row) def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # gets an iterable of rows that's not yet evaluated input_rows = parse_input_csv(in_csvfile) # sends the rows iterable to sum_rows() for results iterable, but # still not evaluated result_rows = sum_rows(input_rows) # finally evaluation takes place as a chain in write_results() write_results(out_csvfile, result_rows) infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) Let's take this program and rewrite it to use multiprocessing to parallelize the three parts outlined above. Below is a skeleton of this new, parallelized program, that needs to be fleshed out to address the parts in the comments: #!/usr/bin/env python # -*- coding: UTF-8 -*- # multiproc_sums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file, using multiple processes if desired. """ import csv import multiprocessing import optparse import sys NUM_PROCS = multiprocessing.cpu_count() def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) cli_parser.add_option('-n', '--numprocs', type='int', default=NUM_PROCS, help="Number of processes to launch [DEFAULT: %default]") return cli_parser def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # Parse the input file and add the parsed data to a queue for # processing, possibly chunking to decrease communication between # processes. # Process the parsed data as soon as any (chunks) appear on the # queue, using as many processes as allotted by the user # (opts.numprocs); place results on a queue for output. # # Terminate processes when the parser stops putting data in the # input queue. # Write the results to disk as soon as they appear on the output # queue. # Ensure all child processes have terminated. # Clean up files. infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) These pieces of code, as well as another piece of code that can generate example CSV files for testing purposes, can be found on github. I would appreciate any insight here as to how you concurrency gurus would approach this problem. Here are some questions I had when thinking about this problem. Bonus points for addressing any/all: Should I have child processes for reading in the data and placing it into the queue, or can the main process do this without blocking until all input is read? Likewise, should I have a child process for writing the results out from the processed queue, or can the main process do this without having to wait for all the results? Should I use a processes pool for the sum operations? If yes, what method do I call on the pool to get it to start processing the results coming into the input queue, without blocking the input and output processes, too? apply_async()? map_async()? imap()? imap_unordered()? Suppose we didn't need to siphon off the input and output queues as data entered them, but could wait until all input was parsed and all results were calculated (e.g., because we know all the input and output will fit in system memory). Should we change the algorithm in any way (e.g., not run any processes concurrently with I/O)?

    Read the article

  • Inverse Kinematics with OpenGL/Eigen3 : unstable jacobian pseudoinverse

    - by SigTerm
    I'm trying to implement simple inverse kinematics test using OpenGL, Eigen3 and "jacobian pseudoinverse" method. The system works fine using "jacobian transpose" algorithm, however, as soon as I attempt to use "pseudoinverse", joints become unstable and start jerking around (eventually they freeze completely - unless I use "jacobian transpose" fallback computation). I've investigated the issue and turns out that in some cases jacobian.inverse()*jacobian has zero determinant and cannot be inverted. However, I've seen other demos on the internet (youtube) that claim to use same method and they do not seem to have this problem. So I'm uncertain where is the cause of the issue. Code is attached below: *.h: struct Ik{ float targetAngle; float ikLength; VectorXf angles; Vector3f root, target; Vector3f jointPos(int ikIndex); size_t size() const; Vector3f getEndPos(int index, const VectorXf& vec); void resize(size_t size); void update(float t); void render(); Ik(): targetAngle(0), ikLength(10){ } }; *.cpp: size_t Ik::size() const{ return angles.rows(); } Vector3f Ik::getEndPos(int index, const VectorXf& vec){ Vector3f pos(0, 0, 0); while(true){ Eigen::Affine3f t; float radAngle = pi*vec[index]/180.0f; t = Eigen::AngleAxisf(radAngle, Vector3f(-1, 0, 0)) * Eigen::Translation3f(Vector3f(0, 0, ikLength)); pos = t * pos; if (index == 0) break; index--; } return pos; } void Ik::resize(size_t size){ angles.resize(size); angles.setZero(); } void drawMarker(Vector3f p){ glBegin(GL_LINES); glVertex3f(p[0]-1, p[1], p[2]); glVertex3f(p[0]+1, p[1], p[2]); glVertex3f(p[0], p[1]-1, p[2]); glVertex3f(p[0], p[1]+1, p[2]); glVertex3f(p[0], p[1], p[2]-1); glVertex3f(p[0], p[1], p[2]+1); glEnd(); } void drawIkArm(float length){ glBegin(GL_LINES); float f = 0.25f; glVertex3f(0, 0, length); glVertex3f(-f, -f, 0); glVertex3f(0, 0, length); glVertex3f(f, -f, 0); glVertex3f(0, 0, length); glVertex3f(f, f, 0); glVertex3f(0, 0, length); glVertex3f(-f, f, 0); glEnd(); glBegin(GL_LINE_LOOP); glVertex3f(f, f, 0); glVertex3f(-f, f, 0); glVertex3f(-f, -f, 0); glVertex3f(f, -f, 0); glEnd(); } void Ik::update(float t){ targetAngle += t * pi*2.0f/10.0f; while (t > pi*2.0f) t -= pi*2.0f; target << 0, 8 + 3*sinf(targetAngle), cosf(targetAngle)*4.0f+5.0f; Vector3f tmpTarget = target; Vector3f targetDiff = tmpTarget - root; float l = targetDiff.norm(); float maxLen = ikLength*(float)angles.size() - 0.01f; if (l > maxLen){ targetDiff *= maxLen/l; l = targetDiff.norm(); tmpTarget = root + targetDiff; } Vector3f endPos = getEndPos(size()-1, angles); Vector3f diff = tmpTarget - endPos; float maxAngle = 360.0f/(float)angles.size(); for(int loop = 0; loop < 1; loop++){ MatrixXf jacobian(diff.rows(), angles.rows()); jacobian.setZero(); float step = 1.0f; for (int i = 0; i < angles.size(); i++){ Vector3f curRoot = root; if (i) curRoot = getEndPos(i-1, angles); Vector3f axis(1, 0, 0); Vector3f n = endPos - curRoot; float l = n.norm(); if (l) n /= l; n = n.cross(axis); if (l) n *= l*step*pi/180.0f; //std::cout << n << "\n"; for (int j = 0; j < 3; j++) jacobian(j, i) = n[j]; } std::cout << jacobian << std::endl; MatrixXf jjt = jacobian.transpose()*jacobian; //std::cout << jjt << std::endl; float d = jjt.determinant(); MatrixXf invJ; float scale = 0.1f; if (!d /*|| true*/){ invJ = jacobian.transpose(); scale = 5.0f; std::cout << "fallback to jacobian transpose!\n"; } else{ invJ = jjt.inverse()*jacobian.transpose(); std::cout << "jacobian pseudo-inverse!\n"; } //std::cout << invJ << std::endl; VectorXf add = invJ*diff*step*scale; //std::cout << add << std::endl; float maxSpeed = 15.0f; for (int i = 0; i < add.size(); i++){ float& cur = add[i]; cur = std::max(-maxSpeed, std::min(maxSpeed, cur)); } angles += add; for (int i = 0; i < angles.size(); i++){ float& cur = angles[i]; if (i) cur = std::max(-maxAngle, std::min(maxAngle, cur)); } } } void Ik::render(){ glPushMatrix(); glTranslatef(root[0], root[1], root[2]); for (int i = 0; i < angles.size(); i++){ glRotatef(angles[i], -1, 0, 0); drawIkArm(ikLength); glTranslatef(0, 0, ikLength); } glPopMatrix(); drawMarker(target); for (int i = 0; i < angles.size(); i++) drawMarker(getEndPos(i, angles)); } Any help will be appreciated.

    Read the article

  • Binary Cosine Cofficient

    - by hairyyak
    I was given the following forumulae for calculating this sim=|QnD| / v|Q|v|D| I went ahed and implemented a class to compare strings consisting of a series of words #pragma once #include <vector> #include <string> #include <iostream> #include <vector> using namespace std; class StringSet { public: StringSet(void); StringSet( const string the_strings[], const int no_of_strings); ~StringSet(void); StringSet( const vector<string> the_strings); void add_string( const string the_string); bool remove_string( const string the_string); void clear_set(void); int no_of_strings(void) const; friend ostream& operator <<(ostream& outs, StringSet& the_strings); friend StringSet operator *(const StringSet& first, const StringSet& second); friend StringSet operator +(const StringSet& first, const StringSet& second); double binary_coefficient( const StringSet& the_second_set); private: vector<string> set; }; #include "StdAfx.h" #include "StringSet.h" #include <iterator> #include <algorithm> #include <stdexcept> #include <iostream> #include <cmath> StringSet::StringSet(void) { } StringSet::~StringSet(void) { } StringSet::StringSet( const vector<string> the_strings) { set = the_strings; } StringSet::StringSet( const string the_strings[], const int no_of_strings) { copy( the_strings, &the_strings[no_of_strings], back_inserter(set)); } void StringSet::add_string( const string the_string) { try { if( find( set.begin(), set.end(), the_string) == set.end()) { set.push_back(the_string); } else { //String is already in the set. throw domain_error("String is already in the set"); } } catch( domain_error e) { cout << e.what(); exit(1); } } bool StringSet::remove_string( const string the_string) { //Found the occurrence of the string. return it an iterator pointing to it. vector<string>::iterator iter; if( ( iter = find( set.begin(), set.end(), the_string) ) != set.end()) { set.erase(iter); return true; } return false; } void StringSet::clear_set(void) { set.clear(); } int StringSet::no_of_strings(void) const { return set.size(); } ostream& operator <<(ostream& outs, StringSet& the_strings) { vector<string>::const_iterator const_iter = the_strings.set.begin(); for( ; const_iter != the_strings.set.end(); const_iter++) { cout << *const_iter << " "; } cout << endl; return outs; } //This function returns the union of the two string sets. StringSet operator *(const StringSet& first, const StringSet& second) { vector<string> new_string_set; new_string_set = first.set; for( unsigned int i = 0; i < second.set.size(); i++) { vector<string>::const_iterator const_iter = find(new_string_set.begin(), new_string_set.end(), second.set[i]); //String is new - include it. if( const_iter == new_string_set.end() ) { new_string_set.push_back(second.set[i]); } } StringSet the_set(new_string_set); return the_set; } //This method returns the intersection of the two string sets. StringSet operator +(const StringSet& first, const StringSet& second) { //For each string in the first string look though the second and see if //there is a matching pair, in which case include the string in the set. vector<string> new_string_set; vector<string>::const_iterator const_iter = first.set.begin(); for ( ; const_iter != first.set.end(); ++const_iter) { //Then search through the entire second string to see if //there is a duplicate. vector<string>::const_iterator const_iter2 = second.set.begin(); for( ; const_iter2 != second.set.end(); const_iter2++) { if( *const_iter == *const_iter2 ) { new_string_set.push_back(*const_iter); } } } StringSet new_set(new_string_set); return new_set; } double StringSet::binary_coefficient( const StringSet& the_second_set) { double coefficient; StringSet intersection = the_second_set + set; coefficient = intersection.no_of_strings() / sqrt((double) no_of_strings()) * sqrt((double)the_second_set.no_of_strings()); return coefficient; } However when I try and calculate the coefficient using the following main function: // Exercise13.cpp : main project file. #include "stdafx.h" #include <boost/regex.hpp> #include "StringSet.h" using namespace System; using namespace System::Runtime::InteropServices; using namespace boost; //This function takes as input a string, which //is then broken down into a series of words //where the punctuaction is ignored. StringSet break_string( const string the_string) { regex re; cmatch matches; StringSet words; string search_pattern = "\\b(\\w)+\\b"; try { // Assign the regular expression for parsing. re = search_pattern; } catch( regex_error& e) { cout << search_pattern << " is not a valid regular expression: \"" << e.what() << "\"" << endl; exit(1); } sregex_token_iterator p(the_string.begin(), the_string.end(), re, 0); sregex_token_iterator end; for( ; p != end; ++p) { string new_string(p->first, p->second); String^ copy_han = gcnew String(new_string.c_str()); String^ copy_han2 = copy_han->ToLower(); char* str2 = (char*)(void*)Marshal::StringToHGlobalAnsi(copy_han2); string new_string2(str2); words.add_string(new_string2); } return words; } int main(array<System::String ^> ^args) { StringSet words = break_string("Here is a string, with some; words"); StringSet words2 = break_string("There is another string,"); cout << words.binary_coefficient(words2); return 0; } I get an index which is 1.5116 rather than a value from 0 to 1. Does anybody have a clue why this is the case? Any help would be appreciated.

    Read the article

  • migrating Solaris to RH: network latency issue, tcp window size & other tcp parameters

    - by Bastien
    Hello I have a client/server app (Java) that I'm migrating from Solaris to RH Linux. since I started running it in RH, I noticed some issues related to latency. I managed to isolate the problem that looks like this: client sends 5 messages (32 bytes each) in a row (same application timestamp) to the server. server echos messages. client receives replies and prints round trip time for each msg. in Solaris, all is well: I get ALL 5 replies at the same time, roughly 80ms after having sent original messages (client & server are several thousands miles away from each other: my ping RTT is 80ms, all normal). in RH, first 3 messages are echoed normally (they arrive 80ms after they've been sent), however the following 2 arrive 80ms later (so total 160ms RTT). the pattern is always the same. clearly looked like a TCP problem. on my solaris box, I had previously configured the tcp stack with 2 specific options: disable nagle algorithm globally set tcp_deferred_acks_max to 0 on RH, it's not possible to disable nagle globally, but I disabled it on all of my apps' sockets (TCP_NODELAY). so I started playing with tcpdump (on the server machine), and compared both outputs: SOLARIS: 22 2.085645 client server TCP 56150 > 6006 [PSH, ACK] Seq=111 Ack=106 Win=66672 Len=22 "MSG_1 RCV" 23 2.085680 server client TCP 6006 > 56150 [ACK] Seq=106 Ack=133 Win=50400 Len=0 24 2.085908 client server TCP 56150 > 6006 [PSH, ACK] Seq=133 Ack=106 Win=66672 Len=22 "MSG_2 RCV" 25 2.085925 server client TCP 6006 > 56150 [ACK] Seq=106 Ack=155 Win=50400 Len=0 26 2.086175 client server TCP 56150 > 6006 [PSH, ACK] Seq=155 Ack=106 Win=66672 Len=22 "MSG_3 RCV" 27 2.086192 server client TCP 6006 > 56150 [ACK] Seq=106 Ack=177 Win=50400 Len=0 28 2.086243 server client TCP 6006 > 56150 [PSH, ACK] Seq=106 Ack=177 Win=50400 Len=21 "MSG_1 ECHO" 29 2.086440 client server TCP 56150 > 6006 [PSH, ACK] Seq=177 Ack=106 Win=66672 Len=22 "MSG_4 RCV" 30 2.086454 server client TCP 6006 > 56150 [ACK] Seq=127 Ack=199 Win=50400 Len=0 31 2.086659 server client TCP 6006 > 56150 [PSH, ACK] Seq=127 Ack=199 Win=50400 Len=21 "MSG_2 ECHO" 32 2.086708 client server TCP 56150 > 6006 [PSH, ACK] Seq=199 Ack=106 Win=66672 Len=22 "MSG_5 RCV" 33 2.086721 server client TCP 6006 > 56150 [ACK] Seq=148 Ack=221 Win=50400 Len=0 34 2.086947 server client TCP 6006 > 56150 [PSH, ACK] Seq=148 Ack=221 Win=50400 Len=21 "MSG_3 ECHO" 35 2.087196 server client TCP 6006 > 56150 [PSH, ACK] Seq=169 Ack=221 Win=50400 Len=21 "MSG_4 ECHO" 36 2.087500 server client TCP 6006 > 56150 [PSH, ACK] Seq=190 Ack=221 Win=50400 Len=21 "MSG_5 ECHO" 37 2.165390 client server TCP 56150 > 6006 [ACK] Seq=221 Ack=148 Win=66632 Len=0 38 2.166314 client server TCP 56150 > 6006 [ACK] Seq=221 Ack=190 Win=66588 Len=0 39 2.364135 client server TCP 56150 > 6006 [ACK] Seq=221 Ack=211 Win=66568 Len=0 REDHAT: 17 2.081163 client server TCP 55879 > 6006 [PSH, ACK] Seq=111 Ack=106 Win=66672 Len=22 "MSG_1 RCV" 18 2.081178 server client TCP 6006 > 55879 [ACK] Seq=106 Ack=133 Win=5888 Len=0 19 2.081297 server client TCP 6006 > 55879 [PSH, ACK] Seq=106 Ack=133 Win=5888 Len=21 "MSG_1 ECHO" 20 2.081711 client server TCP 55879 > 6006 [PSH, ACK] Seq=133 Ack=106 Win=66672 Len=22 "MSG_2 RCV" 21 2.081761 client server TCP 55879 > 6006 [PSH, ACK] Seq=155 Ack=106 Win=66672 Len=22 "MSG_3 RCV" 22 2.081846 server client TCP 6006 > 55879 [PSH, ACK] Seq=127 Ack=177 Win=5888 Len=21 "MSG_2 ECHO" 23 2.081995 server client TCP 6006 > 55879 [PSH, ACK] Seq=148 Ack=177 Win=5888 Len=21 "MSG_3 ECHO" 24 2.082011 client server TCP 55879 > 6006 [PSH, ACK] Seq=177 Ack=106 Win=66672 Len=22 "MSG_4 RCV" 25 2.082362 client server TCP 55879 > 6006 [PSH, ACK] Seq=199 Ack=106 Win=66672 Len=22 "MSG_5 RCV" 26 2.082377 server client TCP 6006 > 55879 [ACK] Seq=169 Ack=221 Win=5888 Len=0 27 2.171003 client server TCP 55879 > 6006 [ACK] Seq=221 Ack=148 Win=66632 Len=0 28 2.171019 server client TCP 6006 > 55879 [PSH, ACK] Seq=169 Ack=221 Win=5888 Len=42 "MSG_4 ECHO + MSG_5 ECHO" 29 2.257498 client server TCP 55879 > 6006 [ACK] Seq=221 Ack=211 Win=66568 Len=0 so, I got confirmation things are not working correctly for RH: packet 28 is sent TOO LATE, it looks like the server is waiting for packet 27's ACK before doing anything. seems to me it's the most likely reason... then I realized that the "Win" parameters are different on Solaris & RH dumps: 50400 on Solaris, only 5888 on RH. that's another hint... I read the doc about the slide window & buffer window, and played around with the rcvBuffer & sendBuffer in java on my sockets, but never managed to change this 5888 value to anything else (I checked each time directly with tcpdump). does anybody know how to do this ? I'm having a hard time getting definitive information, as in some cases there's "auto-negotiation" that I might need to bypass, etc... I eventually managed to get only partially rid of my initial problem by setting the "tcp_slow_start_after_idle" parameter to 0 on RH, but it did not change the "win" parameter at all. the same problem was there for the first 4 groups of 5 messages, with TCP retransmission & TCP Dup ACK in tcpdump, then the problem disappeared altogether for all following groups of 5 messages. It doesn't seem like a very clean and/or generic solution to me. I'd really like to reproduce the exact same conditions under both OSes. I'll keep researching, but any help from TCP gurus would be greatly appreciated ! thanks !

    Read the article

  • Autocorrelation returns random results with mic input (using a high pass filter)

    - by Niall
    Hello, Sorry to ask a similar question to the one i asked before (FFT Problem (Returns random results)), but i've looked up pitch detection and autocorrelation and have found some code for pitch detection using autocorrelation. Im trying to do pitch detection of a users singing. Problem is, it keeps returning random results. I've got some code from http://code.google.com/p/yaalp/ which i've converted to C++ and modified (below). My sample rate is 2048, and data size is 1024. I'm detecting pitch of both a sine wave and mic input. The frequency of the sine wave is 726.0, and its detecting it to be 722.950820 (which im ok with), but its detecting the pitch of the mic as a random number from around 100 to around 1050. I'm now using a High pass filter to remove the DC offset, but it's not working. Am i doing it right, and if so, what else can i do to fix it? Any help would be greatly appreciated! double* doHighPassFilter(short *buffer) { // Do FFT: int bufferLength = 1024; float *real = malloc(bufferLength*sizeof(float)); float *real2 = malloc(bufferLength*sizeof(float)); for(int x=0;x<bufferLength;x++) { real[x] = buffer[x]; } fft(real, bufferLength); for(int x=0;x<bufferLength;x+=2) { real2[x] = real[x]; } for (int i=0; i < 30; i++) //Set freqs lower than 30hz to zero to attenuate the low frequencies real2[i] = 0; // Do inverse FFT: inversefft(real2,bufferLength); double* real3 = (double*)real2; return real3; } double DetectPitch(short* data) { int sampleRate = 2048; //Create sine wave double *buffer = malloc(1024*sizeof(short)); double amplitude = 0.25 * 32768; //0.25 * max length of short double frequency = 726.0; for (int n = 0; n < 1024; n++) { buffer[n] = (short)(amplitude * sin((2 * 3.14159265 * n * frequency) / sampleRate)); } doHighPassFilter(data); printf("Pitch from sine wave: %f\n",detectPitchCalculation(buffer, 50.0, 1000.0, 1, 1)); printf("Pitch from mic: %f\n",detectPitchCalculation(data, 50.0, 1000.0, 1, 1)); return 0; } // These work by shifting the signal until it seems to correlate with itself. // In other words if the signal looks very similar to (signal shifted 200 data) than the fundamental period is probably 200 data // Note that the algorithm only works well when there's only one prominent fundamental. // This could be optimized by looking at the rate of change to determine a maximum without testing all periods. double detectPitchCalculation(double* data, double minHz, double maxHz, int nCandidates, int nResolution) { //-------------------------1-------------------------// // note that higher frequency means lower period int nLowPeriodInSamples = hzToPeriodInSamples(maxHz, 2048); int nHiPeriodInSamples = hzToPeriodInSamples(minHz, 2048); if (nHiPeriodInSamples <= nLowPeriodInSamples) printf("Bad range for pitch detection."); if (1024 < nHiPeriodInSamples) printf("Not enough data."); double *results = new double[nHiPeriodInSamples - nLowPeriodInSamples]; //-------------------------2-------------------------// for (int period = nLowPeriodInSamples; period < nHiPeriodInSamples; period += nResolution) { double sum = 0; // for each sample, find correlation. (If they are far apart, small) for (int i = 0; i < 1024 - period; i++) sum += data[i] * data[i + period]; double mean = sum / 1024.0; results[period - nLowPeriodInSamples] = mean; } //-------------------------3-------------------------// // find the best indices int *bestIndices = findBestCandidates(nCandidates, results, nHiPeriodInSamples - nLowPeriodInSamples - 1); //note findBestCandidates modifies parameter // convert back to Hz double *res = new double[nCandidates]; for (int i=0; i < nCandidates;i++) res[i] = periodInSamplesToHz(bestIndices[i]+nLowPeriodInSamples, 2048); double pitch2 = res[0]; free(res); free(results); return pitch2; } /// Finds n "best" values from an array. Returns the indices of the best parts. /// (One way to do this would be to sort the array, but that could take too long. /// Warning: Changes the contents of the array!!! Do not use result array afterwards. int* findBestCandidates(int n, double* inputs,int length) { //int length = inputs.Length; if (length < n) printf("Length of inputs is not long enough."); int *res = new int[n]; double minValue = 0; for (int c = 0; c < n; c++) { // find the highest. double fBestValue = minValue; int nBestIndex = -1; for (int i = 0; i < length; i++) { if (inputs[i] > fBestValue) { nBestIndex = i; fBestValue = inputs[i]; } } // record this highest value res[c] = nBestIndex; // now blank out that index. if(nBestIndex!=-1) inputs[nBestIndex] = minValue; } return res; } int hzToPeriodInSamples(double hz, int sampleRate) { return (int)(1 / (hz / (double)sampleRate)); } double periodInSamplesToHz(int period, int sampleRate) { return 1 / (period / (double)sampleRate); } Thanks, Niall. Edit: Changed the code to implement a high pass filter with a cutoff of 30hz (from What Are High-Pass and Low-Pass Filters?, can anyone tell me how to convert the low-pass filter using convolution to a high-pass one?) but it's still returning random results. Plugging it into a VST host and using VST plugins to compare spectrums isn't an option to me unfortunately.

    Read the article

  • rotating bitmaps. In code.

    - by Marco van de Voort
    Is there a faster way to rotate a large bitmap by 90 or 270 degrees than simply doing a nested loop with inverted coordinates? The bitmaps are 8bpp and typically 2048*2400*8bpp Currently I do this by simply copying with argument inversion, roughly (pseudo code: for x = 0 to 2048-1 for y = 0 to 2048-1 dest[x][y]=src[y][x]; (In reality I do it with pointers, for a bit more speed, but that is roughly the same magnitude) GDI is quite slow with large images, and GPU load/store times for textures (GF7 cards) are in the same magnitude as the current CPU time. Any tips, pointers? An in-place algorithm would even be better, but speed is more important than being in-place. Target is Delphi, but it is more an algorithmic question. SSE(2) vectorization no problem, it is a big enough problem for me to code it in assembler Duplicates How do you rotate a two dimensional array?. Follow up to Nils' answer Image 2048x2700 - 2700x2048 Compiler Turbo Explorer 2006 with optimization on. Windows: Power scheme set to "Always on". (important!!!!) Machine: Core2 6600 (2.4 GHz) time with old routine: 32ms (step 1) time with stepsize 8 : 12ms time with stepsize 16 : 10ms time with stepsize 32+ : 9ms Meanwhile I also tested on a Athlon 64 X2 (5200+ iirc), and the speed up there was slightly more than a factor four (80 to 19 ms). The speed up is well worth it, thanks. Maybe that during the summer months I'll torture myself with a SSE(2) version. However I already thought about how to tackle that, and I think I'll run out of SSE2 registers for an straight implementation: for n:=0 to 7 do begin load r0, <source+n*rowsize> shift byte from r0 into r1 shift byte from r0 into r2 .. shift byte from r0 into r8 end; store r1, <target> store r2, <target+1*<rowsize> .. store r8, <target+7*<rowsize> So 8x8 needs 9 registers, but 32-bits SSE only has 8. Anyway that is something for the summer months :-) Note that the pointer thing is something that I do out of instinct, but it could be there is actually something to it, if your dimensions are not hardcoded, the compiler can't turn the mul into a shift. While muls an sich are cheap nowadays, they also generate more register pressure afaik. The code (validated by subtracting result from the "naieve" rotate1 implementation): const stepsize = 32; procedure rotatealign(Source: tbw8image; Target:tbw8image); var stepsx,stepsy,restx,resty : Integer; RowPitchSource, RowPitchTarget : Integer; pSource, pTarget,ps1,ps2 : pchar; x,y,i,j: integer; rpstep : integer; begin RowPitchSource := source.RowPitch; // bytes to jump to next line. Can be negative (includes alignment) RowPitchTarget := target.RowPitch; rpstep:=RowPitchTarget*stepsize; stepsx:=source.ImageWidth div stepsize; stepsy:=source.ImageHeight div stepsize; // check if mod 16=0 here for both dimensions, if so -> SSE2. for y := 0 to stepsy - 1 do begin psource:=source.GetImagePointer(0,y*stepsize); // gets pointer to pixel x,y ptarget:=Target.GetImagePointer(target.imagewidth-(y+1)*stepsize,0); for x := 0 to stepsx - 1 do begin for i := 0 to stepsize - 1 do begin ps1:=@psource[rowpitchsource*i]; // ( 0,i) ps2:=@ptarget[stepsize-1-i]; // (maxx-i,0); for j := 0 to stepsize - 1 do begin ps2[0]:=ps1[j]; inc(ps2,RowPitchTarget); end; end; inc(psource,stepsize); inc(ptarget,rpstep); end; end; // 3 more areas to do, with dimensions // - stepsy*stepsize * restx // right most column of restx width // - stepsx*stepsize * resty // bottom row with resty height // - restx*resty // bottom-right rectangle. restx:=source.ImageWidth mod stepsize; // typically zero because width is // typically 1024 or 2048 resty:=source.Imageheight mod stepsize; if restx>0 then begin // one loop less, since we know this fits in one line of "blocks" psource:=source.GetImagePointer(source.ImageWidth-restx,0); // gets pointer to pixel x,y ptarget:=Target.GetImagePointer(Target.imagewidth-stepsize,Target.imageheight-restx); for y := 0 to stepsy - 1 do begin for i := 0 to stepsize - 1 do begin ps1:=@psource[rowpitchsource*i]; // ( 0,i) ps2:=@ptarget[stepsize-1-i]; // (maxx-i,0); for j := 0 to restx - 1 do begin ps2[0]:=ps1[j]; inc(ps2,RowPitchTarget); end; end; inc(psource,stepsize*RowPitchSource); dec(ptarget,stepsize); end; end; if resty>0 then begin // one loop less, since we know this fits in one line of "blocks" psource:=source.GetImagePointer(0,source.ImageHeight-resty); // gets pointer to pixel x,y ptarget:=Target.GetImagePointer(0,0); for x := 0 to stepsx - 1 do begin for i := 0 to resty- 1 do begin ps1:=@psource[rowpitchsource*i]; // ( 0,i) ps2:=@ptarget[resty-1-i]; // (maxx-i,0); for j := 0 to stepsize - 1 do begin ps2[0]:=ps1[j]; inc(ps2,RowPitchTarget); end; end; inc(psource,stepsize); inc(ptarget,rpstep); end; end; if (resty>0) and (restx>0) then begin // another loop less, since only one block psource:=source.GetImagePointer(source.ImageWidth-restx,source.ImageHeight-resty); // gets pointer to pixel x,y ptarget:=Target.GetImagePointer(0,target.ImageHeight-restx); for i := 0 to resty- 1 do begin ps1:=@psource[rowpitchsource*i]; // ( 0,i) ps2:=@ptarget[resty-1-i]; // (maxx-i,0); for j := 0 to restx - 1 do begin ps2[0]:=ps1[j]; inc(ps2,RowPitchTarget); end; end; end; end;

    Read the article

< Previous Page | 198 199 200 201 202 203  | Next Page >