Search Results

Search found 28325 results on 1133 pages for 'test cases'.

Page 233/1133 | < Previous Page | 229 230 231 232 233 234 235 236 237 238 239 240  | Next Page >

  • Optimized OCR black/white pixel algorithm

    - by eagle
    I am writing a simple OCR solution for a finite set of characters. That is, I know the exact way all 26 letters in the alphabet will look like. I am using C# and am able to easily determine if a given pixel should be treated as black or white. I am generating a matrix of black/white pixels for every single character. So for example, the letter I (capital i), might look like the following: 01110 00100 00100 00100 01110 Note: all points, which I use later in this post, assume that the top left pixel is (0, 0), bottom right pixel is (4, 4). 1's represent black pixels, and 0's represent white pixels. I would create a corresponding matrix in C# like this: CreateLetter("I", new List<List<bool>>() { new List<bool>() { false, true, true, true, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, true, true, true, false } }); I know I could probably optimize this part by using a multi-dimensional array instead, but let's ignore that for now, this is for illustrative purposes. Every letter is exactly the same dimensions, 10px by 11px (10px by 11px is the actual dimensions of a character in my real program. I simplified this to 5px by 5px in this posting since it is much easier to "draw" the letters using 0's and 1's on a smaller image). Now when I give it a 10px by 11px part of an image to analyze with OCR, it would need to run on every single letter (26) on every single pixel (10 * 11 = 110) which would mean 2,860 (26 * 110) iterations (in the worst case) for every single character. I was thinking this could be optimized by defining the unique characteristics of every character. So, for example, let's assume that the set of characters only consists of 5 distinct letters: I, A, O, B, and L. These might look like the following: 01110 00100 00100 01100 01000 00100 01010 01010 01010 01000 00100 01110 01010 01100 01000 00100 01010 01010 01010 01000 01110 01010 00100 01100 01110 After analyzing the unique characteristics of every character, I can significantly reduce the number of tests that need to be performed to test for a character. For example, for the "I" character, I could define it's unique characteristics as having a black pixel in the coordinate (3, 0) since no other characters have that pixel as black. So instead of testing 110 pixels for a match on the "I" character, I reduced it to a 1 pixel test. This is what it might look like for all these characters: var LetterI = new OcrLetter() { Name = "I", BlackPixels = new List<Point>() { new Point (3, 0) } } var LetterA = new OcrLetter() { Name = "A", WhitePixels = new List<Point>() { new Point(2, 4) } } var LetterO = new OcrLetter() { Name = "O", BlackPixels = new List<Point>() { new Point(3, 2) }, WhitePixels = new List<Point>() { new Point(2, 2) } } var LetterB = new OcrLetter() { Name = "B", BlackPixels = new List<Point>() { new Point(3, 1) }, WhitePixels = new List<Point>() { new Point(3, 2) } } var LetterL = new OcrLetter() { Name = "L", BlackPixels = new List<Point>() { new Point(1, 1), new Point(3, 4) }, WhitePixels = new List<Point>() { new Point(2, 2) } } This is challenging to do manually for 5 characters and gets much harder the greater the amount of letters that are added. You also want to guarantee that you have the minimum set of unique characteristics of a letter since you want it to be optimized as much as possible. I want to create an algorithm that will identify the unique characteristics of all the letters and would generate similar code to that above. I would then use this optimized black/white matrix to identify characters. How do I take the 26 letters that have all their black/white pixels filled in (e.g. the CreateLetter code block) and convert them to an optimized set of unique characteristics that define a letter (e.g. the new OcrLetter() code block)? And how would I guarantee that it is the most efficient definition set of unique characteristics (e.g. instead of defining 6 points as the unique characteristics, there might be a way to do it with 1 or 2 points, as the letter "I" in my example was able to). An alternative solution I've come up with is using a hash table, which will reduce it from 2,860 iterations to 110 iterations, a 26 time reduction. This is how it might work: I would populate it with data similar to the following: Letters["01110 00100 00100 00100 01110"] = "I"; Letters["00100 01010 01110 01010 01010"] = "A"; Letters["00100 01010 01010 01010 00100"] = "O"; Letters["01100 01010 01100 01010 01100"] = "B"; Now when I reach a location in the image to process, I convert it to a string such as: "01110 00100 00100 00100 01110" and simply find it in the hash table. This solution seems very simple, however, this still requires 110 iterations to generate this string for each letter. In big O notation, the algorithm is the same since O(110N) = O(2860N) = O(N) for N letters to process on the page. However, it is still improved by a constant factor of 26, a significant improvement (e.g. instead of it taking 26 minutes, it would take 1 minute). Update: Most of the solutions provided so far have not addressed the issue of identifying the unique characteristics of a character and rather provide alternative solutions. I am still looking for this solution which, as far as I can tell, is the only way to achieve the fastest OCR processing. I just came up with a partial solution: For each pixel, in the grid, store the letters that have it as a black pixel. Using these letters: I A O B L 01110 00100 00100 01100 01000 00100 01010 01010 01010 01000 00100 01110 01010 01100 01000 00100 01010 01010 01010 01000 01110 01010 00100 01100 01110 You would have something like this: CreatePixel(new Point(0, 0), new List<Char>() { }); CreatePixel(new Point(1, 0), new List<Char>() { 'I', 'B', 'L' }); CreatePixel(new Point(2, 0), new List<Char>() { 'I', 'A', 'O', 'B' }); CreatePixel(new Point(3, 0), new List<Char>() { 'I' }); CreatePixel(new Point(4, 0), new List<Char>() { }); CreatePixel(new Point(0, 1), new List<Char>() { }); CreatePixel(new Point(1, 1), new List<Char>() { 'A', 'B', 'L' }); CreatePixel(new Point(2, 1), new List<Char>() { 'I' }); CreatePixel(new Point(3, 1), new List<Char>() { 'A', 'O', 'B' }); // ... CreatePixel(new Point(2, 2), new List<Char>() { 'I', 'A', 'B' }); CreatePixel(new Point(3, 2), new List<Char>() { 'A', 'O' }); // ... CreatePixel(new Point(2, 4), new List<Char>() { 'I', 'O', 'B', 'L' }); CreatePixel(new Point(3, 4), new List<Char>() { 'I', 'A', 'L' }); CreatePixel(new Point(4, 4), new List<Char>() { }); Now for every letter, in order to find the unique characteristics, you need to look at which buckets it belongs to, as well as the amount of other characters in the bucket. So let's take the example of "I". We go to all the buckets it belongs to (1,0; 2,0; 3,0; ...; 3,4) and see that the one with the least amount of other characters is (3,0). In fact, it only has 1 character, meaning it must be an "I" in this case, and we found our unique characteristic. You can also do the same for pixels that would be white. Notice that bucket (2,0) contains all the letters except for "L", this means that it could be used as a white pixel test. Similarly, (2,4) doesn't contain an 'A'. Buckets that either contain all the letters or none of the letters can be discarded immediately, since these pixels can't help define a unique characteristic (e.g. 1,1; 4,0; 0,1; 4,4). It gets trickier when you don't have a 1 pixel test for a letter, for example in the case of 'O' and 'B'. Let's walk through the test for 'O'... It's contained in the following buckets: // Bucket Count Letters // 2,0 4 I, A, O, B // 3,1 3 A, O, B // 3,2 2 A, O // 2,4 4 I, O, B, L Additionally, we also have a few white pixel tests that can help: (I only listed those that are missing at most 2). The Missing Count was calculated as (5 - Bucket.Count). // Bucket Missing Count Missing Letters // 1,0 2 A, O // 1,1 2 I, O // 2,2 2 O, L // 3,4 2 O, B So now we can take the shortest black pixel bucket (3,2) and see that when we test for (3,2) we know it is either an 'A' or an 'O'. So we need an easy way to tell the difference between an 'A' and an 'O'. We could either look for a black pixel bucket that contains 'O' but not 'A' (e.g. 2,4) or a white pixel bucket that contains an 'O' but not an 'A' (e.g. 1,1). Either of these could be used in combination with the (3,2) pixel to uniquely identify the letter 'O' with only 2 tests. This seems like a simple algorithm when there are 5 characters, but how would I do this when there are 26 letters and a lot more pixels overlapping? For example, let's say that after the (3,2) pixel test, it found 10 different characters that contain the pixel (and this was the least from all the buckets). Now I need to find differences from 9 other characters instead of only 1 other character. How would I achieve my goal of getting the least amount of checks as possible, and ensure that I am not running extraneous tests?

    Read the article

  • Code Golf: Seven Segments

    - by LiraNuna
    The challenge The shortest code by character count to generate seven segment display representation of a given hex number. Input Input is made out of digits [0-9] and hex characters in both lower and upper case [a-fA-F] only. There is no need to handle special cases. Output Output will be the seven segment representation of the input, using those ASCII faces: _ _ _ _ _ _ _ _ _ _ _ _ | | | _| _| |_| |_ |_ | |_| |_| |_| |_ | _| |_ |_ |_| | |_ _| | _| |_| | |_| _| | | |_| |_ |_| |_ | Restrictions The use of the following is forbidden: eval, exec, system, figlet, toilet and external libraries. Test cases: Input: deadbeef Output: _ _ _ _ _ _||_ |_| _||_ |_ |_ |_ |_||_ | ||_||_||_ |_ | Input: 4F790D59 Output: _ _ _ _ _ _ |_||_ ||_|| | _||_ |_| || | _||_||_| _| _| Code count includes input/output (i.e full program).

    Read the article

  • Why 2 GB memory limit when running in 64 bit Windows ?

    - by Roland Bengtsson
    I'm a member in a team that develop a Delphi application. The memory requirements are huge. 500 MB is normal but in some cases it got out of memory exception. The memory allocated in that cases is typically between 1000 - 1700 MB. We of course want 64-bits compiler but that won't happen now (and if it happens we also must convert to unicode, but that is another story...). My question is why is there a 2 GB memory limit per process when running in a 64 bit environment. The pointer is 32 bit so I think 4 GB would be the right limit. I use Delphi 2007.

    Read the article

  • How to detect if certain characters are at the end of an NSString?

    - by Sheehan Alam
    Let's assume I can have the following strings: "hey @john..." "@john, hello" "@john(hello)" I am tokenizing the string to get every word separated by a space: [myString componentsSeparatedByString:@" "]; My array of tokens now contain: @john... @john, @john(hello) For these cases. How can I make sure only @john is tokenized, while retaining the trailing characters: ... , (hello) Note: I would like to be able to handle all cases of characters at the end of a string. The above are just 3 examples.

    Read the article

  • Compile time string hashing

    - by Caspin
    I have read in few different places that using c++0x's new string literals it might be possible to compute a string's hash at compile time. However, no one seems to be ready to come out and say that it will be possible or how it would be done. Is this possible? What would the operator look like? I'm particularly interested use cases like this. void foo( const std::string& value ) { switch( std::hash(value) ) { case "one"_hash: one(); break; case "two"_hash: two(); break; /*many more cases*/ default: other(); break; } } Note: the compile time hash function doesn't have to look exactly as I've written it. I did my best to guess what the final solution would look like, but meta_hash<"string"_meta>::value could also be a viable solution.

    Read the article

  • OOP: how much program logic should be encapsulated within related objects/classes as methods?

    - by Andrew
    I have a simple program which can have an admin user or just a normal user. The program also has two classes: for UserAccount and AdminAccount. The things an admin will need to do (use cases) include Add_Account, Remove_Account, and so on. My question is, should I try to encapsulate these use-cases into the objects? Only someone who is an Admin, logged in with an AdminAccount, should be able to add and remove other accounts. I could have a class-less Sub-procedure that adds new UserAccount objects to the system and is called when an admin presses the 'Add Account' button. Alternatively, I could place that procedure as a method inside the AdminAccount object, and have the button event execute some code like 'Admin.AddUser(name, password).' I'm more inclined to go with the first option, but I'm not sure how far this OO business is supposed to go.

    Read the article

  • Testing install procedure of a program requiring administrative privileges

    - by Lucas Meijer
    I'm trying to write automated test, to ensure that the installer for my program works okay. The program can be installed for all users (requires admin privs), or for current user (does not require admin privs). The program can also autoupdate itself, which in some cases requires admin privileges, and in some cases doesn't. I'm looking for a way where I can have an automated test click "Yes, Allow" on the UAC dialogs, so I can write tests for all different scenarios, on many different operating systems, so that I can be confident when I make changes to the installer that I didn't break anything. Obviously, the installer process itself cannot do this. However, I control the complete machine, and could easily start some sort of daemon process with administrative rights, that the testprogram could make a socket connection to, to request it to "please click ok on the UAC now".

    Read the article

  • git- how to troubleshoot "cannot find command"

    - by Frank Schwieterman
    I need help getting git extensions to run with msysgit. I have had bad luck with extensions git-tfs and git-fetchall, in both cases it is the same problem. The addon will require a file to be placed where git can find it (git-tfs.exe and git-fetchall.sh). I understand this to mean the files need to be in a directory that is in the 'PATH' environment variable. In both cases I get stuck at this point: $ git-diffall bash: git-diffall: command not found When I run echo %PATH% from a regular command shell, it shows my path variable includes the directories where git-diffall and git-tfs are. How can I debug this, or am I missing something? Is there a way within msysgit to verify the command search path is what I expect?

    Read the article

  • ZooKeeper and RabbitMQ/Qpid together - overkill or a good combination?

    - by Chris Sears
    Greetings, I'm evaluating some components for a multi-data center distributed system. We're going to be using message queues (via either RabbitMQ or Qpid) so agents can make asynchronous requests to other agents without worrying about addressing, routing, load balancing or retransmission. In many cases, the agents will be interacting with components that were not designed for highly concurrent access, so locking and cross-agent coordination will be needed to avoid race conditions. Also, we'd like the system to automatically respond to agent or data center failures. With the above use cases in mind, ZooKeeper seemed like it might be a good fit. But I'm wondering if trying to use both ZK and message queuing is overkill. It seems like what Zookeeper does could be accomplished by my own cluster manager using AMQP messaging, but that would be hard to get really right. On the other hand, I've seen some examples where ZooKeeper was used to implement message queuing, but I think RabbitMQ/Qpid are a more natural fit for that. Has anyone out there used a combination like this? Thanks in advance, -Chris

    Read the article

  • Can Windows handle inheritance cross the 32-bit/64-bit boundary?

    - by TheBeardyMan
    Is it possible for a child process to inherit a handle from its parent process if one process is 32-bit and the other is 64-bit? HANDLE is a 64 bit type on Win64 and a 32 bit type on Win32, which suggests that even it were supposed to be possible in all cases, there would be some cases where it would fail: a 64-bit parent process, a 32-bit child process, and a handle that can't be represented in 32 bits. Or is naming the object the only way for a 32-bit process and a 64-bit process to get a handle for the same object?

    Read the article

  • Practices for Foreground/Background threads in .NET

    - by Andrei Taptunov
    I work with in-house legacy communication framework which exposes some high level abstractions. These abstractions are wrappers with some logic around .NET threads. When I looked at code I've noticed that some abstractions are wrappers around foreground threads while others are wrappers around background threads. The sad thing is that I don't see any logic why in some cases foreground threads are used and background in other cases. Are there any guidelines or patterns & practices when it's better to choose one over another on server side and client side (I believe there should be some difference)?

    Read the article

  • Precompile Lambda Expression Tree conversions as constants?

    - by Nathan
    It is fairly common to take an Expression tree, and convert it to some other form, such as a string representation (for example this question and this question, and I suspect Linq2Sql does something similar). In many cases, perhaps even most cases, the Expression tree conversion will always be the same, i.e. if I have a function public string GenerateSomeSql(Expression<Func<TResult, TProperty>> expression) then any call with the same argument will always return the same result for example: GenerateSomeSql(x => x.Age) //suppose this will always return "select Age from Person" GenerateSomeSql(x => x.Ssn) //suppose this will always return "select Ssn from Person" So, in essence, the function call with a particular argument is really just a constant, except time is wasted at runtime re-computing it continuously. Assuming, for the sake of argument, that the conversion was sufficiently complex to cause a noticeable performance hit, is there any way to pre-compile the function call into an actual constant?

    Read the article

  • Parse http GET and POST parameters from BaseHTTPHandler?

    - by ataylor
    BaseHTTPHandler from the BaseHTTPServer module doesn't seem to provide any convenient way to access http request parameters. What is the best way to parse the GET parameters from the path, and the POST parameters from the request body? Right now, I'm using this for GET: def do_GET(self): parsed_path = urlparse.urlparse(self.path) try: params = dict([p.split('=') for p in parsed_path[4].split('&')]) except: params = {} This works for most cases, but I'd like something more robust that handles encodings and cases like empty parameters properly. Ideally, I'd like something small and standalone, rather than a full web framework.

    Read the article

  • Iterables.find and Iterators.find - instead of throwing exception, get null

    - by mjlee
    I'm using google-collections and trying to find the first element that satisfies Predicate if not, return me 'null'. Unfortunately, Iterables.find and Iterators.find throws NoSuchElementException when no element is found. Now, I am forced to do Object found = null; if ( Iterators.any( newIterator(...) , my_predicate ) { found = Iterators.find( newIterator(...), my_predicate ) } I can surround by 'try/catch' and do the same thing but for my use-cases, I am going to encounter many cases where no-element is found. Is there a simpler way of doing this?

    Read the article

  • Give me a practical use-case of Multi-set

    - by Calm Storm
    I would like to know a few practical use-cases (if they are not related/tied to any programming language it will be better).I can associate Sets, Lists and Maps to practical use cases. For example if you wanted a glossary of a book where terms that you want are listed alphabetically and a location/page number is the value, you would use the collection TreeMap(OrderedMap which is a Map) Somehow, I can't associate MultiSets with any "practical" usecase. Does someone know of any uses? http://en.wikipedia.org/wiki/Multiset does not tell me enough :) PS: If you guys think this should be community-wiki'ed it is okay. The only reason I did not do it was "There is a clear objective way to answer this question".

    Read the article

  • Trying to understand strtok

    - by Karthick
    Consider the following snippet that uses strtok to split the string madddy. char* str = (char*) malloc(sizeof("Madddy")); strcpy(str,"Madddy"); char* tmp = strtok(str,"d"); std::cout<<tmp; do { std::cout<<tmp; tmp=strtok(NULL, "dddy"); }while(tmp!=NULL); It works fine, the output is Ma. But by modifying the strtok to the following, tmp=strtok(NULL, "ay"); The output becomes Madd. So how does strtok exactly work? I have this question because I expected strtok to take each and every character that is in the delimiter string to be taken as a delimiter. But in certain cases it is doing that way but in few cases, it is giving unexpected results. Could anyone help me understand this?

    Read the article

  • SQL Try catch purpose unclear

    - by PaN1C_Showt1Me
    Let's suppose I want to inform the application about what happened / returned the SQL server. Let's have this code block: BEGIN TRY -- Generate divide-by-zero error. SELECT 1/0; END TRY BEGIN CATCH SELECT ERROR_NUMBER() AS ErrorNumber, ERROR_SEVERITY() AS ErrorSeverity, ERROR_STATE() as ErrorState, ERROR_PROCEDURE() as ErrorProcedure, ERROR_LINE() as ErrorLine, ERROR_MESSAGE() as ErrorMessage; END CATCH; GO and Let's have this code block: SELECT 1/0; My question is: Both return the division by zero error. What I don't understand clearly is that why I should surround it with the try catch clausule when I got that error in both cases ? Isn't it true that this error will be in both cases propagated to the client application ?

    Read the article

  • In Ruby, can the coerce() method know what operator it is that requires the help to coerce?

    - by Jian Lin
    In Ruby, it seems that a lot of coerce() help can be done by def coerce(something) [self, something] end that's is, when 3 + rational is needed, Fixnum 3 doesn't know how to handle adding a Rational, so it asks Rational#coerce for help by calling rational.coerce(3), and this coerce instance method will tell the caller: # I know how to handle rational + something, so I will return you the following: [self, something] # so that now you can invoke + on me, and I will deal with Fixnum to get an answer So what if most operators can use this method, but not when it is (a - b) != (b - a) situation? Can coerce() know which operator it is, and just handle those special cases, while just using the simple [self, something] to handle all the other cases where (a op b) == (b op a) ? (op is the operator).

    Read the article

  • is there a way to switch bash or zsh from emacs mode to vi mode with a keystroke

    - by Brandon
    I'd like to be able to switch temporarily from emacs mode to vi mode, since vi mode is sometimes better, but I'm usually half-way through typing something before I realize I want I don't want to switch permanently to vi mode, because I normally prefer emacs mode on the command line, mostly because it's what I'm used to, and over the years many of the keystrokes have become second nature. (As an editor I generally use emacs in viper mode, so that I can use both vi and emacs keystrokes, since I found myself accidentally using them in vi all the time, and screwing things up, and because in some cases I find vi keystrokes more memorable and handy, and in other cases emacs.)

    Read the article

< Previous Page | 229 230 231 232 233 234 235 236 237 238 239 240  | Next Page >