Search Results

Search found 552 results on 23 pages for 'hungarian notation'.

Page 23/23 | < Previous Page | 19 20 21 22 23 

  • Using the ASP.NET Cache to cache data in a Model or Business Object layer, without a dependency on System.Web in the layer - Part One.

    - by Rhames
    ASP.NET applications can make use of the System.Web.Caching.Cache object to cache data and prevent repeated expensive calls to a database or other store. However, ideally an application should make use of caching at the point where data is retrieved from the database, which typically is inside a Business Objects or Model layer. One of the key features of using a UI pattern such as Model-View-Presenter (MVP) or Model-View-Controller (MVC) is that the Model and Presenter (or Controller) layers are developed without any knowledge of the UI layer. Introducing a dependency on System.Web into the Model layer would break this independence of the Model from the View. This article gives a solution to this problem, using dependency injection to inject the caching implementation into the Model layer at runtime. This allows caching to be used within the Model layer, without any knowledge of the actual caching mechanism that will be used. Create a sample application to use the caching solution Create a test SQL Server database This solution uses a SQL Server database with the same Sales data used in my previous post on calculating running totals. The advantage of using this data is that it gives nice slow queries that will exaggerate the effect of using caching! To create the data, first create a new SQL database called CacheSample. Next run the following script to create the Sale table and populate it: USE CacheSample GO   CREATE TABLE Sale(DayCount smallint, Sales money) CREATE CLUSTERED INDEX ndx_DayCount ON Sale(DayCount) go INSERT Sale VALUES (1,120) INSERT Sale VALUES (2,60) INSERT Sale VALUES (3,125) INSERT Sale VALUES (4,40)   DECLARE @DayCount smallint, @Sales money SET @DayCount = 5 SET @Sales = 10   WHILE @DayCount < 5000  BEGIN  INSERT Sale VALUES (@DayCount,@Sales)  SET @DayCount = @DayCount + 1  SET @Sales = @Sales + 15  END Next create a stored procedure to calculate the running total, and return a specified number of rows from the Sale table, using the following script: USE [CacheSample] GO   SET ANSI_NULLS ON GO   SET QUOTED_IDENTIFIER ON GO   -- ============================================= -- Author:        Robin -- Create date: -- Description:   -- ============================================= CREATE PROCEDURE [dbo].[spGetRunningTotals]       -- Add the parameters for the stored procedure here       @HighestDayCount smallint = null AS BEGIN       -- SET NOCOUNT ON added to prevent extra result sets from       -- interfering with SELECT statements.       SET NOCOUNT ON;         IF @HighestDayCount IS NULL             SELECT @HighestDayCount = MAX(DayCount) FROM dbo.Sale                   DECLARE @SaleTbl TABLE (DayCount smallint, Sales money, RunningTotal money)         DECLARE @DayCount smallint,                   @Sales money,                   @RunningTotal money         SET @RunningTotal = 0       SET @DayCount = 0         DECLARE rt_cursor CURSOR       FOR       SELECT DayCount, Sales       FROM Sale       ORDER BY DayCount         OPEN rt_cursor         FETCH NEXT FROM rt_cursor INTO @DayCount,@Sales         WHILE @@FETCH_STATUS = 0 AND @DayCount <= @HighestDayCount        BEGIN        SET @RunningTotal = @RunningTotal + @Sales        INSERT @SaleTbl VALUES (@DayCount,@Sales,@RunningTotal)        FETCH NEXT FROM rt_cursor INTO @DayCount,@Sales        END         CLOSE rt_cursor       DEALLOCATE rt_cursor         SELECT DayCount, Sales, RunningTotal       FROM @SaleTbl   END   GO   Create the Sample ASP.NET application In Visual Studio create a new solution and add a class library project called CacheSample.BusinessObjects and an ASP.NET web application called CacheSample.UI. The CacheSample.BusinessObjects project will contain a single class to represent a Sale data item, with all the code to retrieve the sales from the database included in it for simplicity (normally I would at least have a separate Repository or other object that is responsible for retrieving data, and probably a data access layer as well, but for this sample I want to keep it simple). The C# code for the Sale class is shown below: using System; using System.Collections.Generic; using System.Data; using System.Data.SqlClient;   namespace CacheSample.BusinessObjects {     public class Sale     {         public Int16 DayCount { get; set; }         public decimal Sales { get; set; }         public decimal RunningTotal { get; set; }           public static IEnumerable<Sale> GetSales(int? highestDayCount)         {             List<Sale> sales = new List<Sale>();               SqlParameter highestDayCountParameter = new SqlParameter("@HighestDayCount", SqlDbType.SmallInt);             if (highestDayCount.HasValue)                 highestDayCountParameter.Value = highestDayCount;             else                 highestDayCountParameter.Value = DBNull.Value;               string connectionStr = System.Configuration.ConfigurationManager .ConnectionStrings["CacheSample"].ConnectionString;               using(SqlConnection sqlConn = new SqlConnection(connectionStr))             using (SqlCommand sqlCmd = sqlConn.CreateCommand())             {                 sqlCmd.CommandText = "spGetRunningTotals";                 sqlCmd.CommandType = CommandType.StoredProcedure;                 sqlCmd.Parameters.Add(highestDayCountParameter);                   sqlConn.Open();                   using (SqlDataReader dr = sqlCmd.ExecuteReader())                 {                     while (dr.Read())                     {                         Sale newSale = new Sale();                         newSale.DayCount = dr.GetInt16(0);                         newSale.Sales = dr.GetDecimal(1);                         newSale.RunningTotal = dr.GetDecimal(2);                           sales.Add(newSale);                     }                 }             }               return sales;         }     } }   The static GetSale() method makes a call to the spGetRunningTotals stored procedure and then reads each row from the returned SqlDataReader into an instance of the Sale class, it then returns a List of the Sale objects, as IEnnumerable<Sale>. A reference to System.Configuration needs to be added to the CacheSample.BusinessObjects project so that the connection string can be read from the web.config file. In the CacheSample.UI ASP.NET project, create a single web page called ShowSales.aspx, and make this the default start up page. This page will contain a single button to call the GetSales() method and a label to display the results. The html mark up and the C# code behind are shown below: ShowSales.aspx <%@ Page Language="C#" AutoEventWireup="true" CodeBehind="ShowSales.aspx.cs" Inherits="CacheSample.UI.ShowSales" %>   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">   <html xmlns="http://www.w3.org/1999/xhtml"> <head runat="server">     <title>Cache Sample - Show All Sales</title> </head> <body>     <form id="form1" runat="server">     <div>         <asp:Button ID="btnTest1" runat="server" onclick="btnTest1_Click"             Text="Get All Sales" />         &nbsp;&nbsp;&nbsp;         <asp:Label ID="lblResults" runat="server"></asp:Label>         </div>     </form> </body> </html>   ShowSales.aspx.cs using System; using System.Collections.Generic; using System.Linq; using System.Web; using System.Web.UI; using System.Web.UI.WebControls;   using CacheSample.BusinessObjects;   namespace CacheSample.UI {     public partial class ShowSales : System.Web.UI.Page     {         protected void Page_Load(object sender, EventArgs e)         {         }           protected void btnTest1_Click(object sender, EventArgs e)         {             System.Diagnostics.Stopwatch stopWatch = new System.Diagnostics.Stopwatch();             stopWatch.Start();               var sales = Sale.GetSales(null);               var lastSales = sales.Last();               stopWatch.Stop();               lblResults.Text = string.Format( "Count of Sales: {0}, Last DayCount: {1}, Total Sales: {2}. Query took {3} ms", sales.Count(), lastSales.DayCount, lastSales.RunningTotal, stopWatch.ElapsedMilliseconds);         }       } }   Finally we need to add a connection string to the CacheSample SQL Server database, called CacheSample, to the web.config file: <?xmlversion="1.0"?>   <configuration>    <connectionStrings>     <addname="CacheSample"          connectionString="data source=.\SQLEXPRESS;Integrated Security=SSPI;Initial Catalog=CacheSample"          providerName="System.Data.SqlClient" />  </connectionStrings>    <system.web>     <compilationdebug="true"targetFramework="4.0" />  </system.web>   </configuration>   Run the application and click the button a few times to see how long each call to the database takes. On my system, each query takes about 450ms. Next I shall look at a solution to use the ASP.NET caching to cache the data returned by the query, so that subsequent requests to the GetSales() method are much faster. Adding Data Caching Support I am going to create my caching support in a separate project called CacheSample.Caching, so the next step is to add a class library to the solution. We shall be using the application configuration to define the implementation of our caching system, so we need a reference to System.Configuration adding to the project. ICacheProvider<T> Interface The first step in adding caching to our application is to define an interface, called ICacheProvider, in the CacheSample.Caching project, with methods to retrieve any data from the cache or to retrieve the data from the data source if it is not present in the cache. Dependency Injection will then be used to inject an implementation of this interface at runtime, allowing the users of the interface (i.e. the CacheSample.BusinessObjects project) to be completely unaware of how the caching is actually implemented. As data of any type maybe retrieved from the data source, it makes sense to use generics in the interface, with a generic type parameter defining the data type associated with a particular instance of the cache interface implementation. The C# code for the ICacheProvider interface is shown below: using System; using System.Collections.Generic;   namespace CacheSample.Caching {     public interface ICacheProvider     {     }       public interface ICacheProvider<T> : ICacheProvider     {         T Fetch(string key, Func<T> retrieveData, DateTime? absoluteExpiry, TimeSpan? relativeExpiry);           IEnumerable<T> Fetch(string key, Func<IEnumerable<T>> retrieveData, DateTime? absoluteExpiry, TimeSpan? relativeExpiry);     } }   The empty non-generic interface will be used as a type in a Dictionary generic collection later to store instances of the ICacheProvider<T> implementation for reuse, I prefer to use a base interface when doing this, as I think the alternative of using object makes for less clear code. The ICacheProvider<T> interface defines two overloaded Fetch methods, the difference between these is that one will return a single instance of the type T and the other will return an IEnumerable<T>, providing support for easy caching of collections of data items. Both methods will take a key parameter, which will uniquely identify the cached data, a delegate of type Func<T> or Func<IEnumerable<T>> which will provide the code to retrieve the data from the store if it is not present in the cache, and absolute or relative expiry policies to define when a cached item should expire. Note that at present there is no support for cache dependencies, but I shall be showing a method of adding this in part two of this article. CacheProviderFactory Class We need a mechanism of creating instances of our ICacheProvider<T> interface, using Dependency Injection to get the implementation of the interface. To do this we shall create a CacheProviderFactory static class in the CacheSample.Caching project. This factory will provide a generic static method called GetCacheProvider<T>(), which shall return instances of ICacheProvider<T>. We can then call this factory method with the relevant data type (for example the Sale class in the CacheSample.BusinessObject project) to get a instance of ICacheProvider for that type (e.g. call CacheProviderFactory.GetCacheProvider<Sale>() to get the ICacheProvider<Sale> implementation). The C# code for the CacheProviderFactory is shown below: using System; using System.Collections.Generic;   using CacheSample.Caching.Configuration;   namespace CacheSample.Caching {     public static class CacheProviderFactory     {         private static Dictionary<Type, ICacheProvider> cacheProviders = new Dictionary<Type, ICacheProvider>();         private static object syncRoot = new object();           ///<summary>         /// Factory method to create or retrieve an implementation of the  /// ICacheProvider interface for type <typeparamref name="T"/>.         ///</summary>         ///<typeparam name="T">  /// The type that this cache provider instance will work with  ///</typeparam>         ///<returns>An instance of the implementation of ICacheProvider for type  ///<typeparamref name="T"/>, as specified by the application  /// configuration</returns>         public static ICacheProvider<T> GetCacheProvider<T>()         {             ICacheProvider<T> cacheProvider = null;             // Get the Type reference for the type parameter T             Type typeOfT = typeof(T);               // Lock the access to the cacheProviders dictionary             // so multiple threads can work with it             lock (syncRoot)             {                 // First check if an instance of the ICacheProvider implementation  // already exists in the cacheProviders dictionary for the type T                 if (cacheProviders.ContainsKey(typeOfT))                     cacheProvider = (ICacheProvider<T>)cacheProviders[typeOfT];                 else                 {                     // There is not already an instance of the ICacheProvider in       // cacheProviders for the type T                     // so we need to create one                       // Get the Type reference for the application's implementation of       // ICacheProvider from the configuration                     Type cacheProviderType = Type.GetType(CacheProviderConfigurationSection.Current. CacheProviderType);                     if (cacheProviderType != null)                     {                         // Now get a Type reference for the Cache Provider with the                         // type T generic parameter                         Type typeOfCacheProviderTypeForT = cacheProviderType.MakeGenericType(new Type[] { typeOfT });                         if (typeOfCacheProviderTypeForT != null)                         {                             // Create the instance of the Cache Provider and add it to // the cacheProviders dictionary for future use                             cacheProvider = (ICacheProvider<T>)Activator. CreateInstance(typeOfCacheProviderTypeForT);                             cacheProviders.Add(typeOfT, cacheProvider);                         }                     }                 }             }               return cacheProvider;                 }     } }   As this code uses Activator.CreateInstance() to create instances of the ICacheProvider<T> implementation, which is a slow process, the factory class maintains a Dictionary of the previously created instances so that a cache provider needs to be created only once for each type. The type of the implementation of ICacheProvider<T> is read from a custom configuration section in the application configuration file, via the CacheProviderConfigurationSection class, which is described below. CacheProviderConfigurationSection Class The implementation of ICacheProvider<T> will be specified in a custom configuration section in the application’s configuration. To handle this create a folder in the CacheSample.Caching project called Configuration, and add a class called CacheProviderConfigurationSection to this folder. This class will extend the System.Configuration.ConfigurationSection class, and will contain a single string property called CacheProviderType. The C# code for this class is shown below: using System; using System.Configuration;   namespace CacheSample.Caching.Configuration {     internal class CacheProviderConfigurationSection : ConfigurationSection     {         public static CacheProviderConfigurationSection Current         {             get             {                 return (CacheProviderConfigurationSection) ConfigurationManager.GetSection("cacheProvider");             }         }           [ConfigurationProperty("type", IsRequired=true)]         public string CacheProviderType         {             get             {                 return (string)this["type"];             }         }     } }   Adding Data Caching to the Sales Class We now have enough code in place to add caching to the GetSales() method in the CacheSample.BusinessObjects.Sale class, even though we do not yet have an implementation of the ICacheProvider<T> interface. We need to add a reference to the CacheSample.Caching project to CacheSample.BusinessObjects so that we can use the ICacheProvider<T> interface within the GetSales() method. Once the reference is added, we can first create a unique string key based on the method name and the parameter value, so that the same cache key is used for repeated calls to the method with the same parameter values. Then we get an instance of the cache provider for the Sales type, using the CacheProviderFactory, and pass the existing code to retrieve the data from the database as the retrievalMethod delegate in a call to the Cache Provider Fetch() method. The C# code for the modified GetSales() method is shown below: public static IEnumerable<Sale> GetSales(int? highestDayCount) {     string cacheKey = string.Format("CacheSample.BusinessObjects.GetSalesWithCache({0})", highestDayCount);       return CacheSample.Caching.CacheProviderFactory. GetCacheProvider<Sale>().Fetch(cacheKey,         delegate()         {             List<Sale> sales = new List<Sale>();               SqlParameter highestDayCountParameter = new SqlParameter("@HighestDayCount", SqlDbType.SmallInt);             if (highestDayCount.HasValue)                 highestDayCountParameter.Value = highestDayCount;             else                 highestDayCountParameter.Value = DBNull.Value;               string connectionStr = System.Configuration.ConfigurationManager. ConnectionStrings["CacheSample"].ConnectionString;               using (SqlConnection sqlConn = new SqlConnection(connectionStr))             using (SqlCommand sqlCmd = sqlConn.CreateCommand())             {                 sqlCmd.CommandText = "spGetRunningTotals";                 sqlCmd.CommandType = CommandType.StoredProcedure;                 sqlCmd.Parameters.Add(highestDayCountParameter);                   sqlConn.Open();                   using (SqlDataReader dr = sqlCmd.ExecuteReader())                 {                     while (dr.Read())                     {                         Sale newSale = new Sale();                         newSale.DayCount = dr.GetInt16(0);                         newSale.Sales = dr.GetDecimal(1);                         newSale.RunningTotal = dr.GetDecimal(2);                           sales.Add(newSale);                     }                 }             }               return sales;         },         null,         new TimeSpan(0, 10, 0)); }     This example passes the code to retrieve the Sales data from the database to the Cache Provider as an anonymous method, however it could also be written as a lambda. The main advantage of using an anonymous function (method or lambda) is that the code inside the anonymous function can access the parameters passed to the GetSales() method. Finally the absolute expiry is set to null, and the relative expiry set to 10 minutes, to indicate that the cache entry should be removed 10 minutes after the last request for the data. As the ICacheProvider<T> has a Fetch() method that returns IEnumerable<T>, we can simply return the results of the Fetch() method to the caller of the GetSales() method. This should be all that is needed for the GetSales() method to now retrieve data from a cache after the first time the data has be retrieved from the database. Implementing a ASP.NET Cache Provider The final step is to actually implement the ICacheProvider<T> interface, and add the implementation details to the web.config file for the dependency injection. The cache provider implementation needs to have access to System.Web. Therefore it could be placed in the CacheSample.UI project, or in its own project that has a reference to System.Web. Implementing the Cache Provider in a separate project is my favoured approach. Create a new project inside the solution called CacheSample.CacheProvider, and add references to System.Web and CacheSample.Caching to this project. Add a class to the project called AspNetCacheProvider. Make the class a generic class by adding the generic parameter <T> and indicate that the class implements ICacheProvider<T>. The C# code for the AspNetCacheProvider class is shown below: using System; using System.Collections.Generic; using System.Linq; using System.Web; using System.Web.Caching;   using CacheSample.Caching;   namespace CacheSample.CacheProvider {     public class AspNetCacheProvider<T> : ICacheProvider<T>     {         #region ICacheProvider<T> Members           public T Fetch(string key, Func<T> retrieveData, DateTime? absoluteExpiry, TimeSpan? relativeExpiry)         {             return FetchAndCache<T>(key, retrieveData, absoluteExpiry, relativeExpiry);         }           public IEnumerable<T> Fetch(string key, Func<IEnumerable<T>> retrieveData, DateTime? absoluteExpiry, TimeSpan? relativeExpiry)         {             return FetchAndCache<IEnumerable<T>>(key, retrieveData, absoluteExpiry, relativeExpiry);         }           #endregion           #region Helper Methods           private U FetchAndCache<U>(string key, Func<U> retrieveData, DateTime? absoluteExpiry, TimeSpan? relativeExpiry)         {             U value;             if (!TryGetValue<U>(key, out value))             {                 value = retrieveData();                 if (!absoluteExpiry.HasValue)                     absoluteExpiry = Cache.NoAbsoluteExpiration;                   if (!relativeExpiry.HasValue)                     relativeExpiry = Cache.NoSlidingExpiration;                   HttpContext.Current.Cache.Insert(key, value, null, absoluteExpiry.Value, relativeExpiry.Value);             }             return value;         }           private bool TryGetValue<U>(string key, out U value)         {             object cachedValue = HttpContext.Current.Cache.Get(key);             if (cachedValue == null)             {                 value = default(U);                 return false;             }             else             {                 try                 {                     value = (U)cachedValue;                     return true;                 }                 catch                 {                     value = default(U);                     return false;                 }             }         }           #endregion       } }   The two interface Fetch() methods call a private method called FetchAndCache(). This method first checks for a element in the HttpContext.Current.Cache with the specified cache key, and if so tries to cast this to the specified type (either T or IEnumerable<T>). If the cached element is found, the FetchAndCache() method simply returns it. If it is not found in the cache, the method calls the retrievalMethod delegate to get the data from the data source, and then adds this to the HttpContext.Current.Cache. The final step is to add the AspNetCacheProvider class to the relevant custom configuration section in the CacheSample.UI.Web.Config file. To do this there needs to be a <configSections> element added as the first element in <configuration>. This will match a custom section called <cacheProvider> with the CacheProviderConfigurationSection. Then we add a <cacheProvider> element, with a type property set to the fully qualified assembly name of the AspNetCacheProvider class, as shown below: <?xmlversion="1.0"?>   <configuration>  <configSections>     <sectionname="cacheProvider" type="CacheSample.Base.Configuration.CacheProviderConfigurationSection, CacheSample.Base" />  </configSections>    <connectionStrings>     <addname="CacheSample"          connectionString="data source=.\SQLEXPRESS;Integrated Security=SSPI;Initial Catalog=CacheSample"          providerName="System.Data.SqlClient" />  </connectionStrings>    <cacheProvidertype="CacheSample.CacheProvider.AspNetCacheProvider`1, CacheSample.CacheProvider, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null">  </cacheProvider>    <system.web>     <compilationdebug="true"targetFramework="4.0" />  </system.web>   </configuration>   One point to note is that the fully qualified assembly name of the AspNetCacheProvider class includes the notation `1 after the class name, which indicates that it is a generic class with a single generic type parameter. The CacheSample.UI project needs to have references added to CacheSample.Caching and CacheSample.CacheProvider so that the actual application is aware of the relevant cache provider implementation. Conclusion After implementing this solution, you should have a working cache provider mechanism, that will allow the middle and data access layers to implement caching support when retrieving data, without any knowledge of the actually caching implementation. If the UI is not ASP.NET based, if for example it is Winforms or WPF, the implementation of ICacheProvider<T> would be written around whatever technology is available. It could even be a standalone caching system that takes full responsibility for adding and removing items from a global store. The next part of this article will show how this caching mechanism may be extended to provide support for cache dependencies, such as the System.Web.Caching.SqlCacheDependency. Another possible extension would be to cache the cache provider implementations instead of storing them in a static Dictionary in the CacheProviderFactory. This would prevent a build up of seldom used cache providers in the application memory, as they could be removed from the cache if not used often enough, although in reality there are probably unlikely to be vast numbers of cache provider implementation instances, as most applications do not have a massive number of business object or model types.

    Read the article

  • Optimized OCR black/white pixel algorithm

    - by eagle
    I am writing a simple OCR solution for a finite set of characters. That is, I know the exact way all 26 letters in the alphabet will look like. I am using C# and am able to easily determine if a given pixel should be treated as black or white. I am generating a matrix of black/white pixels for every single character. So for example, the letter I (capital i), might look like the following: 01110 00100 00100 00100 01110 Note: all points, which I use later in this post, assume that the top left pixel is (0, 0), bottom right pixel is (4, 4). 1's represent black pixels, and 0's represent white pixels. I would create a corresponding matrix in C# like this: CreateLetter("I", new List<List<bool>>() { new List<bool>() { false, true, true, true, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, true, true, true, false } }); I know I could probably optimize this part by using a multi-dimensional array instead, but let's ignore that for now, this is for illustrative purposes. Every letter is exactly the same dimensions, 10px by 11px (10px by 11px is the actual dimensions of a character in my real program. I simplified this to 5px by 5px in this posting since it is much easier to "draw" the letters using 0's and 1's on a smaller image). Now when I give it a 10px by 11px part of an image to analyze with OCR, it would need to run on every single letter (26) on every single pixel (10 * 11 = 110) which would mean 2,860 (26 * 110) iterations (in the worst case) for every single character. I was thinking this could be optimized by defining the unique characteristics of every character. So, for example, let's assume that the set of characters only consists of 5 distinct letters: I, A, O, B, and L. These might look like the following: 01110 00100 00100 01100 01000 00100 01010 01010 01010 01000 00100 01110 01010 01100 01000 00100 01010 01010 01010 01000 01110 01010 00100 01100 01110 After analyzing the unique characteristics of every character, I can significantly reduce the number of tests that need to be performed to test for a character. For example, for the "I" character, I could define it's unique characteristics as having a black pixel in the coordinate (3, 0) since no other characters have that pixel as black. So instead of testing 110 pixels for a match on the "I" character, I reduced it to a 1 pixel test. This is what it might look like for all these characters: var LetterI = new OcrLetter() { Name = "I", BlackPixels = new List<Point>() { new Point (3, 0) } } var LetterA = new OcrLetter() { Name = "A", WhitePixels = new List<Point>() { new Point(2, 4) } } var LetterO = new OcrLetter() { Name = "O", BlackPixels = new List<Point>() { new Point(3, 2) }, WhitePixels = new List<Point>() { new Point(2, 2) } } var LetterB = new OcrLetter() { Name = "B", BlackPixels = new List<Point>() { new Point(3, 1) }, WhitePixels = new List<Point>() { new Point(3, 2) } } var LetterL = new OcrLetter() { Name = "L", BlackPixels = new List<Point>() { new Point(1, 1), new Point(3, 4) }, WhitePixels = new List<Point>() { new Point(2, 2) } } This is challenging to do manually for 5 characters and gets much harder the greater the amount of letters that are added. You also want to guarantee that you have the minimum set of unique characteristics of a letter since you want it to be optimized as much as possible. I want to create an algorithm that will identify the unique characteristics of all the letters and would generate similar code to that above. I would then use this optimized black/white matrix to identify characters. How do I take the 26 letters that have all their black/white pixels filled in (e.g. the CreateLetter code block) and convert them to an optimized set of unique characteristics that define a letter (e.g. the new OcrLetter() code block)? And how would I guarantee that it is the most efficient definition set of unique characteristics (e.g. instead of defining 6 points as the unique characteristics, there might be a way to do it with 1 or 2 points, as the letter "I" in my example was able to). An alternative solution I've come up with is using a hash table, which will reduce it from 2,860 iterations to 110 iterations, a 26 time reduction. This is how it might work: I would populate it with data similar to the following: Letters["01110 00100 00100 00100 01110"] = "I"; Letters["00100 01010 01110 01010 01010"] = "A"; Letters["00100 01010 01010 01010 00100"] = "O"; Letters["01100 01010 01100 01010 01100"] = "B"; Now when I reach a location in the image to process, I convert it to a string such as: "01110 00100 00100 00100 01110" and simply find it in the hash table. This solution seems very simple, however, this still requires 110 iterations to generate this string for each letter. In big O notation, the algorithm is the same since O(110N) = O(2860N) = O(N) for N letters to process on the page. However, it is still improved by a constant factor of 26, a significant improvement (e.g. instead of it taking 26 minutes, it would take 1 minute). Update: Most of the solutions provided so far have not addressed the issue of identifying the unique characteristics of a character and rather provide alternative solutions. I am still looking for this solution which, as far as I can tell, is the only way to achieve the fastest OCR processing. I just came up with a partial solution: For each pixel, in the grid, store the letters that have it as a black pixel. Using these letters: I A O B L 01110 00100 00100 01100 01000 00100 01010 01010 01010 01000 00100 01110 01010 01100 01000 00100 01010 01010 01010 01000 01110 01010 00100 01100 01110 You would have something like this: CreatePixel(new Point(0, 0), new List<Char>() { }); CreatePixel(new Point(1, 0), new List<Char>() { 'I', 'B', 'L' }); CreatePixel(new Point(2, 0), new List<Char>() { 'I', 'A', 'O', 'B' }); CreatePixel(new Point(3, 0), new List<Char>() { 'I' }); CreatePixel(new Point(4, 0), new List<Char>() { }); CreatePixel(new Point(0, 1), new List<Char>() { }); CreatePixel(new Point(1, 1), new List<Char>() { 'A', 'B', 'L' }); CreatePixel(new Point(2, 1), new List<Char>() { 'I' }); CreatePixel(new Point(3, 1), new List<Char>() { 'A', 'O', 'B' }); // ... CreatePixel(new Point(2, 2), new List<Char>() { 'I', 'A', 'B' }); CreatePixel(new Point(3, 2), new List<Char>() { 'A', 'O' }); // ... CreatePixel(new Point(2, 4), new List<Char>() { 'I', 'O', 'B', 'L' }); CreatePixel(new Point(3, 4), new List<Char>() { 'I', 'A', 'L' }); CreatePixel(new Point(4, 4), new List<Char>() { }); Now for every letter, in order to find the unique characteristics, you need to look at which buckets it belongs to, as well as the amount of other characters in the bucket. So let's take the example of "I". We go to all the buckets it belongs to (1,0; 2,0; 3,0; ...; 3,4) and see that the one with the least amount of other characters is (3,0). In fact, it only has 1 character, meaning it must be an "I" in this case, and we found our unique characteristic. You can also do the same for pixels that would be white. Notice that bucket (2,0) contains all the letters except for "L", this means that it could be used as a white pixel test. Similarly, (2,4) doesn't contain an 'A'. Buckets that either contain all the letters or none of the letters can be discarded immediately, since these pixels can't help define a unique characteristic (e.g. 1,1; 4,0; 0,1; 4,4). It gets trickier when you don't have a 1 pixel test for a letter, for example in the case of 'O' and 'B'. Let's walk through the test for 'O'... It's contained in the following buckets: // Bucket Count Letters // 2,0 4 I, A, O, B // 3,1 3 A, O, B // 3,2 2 A, O // 2,4 4 I, O, B, L Additionally, we also have a few white pixel tests that can help: (I only listed those that are missing at most 2). The Missing Count was calculated as (5 - Bucket.Count). // Bucket Missing Count Missing Letters // 1,0 2 A, O // 1,1 2 I, O // 2,2 2 O, L // 3,4 2 O, B So now we can take the shortest black pixel bucket (3,2) and see that when we test for (3,2) we know it is either an 'A' or an 'O'. So we need an easy way to tell the difference between an 'A' and an 'O'. We could either look for a black pixel bucket that contains 'O' but not 'A' (e.g. 2,4) or a white pixel bucket that contains an 'O' but not an 'A' (e.g. 1,1). Either of these could be used in combination with the (3,2) pixel to uniquely identify the letter 'O' with only 2 tests. This seems like a simple algorithm when there are 5 characters, but how would I do this when there are 26 letters and a lot more pixels overlapping? For example, let's say that after the (3,2) pixel test, it found 10 different characters that contain the pixel (and this was the least from all the buckets). Now I need to find differences from 9 other characters instead of only 1 other character. How would I achieve my goal of getting the least amount of checks as possible, and ensure that I am not running extraneous tests?

    Read the article

< Previous Page | 19 20 21 22 23