Search Results

Search found 75233 results on 3010 pages for 'data distribution service'.

Page 1/3010 | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Data Quality and Master Data Management Resources

    - by Dejan Sarka
    Many companies or organizations do regular data cleansing. When you cleanse the data, the data quality goes up to some higher level. The data quality level is determined by the amount of work invested in the cleansing. As time passes, the data quality deteriorates, and you need to repeat the cleansing process. If you spend an equal amount of effort as you did with the previous cleansing, you can expect the same level of data quality as you had after the previous cleansing. And then the data quality deteriorates over time again, and the cleansing process starts over and over again. The idea of Data Quality Services is to mitigate the cleansing process. While the amount of time you need to spend on cleansing decreases, you will achieve higher and higher levels of data quality. While cleansing, you learn what types of errors to expect, discover error patterns, find domains of correct values, etc. You don’t throw away this knowledge. You store it and use it to find and correct the same issues automatically during your next cleansing process. The following figure shows this graphically. The idea of master data management, which you can perform with Master Data Services (MDS), is to prevent data quality from deteriorating. Once you reach a particular quality level, the MDS application—together with the defined policies, people, and master data management processes—allow you to maintain this level permanently. This idea is shown in the following picture. OK, now you know what DQS and MDS are about. You can imagine the importance on maintaining the data quality. Here are some resources that help you preparing and executing the data quality (DQ) and master data management (MDM) activities. Books Dejan Sarka and Davide Mauri: Data Quality and Master Data Management with Microsoft SQL Server 2008 R2 – a general introduction to MDM, MDS, and data profiling. Matching explained in depth. Dejan Sarka, Matija Lah and Grega Jerkic: MCTS Self-Paced Training Kit (Exam 70-463): Building Data Warehouses with Microsoft SQL Server 2012 – I wrote quite a few chapters about DQ and MDM, and introduced also SQL Server 2012 DQS. Thomas Redman: Data Quality: The Field Guide – you should start with this book. Thomas Redman is the father of DQ and MDM. Tyler Graham: Microsoft SQL Server 2012 Master Data Services – MDS in depth from a product team mate. Arkady Maydanchik: Data Quality Assessment – data profiling in depth. Tamraparni Dasu, Theodore Johnson: Exploratory Data Mining and Data Cleaning – advanced data profiling with data mining. Forthcoming presentations I am presenting a DQS and MDM seminar at PASS SQL Rally Amsterdam 2013: Wednesday, November 6th, 2013: Enterprise Information Management with SQL Server 2012 – a good kick start to your first DQ and / or MDM project. Courses Data Quality and Master Data Management with SQL Server 2012 – I wrote a 2-day course for SolidQ. If you are interested in this course, which I could also deliver in a shorter seminar way, you can contact your closes SolidQ subsidiary, or, of course, me directly on addresses [email protected] or [email protected]. This course could also complement the existing courseware portfolio of training providers, which are welcome to contact me as well. Start improving the quality of your data now!

    Read the article

  • Service Discovery in WCF 4.0 – Part 1

    - by Shaun
    When designing a service oriented architecture (SOA) system, there will be a lot of services with many service contracts, endpoints and behaviors. Besides the client calling the service, in a large distributed system a service may invoke other services. In this case, one service might need to know the endpoints it invokes. This might not be a problem in a small system. But when you have more than 10 services this might be a problem. For example in my current product, there are around 10 services, such as the user authentication service, UI integration service, location service, license service, device monitor service, event monitor service, schedule job service, accounting service, player management service, etc..   Benefit of Discovery Service Since almost all my services need to invoke at least one other service. This would be a difficult task to make sure all services endpoints are configured correctly in every service. And furthermore, it would be a nightmare when a service changed its endpoint at runtime. Hence, we need a discovery service to remove the dependency (configuration dependency). A discovery service plays as a service dictionary which stores the relationship between the contracts and the endpoints for every service. By using the discovery service, when service X wants to invoke service Y, it just need to ask the discovery service where is service Y, then the discovery service will return all proper endpoints of service Y, then service X can use the endpoint to send the request to service Y. And when some services changed their endpoint address, all need to do is to update its records in the discovery service then all others will know its new endpoint. In WCF 4.0 Discovery it supports both managed proxy discovery mode and ad-hoc discovery mode. In ad-hoc mode there is no standalone discovery service. When a client wanted to invoke a service, it will broadcast an message (normally in UDP protocol) to the entire network with the service match criteria. All services which enabled the discovery behavior will receive this message and only those matched services will send their endpoint back to the client. The managed proxy discovery service works as I described above. In this post I will only cover the managed proxy mode, where there’s a discovery service. For more information about the ad-hoc mode please refer to the MSDN.   Service Announcement and Probe The main functionality of discovery service should be return the proper endpoint addresses back to the service who is looking for. In most cases the consume service (as a client) will send the contract which it wanted to request to the discovery service. And then the discovery service will find the endpoint and respond. Sometimes the contract and endpoint are not enough. It also contains versioning, extensions attributes. This post I will only cover the case includes contract and endpoint. When a client (or sometimes a service who need to invoke another service) need to connect to a target service, it will firstly request the discovery service through the “Probe” method with the criteria. Basically the criteria contains the contract type name of the target service. Then the discovery service will search its endpoint repository by the criteria. The repository might be a database, a distributed cache or a flat XML file. If it matches, the discovery service will grab the endpoint information (it’s called discovery endpoint metadata in WCF) and send back. And this is called “Probe”. Finally the client received the discovery endpoint metadata and will use the endpoint to connect to the target service. Besides the probe, discovery service should take the responsible to know there is a new service available when it goes online, as well as stopped when it goes offline. This feature is named “Announcement”. When a service started and stopped, it will announce to the discovery service. So the basic functionality of a discovery service should includes: 1, An endpoint which receive the service online message, and add the service endpoint information in the discovery repository. 2, An endpoint which receive the service offline message, and remove the service endpoint information from the discovery repository. 3, An endpoint which receive the client probe message, and return the matches service endpoints, and return the discovery endpoint metadata. WCF 4.0 discovery service just covers all these features in it's infrastructure classes.   Discovery Service in WCF 4.0 WCF 4.0 introduced a new assembly named System.ServiceModel.Discovery which has all necessary classes and interfaces to build a WS-Discovery compliant discovery service. It supports ad-hoc and managed proxy modes. For the case mentioned in this post, what we need to build is a standalone discovery service, which is the managed proxy discovery service mode. To build a managed discovery service in WCF 4.0 just create a new class inherits from the abstract class System.ServiceModel.Discovery.DiscoveryProxy. This class implemented and abstracted the procedures of service announcement and probe. And it exposes 8 abstract methods where we can implement our own endpoint register, unregister and find logic. These 8 methods are asynchronized, which means all invokes to the discovery service are asynchronously, for better service capability and performance. 1, OnBeginOnlineAnnouncement, OnEndOnlineAnnouncement: Invoked when a service sent the online announcement message. We need to add the endpoint information to the repository in this method. 2, OnBeginOfflineAnnouncement, OnEndOfflineAnnouncement: Invoked when a service sent the offline announcement message. We need to remove the endpoint information from the repository in this method. 3, OnBeginFind, OnEndFind: Invoked when a client sent the probe message that want to find the service endpoint information. We need to look for the proper endpoints by matching the client’s criteria through the repository in this method. 4, OnBeginResolve, OnEndResolve: Invoked then a client sent the resolve message. Different from the find method, when using resolve method the discovery service will return the exactly one service endpoint metadata to the client. In our example we will NOT implement this method.   Let’s create our own discovery service, inherit the base System.ServiceModel.Discovery.DiscoveryProxy. We also need to specify the service behavior in this class. Since the build-in discovery service host class only support the singleton mode, we must set its instance context mode to single. 1: using System; 2: using System.Collections.Generic; 3: using System.Linq; 4: using System.Text; 5: using System.ServiceModel.Discovery; 6: using System.ServiceModel; 7:  8: namespace Phare.Service 9: { 10: [ServiceBehavior(InstanceContextMode = InstanceContextMode.Single, ConcurrencyMode = ConcurrencyMode.Multiple)] 11: public class ManagedProxyDiscoveryService : DiscoveryProxy 12: { 13: protected override IAsyncResult OnBeginFind(FindRequestContext findRequestContext, AsyncCallback callback, object state) 14: { 15: throw new NotImplementedException(); 16: } 17:  18: protected override IAsyncResult OnBeginOfflineAnnouncement(DiscoveryMessageSequence messageSequence, EndpointDiscoveryMetadata endpointDiscoveryMetadata, AsyncCallback callback, object state) 19: { 20: throw new NotImplementedException(); 21: } 22:  23: protected override IAsyncResult OnBeginOnlineAnnouncement(DiscoveryMessageSequence messageSequence, EndpointDiscoveryMetadata endpointDiscoveryMetadata, AsyncCallback callback, object state) 24: { 25: throw new NotImplementedException(); 26: } 27:  28: protected override IAsyncResult OnBeginResolve(ResolveCriteria resolveCriteria, AsyncCallback callback, object state) 29: { 30: throw new NotImplementedException(); 31: } 32:  33: protected override void OnEndFind(IAsyncResult result) 34: { 35: throw new NotImplementedException(); 36: } 37:  38: protected override void OnEndOfflineAnnouncement(IAsyncResult result) 39: { 40: throw new NotImplementedException(); 41: } 42:  43: protected override void OnEndOnlineAnnouncement(IAsyncResult result) 44: { 45: throw new NotImplementedException(); 46: } 47:  48: protected override EndpointDiscoveryMetadata OnEndResolve(IAsyncResult result) 49: { 50: throw new NotImplementedException(); 51: } 52: } 53: } Then let’s implement the online, offline and find methods one by one. WCF discovery service gives us full flexibility to implement the endpoint add, remove and find logic. For the demo purpose we will use an internal dictionary to store the services’ endpoint metadata. In the next post we will see how to serialize and store these information in database. Define a concurrent dictionary inside the service class since our it will be used in the multiple threads scenario. 1: [ServiceBehavior(InstanceContextMode = InstanceContextMode.Single, ConcurrencyMode = ConcurrencyMode.Multiple)] 2: public class ManagedProxyDiscoveryService : DiscoveryProxy 3: { 4: private ConcurrentDictionary<EndpointAddress, EndpointDiscoveryMetadata> _services; 5:  6: public ManagedProxyDiscoveryService() 7: { 8: _services = new ConcurrentDictionary<EndpointAddress, EndpointDiscoveryMetadata>(); 9: } 10: } Then we can simply implement the logic of service online and offline. 1: protected override IAsyncResult OnBeginOnlineAnnouncement(DiscoveryMessageSequence messageSequence, EndpointDiscoveryMetadata endpointDiscoveryMetadata, AsyncCallback callback, object state) 2: { 3: _services.AddOrUpdate(endpointDiscoveryMetadata.Address, endpointDiscoveryMetadata, (key, value) => endpointDiscoveryMetadata); 4: return new OnOnlineAnnouncementAsyncResult(callback, state); 5: } 6:  7: protected override void OnEndOnlineAnnouncement(IAsyncResult result) 8: { 9: OnOnlineAnnouncementAsyncResult.End(result); 10: } 11:  12: protected override IAsyncResult OnBeginOfflineAnnouncement(DiscoveryMessageSequence messageSequence, EndpointDiscoveryMetadata endpointDiscoveryMetadata, AsyncCallback callback, object state) 13: { 14: EndpointDiscoveryMetadata endpoint = null; 15: _services.TryRemove(endpointDiscoveryMetadata.Address, out endpoint); 16: return new OnOfflineAnnouncementAsyncResult(callback, state); 17: } 18:  19: protected override void OnEndOfflineAnnouncement(IAsyncResult result) 20: { 21: OnOfflineAnnouncementAsyncResult.End(result); 22: } Regards the find method, the parameter FindRequestContext.Criteria has a method named IsMatch, which can be use for us to evaluate which service metadata is satisfied with the criteria. So the implementation of find method would be like this. 1: protected override IAsyncResult OnBeginFind(FindRequestContext findRequestContext, AsyncCallback callback, object state) 2: { 3: _services.Where(s => findRequestContext.Criteria.IsMatch(s.Value)) 4: .Select(s => s.Value) 5: .All(meta => 6: { 7: findRequestContext.AddMatchingEndpoint(meta); 8: return true; 9: }); 10: return new OnFindAsyncResult(callback, state); 11: } 12:  13: protected override void OnEndFind(IAsyncResult result) 14: { 15: OnFindAsyncResult.End(result); 16: } As you can see, we checked all endpoints metadata in repository by invoking the IsMatch method. Then add all proper endpoints metadata into the parameter. Finally since all these methods are asynchronized we need some AsyncResult classes as well. Below are the base class and the inherited classes used in previous methods. 1: using System; 2: using System.Collections.Generic; 3: using System.Linq; 4: using System.Text; 5: using System.Threading; 6:  7: namespace Phare.Service 8: { 9: abstract internal class AsyncResult : IAsyncResult 10: { 11: AsyncCallback callback; 12: bool completedSynchronously; 13: bool endCalled; 14: Exception exception; 15: bool isCompleted; 16: ManualResetEvent manualResetEvent; 17: object state; 18: object thisLock; 19:  20: protected AsyncResult(AsyncCallback callback, object state) 21: { 22: this.callback = callback; 23: this.state = state; 24: this.thisLock = new object(); 25: } 26:  27: public object AsyncState 28: { 29: get 30: { 31: return state; 32: } 33: } 34:  35: public WaitHandle AsyncWaitHandle 36: { 37: get 38: { 39: if (manualResetEvent != null) 40: { 41: return manualResetEvent; 42: } 43: lock (ThisLock) 44: { 45: if (manualResetEvent == null) 46: { 47: manualResetEvent = new ManualResetEvent(isCompleted); 48: } 49: } 50: return manualResetEvent; 51: } 52: } 53:  54: public bool CompletedSynchronously 55: { 56: get 57: { 58: return completedSynchronously; 59: } 60: } 61:  62: public bool IsCompleted 63: { 64: get 65: { 66: return isCompleted; 67: } 68: } 69:  70: object ThisLock 71: { 72: get 73: { 74: return this.thisLock; 75: } 76: } 77:  78: protected static TAsyncResult End<TAsyncResult>(IAsyncResult result) 79: where TAsyncResult : AsyncResult 80: { 81: if (result == null) 82: { 83: throw new ArgumentNullException("result"); 84: } 85:  86: TAsyncResult asyncResult = result as TAsyncResult; 87:  88: if (asyncResult == null) 89: { 90: throw new ArgumentException("Invalid async result.", "result"); 91: } 92:  93: if (asyncResult.endCalled) 94: { 95: throw new InvalidOperationException("Async object already ended."); 96: } 97:  98: asyncResult.endCalled = true; 99:  100: if (!asyncResult.isCompleted) 101: { 102: asyncResult.AsyncWaitHandle.WaitOne(); 103: } 104:  105: if (asyncResult.manualResetEvent != null) 106: { 107: asyncResult.manualResetEvent.Close(); 108: } 109:  110: if (asyncResult.exception != null) 111: { 112: throw asyncResult.exception; 113: } 114:  115: return asyncResult; 116: } 117:  118: protected void Complete(bool completedSynchronously) 119: { 120: if (isCompleted) 121: { 122: throw new InvalidOperationException("This async result is already completed."); 123: } 124:  125: this.completedSynchronously = completedSynchronously; 126:  127: if (completedSynchronously) 128: { 129: this.isCompleted = true; 130: } 131: else 132: { 133: lock (ThisLock) 134: { 135: this.isCompleted = true; 136: if (this.manualResetEvent != null) 137: { 138: this.manualResetEvent.Set(); 139: } 140: } 141: } 142:  143: if (callback != null) 144: { 145: callback(this); 146: } 147: } 148:  149: protected void Complete(bool completedSynchronously, Exception exception) 150: { 151: this.exception = exception; 152: Complete(completedSynchronously); 153: } 154: } 155: } 1: using System; 2: using System.Collections.Generic; 3: using System.Linq; 4: using System.Text; 5: using System.ServiceModel.Discovery; 6: using Phare.Service; 7:  8: namespace Phare.Service 9: { 10: internal sealed class OnOnlineAnnouncementAsyncResult : AsyncResult 11: { 12: public OnOnlineAnnouncementAsyncResult(AsyncCallback callback, object state) 13: : base(callback, state) 14: { 15: this.Complete(true); 16: } 17:  18: public static void End(IAsyncResult result) 19: { 20: AsyncResult.End<OnOnlineAnnouncementAsyncResult>(result); 21: } 22:  23: } 24:  25: sealed class OnOfflineAnnouncementAsyncResult : AsyncResult 26: { 27: public OnOfflineAnnouncementAsyncResult(AsyncCallback callback, object state) 28: : base(callback, state) 29: { 30: this.Complete(true); 31: } 32:  33: public static void End(IAsyncResult result) 34: { 35: AsyncResult.End<OnOfflineAnnouncementAsyncResult>(result); 36: } 37: } 38:  39: sealed class OnFindAsyncResult : AsyncResult 40: { 41: public OnFindAsyncResult(AsyncCallback callback, object state) 42: : base(callback, state) 43: { 44: this.Complete(true); 45: } 46:  47: public static void End(IAsyncResult result) 48: { 49: AsyncResult.End<OnFindAsyncResult>(result); 50: } 51: } 52:  53: sealed class OnResolveAsyncResult : AsyncResult 54: { 55: EndpointDiscoveryMetadata matchingEndpoint; 56:  57: public OnResolveAsyncResult(EndpointDiscoveryMetadata matchingEndpoint, AsyncCallback callback, object state) 58: : base(callback, state) 59: { 60: this.matchingEndpoint = matchingEndpoint; 61: this.Complete(true); 62: } 63:  64: public static EndpointDiscoveryMetadata End(IAsyncResult result) 65: { 66: OnResolveAsyncResult thisPtr = AsyncResult.End<OnResolveAsyncResult>(result); 67: return thisPtr.matchingEndpoint; 68: } 69: } 70: } Now we have finished the discovery service. The next step is to host it. The discovery service is a standard WCF service. So we can use ServiceHost on a console application, windows service, or in IIS as usual. The following code is how to host the discovery service we had just created in a console application. 1: static void Main(string[] args) 2: { 3: using (var host = new ServiceHost(new ManagedProxyDiscoveryService())) 4: { 5: host.Opened += (sender, e) => 6: { 7: host.Description.Endpoints.All((ep) => 8: { 9: Console.WriteLine(ep.ListenUri); 10: return true; 11: }); 12: }; 13:  14: try 15: { 16: // retrieve the announcement, probe endpoint and binding from configuration 17: var announcementEndpointAddress = new EndpointAddress(ConfigurationManager.AppSettings["announcementEndpointAddress"]); 18: var probeEndpointAddress = new EndpointAddress(ConfigurationManager.AppSettings["probeEndpointAddress"]); 19: var binding = Activator.CreateInstance(Type.GetType(ConfigurationManager.AppSettings["bindingType"], true, true)) as Binding; 20: var announcementEndpoint = new AnnouncementEndpoint(binding, announcementEndpointAddress); 21: var probeEndpoint = new DiscoveryEndpoint(binding, probeEndpointAddress); 22: probeEndpoint.IsSystemEndpoint = false; 23: // append the service endpoint for announcement and probe 24: host.AddServiceEndpoint(announcementEndpoint); 25: host.AddServiceEndpoint(probeEndpoint); 26:  27: host.Open(); 28:  29: Console.WriteLine("Press any key to exit."); 30: Console.ReadKey(); 31: } 32: catch (Exception ex) 33: { 34: Console.WriteLine(ex.ToString()); 35: } 36: } 37:  38: Console.WriteLine("Done."); 39: Console.ReadKey(); 40: } What we need to notice is that, the discovery service needs two endpoints for announcement and probe. In this example I just retrieve them from the configuration file. I also specified the binding of these two endpoints in configuration file as well. 1: <?xml version="1.0"?> 2: <configuration> 3: <startup> 4: <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.0"/> 5: </startup> 6: <appSettings> 7: <add key="announcementEndpointAddress" value="net.tcp://localhost:10010/announcement"/> 8: <add key="probeEndpointAddress" value="net.tcp://localhost:10011/probe"/> 9: <add key="bindingType" value="System.ServiceModel.NetTcpBinding, System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"/> 10: </appSettings> 11: </configuration> And this is the console screen when I ran my discovery service. As you can see there are two endpoints listening for announcement message and probe message.   Discoverable Service and Client Next, let’s create a WCF service that is discoverable, which means it can be found by the discovery service. To do so, we need to let the service send the online announcement message to the discovery service, as well as offline message before it shutdown. Just create a simple service which can make the incoming string to upper. The service contract and implementation would be like this. 1: [ServiceContract] 2: public interface IStringService 3: { 4: [OperationContract] 5: string ToUpper(string content); 6: } 1: public class StringService : IStringService 2: { 3: public string ToUpper(string content) 4: { 5: return content.ToUpper(); 6: } 7: } Then host this service in the console application. In order to make the discovery service easy to be tested the service address will be changed each time it’s started. 1: static void Main(string[] args) 2: { 3: var baseAddress = new Uri(string.Format("net.tcp://localhost:11001/stringservice/{0}/", Guid.NewGuid().ToString())); 4:  5: using (var host = new ServiceHost(typeof(StringService), baseAddress)) 6: { 7: host.Opened += (sender, e) => 8: { 9: Console.WriteLine("Service opened at {0}", host.Description.Endpoints.First().ListenUri); 10: }; 11:  12: host.AddServiceEndpoint(typeof(IStringService), new NetTcpBinding(), string.Empty); 13:  14: host.Open(); 15:  16: Console.WriteLine("Press any key to exit."); 17: Console.ReadKey(); 18: } 19: } Currently this service is NOT discoverable. We need to add a special service behavior so that it could send the online and offline message to the discovery service announcement endpoint when the host is opened and closed. WCF 4.0 introduced a service behavior named ServiceDiscoveryBehavior. When we specified the announcement endpoint address and appended it to the service behaviors this service will be discoverable. 1: var announcementAddress = new EndpointAddress(ConfigurationManager.AppSettings["announcementEndpointAddress"]); 2: var announcementBinding = Activator.CreateInstance(Type.GetType(ConfigurationManager.AppSettings["bindingType"], true, true)) as Binding; 3: var announcementEndpoint = new AnnouncementEndpoint(announcementBinding, announcementAddress); 4: var discoveryBehavior = new ServiceDiscoveryBehavior(); 5: discoveryBehavior.AnnouncementEndpoints.Add(announcementEndpoint); 6: host.Description.Behaviors.Add(discoveryBehavior); The ServiceDiscoveryBehavior utilizes the service extension and channel dispatcher to implement the online and offline announcement logic. In short, it injected the channel open and close procedure and send the online and offline message to the announcement endpoint.   On client side, when we have the discovery service, a client can invoke a service without knowing its endpoint. WCF discovery assembly provides a class named DiscoveryClient, which can be used to find the proper service endpoint by passing the criteria. In the code below I initialized the DiscoveryClient, specified the discovery service probe endpoint address. Then I created the find criteria by specifying the service contract I wanted to use and invoke the Find method. This will send the probe message to the discovery service and it will find the endpoints back to me. The discovery service will return all endpoints that matches the find criteria, which means in the result of the find method there might be more than one endpoints. In this example I just returned the first matched one back. In the next post I will show how to extend our discovery service to make it work like a service load balancer. 1: static EndpointAddress FindServiceEndpoint() 2: { 3: var probeEndpointAddress = new EndpointAddress(ConfigurationManager.AppSettings["probeEndpointAddress"]); 4: var probeBinding = Activator.CreateInstance(Type.GetType(ConfigurationManager.AppSettings["bindingType"], true, true)) as Binding; 5: var discoveryEndpoint = new DiscoveryEndpoint(probeBinding, probeEndpointAddress); 6:  7: EndpointAddress address = null; 8: FindResponse result = null; 9: using (var discoveryClient = new DiscoveryClient(discoveryEndpoint)) 10: { 11: result = discoveryClient.Find(new FindCriteria(typeof(IStringService))); 12: } 13:  14: if (result != null && result.Endpoints.Any()) 15: { 16: var endpointMetadata = result.Endpoints.First(); 17: address = endpointMetadata.Address; 18: } 19: return address; 20: } Once we probed the discovery service we will receive the endpoint. So in the client code we can created the channel factory from the endpoint and binding, and invoke to the service. When creating the client side channel factory we need to make sure that the client side binding should be the same as the service side. WCF discovery service can be used to find the endpoint for a service contract, but the binding is NOT included. This is because the binding was not in the WS-Discovery specification. In the next post I will demonstrate how to add the binding information into the discovery service. At that moment the client don’t need to create the binding by itself. Instead it will use the binding received from the discovery service. 1: static void Main(string[] args) 2: { 3: Console.WriteLine("Say something..."); 4: var content = Console.ReadLine(); 5: while (!string.IsNullOrWhiteSpace(content)) 6: { 7: Console.WriteLine("Finding the service endpoint..."); 8: var address = FindServiceEndpoint(); 9: if (address == null) 10: { 11: Console.WriteLine("There is no endpoint matches the criteria."); 12: } 13: else 14: { 15: Console.WriteLine("Found the endpoint {0}", address.Uri); 16:  17: var factory = new ChannelFactory<IStringService>(new NetTcpBinding(), address); 18: factory.Opened += (sender, e) => 19: { 20: Console.WriteLine("Connecting to {0}.", factory.Endpoint.ListenUri); 21: }; 22: var proxy = factory.CreateChannel(); 23: using (proxy as IDisposable) 24: { 25: Console.WriteLine("ToUpper: {0} => {1}", content, proxy.ToUpper(content)); 26: } 27: } 28:  29: Console.WriteLine("Say something..."); 30: content = Console.ReadLine(); 31: } 32: } Similarly, the discovery service probe endpoint and binding were defined in the configuration file. 1: <?xml version="1.0"?> 2: <configuration> 3: <startup> 4: <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.0"/> 5: </startup> 6: <appSettings> 7: <add key="announcementEndpointAddress" value="net.tcp://localhost:10010/announcement"/> 8: <add key="probeEndpointAddress" value="net.tcp://localhost:10011/probe"/> 9: <add key="bindingType" value="System.ServiceModel.NetTcpBinding, System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"/> 10: </appSettings> 11: </configuration> OK, now let’s have a test. Firstly start the discovery service, and then start our discoverable service. When it started it will announced to the discovery service and registered its endpoint into the repository, which is the local dictionary. And then start the client and type something. As you can see the client asked the discovery service for the endpoint and then establish the connection to the discoverable service. And more interesting, do NOT close the client console but terminate the discoverable service but press the enter key. This will make the service send the offline message to the discovery service. Then start the discoverable service again. Since we made it use a different address each time it started, currently it should be hosted on another address. If we enter something in the client we could see that it asked the discovery service and retrieve the new endpoint, and connect the the service.   Summary In this post I discussed the benefit of using the discovery service and the procedures of service announcement and probe. I also demonstrated how to leverage the WCF Discovery feature in WCF 4.0 to build a simple managed discovery service. For test purpose, in this example I used the in memory dictionary as the discovery endpoint metadata repository. And when finding I also just return the first matched endpoint back. I also hard coded the bindings between the discoverable service and the client. In next post I will show you how to solve the problem mentioned above, as well as some additional feature for production usage. You can download the code here.   Hope this helps, Shaun All documents and related graphics, codes are provided "AS IS" without warranty of any kind. Copyright © Shaun Ziyan Xu. This work is licensed under the Creative Commons License.

    Read the article

  • Itautec Accelerates Profitable High Tech Customer Service

    - by charles.knapp
    Itautec is a Brazilian-based global high technology products and services firm with strong performance in the global market of banking and commercial automation, with more than 2,300 global clients. It recently deployed Siebel CRM for sales, customer support, and field service. In the first year of use, Siebel CRM enabled a 30% growth in services revenue. Siebel CRM also reduced support costs. "Oracle's Siebel CRM has minimized costs and made our customer service more agile," said Adriano Rodrigues da Silva, IT Manager. "Before deployment, 95% of our customer service contacts were made by phone. Siebel CRM made it possible to expand' choices, so that now 55% of our customers contact our helpdesk through the newer communications channels." Read more here about Itautec's success, and learn more here about how Siebel CRM can help your firm to grow customer service revenues, improve service levels, and reduce costs.

    Read the article

  • Big Data – Learning Basics of Big Data in 21 Days – Bookmark

    - by Pinal Dave
    Earlier this month I had a great time to write Bascis of Big Data series. This series received great response and lots of good comments I have received, I am going to follow up this basics series with further in-depth series in near future. Here is the consolidated blog post where you can find all the 21 days blog posts together. Bookmark this page for future reference. Big Data – Beginning Big Data – Day 1 of 21 Big Data – What is Big Data – 3 Vs of Big Data – Volume, Velocity and Variety – Day 2 of 21 Big Data – Evolution of Big Data – Day 3 of 21 Big Data – Basics of Big Data Architecture – Day 4 of 21 Big Data – Buzz Words: What is NoSQL – Day 5 of 21 Big Data – Buzz Words: What is Hadoop – Day 6 of 21 Big Data – Buzz Words: What is MapReduce – Day 7 of 21 Big Data – Buzz Words: What is HDFS – Day 8 of 21 Big Data – Buzz Words: Importance of Relational Database in Big Data World – Day 9 of 21 Big Data – Buzz Words: What is NewSQL – Day 10 of 21 Big Data – Role of Cloud Computing in Big Data – Day 11 of 21 Big Data – Operational Databases Supporting Big Data – RDBMS and NoSQL – Day 12 of 21 Big Data – Operational Databases Supporting Big Data – Key-Value Pair Databases and Document Databases – Day 13 of 21 Big Data – Operational Databases Supporting Big Data – Columnar, Graph and Spatial Database – Day 14 of 21 Big DataData Mining with Hive – What is Hive? – What is HiveQL (HQL)? – Day 15 of 21 Big Data – Interacting with Hadoop – What is PIG? – What is PIG Latin? – Day 16 of 21 Big Data – Interacting with Hadoop – What is Sqoop? – What is Zookeeper? – Day 17 of 21 Big Data – Basics of Big Data Analytics – Day 18 of 21 Big Data – How to become a Data Scientist and Learn Data Science? – Day 19 of 21 Big Data – Various Learning Resources – How to Start with Big Data? – Day 20 of 21 Big Data – Final Wrap and What Next – Day 21 of 21 Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Using a service registry that doesn’t suck part II: Dear registry, do you have to be a message broker?

    - by gsusx
    Continuing our series of posts about service registry patterns that suck, we decided to address one of the most common techniques that Service Oriented (SOA) governance tools use to enforce policies. Scenario Service registries and repositories serve typically as a mechanism for storing service policies that model behaviors such as security, trust, reliable messaging, SLAs, etc. This makes perfect sense given that SOA governance registries were conceived as a mechanism to store and manage the policies...(read more)

    Read the article

  • Data Quality Through Data Governance

    Data Quality Governance Data quality is very important to every organization, bad data cost an organization time, money, and resources that could be prevented if the proper governance was put in to place.  Data Governance Program Criteria: Support from Executive Management and all Business Units Data Stewardship Program  Cross Functional Team of Data Stewards Data Governance Committee Quality Structured Data It should go without saying but any successful project in today’s business world must get buy in from executive management and all stakeholders involved with the project. If management does not fully support a project because they see it is in there and the company’s best interest then they will remove/eliminate funding, resources and allocated time to work on the project. In essence they can render a project dead until it is official killed by the business. In addition, buy in from stake holders is also very important because they can cause delays increased spending in time, money and resources because they do not support a project. Data Stewardship programs are administered by a data steward manager who primary focus is to support, train and manage a cross functional data stewards team. A cross functional team of data stewards are pulled from various departments act to ensure that all systems work to ensure that an organization’s goals are achieved. Typically, data stewards are subject matter experts that act as mediators between their respective departments and IT. Data Quality Procedures Data Governance Committees are composed of data stewards, Upper management, IT Leadership and various subject matter experts depending on a company. The primary goal of this committee is to define strategic goals, coordinate activities, set data standards and offer data guidelines for the business. Data Quality Policies In 1997, Claudia Imhoff defined a Data Stewardship’s responsibility as to approve business naming standards, develop consistent data definitions, determine data aliases, develop standard calculations and derivations, document the business rules of the corporation, monitor the quality of the data in the data warehouse, define security requirements, and so forth. She further explains data stewards responsible for creating and enforcing polices on the following but not limited to issues. Resolving Data Integration Issues Determining Data Security Documenting Data Definitions, Calculations, Summarizations, etc. Maintaining/Updating Business Rules Analyzing and Improving Data Quality

    Read the article

  • SQL SERVER – Advanced Data Quality Services with Melissa Data – Azure Data Market

    - by pinaldave
    There has been much fanfare over the new SQL Server 2012, and especially around its new companion product Data Quality Services (DQS). Among the many new features is the addition of this integrated knowledge-driven product that enables data stewards everywhere to profile, match, and cleanse data. In addition to the homegrown rules that data stewards can design and implement, there are also connectors to third party providers that are hosted in the Azure Datamarket marketplace.  In this review, I leverage SQL Server 2012 Data Quality Services, and proceed to subscribe to a third party data cleansing product through the Datamarket to showcase this unique capability. Crucial Questions For the purposes of the review, I used a database I had in an Excel spreadsheet with name and address information. Upon a cursory inspection, there are miscellaneous problems with these records; some addresses are missing ZIP codes, others missing a city, and some records are slightly misspelled or have unparsed suites. With DQS, I can easily add a knowledge base to help standardize my values, such as for state abbreviations. But how do I know that my address is correct? And if my address is not correct, what should it be corrected to? The answer lies in a third party knowledge base by the acknowledged USPS certified address accuracy experts at Melissa Data. Reference Data Services Within DQS there is a handy feature to actually add reference data from many different third-party Reference Data Services (RDS) vendors. DQS simplifies the processes of cleansing, standardizing, and enriching data through custom rules and through service providers from the Azure Datamarket. A quick jump over to the Datamarket site shows me that there are a handful of providers that offer data directly through Data Quality Services. Upon subscribing to these services, one can attach a DQS domain or composite domain (fields in a record) to a reference data service provider, and begin using it to cleanse, standardize, and enrich that data. Besides what I am looking for (address correction and enrichment), it is possible to subscribe to a host of other services including geocoding, IP address reference, phone checking and enrichment, as well as name parsing, standardization, and genderization.  These capabilities extend the data quality that DQS has natively by quite a bit. For my current address correction review, I needed to first sign up to a reference data provider on the Azure Data Market site. For this example, I used Melissa Data’s Address Check Service. They offer free one-month trials, so if you wish to follow along, or need to add address quality to your own data, I encourage you to sign up with them. Once I subscribed to the desired Reference Data Provider, I navigated my browser to the Account Keys within My Account to view the generated account key, which I then inserted into the DQS Client – Configuration under the Administration area. Step by Step to Guide That was all it took to hook in the subscribed provider -Melissa Data- directly to my DQS Client. The next step was for me to attach and map in my Reference Data from the newly acquired reference data provider, to a domain in my knowledge base. On the DQS Client home screen, I selected “New Knowledge Base” under Knowledge Base Management on the left-hand side of the home screen. Under New Knowledge Base, I typed a Name and description of my new knowledge base, then proceeded to the Domain Management screen. Here I established a series of domains (fields) and then linked them all together as a composite domain (record set). Using the Create Domain button, I created the following domains according to the fields in my incoming data: Name Address Suite City State Zip I added a Suite column in my domain because Melissa Data has the ability to return missing Suites based on last name or company. And that’s a great benefit of using these third party providers, as they have data that the data steward would not normally have access to. The bottom line is, with these third party data providers, I can actually improve my data. Next, I created a composite domain (fulladdress) and added the (field) domains into the composite domain. This essentially groups our address fields together in a record to facilitate the full address cleansing they perform. I then selected my newly created composite domain and under the Reference Data tab, added my third party reference data provider –Melissa Data’s Address Check- and mapped in each domain that I had to the provider’s Schema. Now that my composite domain has been married to the Reference Data service, I can take the newly published knowledge base and create a project to cleanse and enrich my data. My next task was to create a new Data Quality project, mapping in my data source and matching it to the appropriate domain column, and then kick off the verification process. It took just a few minutes with some progress indicators indicating that it was working. When the process concluded, there was a helpful set of tabs that place the response records into categories: suggested; new; invalid; corrected (automatically); and correct. Accepting the suggestions provided by  Melissa Data allowed me to clean up all the records and flag the invalid ones. It is very apparent that DQS makes address data quality simplistic for any IT professional. Final Note As I have shown, DQS makes data quality very easy. Within minutes I was able to set up a data cleansing and enrichment routine within my data quality project, and ensure that my address data was clean, verified, and standardized against real reference data. As reviewed here, it’s easy to see how both SQL Server 2012 and DQS work to take what used to require a highly skilled developer, and empower an average business or database person to consume external services and clean data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Utility, T SQL, Technology Tagged: DQS

    Read the article

  • Tellago && Tellago Studios 2010

    - by gsusx
    With 2011 around the corner we, at Tellago and Tellago Studios , we have been spending a lot of times evaluating our successes and failures (yes those too ;)) of 2010 and delineating some of our goals and strategies for 2011. When I look at 2010 here are some of the things that quickly jump off the page: Growing Tellago by 300% Launching a brand new company: Tellago Studios Expanding our customer base Establishing our business intelligence practice http://tellago.com/what-we-say/events/business-intelligence...(read more)

    Read the article

  • Big Data – Is Big Data Relevant to me? – Big Data Questionnaires – Guest Post by Vinod Kumar

    - by Pinal Dave
    This guest post is by Vinod Kumar. Vinod Kumar has worked with SQL Server extensively since joining the industry over a decade ago. Working on various versions of SQL Server 7.0, Oracle 7.3 and other database technologies – he now works with the Microsoft Technology Center (MTC) as a Technology Architect. Let us read the blog post in Vinod’s own voice. I think the series from Pinal is a good one for anyone planning to start on Big Data journey from the basics. In my daily customer interactions this buzz of “Big Data” always comes up, I react generally saying – “Sir, do you really have a ‘Big Data’ problem or do you have a big Data problem?” Generally, there is a silence in the air when I ask this question. Data is everywhere in organizations – be it big data, small data, all data and for few it is bad data which is same as no data :). Wow, don’t discount me as someone who opposes “Big Data”, I am a big supporter as much as I am a critic of the abuse of this term by the people. In this post, I wanted to let my mind flow so that you can also think in the direction I want you to see these concepts. In any case, this is not an exhaustive dump of what is in my mind – but you will surely get the drift how I am going to question Big Data terms from customers!!! Is Big Data Relevant to me? Many of my customers talk to me like blank whiteboard with no idea – “why Big Data”. They want to jump into the bandwagon of technology and they want to decipher insights from their unexplored data a.k.a. unstructured data with structured data. So what are these industry scenario’s that come to mind? Here are some of them: Financials Fraud detection: Banks and Credit cards are monitoring your spending habits on real-time basis. Customer Segmentation: applies in every industry from Banking to Retail to Aviation to Utility and others where they deal with end customer who consume their products and services. Customer Sentiment Analysis: Responding to negative brand perception on social or amplify the positive perception. Sales and Marketing Campaign: Understand the impact and get closer to customer delight. Call Center Analysis: attempt to take unstructured voice recordings and analyze them for content and sentiment. Medical Reduce Re-admissions: How to build a proactive follow-up engagements with patients. Patient Monitoring: How to track Inpatient, Out-Patient, Emergency Visits, Intensive Care Units etc. Preventive Care: Disease identification and Risk stratification is a very crucial business function for medical. Claims fraud detection: There is no precise dollars that one can put here, but this is a big thing for the medical field. Retail Customer Sentiment Analysis, Customer Care Centers, Campaign Management. Supply Chain Analysis: Every sensors and RFID data can be tracked for warehouse space optimization. Location based marketing: Based on where a check-in happens retail stores can be optimize their marketing. Telecom Price optimization and Plans, Finding Customer churn, Customer loyalty programs Call Detail Record (CDR) Analysis, Network optimizations, User Location analysis Customer Behavior Analysis Insurance Fraud Detection & Analysis, Pricing based on customer Sentiment Analysis, Loyalty Management Agents Analysis, Customer Value Management This list can go on to other areas like Utility, Manufacturing, Travel, ITES etc. So as you can see, there are obviously interesting use cases for each of these industry verticals. These are just representative list. Where to start? A lot of times I try to quiz customers on a number of dimensions before starting a Big Data conversation. Are you getting the data you need the way you want it and in a timely manner? Can you get in and analyze the data you need? How quickly is IT to respond to your BI Requests? How easily can you get at the data that you need to run your business/department/project? How are you currently measuring your business? Can you get the data you need to react WITHIN THE QUARTER to impact behaviors to meet your numbers or is it always “rear-view mirror?” How are you measuring: The Brand Customer Sentiment Your Competition Your Pricing Your performance Supply Chain Efficiencies Predictive product / service positioning What are your key challenges of driving collaboration across your global business?  What the challenges in innovation? What challenges are you facing in getting more information out of your data? Note: Garbage-in is Garbage-out. Hold good for all reporting / analytics requirements Big Data POCs? A number of customers get into the realm of setting a small team to work on Big Data – well it is a great start from an understanding point of view, but I tend to ask a number of other questions to such customers. Some of these common questions are: To what degree is your advanced analytics (natural language processing, sentiment analysis, predictive analytics and classification) paired with your Big Data’s efforts? Do you have dedicated resources exploring the possibilities of advanced analytics in Big Data for your business line? Do you plan to employ machine learning technology while doing Advanced Analytics? How is Social Media being monitored in your organization? What is your ability to scale in terms of storage and processing power? Do you have a system in place to sort incoming data in near real time by potential value, data quality, and use frequency? Do you use event-driven architecture to manage incoming data? Do you have specialized data services that can accommodate different formats, security, and the management requirements of multiple data sources? Is your organization currently using or considering in-memory analytics? To what degree are you able to correlate data from your Big Data infrastructure with that from your enterprise data warehouse? Have you extended the role of Data Stewards to include ownership of big data components? Do you prioritize data quality based on the source system (that is Facebook/Twitter data has lower quality thresholds than radio frequency identification (RFID) for a tracking system)? Do your retention policies consider the different legal responsibilities for storing Big Data for a specific amount of time? Do Data Scientists work in close collaboration with Data Stewards to ensure data quality? How is access to attributes of Big Data being given out in the organization? Are roles related to Big Data (Advanced Analyst, Data Scientist) clearly defined? How involved is risk management in the Big Data governance process? Is there a set of documented policies regarding Big Data governance? Is there an enforcement mechanism or approach to ensure that policies are followed? Who is the key sponsor for your Big Data governance program? (The CIO is best) Do you have defined policies surrounding the use of social media data for potential employees and customers, as well as the use of customer Geo-location data? How accessible are complex analytic routines to your user base? What is the level of involvement with outside vendors and third parties in regard to the planning and execution of Big Data projects? What programming technologies are utilized by your data warehouse/BI staff when working with Big Data? These are some of the important questions I ask each customer who is actively evaluating Big Data trends for their organizations. These questions give you a sense of direction where to start, what to use, how to secure, how to analyze and more. Sign off Any Big data is analysis is incomplete without a compelling story. The best way to understand this is to watch Hans Rosling – Gapminder (2:17 to 6:06) videos about the third world myths. Don’t get overwhelmed with the Big Data buzz word, the destination to what your data speaks is important. In this blog post, we did not particularly look at any Big Data technologies. This is a set of questionnaire one needs to keep in mind as they embark their journey of Big Data. I did write some of the basics in my blog: Big Data – Big Hype yet Big Opportunity. Do let me know if these questions make sense?  Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Big Data – Basics of Big Data Architecture – Day 4 of 21

    - by Pinal Dave
    In yesterday’s blog post we understood how Big Data evolution happened. Today we will understand basics of the Big Data Architecture. Big Data Cycle Just like every other database related applications, bit data project have its development cycle. Though three Vs (link) for sure plays an important role in deciding the architecture of the Big Data projects. Just like every other project Big Data project also goes to similar phases of the data capturing, transforming, integrating, analyzing and building actionable reporting on the top of  the data. While the process looks almost same but due to the nature of the data the architecture is often totally different. Here are few of the question which everyone should ask before going ahead with Big Data architecture. Questions to Ask How big is your total database? What is your requirement of the reporting in terms of time – real time, semi real time or at frequent interval? How important is the data availability and what is the plan for disaster recovery? What are the plans for network and physical security of the data? What platform will be the driving force behind data and what are different service level agreements for the infrastructure? This are just basic questions but based on your application and business need you should come up with the custom list of the question to ask. As I mentioned earlier this question may look quite simple but the answer will not be simple. When we are talking about Big Data implementation there are many other important aspects which we have to consider when we decide to go for the architecture. Building Blocks of Big Data Architecture It is absolutely impossible to discuss and nail down the most optimal architecture for any Big Data Solution in a single blog post, however, we can discuss the basic building blocks of big data architecture. Here is the image which I have built to explain how the building blocks of the Big Data architecture works. Above image gives good overview of how in Big Data Architecture various components are associated with each other. In Big Data various different data sources are part of the architecture hence extract, transform and integration are one of the most essential layers of the architecture. Most of the data is stored in relational as well as non relational data marts and data warehousing solutions. As per the business need various data are processed as well converted to proper reports and visualizations for end users. Just like software the hardware is almost the most important part of the Big Data Architecture. In the big data architecture hardware infrastructure is extremely important and failure over instances as well as redundant physical infrastructure is usually implemented. NoSQL in Data Management NoSQL is a very famous buzz word and it really means Not Relational SQL or Not Only SQL. This is because in Big Data Architecture the data is in any format. It can be unstructured, relational or in any other format or from any other data source. To bring all the data together relational technology is not enough, hence new tools, architecture and other algorithms are invented which takes care of all the kind of data. This is collectively called NoSQL. Tomorrow Next four days we will answer the Buzz Words – Hadoop. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Big Data – Evolution of Big Data – Day 3 of 21

    - by Pinal Dave
    In yesterday’s blog post we answered what is the Big Data. Today we will understand why and how the evolution of Big Data has happened. Though the answer is very simple, I would like to tell it in the form of a history lesson. Data in Flat File In earlier days data was stored in the flat file and there was no structure in the flat file.  If any data has to be retrieved from the flat file it was a project by itself. There was no possibility of retrieving the data efficiently and data integrity has been just a term discussed without any modeling or structure around. Database residing in the flat file had more issues than we would like to discuss in today’s world. It was more like a nightmare when there was any data processing involved in the application. Though, applications developed at that time were also not that advanced the need of the data was always there and there was always need of proper data management. Edgar F Codd and 12 Rules Edgar Frank Codd was a British computer scientist who, while working for IBM, invented the relational model for database management, the theoretical basis for relational databases. He presented 12 rules for the Relational Database and suddenly the chaotic world of the database seems to see discipline in the rules. Relational Database was a promising land for all the unstructured database users. Relational Database brought into the relationship between data as well improved the performance of the data retrieval. Database world had immediately seen a major transformation and every single vendors and database users suddenly started to adopt the relational database models. Relational Database Management Systems Since Edgar F Codd proposed 12 rules for the RBDMS there were many different vendors who started them to build applications and tools to support the relationship between database. This was indeed a learning curve for many of the developer who had never worked before with the modeling of the database. However, as time passed by pretty much everybody accepted the relationship of the database and started to evolve product which performs its best with the boundaries of the RDBMS concepts. This was the best era for the databases and it gave the world extreme experts as well as some of the best products. The Entity Relationship model was also evolved at the same time. In software engineering, an Entity–relationship model (ER model) is a data model for describing a database in an abstract way. Enormous Data Growth Well, everything was going fine with the RDBMS in the database world. As there were no major challenges the adoption of the RDBMS applications and tools was pretty much universal. There was a race at times to make the developer’s life much easier with the RDBMS management tools. Due to the extreme popularity and easy to use system pretty much every data was stored in the RDBMS system. New age applications were built and social media took the world by the storm. Every organizations was feeling pressure to provide the best experience for their users based the data they had with them. While this was all going on at the same time data was growing pretty much every organization and application. Data Warehousing The enormous data growth now presented a big challenge for the organizations who wanted to build intelligent systems based on the data and provide near real time superior user experience to their customers. Various organizations immediately start building data warehousing solutions where the data was stored and processed. The trend of the business intelligence becomes the need of everyday. Data was received from the transaction system and overnight was processed to build intelligent reports from it. Though this is a great solution it has its own set of challenges. The relational database model and data warehousing concepts are all built with keeping traditional relational database modeling in the mind and it still has many challenges when unstructured data was present. Interesting Challenge Every organization had expertise to manage structured data but the world had already changed to unstructured data. There was intelligence in the videos, photos, SMS, text, social media messages and various other data sources. All of these needed to now bring to a single platform and build a uniform system which does what businesses need. The way we do business has also been changed. There was a time when user only got the features what technology supported, however, now users ask for the feature and technology is built to support the same. The need of the real time intelligence from the fast paced data flow is now becoming a necessity. Large amount (Volume) of difference (Variety) of high speed data (Velocity) is the properties of the data. The traditional database system has limits to resolve the challenges this new kind of the data presents. Hence the need of the Big Data Science. We need innovation in how we handle and manage data. We need creative ways to capture data and present to users. Big Data is Reality! Tomorrow In tomorrow’s blog post we will try to answer discuss Basics of Big Data Architecture. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Working with Temporal Data in SQL Server

    - by Dejan Sarka
    My third Pluralsight course, Working with Temporal Data in SQL Server, is published. I am really proud on the second part of the course, where I discuss optimization of temporal queries. This was a nearly impossible task for decades. First solutions appeared only lately. I present all together six solutions (and one more that is not a solution), and I invented four of them. http://pluralsight.com/training/Courses/TableOfContents/working-with-temporal-data-sql-server

    Read the article

  • New Communications Industry Data Model with "Factory Installed" Predictive Analytics using Oracle Da

    - by charlie.berger
    Oracle Introduces Oracle Communications Data Model to Provide Actionable Insight for Communications Service Providers   We've integrated pre-installed analytical methodologies with the new Oracle Communications Data Model to deliver automated, simple, yet powerful predictive analytics solutions for customers.  Churn, sentiment analysis, identifying customer segments - all things that can be anticipated and hence, preconcieved and implemented inside an applications.  Read on for more information! TM Forum Management World, Nice, France - 18 May 2010 News Facts To help communications service providers (CSPs) manage and analyze rapidly growing data volumes cost effectively, Oracle today introduced the Oracle Communications Data Model. With the Oracle Communications Data Model, CSPs can achieve rapid time to value by quickly implementing a standards-based enterprise data warehouse that features communications industry-specific reporting, analytics and data mining. The combination of the Oracle Communications Data Model, Oracle Exadata and the Oracle Business Intelligence (BI) Foundation represents the most comprehensive data warehouse and BI solution for the communications industry. Also announced today, Hong Kong Broadband Network enhanced their data warehouse system, going live on Oracle Communications Data Model in three months. The leading provider increased its subscriber base by 37 percent in six months and reduced customer churn to less than one percent. Product Details Oracle Communications Data Model provides industry-specific schema and embedded analytics that address key areas such as customer management, marketing segmentation, product development and network health. CSPs can efficiently capture and monitor critical data and transform it into actionable information to support development and delivery of next-generation services using: More than 1,300 industry-specific measurements and key performance indicators (KPIs) such as network reliability statistics, provisioning metrics and customer churn propensity. Embedded OLAP cubes for extremely fast dimensional analysis of business information. Embedded data mining models for sophisticated trending and predictive analysis. Support for multiple lines of business, such as cable, mobile, wireline and Internet, which can be easily extended to support future requirements. With Oracle Communications Data Model, CSPs can jump start the implementation of a communications data warehouse in line with communications-industry standards including the TM Forum Information Framework (SID), formerly known as the Shared Information Model. Oracle Communications Data Model is optimized for any Oracle Database 11g platform, including Oracle Exadata, which can improve call data record query performance by 10x or more. Supporting Quotes "Oracle Communications Data Model covers a wide range of business areas that are relevant to modern communications service providers and is a comprehensive solution - with its data model and pre-packaged templates including BI dashboards, KPIs, OLAP cubes and mining models. It helps us save a great deal of time in building and implementing a customized data warehouse and enables us to leverage the advanced analytics quickly and more effectively," said Yasuki Hayashi, executive manager, NTT Comware Corporation. "Data volumes will only continue to grow as communications service providers expand next-generation networks, deploy new services and adopt new business models. They will increasingly need efficient, reliable data warehouses to capture key insights on data such as customer value, network value and churn probability. With the Oracle Communications Data Model, Oracle has demonstrated its commitment to meeting these needs by delivering data warehouse tools designed to fill communications industry-specific needs," said Elisabeth Rainge, program director, Network Software, IDC. "The TM Forum Conformance Mark provides reassurance to customers seeking standards-based, and therefore, cost-effective and flexible solutions. TM Forum is extremely pleased to work with Oracle to certify its Oracle Communications Data Model solution. Upon successful completion, this certification will represent the broadest and most complete implementation of the TM Forum Information Framework to date, with more than 130 aggregate business entities," said Keith Willetts, chairman and chief executive officer, TM Forum. Supporting Resources Oracle Communications Oracle Communications Data Model Data Sheet Oracle Communications Data Model Podcast Oracle Data Warehousing Oracle Communications on YouTube Oracle Communications on Delicious Oracle Communications on Facebook Oracle Communications on Twitter Oracle Communications on LinkedIn Oracle Database on Twitter The Data Warehouse Insider Blog

    Read the article

  • Integration Patterns with Azure Service Bus Relay, Part 1: Exposing the on-premise service

    - by Elton Stoneman
    We're in the process of delivering an enabling project to expose on-premise WCF services securely to Internet consumers. The Azure Service Bus Relay is doing the clever stuff, we register our on-premise service with Azure, consumers call into our .servicebus.windows.net namespace, and their requests are relayed and serviced on-premise. In theory it's all wonderfully simple; by using the relay we get lots of protocol options, free HTTPS and load balancing, and by integrating to ACS we get plenty of security options. Part of our delivery is a suite of sample consumers for the service - .NET, jQuery, PHP - and this set of posts will cover setting up the service and the consumers. Part 1: Exposing the on-premise service In theory, this is ultra-straightforward. In practice, and on a dev laptop it is - but in a corporate network with firewalls and proxies, it isn't, so we'll walkthrough some of the pitfalls. Note that I'm using the "old" Azure portal which will soon be out of date, but the new shiny portal should have the same steps available and be easier to use. We start with a simple WCF service which takes a string as input, reverses the string and returns it. The Part 1 version of the code is on GitHub here: on GitHub here: IPASBR Part 1. Configuring Azure Service Bus Start by logging into the Azure portal and registering a Service Bus namespace which will be our endpoint in the cloud. Give it a globally unique name, set it up somewhere near you (if you’re in Europe, remember Europe (North) is Ireland, and Europe (West) is the Netherlands), and  enable ACS integration by ticking "Access Control" as a service: Authenticating and authorizing to ACS When we try to register our on-premise service as a listener for the Service Bus endpoint, we need to supply credentials, which means only trusted service providers can act as listeners. We can use the default "owner" credentials, but that has admin permissions so a dedicated service account is better (Neil Mackenzie has a good post On Not Using owner with the Azure AppFabric Service Bus with lots of permission details). Click on "Access Control Service" for the namespace, navigate to Service Identities and add a new one. Give the new account a sensible name and description: Let ACS generate a symmetric key for you (this will be the shared secret we use in the on-premise service to authenticate as a listener), but be sure to set the expiration date to something usable. The portal defaults to expiring new identities after 1 year - but when your year is up *your identity will expire without warning* and everything will stop working. In production, you'll need governance to manage identity expiration and a process to make sure you renew identities and roll new keys regularly. The new service identity needs to be authorized to listen on the service bus endpoint. This is done through claim mapping in ACS - we'll set up a rule that says if the nameidentifier in the input claims has the value serviceProvider, in the output we'll have an action claim with the value Listen. In the ACS portal you'll see that there is already a Relying Party Application set up for ServiceBus, which has a Default rule group. Edit the rule group and click Add to add this new rule: The values to use are: Issuer: Access Control Service Input claim type: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier Input claim value: serviceProvider Output claim type: net.windows.servicebus.action Output claim value: Listen When your service namespace and identity are set up, open the Part 1 solution and put your own namespace, service identity name and secret key into the file AzureConnectionDetails.xml in Solution Items, e.g: <azure namespace="sixeyed-ipasbr">    <!-- ACS credentials for the listening service (Part1):-->   <service identityName="serviceProvider"            symmetricKey="nuR2tHhlrTCqf4YwjT2RA2BZ/+xa23euaRJNLh1a/V4="/>  </azure> Build the solution, and the T4 template will generate the Web.config for the service project with your Azure details in the transportClientEndpointBehavior:           <behavior name="SharedSecret">             <transportClientEndpointBehavior credentialType="SharedSecret">               <clientCredentials>                 <sharedSecret issuerName="serviceProvider"                               issuerSecret="nuR2tHhlrTCqf4YwjT2RA2BZ/+xa23euaRJNLh1a/V4="/>               </clientCredentials>             </transportClientEndpointBehavior>           </behavior> , and your service namespace in the Azure endpoint:         <!-- Azure Service Bus endpoints -->          <endpoint address="sb://sixeyed-ipasbr.servicebus.windows.net/net"                   binding="netTcpRelayBinding"                   contract="Sixeyed.Ipasbr.Services.IFormatService"                   behaviorConfiguration="SharedSecret">         </endpoint> The sample project is hosted in IIS, but it won't register with Azure until the service is activated. Typically you'd install AppFabric 1.1 for Widnows Server and set the service to auto-start in IIS, but for dev just navigate to the local REST URL, which will activate the service and register it with Azure. Testing the service locally As well as an Azure endpoint, the service has a WebHttpBinding for local REST access:         <!-- local REST endpoint for internal use -->         <endpoint address="rest"                   binding="webHttpBinding"                   behaviorConfiguration="RESTBehavior"                   contract="Sixeyed.Ipasbr.Services.IFormatService" /> Build the service, then navigate to: http://localhost/Sixeyed.Ipasbr.Services/FormatService.svc/rest/reverse?string=abc123 - and you should see the reversed string response: If your network allows it, you'll get the expected response as before, but in the background your service will also be listening in the cloud. Good stuff! Who needs network security? Onto the next post for consuming the service with the netTcpRelayBinding.  Setting up network access to Azure But, if you get an error, it's because your network is secured and it's doing something to stop the relay working. The Service Bus relay bindings try to use direct TCP connections to Azure, so if ports 9350-9354 are available *outbound*, then the relay will run through them. If not, the binding steps down to standard HTTP, and issues a CONNECT across port 443 or 80 to set up a tunnel for the relay. If your network security guys are doing their job, the first option will be blocked by the firewall, and the second option will be blocked by the proxy, so you'll get this error: System.ServiceModel.CommunicationException: Unable to reach sixeyed-ipasbr.servicebus.windows.net via TCP (9351, 9352) or HTTP (80, 443) - and that will probably be the start of lots of discussions. Network guys don't really like giving servers special permissions for the web proxy, and they really don't like opening ports, so they'll need to be convinced about this. The resolution in our case was to put up a dedicated box in a DMZ, tinker with the firewall and the proxy until we got a relay connection working, then run some traffic which the the network guys monitored to do a security assessment afterwards. Along the way we hit a few more issues, diagnosed mainly with Fiddler and Wireshark: System.Net.ProtocolViolationException: Chunked encoding upload is not supported on the HTTP/1.0 protocol - this means the TCP ports are not available, so Azure tries to relay messaging traffic across HTTP. The service can access the endpoint, but the proxy is downgrading traffic to HTTP 1.0, which does not support tunneling, so Azure can’t make its connection. We were using the Squid proxy, version 2.6. The Squid project is incrementally adding HTTP 1.1 support, but there's no definitive list of what's supported in what version (here are some hints). System.ServiceModel.Security.SecurityNegotiationException: The X.509 certificate CN=servicebus.windows.net chain building failed. The certificate that was used has a trust chain that cannot be verified. Replace the certificate or change the certificateValidationMode. The evocation function was unable to check revocation because the revocation server was offline. - by this point we'd given up on the HTTP proxy and opened the TCP ports. We got this error when the relay binding does it's authentication hop to ACS. The messaging traffic is TCP, but the control traffic still goes over HTTP, and as part of the ACS authentication the process checks with a revocation server to see if Microsoft’s ACS cert is still valid, so the proxy still needs some clearance. The service account (the IIS app pool identity) needs access to: www.public-trust.com mscrl.microsoft.com We still got this error periodically with different accounts running the app pool. We fixed that by ensuring the machine-wide proxy settings are set up, so every account uses the correct proxy: netsh winhttp set proxy proxy-server="http://proxy.x.y.z" - and you might need to run this to clear out your credential cache: certutil -urlcache * delete If your network guys end up grudgingly opening ports, they can restrict connections to the IP address range for your chosen Azure datacentre, which might make them happier - see Windows Azure Datacenter IP Ranges. After all that you've hopefully got an on-premise service listening in the cloud, which you can consume from pretty much any technology.

    Read the article

  • Reading data from an Entity Framework data model through a WCF Data Service

    - by nikolaosk
    This is going to be the fourth post of a series of posts regarding ASP.Net and the Entity Framework and how we can use Entity Framework to access our datastore. You can find the first one here , the second one here and the third one here . I have a post regarding ASP.Net and EntityDataSource. You can read it here .I have 3 more posts on Profiling Entity Framework applications. You can have a look at them here , here and here . Microsoft with .Net 3.0 Framework, introduced WCF. WCF is Microsoft's...(read more)

    Read the article

  • Big Data – How to become a Data Scientist and Learn Data Science? – Day 19 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned the importance of the analytics in Big Data Story. In this article we will understand how to become a Data Scientist for Big Data Story. Data Scientist is a new buzz word, everyone seems to be wanting to become Data Scientist. Let us go over a few key topics related to Data Scientist in this blog post. First of all we will understand what is a Data Scientist. In the new world of Big Data, I see pretty much everyone wants to become Data Scientist and there are lots of people I have already met who claims that they are Data Scientist. When I ask what is their role, I have got a wide variety of answers. What is Data Scientist? Data scientists are the experts who understand various aspects of the business and know how to strategies data to achieve the business goals. They should have a solid foundation of various data algorithms, modeling and statistics methodology. What do Data Scientists do? Data scientists understand the data very well. They just go beyond the regular data algorithms and builds interesting trends from available data. They innovate and resurrect the entire new meaning from the existing data. They are artists in disguise of computer analyst. They look at the data traditionally as well as explore various new ways to look at the data. Data Scientists do not wait to build their solutions from existing data. They think creatively, they think before the data has entered into the system. Data Scientists are visionary experts who understands the business needs and plan ahead of the time, this tremendously help to build solutions at rapid speed. Besides being data expert, the major quality of Data Scientists is “curiosity”. They always wonder about what more they can get from their existing data and how to get maximum out of future incoming data. Data Scientists do wonders with the data, which goes beyond the job descriptions of Data Analysist or Business Analysist. Skills Required for Data Scientists Here are few of the skills a Data Scientist must have. Expert level skills with statistical tools like SAS, Excel, R etc. Understanding Mathematical Models Hands-on with Visualization Tools like Tableau, PowerPivots, D3. j’s etc. Analytical skills to understand business needs Communication skills On the technology front any Data Scientists should know underlying technologies like (Hadoop, Cloudera) as well as their entire ecosystem (programming language, analysis and visualization tools etc.) . Remember that for becoming a successful Data Scientist one require have par excellent skills, just having a degree in a relevant education field will not suffice. Final Note Data Scientists is indeed very exciting job profile. As per research there are not enough Data Scientists in the world to handle the current data explosion. In near future Data is going to expand exponentially, and the need of the Data Scientists will increase along with it. It is indeed the job one should focus if you like data and science of statistics. Courtesy: emc Tomorrow In tomorrow’s blog post we will discuss about various Big Data Learning resources. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Unstructured Data - The future of Data Administration

    Some have claimed that there is a problem with the way data is currently managed using the relational paradigm do to the rise of unstructured data in modern business. PCMag.com defines unstructured data as data that does not reside in a fixed location. They further explain that unstructured data refers to data in a free text form that is not bound to any specific structure. With the rise of unstructured data in the form of emails, spread sheets, images and documents the critics have a right to argue that the relational paradigm is not as effective as the object oriented data paradigm in managing this type of data. The relational paradigm relies heavily on structure and relationships in and between items of data. This type of paradigm works best in a relation database management system like Microsoft SQL, MySQL, and Oracle because data is forced to conform to a structure in the form of tables and relations can be derived from the existence of one or more tables. These critics also claim that database administrators have not kept up with reality because their primary focus in regards to data administration deals with structured data and the relational paradigm. The relational paradigm was developed in the 1970’s as a way to improve data management when compared to standard flat files. Little has changed since then, and modern database administrators need to know more than just how to handle structured data. That is why critics claim that today’s data professionals do not have the proper skills in order to store and maintain data for modern systems when compared to the skills of system designers, programmers , software engineers, and data designers  due to the industry trend of object oriented design and development. I think that they are wrong. I do not disagree that the industry is moving toward an object oriented approach to development with the potential to use more of an object oriented approach to data.   However, I think that it is business itself that is limiting database administrators from changing how data is stored because of the potential costs, and impact that might occur by altering any part of stored data. Furthermore, database administrators like all technology workers constantly are trying to improve their technical skills in order to excel in their job, so I think that accusing data professional is not just when the root cause of the lack of innovation is controlled by business, and it is business that will suffer for their inability to keep up with technology. One way for database professionals to better prepare for the future of database management is start working with data in the form of objects and so that they can extract data from the objects so that the stored information within objects can be used in relation to the data stored in a using the relational paradigm. Furthermore, I think the use of pattern matching will increase with the increased use of unstructured data because object can be selected, filtered and altered based on the existence of a pattern found within an object.

    Read the article

  • Developing an analytics's system processing large amounts of data - where to start

    - by Ryan
    Imagine you're writing some sort of Web Analytics system - you're recording raw page hits along with some extra things like tagging cookies etc and then producing stats such as Which pages got most traffic over a time period Which referers sent most traffic Goals completed (goal being a view of a particular page) And more advanced things like which referers sent the most number of vistors who later hit a goal. The naieve way of approaching this would be to throw it in a relational database and run queries over it - but that won't scale. You could pre-calculate everything (have a queue of incoming 'hits' and use to update report tables) - but what if you later change a goal - how could you efficiently re-calculate just the data that would be effected. Obviously this has been done before ;) so any tips on where to start, methods & examples, architecture, technologies etc.

    Read the article

  • Hosting WCF service in Windows Service

    - by DigiMortal
    When building Windows services we often need a way to communicate with them. The natural way to communicate to service is to send signals to it. But this is very limited communication. Usually we need more powerful communication mechanisms with services. In this posting I will show you how to use service-hosted WCF web service to communicate with Windows service. Create Windows service Suppose you have Windows service created and service class is named as MyWindowsService. This is new service and all we have is default code that Visual Studio generates. Create WCF service Add reference to System.ServiceModel assembly to Windows service project and add new interface called IMyService. This interface defines our service contracts. [ServiceContract] public interface IMyService {     [OperationContract]     string SayHello(int value); } We keep this service simple so it is easy for you to follow the code. Now let’s add service implementation: [ServiceBehavior(InstanceContextMode=InstanceContextMode.Single)] public class MyService : IMyService {     public string SayHello(int value)     {         return string.Format("Hello, : {0}", value);     } } With ServiceBehavior attribute we say that we need only one instance of WCF service to serve all requests. Usually this is more than enough for us. Hosting WCF service in Windows Service Now it’s time to host our WCF service and make it available in Windows service. Here is the code in my Windows service: public partial class MyWindowsService : ServiceBase {     private ServiceHost _host;     private MyService _server;       public MyWindowsService()     {         InitializeComponent();     }       protected override void OnStart(string[] args)     {         _server = new MyService();         _host = new ServiceHost(_server);         _host.Open();     }       protected override void OnStop()     {         _host.Close();     } } Our Windows service now hosts our WCF service. WCF service will be available when Windows service is started and it is taken down when Windows service stops. Configuring WCF service To make WCF service usable we need to configure it. Add app.config file to your Windows service project and paste the following XML there: <system.serviceModel>   <serviceHostingEnvironment aspNetCompatibilityEnabled="true" />   <services>     <service name="MyWindowsService.MyService" behaviorConfiguration="def">       <host>         <baseAddresses>           <add baseAddress="http://localhost:8732/MyService/"/>         </baseAddresses>       </host>       <endpoint address="" binding="wsHttpBinding" contract="MyWindowsService.IMyService">       </endpoint>       <endpoint address="mex" binding="mexHttpBinding" contract="IMetadataExchange"/>     </service>   </services>   <behaviors>     <serviceBehaviors>       <behavior name="def">         <serviceMetadata httpGetEnabled="True"/>         <serviceDebug includeExceptionDetailInFaults="True"/>       </behavior>     </serviceBehaviors>   </behaviors> </system.serviceModel> Now you are ready to test your service. Install Windows service and start it. Open your browser and open the following address: http://localhost:8732/MyService/ You should see your WCF service page now. Conclusion WCF is not only web applications fun. You can use WCF also as self-hosted service. Windows services that lack good communication possibilities can be saved by using WCF self-hosted service as it is the best way to talk to service. We can also revert the context and say that Windows service is good host for our WCF service.

    Read the article

  • Big Data – Beginning Big Data – Day 1 of 21

    - by Pinal Dave
    What is Big Data? I want to learn Big Data. I have no clue where and how to start learning about it. Does Big Data really means data is big? What are the tools and software I need to know to learn Big Data? I often receive questions which I mentioned above. They are good questions and honestly when we search online, it is hard to find authoritative and authentic answers. I have been working with Big Data and NoSQL for a while and I have decided that I will attempt to discuss this subject over here in the blog. In the next 21 days we will understand what is so big about Big Data. Big Data – Big Thing! Big Data is becoming one of the most talked about technology trends nowadays. The real challenge with the big organization is to get maximum out of the data already available and predict what kind of data to collect in the future. How to take the existing data and make it meaningful that it provides us accurate insight in the past data is one of the key discussion points in many of the executive meetings in organizations. With the explosion of the data the challenge has gone to the next level and now a Big Data is becoming the reality in many organizations. Big Data – A Rubik’s Cube I like to compare big data with the Rubik’s cube. I believe they have many similarities. Just like a Rubik’s cube it has many different solutions. Let us visualize a Rubik’s cube solving challenge where there are many experts participating. If you take five Rubik’s cube and mix up the same way and give it to five different expert to solve it. It is quite possible that all the five people will solve the Rubik’s cube in fractions of the seconds but if you pay attention to the same closely, you will notice that even though the final outcome is the same, the route taken to solve the Rubik’s cube is not the same. Every expert will start at a different place and will try to resolve it with different methods. Some will solve one color first and others will solve another color first. Even though they follow the same kind of algorithm to solve the puzzle they will start and end at a different place and their moves will be different at many occasions. It is  nearly impossible to have a exact same route taken by two experts. Big Market and Multiple Solutions Big Data is exactly like a Rubik’s cube – even though the goal of every organization and expert is same to get maximum out of the data, the route and the starting point are different for each organization and expert. As organizations are evaluating and architecting big data solutions they are also learning the ways and opportunities which are related to Big Data. There is not a single solution to big data as well there is not a single vendor which can claim to know all about Big Data. Honestly, Big Data is too big a concept and there are many players – different architectures, different vendors and different technology. What is Next? In this 31 days series we will be exploring many essential topics related to big data. I do not claim that you will be master of the subject after 31 days but I claim that I will be covering following topics in easy to understand language. Architecture of Big Data Big Data a Management and Implementation Different Technologies – Hadoop, Mapreduce Real World Conversations Best Practices Tomorrow In tomorrow’s blog post we will try to answer one of the very essential questions – What is Big Data? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Big Data&rsquo;s Killer App&hellip;

    - by jean-pierre.dijcks
    Recently Keith spent  some time talking about the cloud on this blog and I will spare you my thoughts on the whole thing. What I do want to write down is something about the Big Data movement and what I think is the killer app for Big Data... Where is this coming from, ok, I confess... I spent 3 days in cloud land at the Cloud Connect conference in Santa Clara and it was quite a lot of fun. One of the nice things at Cloud Connect was that there was a track dedicated to Big Data, which prompted me to some extend to write this post. What is Big Data anyways? The most valuable point made in the Big Data track was that Big Data in itself is not very cool. Doing something with Big Data is what makes all of this cool and interesting to a business user! The other good insight I got was that a lot of people think Big Data means a single gigantic monolithic system holding gazillions of bytes or documents or log files. Well turns out that most people in the Big Data track are talking about a lot of collections of smaller data sets. So rather than thinking "big = monolithic" you should be thinking "big = many data sets". This is more than just theoretical, it is actually relevant when thinking about big data and how to process it. It is important because it means that the platform that stores data will most likely consist out of multiple solutions. You may be storing logs on something like HDFS, you may store your customer information in Oracle and you may store distilled clickstream information in some distilled form in MySQL. The big question you will need to solve is not what lives where, but how to get it all together and get some value out of all that data. NoSQL and MapReduce Nope, sorry, this is not the killer app... and no I'm not saying this because my business card says Oracle and I'm therefore biased. I think language is important, but as with storage I think pragmatic is better. In other words, some questions can be answered with SQL very efficiently, others can be answered with PERL or TCL others with MR. History should teach us that anyone trying to solve a problem will use any and all tools around. For example, most data warehouses (Big Data 1.0?) get a lot of data in flat files. Everyone then runs a bunch of shell scripts to massage or verify those files and then shoves those files into the database. We've even built shell script support into external tables to allow for this. I think the Big Data projects will do the same. Some people will use MapReduce, although I would argue that things like Cascading are more interesting, some people will use Java. Some data is stored on HDFS making Cascading the way to go, some data is stored in Oracle and SQL does do a good job there. As with storage and with history, be pragmatic and use what fits and neither NoSQL nor MR will be the one and only. Also, a language, while important, does in itself not deliver business value. So while cool it is not a killer app... Vertical Behavioral Analytics This is the killer app! And you are now thinking: "what does that mean?" Let's decompose that heading. First of all, analytics. I would think you had guessed by now that this is really what I'm after, and of course you are right. But not just analytics, which has a very large scope and means many things to many people. I'm not just after Business Intelligence (analytics 1.0?) or data mining (analytics 2.0?) but I'm after something more interesting that you can only do after collecting large volumes of specific data. That all important data is about behavior. What do my customers do? More importantly why do they behave like that? If you can figure that out, you can tailor web sites, stores, products etc. to that behavior and figure out how to be successful. Today's behavior that is somewhat easily tracked is web site clicks, search patterns and all of those things that a web site or web server tracks. that is where the Big Data lives and where these patters are now emerging. Other examples however are emerging, and one of the examples used at the conference was about prediction churn for a telco based on the social network its members are a part of. That social network is not about LinkedIn or Facebook, but about who calls whom. I call you a lot, you switch provider, and I might/will switch too. And that just naturally brings me to the next word, vertical. Vertical in this context means per industry, e.g. communications or retail or government or any other vertical. The reason for being more specific than just behavioral analytics is that each industry has its own data sources, has its own quirky logic and has its own demands and priorities. Of course, the methods and some of the software will be common and some will have both retail and service industry analytics in place (your corner coffee store for example). But the gist of it all is that analytics that can predict customer behavior for a specific focused group of people in a specific industry is what makes Big Data interesting. Building a Vertical Behavioral Analysis System Well, that is going to be interesting. I have not seen much going on in that space and if I had to have some criticism on the cloud connect conference it would be the lack of concrete user cases on big data. The telco example, while a step into the vertical behavioral part is not really on big data. It used a sample of data from the customers' data warehouse. One thing I do think, and this is where I think parts of the NoSQL stuff come from, is that we will be doing this analysis where the data is. Over the past 10 years we at Oracle have called this in-database analytics. I guess we were (too) early? Now the entire market is going there including companies like SAS. In-place btw does not mean "no data movement at all", what it means that you will do this on data's permanent home. For SAS that is kind of the current problem. Most of the inputs live in a data warehouse. So why move it into SAS and back? That all worked with 1 TB data warehouses, but when we are looking at 100TB to 500 TB of distilled data... Comments? As it is still early days with these systems, I'm very interested in seeing reactions and thoughts to some of these thoughts...

    Read the article

  • Oracle Retail Point-of-Service with Mobile Point-of-Service, Release 13.4.1

    - by Oracle Retail Documentation Team
    Oracle Retail Mobile Point-of-Service was previously released as a standalone product. Oracle Retail Mobile Point-of-Service is now a supported extension of Oracle Retail Point-of-Service, Release 13.4.1. Oracle Retail Mobile Point-of-Service provides support for using a mobile device to perform tasks such as scanning items, applying price adjustments, tendering, and looking up item information. Integration with Oracle Retail Store Inventory Management (SIM) If Oracle Retail Mobile Point-of-Service is implemented with Oracle Retail Store Inventory Management (SIM), the following Oracle Retail Store Inventory Management functionality is supported: Inventory lookup at the current store Inventory lookup at buddy stores Validation of serial numbers Technical Overview The Oracle Retail Mobile Point-of-Service server application runs in a domain on Oracle WebLogic. The server supports the mobile devices in the store. On each mobile device, the Mobile POS application is downloaded and then installed. Highlighted End User Documentation Updates and List of Documents  Oracle Retail Point-of-Service with Mobile Point-of-Service Release NotesA high-level overview is included about the release's functional, technical, and documentation enhancements. In addition, a section has been written that addresses Product Support considerations.   Oracle Retail Mobile Point-of-Service Java API ReferenceJava API documentation for Oracle Retail Mobile Point-of-Service is included as part of the Oracle Retail Mobile Point-of-Service Release 13.4.1 documentation set. Oracle Retail Point-of-Service with Mobile Point-of-Service Installation Guide - Volume 1, Oracle StackA new chapter is included with information on installing the Mobile Point-of-Service server and setting up the Mobile POS application. The installer screens for installing the server are included in a new appendix. Oracle Retail Point-of-Service with Mobile Point-of-Service User GuideA new chapter describes the functionality available on a mobile device and how to use Oracle Retail Mobile Point-of-Service on a mobile device. Oracle Retail POS Suite with Mobile Point-of-Service Configuration GuideThe Configuration Guide is updated to indicate which parameters are used for Oracle Retail Mobile Point-of-Service. Oracle Retail POS Suite with Mobile Point-of-Service Implementation Guide - Volume 5, Mobile Point-of-ServiceThis new Implementation Guide volume contains information for extending and customizing both the Mobile POS application for the mobile device and the Oracle Retail Mobile Point-of-Service server. Oracle Retail POS Suite with Mobile Point-of-Service Licensing InformationThe Licensing Information document is updated with the list of third-party open-source software used by Oracle Retail Mobile Point-of-Service. Oracle Retail POS Suite with Mobile Point-of-Service Security GuideThe Security Guide is updated with information on security for mobile devices. Oracle Retail Enhancements Summary (My Oracle Support Doc ID 1088183.1)This enterprise level document captures the major changes for all the products that are part of releases 13.2, 13.3, and 13.4. The functional, integration, and technical enhancements in the Release Notes for each product are listed in this document.

    Read the article

  • Accelerate your SOA with Data Integration - Live Webinar Tuesday!

    - by dain.hansen
    Need to put wind in your SOA sails? Organizations are turning more and more to Real-time data integration to complement their Service Oriented Architecture. The benefit? Lowering costs through consolidating legacy systems, reducing risk of bad data polluting their applications, and shortening the time to deliver new service offerings. Join us on Tuesday April 13th, 11AM PST for our live webinar on the value of combining SOA and Data Integration together. In this webcast you'll learn how to innovate across your applications swiftly and at a lower cost using Oracle Data Integration technologies: Oracle Data Integrator Enterprise Edition, Oracle GoldenGate, and Oracle Data Quality. You'll also hear: Best practices for building re-usable data services that are high performing and scalable across the enterprise How real-time data integration can maximize SOA returns while providing continuous availability for your mission critical applications Architectural approaches to speed service implementation and delivery times, with pre-integrations to CRM, ERP, BI, and other packaged applications Register now for this live webinar!

    Read the article

  • Big Data – Basics of Big Data Analytics – Day 18 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned the importance of the various components in Big Data Story. In this article we will understand what are the various analytics tasks we try to achieve with the Big Data and the list of the important tools in Big Data Story. When you have plenty of the data around you what is the first thing which comes to your mind? “What do all these data means?” Exactly – the same thought comes to my mind as well. I always wanted to know what all the data means and what meaningful information I can receive out of it. Most of the Big Data projects are built to retrieve various intelligence all this data contains within it. Let us take example of Facebook. When I look at my friends list of Facebook, I always want to ask many questions such as - On which date my maximum friends have a birthday? What is the most favorite film of my most of the friends so I can talk about it and engage them? What is the most liked placed to travel my friends? Which is the most disliked cousin for my friends in India and USA so when they travel, I do not take them there. There are many more questions I can think of. This illustrates that how important it is to have analysis of Big Data. Here are few of the kind of analysis listed which you can use with Big Data. Slicing and Dicing: This means breaking down your data into smaller set and understanding them one set at a time. This also helps to present various information in a variety of different user digestible ways. For example if you have data related to movies, you can use different slide and dice data in various formats like actors, movie length etc. Real Time Monitoring: This is very crucial in social media when there are any events happening and you wanted to measure the impact at the time when the event is happening. For example, if you are using twitter when there is a football match, you can watch what fans are talking about football match on twitter when the event is happening. Anomaly Predication and Modeling: If the business is running normal it is alright but if there are signs of trouble, everyone wants to know them early on the hand. Big Data analysis of various patterns can be very much helpful to predict future. Though it may not be always accurate but certain hints and signals can be very helpful. For example, lots of data can help conclude that if there is lots of rain it can increase the sell of umbrella. Text and Unstructured Data Analysis: unstructured data are now getting norm in the new world and they are a big part of the Big Data revolution. It is very important that we Extract, Transform and Load the unstructured data and make meaningful data out of it. For example, analysis of lots of images, one can predict that people like to use certain colors in certain months in their cloths. Big Data Analytics Solutions There are many different Big Data Analystics Solutions out in the market. It is impossible to list all of them so I will list a few of them over here. Tableau – This has to be one of the most popular visualization tools out in the big data market. SAS – A high performance analytics and infrastructure company IBM and Oracle – They have a range of tools for Big Data Analysis Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Data Scientist. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >