Search Results

Search found 27 results on 2 pages for 'aggregators'.

Page 2/2 | < Previous Page | 1 2

Running a simple integration scenario using the Oracle Big Data Connectors on Hadoop/HDFS cluster

- by hamsun

Between the elephant ( the tradional image of the Hadoop framework) and the Oracle Iron Man (Big Data..) an english setter could be seen as the link to the right data Data, Data, Data, we are living in a world where data technology based on popular applications , search engines, Webservers, rich sms messages, email clients, weather forecasts and so on, have a predominant role in our life. More and more technologies are used to analyze/track our behavior, try to detect patterns, to propose us "the best/right user experience" from the Google Ad services, to Telco companies or large consumer sites (like Amazon:) ). The more we use all these technologies, the more we generate data, and thus there is a need of huge data marts and specific hardware/software servers (as the Exadata servers) in order to treat/analyze/understand the trends and offer new services to the users. Some of these "data feeds" are raw, unstructured data, and cannot be processed effectively by normal SQL queries. Large scale distributed processing was an emerging infrastructure need and the solution seemed to be the "collocation of compute nodes with the data", which in turn leaded to MapReduce parallel patterns and the development of the Hadoop framework, which is based on MapReduce and a distributed file system (HDFS) that runs on larger clusters of rather inexpensive servers. Several Oracle products are using the distributed / aggregation pattern for data calculation ( Coherence, NoSql, times ten ) so once that you are familiar with one of these technologies, lets says with coherence aggregators, you will find the whole Hadoop, MapReduce concept very similar. Oracle Big Data Appliance is based on the Cloudera Distribution (CDH), and the Oracle Big Data Connectors can be plugged on a Hadoop cluster running the CDH distribution or equivalent Hadoop clusters. In this paper, a "lab like" implementation of this concept is done on a single Linux X64 server, running an Oracle Database 11g Enterprise Edition Release 11.2.0.4.0, and a single node Apache hadoop-1.2.1 HDFS cluster, using the SQL connector for HDFS. The whole setup is fairly simple: Install on a Linux x64 server ( or virtual box appliance) an Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 server Get the Apache Hadoop distribution from: http://mir2.ovh.net/ftp.apache.org/dist/hadoop/common/hadoop-1.2.1. Get the Oracle Big Data Connectors from: http://www.oracle.com/technetwork/bdc/big-data-connectors/downloads/index.html?ssSourceSiteId=ocomen. Check the java version of your Linux server with the command: java -version java version "1.7.0_40" Java(TM) SE Runtime Environment (build 1.7.0_40-b43) Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode) Decompress the hadoop hadoop-1.2.1.tar.gz file to /u01/hadoop-1.2.1 Modify your .bash_profile export HADOOP_HOME=/u01/hadoop-1.2.1 export PATH=$PATH:$HADOOP_HOME/bin export HIVE_HOME=/u01/hive-0.11.0 export PATH=$PATH:$HADOOP_HOME/bin:$HIVE_HOME/bin (also see my sample .bash_profile) Set up ssh trust for Hadoop process, this is a mandatory step, in our case we have to establish a "local trust" as will are using a single node configuration copy the new public keys to the list of authorized keys connect and test the ssh setup to your localhost: We will run a "pseudo-Hadoop cluster", in what is called "local standalone mode", all the Hadoop java components are running in one Java process, this is enough for our demo purposes. We need to "fine tune" some Hadoop configuration files, we have to go at our $HADOOP_HOME/conf, and modify the files: core-site.xml hdfs-site.xml mapred-site.xml check that the hadoop binaries are referenced correctly from the command line by executing: hadoop -version As Hadoop is managing our "clustered HDFS" file system we have to create "the mount point" and format it , the mount point will be declared to core-site.xml as: The layout under the /u01/hadoop-1.2.1/data will be created and used by other hadoop components (MapReduce = /mapred/...) HDFS is using the /dfs/... layout structure format the HDFS hadoop file system: Start the java components for the HDFS system As an additional check, you can use the GUI Hadoop browsers to check the content of your HDFS configurations: Once our HDFS Hadoop setup is done you can use the HDFS file system to store data ( big data : )), and plug them back and forth to Oracle Databases by the means of the Big Data Connectors ( which is the next configuration step). You can create / use a Hive db, but in our case we will make a simple integration of "raw data" , through the creation of an External Table to a local Oracle instance ( on the same Linux box, we run the Hadoop HDFS one node cluster and one Oracle DB). Download some public "big data", I use the site: http://france.meteofrance.com/france/observations, from where I can get *.csv files for my big data simulations :). Here is the data layout of my example file: Download the Big Data Connector from the OTN (oraosch-2.2.0.zip), unzip it to your local file system (see picture below) Modify your environment in order to access the connector libraries , and make the following test: [oracle@dg1 bin]$./hdfs_stream Usage: hdfs_stream locationFile [oracle@dg1 bin]$ Load the data to the Hadoop hdfs file system: hadoop fs -mkdir bgtest_data hadoop fs -put obsFrance.txt bgtest_data/obsFrance.txt hadoop fs -ls /user/oracle/bgtest_data/obsFrance.txt [oracle@dg1 bg-data-raw]$ hadoop fs -ls /user/oracle/bgtest_data/obsFrance.txt Found 1 items -rw-r--r-- 1 oracle supergroup 54103 2013-10-22 06:10 /user/oracle/bgtest_data/obsFrance.txt [oracle@dg1 bg-data-raw]$hadoop fs -ls hdfs:///user/oracle/bgtest_data/obsFrance.txt Found 1 items -rw-r--r-- 1 oracle supergroup 54103 2013-10-22 06:10 /user/oracle/bgtest_data/obsFrance.txt Check the content of the HDFS with the browser UI: Start the Oracle database, and run the following script in order to create the Oracle database user, the Oracle directories for the Oracle Big Data Connector (dg1 it’s my own db id replace accordingly yours): #!/bin/bash export ORAENV_ASK=NO export ORACLE_SID=dg1 . oraenv sqlplus /nolog <<EOF CONNECT / AS sysdba; CREATE OR REPLACE DIRECTORY osch_bin_path AS '/u01/orahdfs-2.2.0/bin'; CREATE USER BGUSER IDENTIFIED BY oracle; GRANT CREATE SESSION, CREATE TABLE TO BGUSER; GRANT EXECUTE ON sys.utl_file TO BGUSER; GRANT READ, EXECUTE ON DIRECTORY osch_bin_path TO BGUSER; CREATE OR REPLACE DIRECTORY BGT_LOG_DIR as '/u01/BG_TEST/logs'; GRANT READ, WRITE ON DIRECTORY BGT_LOG_DIR to BGUSER; CREATE OR REPLACE DIRECTORY BGT_DATA_DIR as '/u01/BG_TEST/data'; GRANT READ, WRITE ON DIRECTORY BGT_DATA_DIR to BGUSER; EOF Put the following in a file named t3.sh and make it executable, hadoop jar $OSCH_HOME/jlib/orahdfs.jar \ oracle.hadoop.exttab.ExternalTable \ -D oracle.hadoop.exttab.tableName=BGTEST_DP_XTAB \ -D oracle.hadoop.exttab.defaultDirectory=BGT_DATA_DIR \ -D oracle.hadoop.exttab.dataPaths="hdfs:///user/oracle/bgtest_data/obsFrance.txt" \ -D oracle.hadoop.exttab.columnCount=7 \ -D oracle.hadoop.connection.url=jdbc:oracle:thin:@//localhost:1521/dg1 \ -D oracle.hadoop.connection.user=BGUSER \ -D oracle.hadoop.exttab.printStackTrace=true \ -createTable --noexecute then test the creation fo the external table with it: [oracle@dg1 samples]$ ./t3.sh ./t3.sh: line 2: /u01/orahdfs-2.2.0: Is a directory Oracle SQL Connector for HDFS Release 2.2.0 - Production Copyright (c) 2011, 2013, Oracle and/or its affiliates. All rights reserved. Enter Database Password:] The create table command was not executed. The following table would be created. CREATE TABLE "BGUSER"."BGTEST_DP_XTAB" ( "C1" VARCHAR2(4000), "C2" VARCHAR2(4000), "C3" VARCHAR2(4000), "C4" VARCHAR2(4000), "C5" VARCHAR2(4000), "C6" VARCHAR2(4000), "C7" VARCHAR2(4000) ) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY "BGT_DATA_DIR" ACCESS PARAMETERS ( RECORDS DELIMITED BY 0X'0A' CHARACTERSET AL32UTF8 STRING SIZES ARE IN CHARACTERS PREPROCESSOR "OSCH_BIN_PATH":'hdfs_stream' FIELDS TERMINATED BY 0X'2C' MISSING FIELD VALUES ARE NULL ( "C1" CHAR(4000), "C2" CHAR(4000), "C3" CHAR(4000), "C4" CHAR(4000), "C5" CHAR(4000), "C6" CHAR(4000), "C7" CHAR(4000) ) ) LOCATION ( 'osch-20131022081035-74-1' ) ) PARALLEL REJECT LIMIT UNLIMITED; The following location files would be created. osch-20131022081035-74-1 contains 1 URI, 54103 bytes 54103 hdfs://localhost:19000/user/oracle/bgtest_data/obsFrance.txt Then remove the --noexecute flag and create the external Oracle table for the Hadoop data. Check the results: The create table command succeeded. CREATE TABLE "BGUSER"."BGTEST_DP_XTAB" ( "C1" VARCHAR2(4000), "C2" VARCHAR2(4000), "C3" VARCHAR2(4000), "C4" VARCHAR2(4000), "C5" VARCHAR2(4000), "C6" VARCHAR2(4000), "C7" VARCHAR2(4000) ) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY "BGT_DATA_DIR" ACCESS PARAMETERS ( RECORDS DELIMITED BY 0X'0A' CHARACTERSET AL32UTF8 STRING SIZES ARE IN CHARACTERS PREPROCESSOR "OSCH_BIN_PATH":'hdfs_stream' FIELDS TERMINATED BY 0X'2C' MISSING FIELD VALUES ARE NULL ( "C1" CHAR(4000), "C2" CHAR(4000), "C3" CHAR(4000), "C4" CHAR(4000), "C5" CHAR(4000), "C6" CHAR(4000), "C7" CHAR(4000) ) ) LOCATION ( 'osch-20131022081719-3239-1' ) ) PARALLEL REJECT LIMIT UNLIMITED; The following location files were created. osch-20131022081719-3239-1 contains 1 URI, 54103 bytes 54103 hdfs://localhost:19000/user/oracle/bgtest_data/obsFrance.txt This is the view from the SQL Developer: and finally the number of lines in the oracle table, imported from our Hadoop HDFS cluster SQL select count(*) from "BGUSER"."BGTEST_DP_XTAB"; COUNT(*) ---------- 1151 In a next post we will integrate data from a Hive database, and try some ODI integrations with the ODI Big Data connector. Our simplistic approach is just a step to show you how these unstructured data world can be integrated to Oracle infrastructure. Hadoop, BigData, NoSql are great technologies, they are widely used and Oracle is offering a large integration infrastructure based on these services. Oracle University presents a complete curriculum on all the Oracle related technologies: NoSQL: Introduction to Oracle NoSQL Database Using Oracle NoSQL Database Big Data: Introduction to Big Data Oracle Big Data Essentials Oracle Big Data Overview Oracle Data Integrator: Oracle Data Integrator 12c: New Features Oracle Data Integrator 11g: Integration and Administration Oracle Data Integrator: Administration and Development Oracle Data Integrator 11g: Advanced Integration and Development Oracle Coherence 12c: Oracle Coherence 12c: New Features Oracle Coherence 12c: Share and Manage Data in Clusters Oracle Coherence 12c: Oracle GoldenGate 11g: Fundamentals for Oracle Oracle GoldenGate 11g: Fundamentals for SQL Server Oracle GoldenGate 11g Fundamentals for Oracle Oracle GoldenGate 11g Fundamentals for DB2 Oracle GoldenGate 11g Fundamentals for Teradata Oracle GoldenGate 11g Fundamentals for HP NonStop Oracle GoldenGate 11g Management Pack: Overview Oracle GoldenGate 11g Troubleshooting and Tuning Oracle GoldenGate 11g: Advanced Configuration for Oracle Other Resources: Apache Hadoop : http://hadoop.apache.org/ is the homepage for these technologies. "Hadoop Definitive Guide 3rdEdition" by Tom White is a classical lecture for people who want to know more about Hadoop , and some active "googling " will also give you some more references. About the author: Eugene Simos is based in France and joined Oracle through the BEA-Weblogic Acquisition, where he worked for the Professional Service, Support, end Education for major accounts across the EMEA Region. He worked in the banking sector, ATT, Telco companies giving him extensive experience on production environments. Eugen currently specializes in Oracle Fusion Middleware teaching an array of courses on Weblogic/Webcenter, Content,BPM /SOA/Identity-Security/GoldenGate/Virtualisation/Unified Comm Suite) throughout the EMEA region.

Read the article
C#/.NET Little Wonders: The Generic Func Delegates

- by James Michael Hare

Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. Back in one of my three original “Little Wonders” Trilogy of posts, I had listed generic delegates as one of the Little Wonders of .NET. Later, someone posted a comment saying said that they would love more detail on the generic delegates and their uses, since my original entry just scratched the surface of them. Last week, I began our look at some of the handy generic delegates built into .NET with a description of delegates in general, and the Action family of delegates. For this week, I’ll launch into a look at the Func family of generic delegates and how they can be used to support generic, reusable algorithms and classes. Quick Delegate Recap Delegates are similar to function pointers in C++ in that they allow you to store a reference to a method. They can store references to either static or instance methods, and can actually be used to chain several methods together in one delegate. Delegates are very type-safe and can be satisfied with any standard method, anonymous method, or a lambda expression. They can also be null as well (refers to no method), so care should be taken to make sure that the delegate is not null before you invoke it. Delegates are defined using the keyword delegate, where the delegate’s type name is placed where you would typically place the method name: 1: // This delegate matches any method that takes string, returns nothing 2: public delegate void Log(string message); This delegate defines a delegate type named Log that can be used to store references to any method(s) that satisfies its signature (whether instance, static, lambda expression, etc.). Delegate instances then can be assigned zero (null) or more methods using the operator = which replaces the existing delegate chain, or by using the operator += which adds a method to the end of a delegate chain: 1: // creates a delegate instance named currentLogger defaulted to Console.WriteLine (static method) 2: Log currentLogger = Console.Out.WriteLine; 3: 4: // invokes the delegate, which writes to the console out 5: currentLogger("Hi Standard Out!"); 6: 7: // append a delegate to Console.Error.WriteLine to go to std error 8: currentLogger += Console.Error.WriteLine; 9: 10: // invokes the delegate chain and writes message to std out and std err 11: currentLogger("Hi Standard Out and Error!"); While delegates give us a lot of power, it can be cumbersome to re-create fairly standard delegate definitions repeatedly, for this purpose the generic delegates were introduced in various stages in .NET. These support various method types with particular signatures. Note: a caveat with generic delegates is that while they can support multiple parameters, they do not match methods that contains ref or out parameters. If you want to a delegate to represent methods that takes ref or out parameters, you will need to create a custom delegate. We’ve got the Func… delegates Just like it’s cousin, the Action delegate family, the Func delegate family gives us a lot of power to use generic delegates to make classes and algorithms more generic. Using them keeps us from having to define a new delegate type when need to make a class or algorithm generic. Remember that the point of the Action delegate family was to be able to perform an “action” on an item, with no return results. Thus Action delegates can be used to represent most methods that take 0 to 16 arguments but return void. You can assign a method The Func delegate family was introduced in .NET 3.5 with the advent of LINQ, and gives us the power to define a function that can be called on 0 to 16 arguments and returns a result. Thus, the main difference between Action and Func, from a delegate perspective, is that Actions return nothing, but Funcs return a result. The Func family of delegates have signatures as follows: Func<TResult> – matches a method that takes no arguments, and returns value of type TResult. Func<T, TResult> – matches a method that takes an argument of type T, and returns value of type TResult. Func<T1, T2, TResult> – matches a method that takes arguments of type T1 and T2, and returns value of type TResult. Func<T1, T2, …, TResult> – and so on up to 16 arguments, and returns value of type TResult. These are handy because they quickly allow you to be able to specify that a method or class you design will perform a function to produce a result as long as the method you specify meets the signature. For example, let’s say you were designing a generic aggregator, and you wanted to allow the user to define how the values will be aggregated into the result (i.e. Sum, Min, Max, etc…). To do this, we would ask the user of our class to pass in a method that would take the current total, the next value, and produce a new total. A class like this could look like: 1: public sealed class Aggregator<TValue, TResult> 2: { 3: // holds method that takes previous result, combines with next value, creates new result 4: private Func<TResult, TValue, TResult> _aggregationMethod; 5: 6: // gets or sets the current result of aggregation 7: public TResult Result { get; private set; } 8: 9: // construct the aggregator given the method to use to aggregate values 10: public Aggregator(Func<TResult, TValue, TResult> aggregationMethod = null) 11: { 12: if (aggregationMethod == null) throw new ArgumentNullException("aggregationMethod"); 13: 14: _aggregationMethod = aggregationMethod; 15: } 16: 17: // method to add next value 18: public void Aggregate(TValue nextValue) 19: { 20: // performs the aggregation method function on the current result and next and sets to current result 21: Result = _aggregationMethod(Result, nextValue); 22: } 23: } Of course, LINQ already has an Aggregate extension method, but that works on a sequence of IEnumerable<T>, whereas this is designed to work more with aggregating single results over time (such as keeping track of a max response time for a service). We could then use this generic aggregator to find the sum of a series of values over time, or the max of a series of values over time (among other things): 1: // creates an aggregator that adds the next to the total to sum the values 2: var sumAggregator = new Aggregator<int, int>((total, next) => total + next); 3: 4: // creates an aggregator (using static method) that returns the max of previous result and next 5: var maxAggregator = new Aggregator<int, int>(Math.Max); So, if we were timing the response time of a web method every time it was called, we could pass that response time to both of these aggregators to get an idea of the total time spent in that web method, and the max time spent in any one call to the web method: 1: // total will be 13 and max 13 2: int responseTime = 13; 3: sumAggregator.Aggregate(responseTime); 4: maxAggregator.Aggregate(responseTime); 5: 6: // total will be 20 and max still 13 7: responseTime = 7; 8: sumAggregator.Aggregate(responseTime); 9: maxAggregator.Aggregate(responseTime); 10: 11: // total will be 40 and max now 20 12: responseTime = 20; 13: sumAggregator.Aggregate(responseTime); 14: maxAggregator.Aggregate(responseTime); The Func delegate family is useful for making generic algorithms and classes, and in particular allows the caller of the method or user of the class to specify a function to be performed in order to generate a result. What is the result of a Func delegate chain? If you remember, we said earlier that you can assign multiple methods to a delegate by using the += operator to chain them. So how does this affect delegates such as Func that return a value, when applied to something like the code below? 1: Func<int, int, int> combo = null; 2: 3: // What if we wanted to aggregate the sum and max together? 4: combo += (total, next) => total + next; 5: combo += Math.Max; 6: 7: // what is the result? 8: var comboAggregator = new Aggregator<int, int>(combo); Well, in .NET if you chain multiple methods in a delegate, they will all get invoked, but the result of the delegate is the result of the last method invoked in the chain. Thus, this aggregator would always result in the Math.Max() result. The other chained method (the sum) gets executed first, but it’s result is thrown away: 1: // result is 13 2: int responseTime = 13; 3: comboAggregator.Aggregate(responseTime); 4: 5: // result is still 13 6: responseTime = 7; 7: comboAggregator.Aggregate(responseTime); 8: 9: // result is now 20 10: responseTime = 20; 11: comboAggregator.Aggregate(responseTime); So remember, you can chain multiple Func (or other delegates that return values) together, but if you do so you will only get the last executed result. Func delegates and co-variance/contra-variance in .NET 4.0 Just like the Action delegate, as of .NET 4.0, the Func delegate family is contra-variant on its arguments. In addition, it is co-variant on its return type. To support this, in .NET 4.0 the signatures of the Func delegates changed to: Func<out TResult> – matches a method that takes no arguments, and returns value of type TResult (or a more derived type). Func<in T, out TResult> – matches a method that takes an argument of type T (or a less derived type), and returns value of type TResult(or a more derived type). Func<in T1, in T2, out TResult> – matches a method that takes arguments of type T1 and T2 (or less derived types), and returns value of type TResult (or a more derived type). Func<in T1, in T2, …, out TResult> – and so on up to 16 arguments, and returns value of type TResult (or a more derived type). Notice the addition of the in and out keywords before each of the generic type placeholders. As we saw last week, the in keyword is used to specify that a generic type can be contra-variant -- it can match the given type or a type that is less derived. However, the out keyword, is used to specify that a generic type can be co-variant -- it can match the given type or a type that is more derived. On contra-variance, if you are saying you need an function that will accept a string, you can just as easily give it an function that accepts an object. In other words, if you say “give me an function that will process dogs”, I could pass you a method that will process any animal, because all dogs are animals. On the co-variance side, if you are saying you need a function that returns an object, you can just as easily pass it a function that returns a string because any string returned from the given method can be accepted by a delegate expecting an object result, since string is more derived. Once again, in other words, if you say “give me a method that creates an animal”, I can pass you a method that will create a dog, because all dogs are animals. It really all makes sense, you can pass a more specific thing to a less specific parameter, and you can return a more specific thing as a less specific result. In other words, pay attention to the direction the item travels (parameters go in, results come out). Keeping that in mind, you can always pass more specific things in and return more specific things out. For example, in the code below, we have a method that takes a Func<object> to generate an object, but we can pass it a Func<string> because the return type of object can obviously accept a return value of string as well: 1: // since Func<object> is co-variant, this will access Func<string>, etc... 2: public static string Sequence(int count, Func<object> generator) 3: { 4: var builder = new StringBuilder(); 5: 6: for (int i=0; i<count; i++) 7: { 8: object value = generator(); 9: builder.Append(value); 10: } 11: 12: return builder.ToString(); 13: } Even though the method above takes a Func<object>, we can pass a Func<string> because the TResult type placeholder is co-variant and accepts types that are more derived as well: 1: // delegate that's typed to return string. 2: Func<string> stringGenerator = () => DateTime.Now.ToString(); 3: 4: // This will work in .NET 4.0, but not in previous versions 5: Sequence(100, stringGenerator); Previous versions of .NET implemented some forms of co-variance and contra-variance before, but .NET 4.0 goes one step further and allows you to pass or assign an Func<A, BResult> to a Func<Y, ZResult> as long as A is less derived (or same) as Y, and BResult is more derived (or same) as ZResult. Sidebar: The Func and the Predicate A method that takes one argument and returns a bool is generally thought of as a predicate. Predicates are used to examine an item and determine whether that item satisfies a particular condition. Predicates are typically unary, but you may also have binary and other predicates as well. Predicates are often used to filter results, such as in the LINQ Where() extension method: 1: var numbers = new[] { 1, 2, 4, 13, 8, 10, 27 }; 2: 3: // call Where() using a predicate which determines if the number is even 4: var evens = numbers.Where(num => num % 2 == 0); As of .NET 3.5, predicates are typically represented as Func<T, bool> where T is the type of the item to examine. Previous to .NET 3.5, there was a Predicate<T> type that tended to be used (which we’ll discuss next week) and is still supported, but most developers recommend using Func<T, bool> now, as it prevents confusion with overloads that accept unary predicates and binary predicates, etc.: 1: // this seems more confusing as an overload set, because of Predicate vs Func 2: public static SomeMethod(Predicate<int> unaryPredicate) { } 3: public static SomeMethod(Func<int, int, bool> binaryPredicate) { } 4: 5: // this seems more consistent as an overload set, since just uses Func 6: public static SomeMethod(Func<int, bool> unaryPredicate) { } 7: public static SomeMethod(Func<int, int, bool> binaryPredicate) { } Also, even though Predicate<T> and Func<T, bool> match the same signatures, they are separate types! Thus you cannot assign a Predicate<T> instance to a Func<T, bool> instance and vice versa: 1: // the same method, lambda expression, etc can be assigned to both 2: Predicate<int> isEven = i => (i % 2) == 0; 3: Func<int, bool> alsoIsEven = i => (i % 2) == 0; 4: 5: // but the delegate instances cannot be directly assigned, strongly typed! 6: // ERROR: cannot convert type... 7: isEven = alsoIsEven; 8: 9: // however, you can assign by wrapping in a new instance: 10: isEven = new Predicate<int>(alsoIsEven); 11: alsoIsEven = new Func<int, bool>(isEven); So, the general advice that seems to come from most developers is that Predicate<T> is still supported, but we should use Func<T, bool> for consistency in .NET 3.5 and above. Sidebar: Func as a Generator for Unit Testing One area of difficulty in unit testing can be unit testing code that is based on time of day. We’d still want to unit test our code to make sure the logic is accurate, but we don’t want the results of our unit tests to be dependent on the time they are run. One way (of many) around this is to create an internal generator that will produce the “current” time of day. This would default to returning result from DateTime.Now (or some other method), but we could inject specific times for our unit testing. Generators are typically methods that return (generate) a value for use in a class/method. For example, say we are creating a CacheItem<T> class that represents an item in the cache, and we want to make sure the item shows as expired if the age is more than 30 seconds. Such a class could look like: 1: // responsible for maintaining an item of type T in the cache 2: public sealed class CacheItem<T> 3: { 4: // helper method that returns the current time 5: private static Func<DateTime> _timeGenerator = () => DateTime.Now; 6: 7: // allows internal access to the time generator 8: internal static Func<DateTime> TimeGenerator 9: { 10: get { return _timeGenerator; } 11: set { _timeGenerator = value; } 12: } 13: 14: // time the item was cached 15: public DateTime CachedTime { get; private set; } 16: 17: // the item cached 18: public T Value { get; private set; } 19: 20: // item is expired if older than 30 seconds 21: public bool IsExpired 22: { 23: get { return _timeGenerator() - CachedTime > TimeSpan.FromSeconds(30.0); } 24: } 25: 26: // creates the new cached item, setting cached time to "current" time 27: public CacheItem(T value) 28: { 29: Value = value; 30: CachedTime = _timeGenerator(); 31: } 32: } Then, we can use this construct to unit test our CacheItem<T> without any time dependencies: 1: var baseTime = DateTime.Now; 2: 3: // start with current time stored above (so doesn't drift) 4: CacheItem<int>.TimeGenerator = () => baseTime; 5: 6: var target = new CacheItem<int>(13); 7: 8: // now add 15 seconds, should still be non-expired 9: CacheItem<int>.TimeGenerator = () => baseTime.AddSeconds(15); 10: 11: Assert.IsFalse(target.IsExpired); 12: 13: // now add 31 seconds, should now be expired 14: CacheItem<int>.TimeGenerator = () => baseTime.AddSeconds(31); 15: 16: Assert.IsTrue(target.IsExpired); Now we can unit test for 1 second before, 1 second after, 1 millisecond before, 1 day after, etc. Func delegates can be a handy tool for this type of value generation to support more testable code. Summary Generic delegates give us a lot of power to make truly generic algorithms and classes. The Func family of delegates is a great way to be able to specify functions to calculate a result based on 0-16 arguments. Stay tuned in the weeks that follow for other generic delegates in the .NET Framework! Tweet Technorati Tags: .NET, C#, CSharp, Little Wonders, Generics, Func, Delegates

Read the article

Developer IT

aggregators - Page 2 - Developer IT

Search Results

Search found 27 results on 2 pages for 'aggregators'.

Page 2/2 | < Previous Page | 1 2

Running a simple integration scenario using the Oracle Big Data Connectors on Hadoop/HDFS cluster

- by hamsun

Read the article

C#/.NET Little Wonders: The Generic Func Delegates

- by James Michael Hare

Read the article

< Previous Page | 1 2