If that sounds confusing, let me give you an example. Let's say you expose a method to read a database of products, and instead of returning a List<Product> you return an IEnumerable<Product> in iterator form (yield return). This accomplishes several good things: The IDataReader is not passed out of the Data Access Layer which prevents abstraction leak and resource leak potentials. You don't need to construct a full List<Product> in memory (which could be very big) if you just want to forward iterate once. If you only want to consume up to a certain point in the list, you won't incur the database cost of looking up the other items. This could give us an example like: 1: // a sample data access object class to do standard CRUD operations.
2: public class ProductDao
3: {
4: private DbProviderFactory _factory = SqlClientFactory.Instance
5:
6: // a method that would retrieve all available products
7: public IEnumerable<Product> GetAvailableProducts()
8: {
9: // must create the connection
10: using (var con = _factory.CreateConnection())
11: {
12: con.ConnectionString = _productsConnectionString;
13: con.Open();
14:
15: // create the command
16: using (var cmd = _factory.CreateCommand())
17: {
18: cmd.Connection = con;
19: cmd.CommandText = _getAllProductsStoredProc;
20: cmd.CommandType = CommandType.StoredProcedure;
21:
22: // get a reader and pass back all results
23: using (var reader = cmd.ExecuteReader())
24: {
25: while(reader.Read())
26: {
27: yield return new Product
28: {
29: Name = reader["product_name"].ToString(),
30: ...
31: };
32: }
33: }
34: }
35: }
36: }
37: }
The database details themselves are irrelevant. I will say, though, that I'm a big fan of using the System.Data.Common classes instead of your provider specific counterparts directly (SqlCommand, OracleCommand, etc). This lets you mock your data sources easily in unit testing and also allows you to swap out your provider in one line of code. In fact, one of the shared components I'm most proud of implementing was our group's DatabaseUtility library that simplifies all the database access above into one line of code in a thread-safe and provider-neutral way.
I went with my own flavor instead of the EL due to the fact I didn't want to force internal company consumers to use the EL if they didn't want to, and it made it easy to allow them to mock their database for unit testing by providing a MockCommand, MockConnection, etc that followed the System.Data.Common model. One of these days I'll blog on that if anyone's interested.
Regardless, you often have situations like the above where you are consuming and iterating through a resource that must be closed once you are finished iterating. For the reasons stated above, I didn't want to return IDataReader (that would force them to remember to Dispose it), and I didn't want to return List<Product> (that would force them to hold all products in memory) -- but the first time I wrote this, I was worried. What if you never consume the last item and exit the loop? Are the reader, command, and connection all disposed correctly?
Of course, I was 99.999999% sure the creators of C# had already thought of this and taken care of it, but inspection in Reflector was difficult due to the nature of the state machines yield return generates, so I decided to try a quick example program to verify whether or not Dispose() will be called when an iterator is broken from outside the iterator itself -- i.e. before the iterator reports there are no more items.
So I wrote a quick Sequencer class with a Dispose() method and an iterator for it. Yes, it is COMPLETELY contrived:
1: // A disposable sequence of int -- yes this is completely contrived...
2: internal class Sequencer : IDisposable
3: {
4: private int _i = 0;
5: private readonly object _mutex = new object();
6:
7: // Constructs an int sequence.
8: public Sequencer(int start)
9: {
10: _i = start;
11: }
12:
13: // Gets the next integer
14: public int GetNext()
15: {
16: lock (_mutex)
17: {
18: return _i++;
19: }
20: }
21:
22: // Dispose the sequence of integers.
23: public void Dispose()
24: {
25: // force output immediately (flush the buffer)
26: Console.WriteLine("Disposed with last sequence number of {0}!", _i);
27: Console.Out.Flush();
28: }
29: }
And then I created a generator (infinite-loop iterator) that did the using block for auto-Disposal:
1: // simply defines an extension method off of an int to start a sequence
2: public static class SequencerExtensions
3: {
4: // generates an infinite sequence starting at the specified number
5: public static IEnumerable<int> GetSequence(this int starter)
6: {
7: // note the using here, will call Dispose() when block terminated.
8: using (var seq = new Sequencer(starter))
9: {
10: // infinite loop on this generator, means must be bounded by caller!
11: while(true)
12: {
13: yield return seq.GetNext();
14: }
15: }
16: }
17: }
This is really the same conundrum as the database problem originally posed. Here we are using iteration (yield return) over a large collection (infinite sequence of integers). If we cut the sequence short by breaking iteration, will that using block exit and hence, Dispose be called?
Well, let's see:
1: // The test program class
2: public class IteratorTest
3: {
4: // The main test method.
5: public static void Main()
6: {
7: Console.WriteLine("Going to consume 10 of infinite items");
8: Console.Out.Flush();
9:
10: foreach(var i in 0.GetSequence())
11: {
12: // could use TakeWhile, but wanted to output right at break...
13: if(i >= 10)
14: {
15: Console.WriteLine("Breaking now!");
16: Console.Out.Flush();
17: break;
18: }
19:
20: Console.WriteLine(i);
21: Console.Out.Flush();
22: }
23:
24: Console.WriteLine("Done with loop.");
25: Console.Out.Flush();
26: }
27: }
So, what do we see? Do we see the "Disposed" message from our dispose, or did the Dispose get skipped because from an "eyeball" perspective we should be locked in that infinite generator loop?
Here's the results:
1: Going to consume 10 of infinite items
2: 0
3: 1
4: 2
5: 3
6: 4
7: 5
8: 6
9: 7
10: 8
11: 9
12: Breaking now!
13: Disposed with last sequence number of 11!
14: Done with loop.
Yes indeed, when we break the loop, the state machine that C# generates for yield iterate exits the iteration through the using blocks and auto-disposes the IDisposable correctly. I must admit, though, the first time I wrote one, I began to wonder and that led to this test. If you've never seen iterators before (I wrote a previous entry here) the infinite loop may throw you, but you have to keep in mind it is not a linear piece of code, that every time you hit a "yield return" it cedes control back to the state machine generated for the iterator. And this state machine, I'm happy to say, is smart enough to clean up the using blocks correctly. I suspected those wily guys and gals at Microsoft engineered it well, and I wasn't disappointed. But, I've been bitten by assumptions before, so it's good to test and see.
Yes, maybe you knew it would or figured it would, but isn't it nice to know? And as those campy 80s G.I. Joe cartoon public service reminders always taught us, "Knowing is half the battle...".
Technorati Tags: C#,.NET