optimistic concurrency - Page 25

How solve consumer/producer task using semaphores

- by user1074896

I have SimpleProducerConsumer class that illustrate consumer/producer problem (I am not sure that it's correct). public class SimpleProducerConsumer { private Stack<Object> stack = new Stack<Object>(); private static final int STACK_MAX_SIZE = 10; public static void main(String[] args) { SimpleProducerConsumer pc = new SimpleProducerConsumer(); new Thread(pc.new Producer(), "p1").start(); new Thread(pc.new Producer(), "p2").start(); new Thread(pc.new Consumer(), "c1").start(); new Thread(pc.new Consumer(), "c2").start(); new Thread(pc.new Consumer(), "c3").start(); } public synchronized void push(Object d) { while (stack.size() >= STACK_MAX_SIZE) try { wait(); } catch (InterruptedException e) { e.printStackTrace(); } try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } stack.push(new Object()); System.out.println("push " + Thread.currentThread().getName() + " " + stack.size()); notify(); } public synchronized Object pop() { while (stack.size() == 0) try { wait(); } catch (InterruptedException e) { e.printStackTrace(); } try { Thread.sleep(50); } catch (InterruptedException e) { e.printStackTrace(); } stack.pop(); System.out.println("pop " + Thread.currentThread().getName() + " " + stack.size()); notify(); return null; } class Consumer implements Runnable { @Override public void run() { while (true) { pop(); } } } class Producer implements Runnable { @Override public void run() { while (true) { push(new Object()); } } } } I found simple realization of semaphore(here:http://docs.oracle.com/javase/tutorial/essential/concurrency/guardmeth.html I know that there is concurrency package) How I need to change code to exchange java objects monitors to my custom semaphore. (To illustrate C/P problem using semaphores) Semaphore: class Semaphore { private int counter; public Semaphore() { this(0); } public Semaphore(int i) { if (i < 0) throw new IllegalArgumentException(i + " < 0"); counter = i; } public synchronized void release() { if (counter == 0) { notify(); } counter++; } public synchronized void acquire() throws InterruptedException { while (counter == 0) { wait(); } counter--; } }

Read the article

Differences in the different ways to make concurrent programs

- by dotnetdev

Hi, What is the difference between: Starting a new thread Using TPL Using BackgroundWorker All of these create concurrency but what are the low-level differences between these? Do all 3 make threads anyway? Thanks

Read the article

ConcurrenctBag(Of MyType) Vs List(Of MyType)

- by Ben

What is the advantage of using a ConcurrentBag(Of MyType) against just using a List(Of MyType)? The MSDN page on the CB (http://msdn.microsoft.com/en-us/library/dd381779(v=VS.100).aspx) states that ConcurrentBag(Of T) is a thread-safe bag implementation, optimized for scenarios where the same thread will be both producing and consuming data stored in the bag So what is the advantage? I can understand the advantage of the other collection types in the Concurrency namespace, but this one puzzled me. Thanks.

Read the article

Benchmarks used to test a C and C++ allocator?

- by Viet

Please kindly advise on benchmarks used to test a C and C++ allocator? Benchmarks satisfying any of the following aspects are considered: Speed Fragmentation Concurrency Thanks!

Read the article

Upload File to Windows Azure Blob in Chunks through ASP.NET MVC, JavaScript and HTML5

- by Shaun

Originally posted on: http://geekswithblogs.net/shaunxu/archive/2013/07/01/upload-file-to-windows-azure-blob-in-chunks-through-asp.net.aspxMany people are using Windows Azure Blob Storage to store their data in the cloud. Blob storage provides 99.9% availability with easy-to-use API through .NET SDK and HTTP REST. For example, we can store JavaScript files, images, documents in blob storage when we are building an ASP.NET web application on a Web Role in Windows Azure. Or we can store our VHD files in blob and mount it as a hard drive in our cloud service. If you are familiar with Windows Azure, you should know that there are two kinds of blob: page blob and block blob. The page blob is optimized for random read and write, which is very useful when you need to store VHD files. The block blob is optimized for sequential/chunk read and write, which has more common usage. Since we can upload block blob in blocks through BlockBlob.PutBlock, and them commit them as a whole blob with invoking the BlockBlob.PutBlockList, it is very powerful to upload large files, as we can upload blocks in parallel, and provide pause-resume feature. There are many documents, articles and blog posts described on how to upload a block blob. Most of them are focus on the server side, which means when you had received a big file, stream or binaries, how to upload them into blob storage in blocks through .NET SDK. But the problem is, how can we upload these large files from client side, for example, a browser. This questioned to me when I was working with a Chinese customer to help them build a network disk production on top of azure. The end users upload their files from the web portal, and then the files will be stored in blob storage from the Web Role. My goal is to find the best way to transform the file from client (end user’s machine) to the server (Web Role) through browser. In this post I will demonstrate and describe what I had done, to upload large file in chunks with high speed, and save them as blocks into Windows Azure Blob Storage. Traditional Upload, Works with Limitation The simplest way to implement this requirement is to create a web page with a form that contains a file input element and a submit button. 1: @using (Html.BeginForm("About", "Index", FormMethod.Post, new { enctype = "multipart/form-data" })) 2: { 3: <input type="file" name="file" /> 4: <input type="submit" value="upload" /> 5: } And then in the backend controller, we retrieve the whole content of this file and upload it in to the blob storage through .NET SDK. We can split the file in blocks and upload them in parallel and commit. The code had been well blogged in the community. 1: [HttpPost] 2: public ActionResult About(HttpPostedFileBase file) 3: { 4: var container = _client.GetContainerReference("test"); 5: container.CreateIfNotExists(); 6: var blob = container.GetBlockBlobReference(file.FileName); 7: var blockDataList = new Dictionary<string, byte[]>(); 8: using (var stream = file.InputStream) 9: { 10: var blockSizeInKB = 1024; 11: var offset = 0; 12: var index = 0; 13: while (offset < stream.Length) 14: { 15: var readLength = Math.Min(1024 * blockSizeInKB, (int)stream.Length - offset); 16: var blockData = new byte[readLength]; 17: offset += stream.Read(blockData, 0, readLength); 18: blockDataList.Add(Convert.ToBase64String(BitConverter.GetBytes(index)), blockData); 19: 20: index++; 21: } 22: } 23: 24: Parallel.ForEach(blockDataList, (bi) => 25: { 26: blob.PutBlock(bi.Key, new MemoryStream(bi.Value), null); 27: }); 28: blob.PutBlockList(blockDataList.Select(b => b.Key).ToArray()); 29: 30: return RedirectToAction("About"); 31: } This works perfect if we selected an image, a music or a small video to upload. But if I selected a large file, let’s say a 6GB HD-movie, after upload for about few minutes the page will be shown as below and the upload will be terminated. In ASP.NET there is a limitation of request length and the maximized request length is defined in the web.config file. It’s a number which less than about 4GB. So if we want to upload a really big file, we cannot simply implement in this way. Also, in Windows Azure, a cloud service network load balancer will terminate the connection if exceed the timeout period. From my test the timeout looks like 2 - 3 minutes. Hence, when we need to upload a large file we cannot just use the basic HTML elements. Besides the limitation mentioned above, the simple HTML file upload cannot provide rich upload experience such as chunk upload, pause and pause-resume. So we need to find a better way to upload large file from the client to the server. Upload in Chunks through HTML5 and JavaScript In order to break those limitation mentioned above we will try to upload the large file in chunks. This takes some benefit to us such as - No request size limitation: Since we upload in chunks, we can define the request size for each chunks regardless how big the entire file is. - No timeout problem: The size of chunks are controlled by us, which means we should be able to make sure request for each chunk upload will not exceed the timeout period of both ASP.NET and Windows Azure load balancer. It was a big challenge to upload big file in chunks until we have HTML5. There are some new features and improvements introduced in HTML5 and we will use them to implement our solution. In HTML5, the File interface had been improved with a new method called “slice”. It can be used to read part of the file by specifying the start byte index and the end byte index. For example if the entire file was 1024 bytes, file.slice(512, 768) will read the part of this file from the 512nd byte to 768th byte, and return a new object of interface called "Blob”, which you can treat as an array of bytes. In fact, a Blob object represents a file-like object of immutable, raw data. The File interface is based on Blob, inheriting blob functionality and expanding it to support files on the user's system. For more information about the Blob please refer here. File and Blob is very useful to implement the chunk upload. We will use File interface to represent the file the user selected from the browser and then use File.slice to read the file in chunks in the size we wanted. For example, if we wanted to upload a 10MB file with 512KB chunks, then we can read it in 512KB blobs by using File.slice in a loop. Assuming we have a web page as below. User can select a file, an input box to specify the block size in KB and a button to start upload. 1: <div> 2: <input type="file" id="upload_files" name="files[]" /><br /> 3: Block Size: <input type="number" id="block_size" value="512" name="block_size" />KB<br /> 4: <input type="button" id="upload_button_blob" name="upload" value="upload (blob)" /> 5: </div> Then we can have the JavaScript function to upload the file in chunks when user clicked the button. 1: <script type="text/javascript"> 1: 2: $(function () { 3: $("#upload_button_blob").click(function () { 4: }); 5: });</script> Firstly we need to ensure the client browser supports the interfaces we are going to use. Just try to invoke the File, Blob and FormData from the “window” object. If any of them is “undefined” the condition result will be “false” which means your browser doesn’t support these premium feature and it’s time for you to get your browser updated. FormData is another new feature we are going to use in the future. It could generate a temporary form for us. We will use this interface to create a form with chunk and associated metadata when invoked the service through ajax. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: if (window.File && window.Blob && window.FormData) { 4: alert("Your brwoser is awesome, let's rock!"); 5: } 6: else { 7: alert("Oh man plz update to a modern browser before try is cool stuff out."); 8: return; 9: } 10: }); Each browser supports these interfaces by their own implementation and currently the Blob, File and File.slice are supported by Chrome 21, FireFox 13, IE 10, Opera 12 and Safari 5.1 or higher. After that we worked on the files the user selected one by one since in HTML5, user can select multiple files in one file input box. 1: var files = $("#upload_files")[0].files; 2: for (var i = 0; i < files.length; i++) { 3: var file = files[i]; 4: var fileSize = file.size; 5: var fileName = file.name; 6: } Next, we calculated the start index and end index for each chunks based on the size the user specified from the browser. We put them into an array with the file name and the index, which will be used when we upload chunks into Windows Azure Blob Storage as blocks since we need to specify the target blob name and the block index. At the same time we will store the list of all indexes into another variant which will be used to commit blocks into blob in Azure Storage once all chunks had been uploaded successfully. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: ... ... 4: // start to upload each files in chunks 5: var files = $("#upload_files")[0].files; 6: for (var i = 0; i < files.length; i++) { 7: var file = files[i]; 8: var fileSize = file.size; 9: var fileName = file.name; 10: 11: // calculate the start and end byte index for each blocks(chunks) 12: // with the index, file name and index list for future using 13: var blockSizeInKB = $("#block_size").val(); 14: var blockSize = blockSizeInKB * 1024; 15: var blocks = []; 16: var offset = 0; 17: var index = 0; 18: var list = ""; 19: while (offset < fileSize) { 20: var start = offset; 21: var end = Math.min(offset + blockSize, fileSize); 22: 23: blocks.push({ 24: name: fileName, 25: index: index, 26: start: start, 27: end: end 28: }); 29: list += index + ","; 30: 31: offset = end; 32: index++; 33: } 34: } 35: }); Now we have all chunks’ information ready. The next step should be upload them one by one to the server side, and at the server side when received a chunk it will upload as a block into Blob Storage, and finally commit them with the index list through BlockBlobClient.PutBlockList. But since all these invokes are ajax calling, which means not synchronized call. So we need to introduce a new JavaScript library to help us coordinate the asynchronize operation, which named “async.js”. You can download this JavaScript library here, and you can find the document here. I will not explain this library too much in this post. We will put all procedures we want to execute as a function array, and pass into the proper function defined in async.js to let it help us to control the execution sequence, in series or in parallel. Hence we will define an array and put the function for chunk upload into this array. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: ... ... 4: 5: // start to upload each files in chunks 6: var files = $("#upload_files")[0].files; 7: for (var i = 0; i < files.length; i++) { 8: var file = files[i]; 9: var fileSize = file.size; 10: var fileName = file.name; 11: // calculate the start and end byte index for each blocks(chunks) 12: // with the index, file name and index list for future using 13: ... ... 14: 15: // define the function array and push all chunk upload operation into this array 16: blocks.forEach(function (block) { 17: putBlocks.push(function (callback) { 18: }); 19: }); 20: } 21: }); 22: }); As you can see, I used File.slice method to read each chunks based on the start and end byte index we calculated previously, and constructed a temporary HTML form with the file name, chunk index and chunk data through another new feature in HTML5 named FormData. Then post this form to the backend server through jQuery.ajax. This is the key part of our solution. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: ... ... 4: // start to upload each files in chunks 5: var files = $("#upload_files")[0].files; 6: for (var i = 0; i < files.length; i++) { 7: var file = files[i]; 8: var fileSize = file.size; 9: var fileName = file.name; 10: // calculate the start and end byte index for each blocks(chunks) 11: // with the index, file name and index list for future using 12: ... ... 13: // define the function array and push all chunk upload operation into this array 14: blocks.forEach(function (block) { 15: putBlocks.push(function (callback) { 16: // load blob based on the start and end index for each chunks 17: var blob = file.slice(block.start, block.end); 18: // put the file name, index and blob into a temporary from 19: var fd = new FormData(); 20: fd.append("name", block.name); 21: fd.append("index", block.index); 22: fd.append("file", blob); 23: // post the form to backend service (asp.net mvc controller action) 24: $.ajax({ 25: url: "/Home/UploadInFormData", 26: data: fd, 27: processData: false, 28: contentType: "multipart/form-data", 29: type: "POST", 30: success: function (result) { 31: if (!result.success) { 32: alert(result.error); 33: } 34: callback(null, block.index); 35: } 36: }); 37: }); 38: }); 39: } 40: }); Then we will invoke these functions one by one by using the async.js. And once all functions had been executed successfully I invoked another ajax call to the backend service to commit all these chunks (blocks) as the blob in Windows Azure Storage. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: ... ... 4: // start to upload each files in chunks 5: var files = $("#upload_files")[0].files; 6: for (var i = 0; i < files.length; i++) { 7: var file = files[i]; 8: var fileSize = file.size; 9: var fileName = file.name; 10: // calculate the start and end byte index for each blocks(chunks) 11: // with the index, file name and index list for future using 12: ... ... 13: // define the function array and push all chunk upload operation into this array 14: ... ... 15: // invoke the functions one by one 16: // then invoke the commit ajax call to put blocks into blob in azure storage 17: async.series(putBlocks, function (error, result) { 18: var data = { 19: name: fileName, 20: list: list 21: }; 22: $.post("/Home/Commit", data, function (result) { 23: if (!result.success) { 24: alert(result.error); 25: } 26: else { 27: alert("done!"); 28: } 29: }); 30: }); 31: } 32: }); That’s all in the client side. The outline of our logic would be - Calculate the start and end byte index for each chunks based on the block size. - Defined the functions of reading the chunk form file and upload the content to the backend service through ajax. - Execute the functions defined in previous step with “async.js”. - Commit the chunks by invoking the backend service in Windows Azure Storage finally. Save Chunks as Blocks into Blob Storage In above we finished the client size JavaScript code. It uploaded the file in chunks to the backend service which we are going to implement in this step. We will use ASP.NET MVC as our backend service, and it will receive the chunks, upload into Windows Azure Bob Storage in blocks, then finally commit as one blob. As in the client side we uploaded chunks by invoking the ajax call to the URL "/Home/UploadInFormData", I created a new action under the Index controller and it only accepts HTTP POST request. 1: [HttpPost] 2: public JsonResult UploadInFormData() 3: { 4: var error = string.Empty; 5: try 6: { 7: } 8: catch (Exception e) 9: { 10: error = e.ToString(); 11: } 12: 13: return new JsonResult() 14: { 15: Data = new 16: { 17: success = string.IsNullOrWhiteSpace(error), 18: error = error 19: } 20: }; 21: } Then I retrieved the file name, index and the chunk content from the Request.Form object, which was passed from our client side. And then, used the Windows Azure SDK to create a blob container (in this case we will use the container named “test”.) and create a blob reference with the blob name (same as the file name). Then uploaded the chunk as a block of this blob with the index, since in Blob Storage each block must have an index (ID) associated with so that finally we can put all blocks as one blob by specifying their block ID list. 1: [HttpPost] 2: public JsonResult UploadInFormData() 3: { 4: var error = string.Empty; 5: try 6: { 7: var name = Request.Form["name"]; 8: var index = int.Parse(Request.Form["index"]); 9: var file = Request.Files[0]; 10: var id = Convert.ToBase64String(BitConverter.GetBytes(index)); 11: 12: var container = _client.GetContainerReference("test"); 13: container.CreateIfNotExists(); 14: var blob = container.GetBlockBlobReference(name); 15: blob.PutBlock(id, file.InputStream, null); 16: } 17: catch (Exception e) 18: { 19: error = e.ToString(); 20: } 21: 22: return new JsonResult() 23: { 24: Data = new 25: { 26: success = string.IsNullOrWhiteSpace(error), 27: error = error 28: } 29: }; 30: } Next, I created another action to commit the blocks into blob once all chunks had been uploaded. Similarly, I retrieved the blob name from the Request.Form. I also retrieved the chunks ID list, which is the block ID list from the Request.Form in a string format, split them as a list, then invoked the BlockBlob.PutBlockList method. After that our blob will be shown in the container and ready to be download. 1: [HttpPost] 2: public JsonResult Commit() 3: { 4: var error = string.Empty; 5: try 6: { 7: var name = Request.Form["name"]; 8: var list = Request.Form["list"]; 9: var ids = list 10: .Split(',') 11: .Where(id => !string.IsNullOrWhiteSpace(id)) 12: .Select(id => Convert.ToBase64String(BitConverter.GetBytes(int.Parse(id)))) 13: .ToArray(); 14: 15: var container = _client.GetContainerReference("test"); 16: container.CreateIfNotExists(); 17: var blob = container.GetBlockBlobReference(name); 18: blob.PutBlockList(ids); 19: } 20: catch (Exception e) 21: { 22: error = e.ToString(); 23: } 24: 25: return new JsonResult() 26: { 27: Data = new 28: { 29: success = string.IsNullOrWhiteSpace(error), 30: error = error 31: } 32: }; 33: } Now we finished all code we need. The whole process of uploading would be like this below. Below is the full client side JavaScript code. 1: <script type="text/javascript" src="~/Scripts/async.js"></script> 2: <script type="text/javascript"> 3: $(function () { 4: $("#upload_button_blob").click(function () { 5: // assert the browser support html5 6: if (window.File && window.Blob && window.FormData) { 7: alert("Your brwoser is awesome, let's rock!"); 8: } 9: else { 10: alert("Oh man plz update to a modern browser before try is cool stuff out."); 11: return; 12: } 13: 14: // start to upload each files in chunks 15: var files = $("#upload_files")[0].files; 16: for (var i = 0; i < files.length; i++) { 17: var file = files[i]; 18: var fileSize = file.size; 19: var fileName = file.name; 20: 21: // calculate the start and end byte index for each blocks(chunks) 22: // with the index, file name and index list for future using 23: var blockSizeInKB = $("#block_size").val(); 24: var blockSize = blockSizeInKB * 1024; 25: var blocks = []; 26: var offset = 0; 27: var index = 0; 28: var list = ""; 29: while (offset < fileSize) { 30: var start = offset; 31: var end = Math.min(offset + blockSize, fileSize); 32: 33: blocks.push({ 34: name: fileName, 35: index: index, 36: start: start, 37: end: end 38: }); 39: list += index + ","; 40: 41: offset = end; 42: index++; 43: } 44: 45: // define the function array and push all chunk upload operation into this array 46: var putBlocks = []; 47: blocks.forEach(function (block) { 48: putBlocks.push(function (callback) { 49: // load blob based on the start and end index for each chunks 50: var blob = file.slice(block.start, block.end); 51: // put the file name, index and blob into a temporary from 52: var fd = new FormData(); 53: fd.append("name", block.name); 54: fd.append("index", block.index); 55: fd.append("file", blob); 56: // post the form to backend service (asp.net mvc controller action) 57: $.ajax({ 58: url: "/Home/UploadInFormData", 59: data: fd, 60: processData: false, 61: contentType: "multipart/form-data", 62: type: "POST", 63: success: function (result) { 64: if (!result.success) { 65: alert(result.error); 66: } 67: callback(null, block.index); 68: } 69: }); 70: }); 71: }); 72: 73: // invoke the functions one by one 74: // then invoke the commit ajax call to put blocks into blob in azure storage 75: async.series(putBlocks, function (error, result) { 76: var data = { 77: name: fileName, 78: list: list 79: }; 80: $.post("/Home/Commit", data, function (result) { 81: if (!result.success) { 82: alert(result.error); 83: } 84: else { 85: alert("done!"); 86: } 87: }); 88: }); 89: } 90: }); 91: }); 92: </script> And below is the full ASP.NET MVC controller code. 1: public class HomeController : Controller 2: { 3: private CloudStorageAccount _account; 4: private CloudBlobClient _client; 5: 6: public HomeController() 7: : base() 8: { 9: _account = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("DataConnectionString")); 10: _client = _account.CreateCloudBlobClient(); 11: } 12: 13: public ActionResult Index() 14: { 15: ViewBag.Message = "Modify this template to jump-start your ASP.NET MVC application."; 16: 17: return View(); 18: } 19: 20: [HttpPost] 21: public JsonResult UploadInFormData() 22: { 23: var error = string.Empty; 24: try 25: { 26: var name = Request.Form["name"]; 27: var index = int.Parse(Request.Form["index"]); 28: var file = Request.Files[0]; 29: var id = Convert.ToBase64String(BitConverter.GetBytes(index)); 30: 31: var container = _client.GetContainerReference("test"); 32: container.CreateIfNotExists(); 33: var blob = container.GetBlockBlobReference(name); 34: blob.PutBlock(id, file.InputStream, null); 35: } 36: catch (Exception e) 37: { 38: error = e.ToString(); 39: } 40: 41: return new JsonResult() 42: { 43: Data = new 44: { 45: success = string.IsNullOrWhiteSpace(error), 46: error = error 47: } 48: }; 49: } 50: 51: [HttpPost] 52: public JsonResult Commit() 53: { 54: var error = string.Empty; 55: try 56: { 57: var name = Request.Form["name"]; 58: var list = Request.Form["list"]; 59: var ids = list 60: .Split(',') 61: .Where(id => !string.IsNullOrWhiteSpace(id)) 62: .Select(id => Convert.ToBase64String(BitConverter.GetBytes(int.Parse(id)))) 63: .ToArray(); 64: 65: var container = _client.GetContainerReference("test"); 66: container.CreateIfNotExists(); 67: var blob = container.GetBlockBlobReference(name); 68: blob.PutBlockList(ids); 69: } 70: catch (Exception e) 71: { 72: error = e.ToString(); 73: } 74: 75: return new JsonResult() 76: { 77: Data = new 78: { 79: success = string.IsNullOrWhiteSpace(error), 80: error = error 81: } 82: }; 83: } 84: } And if we selected a file from the browser we will see our application will upload chunks in the size we specified to the server through ajax call in background, and then commit all chunks in one blob. Then we can find the blob in our Windows Azure Blob Storage. Optimized by Parallel Upload In previous example we just uploaded our file in chunks. This solved the problem that ASP.NET MVC request content size limitation as well as the Windows Azure load balancer timeout. But it might introduce the performance problem since we uploaded chunks in sequence. In order to improve the upload performance we could modify our client side code a bit to make the upload operation invoked in parallel. The good news is that, “async.js” library provides the parallel execution function. If you remembered the code we invoke the service to upload chunks, it utilized “async.series” which means all functions will be executed in sequence. Now we will change this code to “async.parallel”. This will invoke all functions in parallel. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: ... ... 4: // start to upload each files in chunks 5: var files = $("#upload_files")[0].files; 6: for (var i = 0; i < files.length; i++) { 7: var file = files[i]; 8: var fileSize = file.size; 9: var fileName = file.name; 10: // calculate the start and end byte index for each blocks(chunks) 11: // with the index, file name and index list for future using 12: ... ... 13: // define the function array and push all chunk upload operation into this array 14: ... ... 15: // invoke the functions one by one 16: // then invoke the commit ajax call to put blocks into blob in azure storage 17: async.parallel(putBlocks, function (error, result) { 18: var data = { 19: name: fileName, 20: list: list 21: }; 22: $.post("/Home/Commit", data, function (result) { 23: if (!result.success) { 24: alert(result.error); 25: } 26: else { 27: alert("done!"); 28: } 29: }); 30: }); 31: } 32: }); In this way all chunks will be uploaded to the server side at the same time to maximize the bandwidth usage. This should work if the file was not very large and the chunk size was not very small. But for large file this might introduce another problem that too many ajax calls are sent to the server at the same time. So the best solution should be, upload the chunks in parallel with maximum concurrency limitation. The code below specified the concurrency limitation to 4, which means at the most only 4 ajax calls could be invoked at the same time. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: ... ... 4: // start to upload each files in chunks 5: var files = $("#upload_files")[0].files; 6: for (var i = 0; i < files.length; i++) { 7: var file = files[i]; 8: var fileSize = file.size; 9: var fileName = file.name; 10: // calculate the start and end byte index for each blocks(chunks) 11: // with the index, file name and index list for future using 12: ... ... 13: // define the function array and push all chunk upload operation into this array 14: ... ... 15: // invoke the functions one by one 16: // then invoke the commit ajax call to put blocks into blob in azure storage 17: async.parallelLimit(putBlocks, 4, function (error, result) { 18: var data = { 19: name: fileName, 20: list: list 21: }; 22: $.post("/Home/Commit", data, function (result) { 23: if (!result.success) { 24: alert(result.error); 25: } 26: else { 27: alert("done!"); 28: } 29: }); 30: }); 31: } 32: }); Summary In this post we discussed how to upload files in chunks to the backend service and then upload them into Windows Azure Blob Storage in blocks. We focused on the frontend side and leverage three new feature introduced in HTML 5 which are - File.slice: Read part of the file by specifying the start and end byte index. - Blob: File-like interface which contains the part of the file content. - FormData: Temporary form element that we can pass the chunk alone with some metadata to the backend service. Then we discussed the performance consideration of chunk uploading. Sequence upload cannot provide maximized upload speed, but the unlimited parallel upload might crash the browser and server if too many chunks. So we finally came up with the solution to upload chunks in parallel with the concurrency limitation. We also demonstrated how to utilize “async.js” JavaScript library to help us control the asynchronize call and the parallel limitation. Regarding the chunk size and the parallel limitation value there is no “best” value. You need to test vary composition and find out the best one for your particular scenario. It depends on the local bandwidth, client machine cores and the server side (Windows Azure Cloud Service Virtual Machine) cores, memory and bandwidth. Below is one of my performance test result. The client machine was Windows 8 IE 10 with 4 cores. I was using Microsoft Cooperation Network. The web site was hosted on Windows Azure China North data center (in Beijing) with one small web role (1.7GB 1 core CPU, 1.75GB memory with 100Mbps bandwidth). The test cases were - Chunk size: 512KB, 1MB, 2MB, 4MB. - Upload Mode: Sequence, parallel (unlimited), parallel with limit (4 threads, 8 threads). - Chunk Format: base64 string, binaries. - Target file: 100MB. - Each case was tested 3 times. Below is the test result chart. Some thoughts, but not guidance or best practice: - Parallel gets better performance than series. - No significant performance improvement between parallel 4 threads and 8 threads. - Transform with binaries provides better performance than base64. - In all cases, chunk size in 1MB - 2MB gets better performance. Hope this helps, Shaun All documents and related graphics, codes are provided "AS IS" without warranty of any kind. Copyright © Shaun Ziyan Xu. This work is licensed under the Creative Commons License.

Read the article

ODI 12c - Parallel Table Load

- by David Allan

In this post we will look at the ODI 12c capability of parallel table load from the aspect of the mapping developer and the knowledge module developer - two quite different viewpoints. This is about parallel table loading which isn't to be confused with loading multiple targets per se. It supports the ability for ODI mappings to be executed concurrently especially if there is an overlap of the datastores that they access, so any temporary resources created may be uniquely constructed by ODI. Temporary objects can be anything basically - common examples are staging tables, indexes, views, directories - anything in the ETL to help the data integration flow do its job. In ODI 11g users found a few workarounds (such as changing the technology prefixes - see here) to build unique temporary names but it was more of a challenge in error cases. ODI 12c mappings by default operate exactly as they did in ODI 11g with respect to these temporary names (this is also true for upgraded interfaces and scenarios) but can be configured to support the uniqueness capabilities. We will look at this feature from two aspects; that of a mapping developer and that of a developer (of procedures or KMs). 1. Firstly as a Mapping Developer..... 1.1 Control when uniqueness is enabled A new property is available to set unique name generation on/off. When unique names have been enabled for a mapping, all temporary names used by the collection and integration objects will be generated using unique names. This property is presented as a check-box in the Property Inspector for a deployment specification. 1.2 Handle cleanup after successful execution Provided that all temporary objects that are created have a corresponding drop statement then all of the temporary objects should be removed during a successful execution. This should be the case with the KMs developed by Oracle. 1.3 Handle cleanup after unsuccessful execution If an execution failed in ODI 11g then temporary tables would have been left around and cleaned up in the subsequent run. In ODI 12c, KM tasks can now have a cleanup-type task which is executed even after a failure in the main tasks. These cleanup tasks will be executed even on failure if the property 'Remove Temporary Objects on Error' is set. If the agent was to crash and not be able to execute this task, then there is an ODI tool (OdiRemoveTemporaryObjects here) you can invoke to cleanup the tables - it supports date ranges and the like. That's all there is to it from the aspect of the mapping developer it's much, much simpler and straightforward. You can now execute the same mapping concurrently or execute many mappings using the same resource concurrently without worrying about conflict. 2. Secondly as a Procedure or KM Developer..... In the ODI Operator the executed code shows the actual name that is generated - you can also see the runtime code prior to execution (introduced in 11.1.1.7), for example below in the code type I selected 'Pre-executed Code' this lets you see the code about to be processed and you can also see the executed code (which is the default view). References to the collection (C$) and integration (I$) names will be automatically made unique by using the odiRef APIs - these objects will have unique names whenever concurrency has been enabled for a particular mapping deployment specification. It's also possible to use name uniqueness functions in procedures and your own KMs. 2.1 New uniqueness tags You can also make your own temporary objects have unique names by explicitly including either %UNIQUE_STEP_TAG or %UNIQUE_SESSION_TAG in the name passed to calls to the odiRef APIs. Such names would always include the unique tag regardless of the concurrency setting. To illustrate, let's look at the getObjectName() method. At <% expansion time, this API will append %UNIQUE_STEP_TAG to the object name for collection and integration tables. The name parameter passed to this API may contain %UNIQUE_STEP_TAG or %UNIQUE_SESSION_TAG. This API always generates to the <? version of getObjectName() At execution time this API will replace the unique tag macros with a string that is unique to the current execution scope. The returned name will conform to the name-length restriction for the target technology, and its pattern for the unique tag. Any necessary truncation will be performed against the initial name for the object and any other fixed text that may have been specified. Examples are:- <?=odiRef.getObjectName("L", "%COL_PRFEMP%UNIQUE_STEP_TAG", "D")?> SCOTT.C$_EABH7QI1BR1EQI3M76PG9SIMBQQ <?=odiRef.getObjectName("L", "EMP%UNIQUE_STEP_TAG_AE", "D")?> SCOTT.EMPAO96Q2JEKO0FTHQP77TMSAIOSR_ Methods which have this kind of support include getFrom, getTableName, getTable, getObjectShortName and getTemporaryIndex. There are APIs for retrieving this tag info also, the getInfo API has been extended with the following properties (the UNIQUE* properties can also be used in ODI procedures); UNIQUE_STEP_TAG - Returns the unique value for the current step scope, e.g. 5rvmd8hOIy7OU2o1FhsF61 Note that this will be a different value for each loop-iteration when the step is in a loop. UNIQUE_SESSION_TAG - Returns the unique value for the current session scope, e.g. 6N38vXLrgjwUwT5MseHHY9 IS_CONCURRENT - Returns info about the current mapping, will return 0 or 1 (only in % phase) GUID_SRC_SET - Returns the UUID for the current source set/execution unit (only in % phase) The getPop API has been extended with the IS_CONCURRENT property which returns info about an mapping, will return 0 or 1. 2.2 Additional APIs Some new APIs are provided including getFormattedName which will allow KM developers to construct a name from fixed-text or ODI symbols that can be optionally truncate to a max length and use a specific encoding for the unique tag. It has syntax getFormattedName(String pName[, String pTechnologyCode]) This API is available at both the % and the ? phase. The format string can contain the ODI prefixes that are available for getObjectName(), e.g. %INT_PRF, %COL_PRF, %ERR_PRF, %IDX_PRF alongwith %UNIQUE_STEP_TAG or %UNIQUE_SESSION_TAG. The latter tags will be expanded into a unique string according to the specified technology. Calls to this API within the same execution context are guaranteed to return the same unique name provided that the same parameters are passed to the call. e.g. <%=odiRef.getFormattedName("%COL_PRFMY_TABLE%UNIQUE_STEP_TAG_AE", "ORACLE")%> <?=odiRef.getFormattedName("%COL_PRFMY_TABLE%UNIQUE_STEP_TAG_AE", "ORACLE")?> C$_MY_TAB7wDiBe80vBog1auacS1xB_AE <?=odiRef.getFormattedName("%COL_PRFMY_TABLE%UNIQUE_STEP_TAG.log", "FILE")?> C2_MY_TAB7wDiBe80vBog1auacS1xB.log 2.3 Name length generation As part of name generation, the length of the generated name will be compared with the maximum length for the target technology and truncation may need to be applied. When a unique tag is included in the generated string it is important that uniqueness is not compromised by truncation of the unique tag. When a unique tag is NOT part of the generated name, the name will be truncated by removing characters from the end - this is the existing 11g algorithm. When a unique tag is included, the algorithm will first truncate the <postfix> and if necessary the <prefix>. It is recommended that users will ensure there is sufficient uniqueness in the <prefix> section to ensure uniqueness of the final resultant name. SUMMARY To summarize, ODI 12c make it much simpler to utilize mappings in concurrent cases and provides APIs for helping developing any procedures or custom knowledge modules in such a way they can be used in highly concurrent, parallel scenarios.

Read the article

SQL SERVER – Introduction to SQL Server 2014 In-Memory OLTP

- by Pinal Dave

In SQL Server 2014 Microsoft has introduced a new database engine component called In-Memory OLTP aka project “Hekaton” which is fully integrated into the SQL Server Database Engine. It is optimized for OLTP workloads accessing memory resident data. In-memory OLTP helps us create memory optimized tables which in turn offer significant performance improvement for our typical OLTP workload. The main objective of memory optimized table is to ensure that highly transactional tables could live in memory and remain in memory forever without even losing out a single record. The most significant part is that it still supports majority of our Transact-SQL statement. Transact-SQL stored procedures can be compiled to machine code for further performance improvements on memory-optimized tables. This engine is designed to ensure higher concurrency and minimal blocking. In-Memory OLTP alleviates the issue of locking, using a new type of multi-version optimistic concurrency control. It also substantially reduces waiting for log writes by generating far less log data and needing fewer log writes. Points to remember Memory-optimized tables refer to tables using the new data structures and key words added as part of In-Memory OLTP. Disk-based tables refer to your normal tables which we used to create in SQL Server since its inception. These tables use a fixed size 8 KB pages that need to be read from and written to disk as a unit. Natively compiled stored procedures refer to an object Type which is new and is supported by in-memory OLTP engine which convert it into machine code, which can further improve the data access performance for memory –optimized tables. Natively compiled stored procedures can only reference memory-optimized tables, they can’t be used to reference any disk –based table. Interpreted Transact-SQL stored procedures, which is what SQL Server has always used. Cross-container transactions refer to transactions that reference both memory-optimized tables and disk-based tables. Interop refers to interpreted Transact-SQL that references memory-optimized tables. Using In-Memory OLTP In-Memory OLTP engine has been available as part of SQL Server 2014 since June 2013 CTPs. Installation of In-Memory OLTP is part of the SQL Server setup application. The In-Memory OLTP components can only be installed with a 64-bit edition of SQL Server 2014 hence they are not available with 32-bit editions. Creating Databases Any database that will store memory-optimized tables must have a MEMORY_OPTIMIZED_DATA filegroup. This filegroup is specifically designed to store the checkpoint files needed by SQL Server to recover the memory-optimized tables, and although the syntax for creating the filegroup is almost the same as for creating a regular filestream filegroup, it must also specify the option CONTAINS MEMORY_OPTIMIZED_DATA. Here is an example of a CREATE DATABASE statement for a database that can support memory-optimized tables: CREATE DATABASE InMemoryDB ON PRIMARY(NAME = [InMemoryDB_data], FILENAME = 'D:\data\InMemoryDB_data.mdf', size=500MB), FILEGROUP [SampleDB_mod_fg] CONTAINS MEMORY_OPTIMIZED_DATA (NAME = [InMemoryDB_mod_dir], FILENAME = 'S:\data\InMemoryDB_mod_dir'), (NAME = [InMemoryDB_mod_dir], FILENAME = 'R:\data\InMemoryDB_mod_dir') LOG ON (name = [SampleDB_log], Filename='L:\log\InMemoryDB_log.ldf', size=500MB) COLLATE Latin1_General_100_BIN2; Above example code creates files on three different drives (D: S: and R:) for the data files and in memory storage so if you would like to run this code kindly change the drive and folder locations as per your convenience. Also notice that binary collation was specified as Windows (non-SQL). BIN2 collation is the only collation support at this point for any indexes on memory optimized tables. It is also possible to add a MEMORY_OPTIMIZED_DATA file group to an existing database, use the below command to achieve the same. ALTER DATABASE AdventureWorks2012 ADD FILEGROUP hekaton_mod CONTAINS MEMORY_OPTIMIZED_DATA; GO ALTER DATABASE AdventureWorks2012 ADD FILE (NAME='hekaton_mod', FILENAME='S:\data\hekaton_mod') TO FILEGROUP hekaton_mod; GO Creating Tables There is no major syntactical difference between creating a disk based table or a memory –optimized table but yes there are a few restrictions and a few new essential extensions. Essentially any memory-optimized table should use the MEMORY_OPTIMIZED = ON clause as shown in the Create Table query example. DURABILITY clause (SCHEMA_AND_DATA or SCHEMA_ONLY) Memory-optimized table should always be defined with a DURABILITY value which can be either SCHEMA_AND_DATA or SCHEMA_ONLY the former being the default. A memory-optimized table defined with DURABILITY=SCHEMA_ONLY will not persist the data to disk which means the data durability is compromised whereas DURABILITY= SCHEMA_AND_DATA ensures that data is also persisted along with the schema. Indexing Memory Optimized Table A memory-optimized table must always have an index for all tables created with DURABILITY= SCHEMA_AND_DATA and this can be achieved by declaring a PRIMARY KEY Constraint at the time of creating a table. The following example shows a PRIMARY KEY index created as a HASH index, for which a bucket count must also be specified. CREATE TABLE Mem_Table ( [Name] VARCHAR(32) NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000), [City] VARCHAR(32) NULL, [State_Province] VARCHAR(32) NULL, [LastModified] DATETIME NOT NULL, ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA); Now as you can see in the above query example we have used the clause MEMORY_OPTIMIZED = ON to make sure that it is considered as a memory optimized table and not just a normal table and also used the DURABILITY Clause= SCHEMA_AND_DATA which means it will persist data along with metadata and also you can notice this table has a PRIMARY KEY mentioned upfront which is also a mandatory clause for memory-optimized tables. We will talk more about HASH Indexes and BUCKET_COUNT in later articles on this topic which will be focusing more on Row and Index storage on Memory-Optimized tables. So stay tuned for that as well. Now as we covered the basics of Memory Optimized tables and understood the key things to remember while using memory optimized tables, let’s explore more using examples to understand the Performance gains using memory-optimized tables. I will be using the database which i created earlier in this article i.e. InMemoryDB in the below Demo Exercise. USE InMemoryDB GO -- Creating a disk based table CREATE TABLE dbo.Disktable ( Id INT IDENTITY, Name CHAR(40) ) GO CREATE NONCLUSTERED INDEX IX_ID ON dbo.Disktable (Id) GO -- Creating a memory optimized table with similar structure and DURABILITY = SCHEMA_AND_DATA CREATE TABLE dbo.Memorytable_durable ( Id INT NOT NULL PRIMARY KEY NONCLUSTERED Hash WITH (bucket_count =1000000), Name CHAR(40) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA) GO -- Creating an another memory optimized table with similar structure but DURABILITY = SCHEMA_Only CREATE TABLE dbo.Memorytable_nondurable ( Id INT NOT NULL PRIMARY KEY NONCLUSTERED Hash WITH (bucket_count =1000000), Name CHAR(40) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_only) GO -- Now insert 100000 records in dbo.Disktable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Disktable(Name) VALUES('sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END -- Do the same inserts for Memory table dbo.Memorytable_durable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Memorytable_durable VALUES(@i_t, 'sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END -- Now finally do the same inserts for Memory table dbo.Memorytable_nondurable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Memorytable_nondurable VALUES(@i_t, 'sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END The above 3 Inserts took 1.20 minutes, 54 secs, and 2 secs respectively to insert 100000 records on my machine with 8 Gb RAM. This proves the point that memory-optimized tables can definitely help businesses achieve better performance for their highly transactional business table and memory- optimized tables with Durability SCHEMA_ONLY is even faster as it does not bother persisting its data to disk which makes it supremely fast. Koenig Solutions is one of the few organizations which offer IT training on SQL Server 2014 and all its updates. Now, I leave the decision on using memory_Optimized tables on you, I hope you like this article and it helped you understand the fundamentals of IN-Memory OLTP . Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Koenig

Read the article

Scheduling thread tiles with C++ AMP

- by Daniel Moth

This post assumes you are totally comfortable with, what some of us call, the simple model of C++ AMP, i.e. you could write your own matrix multiplication. We are now ready to explore the tiled model, which builds on top of the non-tiled one. Tiling the extent We know that when we pass a grid (which is just an extent under the covers) to the parallel_for_each call, it determines the number of threads to schedule and their index values (including dimensionality). For the single-, two-, and three- dimensional cases you can go a step further and subdivide the threads into what we call tiles of threads (others may call them thread groups). So here is a single-dimensional example: extent<1> e(20); // 20 units in a single dimension with indices from 0-19 grid<1> g(e); // same as extent tiled_grid<4> tg = g.tile<4>(); …on the 3rd line we subdivided the single-dimensional space into 5 single-dimensional tiles each having 4 elements, and we captured that result in a concurrency::tiled_grid (a new class in amp.h). Let's move on swiftly to another example, in pictures, this time 2-dimensional: So we start on the left with a grid of a 2-dimensional extent which has 8*6=48 threads. We then have two different examples of tiling. In the first case, in the middle, we subdivide the 48 threads into tiles where each has 4*3=12 threads, hence we have 2*2=4 tiles. In the second example, on the right, we subdivide the original input into tiles where each has 2*2=4 threads, hence we have 4*3=12 tiles. Notice how you can play with the tile size and achieve different number of tiles. The numbers you pick must be such that the original total number of threads (in our example 48), remains the same, and every tile must have the same size. Of course, you still have no clue why you would do that, but stick with me. First, we should see how we can use this tiled_grid, since the parallel_for_each function that we know expects a grid. Tiled parallel_for_each and tiled_index It turns out that we have additional overloads of parallel_for_each that accept a tiled_grid instead of a grid. However, those overloads, also expect that the lambda you pass in accepts a concurrency::tiled_index (new in amp.h), not an index<N>. So how is a tiled_index different to an index? A tiled_index object, can have only 1 or 2 or 3 dimensions (matching exactly the tiled_grid), and consists of 4 index objects that are accessible via properties: global, local, tile_origin, and tile. The global index is the same as the index we know and love: the global thread ID. The local index is the local thread ID within the tile. The tile_origin index returns the global index of the thread that is at position 0,0 of this tile, and the tile index is the position of the tile in relation to the overall grid. Confused? Here is an example accompanied by a picture that hopefully clarifies things: array_view<int, 2> data(8, 6, p_my_data); parallel_for_each(data.grid.tile<2,2>(), [=] (tiled_index<2,2> t_idx) restrict(direct3d) { /* todo */ }); Given the code above and the picture on the right, what are the values of each of the 4 index objects that the t_idx variables exposes, when the lambda is executed by T (highlighted in the picture on the right)? If you can't work it out yourselves, the solution follows: t_idx.global = index<2> (6,3) t_idx.local = index<2> (0,1) t_idx.tile_origin = index<2> (6,2) t_idx.tile = index<2> (3,1) Don't move on until you are comfortable with this… the picture really helps, so use it. Tiled Matrix Multiplication Example – part 1 Let's paste here the C++ AMP matrix multiplication example, bolding the lines we are going to change (can you guess what the changes will be?) 01: void MatrixMultiplyTiled_Part1(vector<float>& vC, const vector<float>& vA, const vector<float>& vB, int M, int N, int W) 02: { 03: 04: array_view<const float,2> a(M, W, vA); 05: array_view<const float,2> b(W, N, vB); 06: array_view<writeonly<float>,2> c(M, N, vC); 07: parallel_for_each(c.grid, 08: [=](index<2> idx) restrict(direct3d) { 09: 10: int row = idx[0]; int col = idx[1]; 11: float sum = 0.0f; 12: for(int i = 0; i < W; i++) 13: sum += a(row, i) * b(i, col); 14: c[idx] = sum; 15: }); 16: } To turn this into a tiled example, first we need to decide our tile size. Let's say we want each tile to be 16*16 (which assumes that we'll have at least 256 threads to process, and that c.grid.extent.size() is divisible by 256, and moreover that c.grid.extent[0] and c.grid.extent[1] are divisible by 16). So we insert at line 03 the tile size (which must be a compile time constant). 03: static const int TS = 16; ...then we need to tile the grid to have tiles where each one has 16*16 threads, so we change line 07 to be as follows 07: parallel_for_each(c.grid.tile<TS,TS>(), ...that means that our index now has to be a tiled_index with the same characteristics as the tiled_grid, so we change line 08 08: [=](tiled_index<TS, TS> t_idx) restrict(direct3d) { ...which means, without changing our core algorithm, we need to be using the global index that the tiled_index gives us access to, so we insert line 09 as follows 09: index<2> idx = t_idx.global; ...and now this code just works and it is tiled! Closing thoughts on part 1 The process we followed just shows the mechanical transformation that can take place from the simple model to the tiled model (think of this as step 1). In fact, when we wrote the matrix multiplication example originally, the compiler was doing this mechanical transformation under the covers for us (and it has additional smarts to deal with the cases where the total number of threads scheduled cannot be divisible by the tile size). The point is that the thread scheduling is always tiled, even when you use the non-tiled model. But with this mechanical transformation, we haven't gained anything… Hint: our goal with explicitly using the tiled model is to gain even more performance. In the next post, we'll evolve this further (beyond what the compiler can automatically do for us, in this first release), so you can see the full usage of the tiled model and its benefits… Comments about this post by Daniel Moth welcome at the original blog.

Read the article

Master-slave vs. peer-to-peer archictecture: benefits and problems

- by Ashok_Ora

Normal 0 false false false EN-US X-NONE X-NONE Almost two decades ago, I was a member of a database development team that introduced adaptive locking. Locking, the most popular concurrency control technique in database systems, is pessimistic. Locking ensures that two or more conflicting operations on the same data item don’t “trample” on each other’s toes, resulting in data corruption. In a nutshell, here’s the issue we were trying to address. In everyday life, traffic lights serve the same purpose. They ensure that traffic flows smoothly and when everyone follows the rules, there are no accidents at intersections. As I mentioned earlier, the problem with typical locking protocols is that they are pessimistic. Regardless of whether there is another conflicting operation in the system or not, you have to hold a lock! Acquiring and releasing locks can be quite expensive, depending on how many objects the transaction touches. Every transaction has to pay this penalty. To use the earlier traffic light analogy, if you have ever waited at a red light in the middle of nowhere with no one on the road, wondering why you need to wait when there’s clearly no danger of a collision, you know what I mean. The adaptive locking scheme that we invented was able to minimize the number of locks that a transaction held, by detecting whether there were one or more transactions that needed conflicting eyou could get by without holding any lock at all. In many “well-behaved” workloads, there are few conflicts, so this optimization is a huge win. If, on the other hand, there are many concurrent, conflicting requests, the algorithm gracefully degrades to the “normal” behavior with minimal cost. We were able to reduce the number of lock requests per TPC-B transaction from 178 requests down to 2! Wow! This is a dramatic improvement in concurrency as well as transaction latency. The lesson from this exercise was that if you can identify the common scenario and optimize for that case so that only the uncommon scenarios are more expensive, you can make dramatic improvements in performance without sacrificing correctness. So how does this relate to the architecture and design of some of the modern NoSQL systems? NoSQL systems can be broadly classified as master-slave sharded, or peer-to-peer sharded systems. NoSQL systems with a peer-to-peer architecture have an interesting way of handling changes. Whenever an item is changed, the client (or an intermediary) propagates the changes synchronously or asynchronously to multiple copies (for availability) of the data. Since the change can be propagated asynchronously, during some interval in time, it will be the case that some copies have received the update, and others haven’t. What happens if someone tries to read the item during this interval? The client in a peer-to-peer system will fetch the same item from multiple copies and compare them to each other. If they’re all the same, then every copy that was queried has the same (and up-to-date) value of the data item, so all’s good. If not, then the system provides a mechanism to reconcile the discrepancy and to update stale copies. So what’s the problem with this? There are two major issues: First, IT’S HORRIBLY PESSIMISTIC because, in the common case, it is unlikely that the same data item will be updated and read from different locations at around the same time! For every read operation, you have to read from multiple copies. That’s a pretty expensive, especially if the data are stored in multiple geographically separate locations and network latencies are high. Second, if the copies are not all the same, the application has to reconcile the differences and propagate the correct value to the out-dated copies. This means that the application program has to handle discrepancies in the different versions of the data item and resolve the issue (which can further add to cost and operation latency). Resolving discrepancies is only one part of the problem. What if the same data item was updated independently on two different nodes (copies)? In that case, due to the asynchronous nature of change propagation, you might land up with different versions of the data item in different copies. In this case, the application program also has to resolve conflicts and then propagate the correct value to the copies that are out-dated or have incorrect versions. This can get really complicated. My hunch is that there are many peer-to-peer-based applications that don’t handle this correctly, and worse, don’t even know it. Imagine have 100s of millions of records in your database – how can you tell whether a particular data item is incorrect or out of date? And what price are you willing to pay for ensuring that the data can be trusted? Multiple network messages per read request? Discrepancy and conflict resolution logic in the application, and potentially, additional messages? All this overhead, when all you were trying to do was to read a data item. Wouldn’t it be simpler to avoid this problem in the first place? Master-slave architectures like the Oracle NoSQL Database handles this very elegantly. A change to a data item is always sent to the master copy. Consequently, the master copy always has the most current and authoritative version of the data item. The master is also responsible for propagating the change to the other copies (for availability and read scalability). Client drivers are aware of master copies and replicas, and client drivers are also aware of the “currency” of a replica. In other words, each NoSQL Database client knows how stale a replica is. This vastly simplifies the job of the application developer. If the application needs the most current version of the data item, the client driver will automatically route the request to the master copy. If the application is willing to tolerate some staleness of data (e.g. a version that is no more than 1 second out of date), the client can easily determine which replica (or set of replicas) can satisfy the request, and route the request to the most efficient copy. This results in a dramatic simplification in application logic and also minimizes network requests (the driver will only send the request to exactl the right replica, not many). So, back to my original point. A well designed and well architected system minimizes or eliminates unnecessary overhead and avoids pessimistic algorithms wherever possible in order to deliver a highly efficient and high performance system. If you’ve every programmed an Oracle NoSQL Database application, you’ll know the difference! /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

Read the article

Python: Random is barely random at all?

- by orokusaki

I did this to test the randomness of randint: >>> from random import randint >>> >>> uniques = [] >>> for i in range(4500): # You can see I optimistic. ... x = randint(500, 5000) ... if x in uniques: ... raise Exception('We duped ' + str(x) + ' at iteration number ' + str(i)) ... uniques.append(x) ... Traceback (most recent call last): File "(stdin)", line 4, in (module) Exception: 'We duped 4061 at iteration number 67 I tried about 10 times more and the best result I got was 121 iterations before a repeater. Is this the best sort of result you can get from the standard library?

Read the article

Passing a SAFEARRAY from C# to COM

- by SlavaGu

I use 3rd party COM to find faces in a picture. One of the methods has the following signature, from SDK: long FindMultipleFaces( IUnknown* pIDibImage, VARIANTARG* FacePositionArray ); Parameters: pIDibImage[in] - The image to search. FacePositionArray[out]- The array of FacePosition2 objects into which face information is placed. This array is in a safe array (VARIANT) of type VT_UNKNOWN. The size of the array dictates the maximum number of faces for which to search. which translates into the following C# method signature (from metadata): int FindMultipleFaces(object pIDibImage, ref object pIFacePositions); Being optimistic I call it the following way but get an exception that the memory is corrupt. The exception is thrown only when a face is present in the image. FacePosition2[] facePositions = new FacePosition2[10]; object positions = facePositions; int faceCount = FaceLocator.FindMultipleFaces(dibImage, ref positions); What's the right way to pass SAFEARRAY to unmanaged code?

Read the article

What are jQuery best practices regarding Ajax convenience methods and error handling?

- by JonathanHayward

Let's suppose, for an example, that I want to partly clone Gmail's interface with jQuery Ajax and implement periodic auto-saving as well as sending. And in particular, let us suppose that I care about error handling, expecting network and other errors, and instead of just being optimistic I want sensible handling of different errors. If I use the "low-level" feature of $.ajax() then it's clear how to specify an error callback, but the convenience methods of $.get(), $.post(), and .load() do not allow an error callback to be specified. What are the best practices for pessimistic error handling? Is it by registering a .ajaxError() with certain wrapped sets, or an introspection-style global error handler in $.ajaxSetup()? What would the relevant portions of code look like to initiate an autosave so that a "could not autosave" type warning is displayed if an attempted autosave fails, and perhaps a message that is customized to the type of error? Thanks,

Read the article

Linq2SQL using Update StoredProcedure

- by PeterTheNiceGuy

We use queries generated by Linq for data retrieval but for INSERT and UPDATE we do not allow generated SQL, but restrict to the use of stored procedures. I connected the Update and the Insert behaviour in the DBML to the stored procedures. The procedures are called, the data gets inserted/updated = all if fine, except in the case of optimistic concurrency. If a record was changed between retrieval and update, the update should fail. When Linq generates the Update statement itself, it throws a ChangeConflictException as expected, but using the stored procedure no Exception is thrown. Thanks a lot for any help on this!

Read the article

Getting consecutive version numbers from Hibernate's @Version usage once per transaction

- by Cheradenine

We use Hibernate with the following version definition for optimistic locking et. al: <version name="version" access="field" column="VERSION" type="long" unsaved-value="negative"/> This is fine and dandy; however, there is one small problem, which is that the first version for some entities is '0', and for others, it is '1'. Why this is happening, is that for some object graphs, an entity will be subject to both onSave and flushDirty - this is reasonable, such as if two object are circular dependencies. However, the version number gets incremented on both occasions, leading to the above '0' / '1' discrepancy. I'd really like the version number only to ever increment once per transaction. However, I can't see a simple way to do this in the hibernate versioning implementation, without hacking about with an Interceptor (which was how I generated a column value for version before, but I wanted hibernate to do it itself)..

Read the article

Using property file in hibernate mapping

- by Zoltan Hamori

Hi, I have a two nodes environment using the same database. In the database there is a resource table like RESOURCE_ID, CODE, NODE The content of the NODE column can be 1 or 2 depending on which node can use it. As I need to deploy the same ear to the two nodes, I would like to map this table like this: <hibernate-mapping> <class name="ResourceVO" table="RESOURCE" dynamic-update="true" optimistic-lock="dirty" where="NODE=${node.value}" > I would like to store the node.value property on the file system, so the instances could identify which resource to use. Is it possible in hibernate?

Read the article

Articles about replication schemes/algorithms?

- by jkff

I'm designing an hierarchical distributed system (every node has zero or more "master" nodes to which it propagates its current data). The data gets continuously updated and I'd like to guarantee that at least N nodes have almost-current data at any given time. I do not need complete consistency, only eventual consistency (t.i. for any time instant, the current snapshot of data should eventually appear on at least N nodes. It is tricky to define the term "current" here, but still). Nodes may fail and go back up at any moment, and there is no single "central" node. O overflowers! Point me to some good papers describing replication schemes. I've so far found one: Consistency Management in Optimistic Replication Algorithms

Read the article

When and why are you planning to upgrade to Python 3.0?

- by Tomislav Mutak

Python 3.0 (aka Python 3000, Py3k, etc) is now available. When and why are you planning on porting your project or code to the new Python? edit: I'm particularly interested in any features that don't exist in 2.6 that make porting worth it. Right now seems like a lot of negatives (x hasn't been ported yet), but I don't know what people see as the positives. Regarding "when", I'm interested in people's thoughts that the first step to porting is to have "excellent test coverage" which seems a bit optimistic for some projects.

Read the article

What is the preferred sql server column definition to use for a LINQ to SQL Version property?

- by Mike Two

We are using the IsVersion property on the ColumnAttribute on a property in a LINQ to SQL class for optimistic concurrency checks. What should the column definition be in the database? Currently we are using version_number int NOT NULL IDENTITY (1, 1) Do we need Identity? Can we get LINQ to SQL to update the version number for us? The only issue with Identity is that every row has a different number. We'd like to see the number increment by 1 when the row is updated.

Read the article

hibernate versioning parent entity

- by Priit

Consider two entities Parent and Child. Child is part of Parent's transient collection Child has a ManyToOne mapping to parent with FetchType.LAZY Both are displayed on the same form to a user. When user saves the data we first update Parent instance and then Child collection (both using merge). Now comes the tricky part. When user modifies only Child property on the form then hibernate dirty checking does not update Parent instance and thus does not increase optimistic locking version number for that entity. I would like to see situation where only Parent is versioned and every time I call merge for Parent then version is always updated even if actual update is not executed in db.

Read the article

Does Python Django support custom SQL and denormalized databases with no Foreign Key relationships?

- by Jay

I've just started learning Python Django and have a lot of experience building high traffic websites using PHP and MySQL. What worries me so far is Python's overly optimistic approach that you will never need to write custom SQL and that it automatically creates all these Foreign Key relationships in your database. The one thing I've learned in the last few years of building Chess.com is that its impossible to NOT write custom SQL when you're dealing with something like MySQL that frequently needs to be told what indexes it should use (or avoid), and that Foreign Keys are a death sentence. Percona's strongest recommendation was for us to remove all FKs for optimal performance. Is there a way in Django to do this in the models file? create relationships without creating actual DB FKs? Or is there a way to start at the database level, design/create my database, and then have Django reverse engineer the models file?

Read the article

Is it possible to hide the cursor in a webpage using CSS or Javascript?

- by yeyeyerman

I want to hide the cursor when showing a webpage that is meant to display information in a building hall. It doesn't have to be interactive at all. I tried with the cursor property and a transparent cursor image but I didn't make it work. Does anybody know if this can be done? I suppose this can be thought as a security threat for a user that can't know where he is clicking on, so I'm not very optimistic... Thank you!

Read the article

How can this be done with (N)Hibernate?

- by Vilx-

I'm creating a windows forms application with NHibernate. It's an MDI application, so there is no limit to how many forms the user can have open at the same time (probably many). For most forms I want to have an "OK" and a "Cancel" button. Both close the form, but "OK" also saves the modified data to the DB. The forms can be pretty complex, and the modifications are likely to touch a whole graph of objects, adding some, deleting some, and changing some more. It would be good if the changes could be automatically detected and persisted as needed, without the need to manually keep track of each of them. What would be a good way to do this? Extra information: I can make whatever DB schema I want. I'm using MSSQL 2008 and currently have decided for GUID primary keys (with guid.comb generator) and a TIMESTAMP column for optimistic concurrency. I tried to simply set FlushMode of a NHibernate ISession to Never, doing all modifications as needed, and then calling Flush() if the user clicked OK. But that didn't work.

Read the article

How is Core Data detecting the conflicts, actually?

- by brainfrog

Apple says about -detectConflictsForObject: If on the next invocation of save: object has been modified in its persistent store, the save fails. This allows optimistic locking for unchanged objects. Conflict detection is always performed on changed or deleted objects. So what does this mean? If I simply modify an managed object and then save the context, there is always a conflict detection happening? Does this conflict detection simply compare the timestamps of the "records" to see if the "new" data is actually "old"? Is that a conflict?

Read the article

VC++ to C# migration guidelines/approaches/Issues

- by KSH

Hi all, We are planning to move few of our VC++ Legacy products to C# with .NET platform.. I am in the process of collecting the relavent information before making the proposal to give optimistic and effective approach to clients. Am looking for the following details. Any general guidelines in migration of VC++ to C#.NET What are the issues that a team can face when we take up this activity Are there any existing approaches available ? I believe many might have tried but may not have detailed information, but consolidating this under this would help not only me but anyone who look for these information. Any good / valid resources available on internet? Any suggestions from Microsoft team if any Microsoft people in this group? Architecture, components design approaches, etc. Please help me in getting these information, each penny would help me to gain good understanding.. Thanks in advance to those who will share their wisdom thorough this query.

Read the article

CodePlex Daily Summary for Friday, May 14, 2010

CodePlex Daily Summary for Friday, May 14, 2010New ProjectsCampfire#: Campfire# is a campfire client written in .NET 4.0 using WPF, which uses the Campfire API.CHESS: Systematic Concurrency Testing: CHESS is a tool for systematic and disciplined concurrency testing. Given a concurrent test, CHESS systematically enumerates the possible thread sc...cmpp: cmppcycloid: Arcanoid gameDotNetNuke® C#: The DotNetNuke® project is developed and maintained on a Visual Basic codebase, however a C# version has always been a popular request. This is a ...EasyBuildingCMS.NET: EasyBuildingCMS is an easy use content management system.fluidCMS: Provide for flexible management of web content that is not tightly integrated with the layout and rendering of sites that consume the content.Golem: An automation tool oriented to localization engineering environmentHB Batch Encoder Mk 2: HandBrake Batch Encoder Mk II This Program was adapted from an original project downloaded from codeplex by the name of "Handbrake Batch Encoder"...Integrating Social Media Networks: This is part of my pos graduation project.Ketonic: The Ketonic project aims to improve development of websites based on the Kentico CMS. LinkSharp: LinkSharp is a short-URL provider that can be used to generate short static non changing URL's. The web interface allows you to easily add / edit /...PUC NET (C++ Network Library - PUC Minas): This is an Academic Library for an Easy Development of Applications and Games based on Network Communication.Regular Expression Tester: Small utility for testing regular expressionsSharePoint User Management WebPart: SharePoint User Management WebPartSharpBox: SharpBox makes it easier for .NET developers to interact with existing cloud storage service, e.g. DropBox or Amazon S3Snipivit: Snipivit is a snippet manager service and VS2010 plugin that allows small development teams to store all their code snippets on a central database,...Software Factories Applied: Software Factories Applied is a project collecting the companion bits for the eponymous book to be published by Wiley & Sons in 2011. The authors ...The Ping Master: A service that periodically pings network addresses and allows the running of command line type utilities in response to success or failure.Title Safe Region Checker: A simple utility for XNA developers to check screenshots from games intended for release on the LIVE Marketplace for "title safe" region compliance...Trial project: sky is blueUyghur Named Date: Generate Uyghur named date string. ئۇيغۇرچە ئاي ناملىق چىسلا ھاسىل قىلىشWildcard Search Web Part for SharePoint 2010: The Wildcard Search web part for MOSS 2007 was wildly successful. Although, SharePoint 2010 has built-in wildcard searching functionality, the out...在线Office控件 Online Offical Control: 在线Office控件软件作品发布平台: SoftwarePublishPlatform 软件作品发布平台New ReleasesDemina: Demina Binaries version 0.1: Demina binaries are now available. This release (version 0.1) is an alpha version. Please report any bugs for extermination.EasyTFS: EasyTfs 1.0 Beta 2: Added cache refreshing when contents are updated rather than just every 10 minutes. Added window title based on currently-open case. Added attachme...Extending C# editor - Outlining, classification: Initial release: Initial releaseHB Batch Encoder Mk 2: HB Batch Encoder Mk2 v1.01: Binary release files.HB Batch Encoder Mk 2: Source Code: Source CodeHobbyBrew Mobile: Beta 2: Corretti numerosi bug, data un implementazione "approssimativa" del riscaldamento per Infusione. Aggiornamento consigliato!HouseFly controls: HouseFly controls beta 1.0.2.0: HouseFly controls relase 1.0.2.0 betaHtml Reader: Beta 2: I fixed a bug in HtmlElementCollection, Which exposed an integer enumerator, instead of enumerating through HtmlElements. I added a WPF Window tha...Html to OpenXml: HtmlToOpenXml 1.2: Fix some reported bug. See change set for description. The dll library to include in your project. The dll is signed for GAC support. Compiled wi...Infection Protection: Infection Protection 0.1: This is the final version of Infection Protection that was entered into the 2010 OGPC game competition.Jobping Url Shortener: Deploy Code 0.5.1: Deployment code for Version 0.5 This version includes our Jobping style.Jobping Url Shortener: Source Code 0.5.1: Source code for the 0.5 release. This release includes our Jobping style skin.Kooboo HTML form: Kooboo HTML form module 2.1.0.1: HTML form module contributed by member aledelgo. Add SMTP user and password authentication.KooBoo Image Galery: Beta 2: This new version corrects some issues pointed by Guoqi Zheng Some schema and folders were renamed, so it's better to uninstall the module and remo...MFCMAPI: May 2010 Release: If you just want to run the tool, get the executable. If you want to debug it, get the symbol file and the source. Build: 6.0.0.1020 The 64 bit bu...MVC Turbine: Release 2.1 for MVC2: This RTM contains the same features as v2.0 RTM plus these features: Instance Registration to IServiceLocator You can now add an instance of a typ...NazTek.Extension.Clr4: NazTek.Extension.Clr4 Binary: Binary releaseNazTek.Extension.Clr4: NazTek.Extension.Clr4 Source: Cab with source codeNSIS Autorun: NSIS Autorun 0.1.8: This release includes source code, executable binaries and example materials.Ottawa IT Day: 2010 Source Code and Presentations: During the Ottawa IT Day 2010, some of the presenters shared their code (and some presentations). This release is the culmination of all those effo...PHPWord: PHPWord 0.6.1 Beta: Changelog: Fixed Error when adding a JPEG image and opening in office 2007 Issue #1 Fixed Already defined constant PHPWORD_BASE_PATH Issue #2 F...Rapid Dictionary: Rapid Dictionary Alpha 2.0: Release Notes * Try auto updatable version: http://install.rapiddict.com/index.html Rapid Dictionary Alpha 2.0 includes such functionality: ...Shake - C# Make: Shake v0.1.18: Core changes. Process wrapper class, console logger, etc.SharpBox: SharpBox-Trunk: This is the SharpBox build from the trunk source branch!SharpBox: SharpBox-Trunk-Initial-Source: The initial source code, will be updated from time to timeSpackle.NET: 4.0.0.0 Release: This new drop contains the following A CreateBigInteger() method on SecureRandom to create random BigInteger values. An extension method to prop...StreamInsight example queries, input adapters and output adapters: StreamInsight Examples for V1.0 RTM: Zipped source code.The Ping Master: v0.1.0.0: Early release of The Ping Master for test purposes. Configuration tool is unfinished and does not include an installer.Title Safe Region Checker: Title Safe Region Checker v1.0.0.1: Release 1.0 of Title Safe Region Checker. No known bugs or problems. File is a zipped directory containing the necessary installation files.TortoiseHg: TortoiseHg 1.0.3: This is a bug fix release, we recommend all users upgrade to 1.0.3Usa*Usa Libraly: Smart.Windows.Navigation 0.4: Smart.Windows.Navigation simple navigation library ver 0.4.0. Include Windows Forms & Compact Framework samples. Information - Smart.Windows.Mvc ...VCC: Latest build, v2.1.30513.0: Automatic drop of latest buildWabbitStudio Z80 Software Tools: Wabbitcode: Wabbitcode is an Z80 Assembly IDE for Windows, OS X, and Linux. Built to take full advantage of the features of SPASM and Wabbitemu, Wabbitcode has...white: Release 0.20: Source Code: https://white-project.googlecode.com/svn/tags/0.20 Add few more keyboard keys like windows button and F13-F24. Fixed bugs for keyboar...Wildcard Search Web Part for SharePoint 2010: Version 1.0 Release 1: This is the initial release of the Wildcard Search Web Part for SharePoint 2010. All queries will be issued as wildcards unless disabled with the ...Windows Azure Command-line Tools for PHP Developers: Windows Azure Command-line Tools May 2010 Update: May 2010 Update – May 13, 2010 We are pleased to announced the May 2010 update of Windows Azure Command-Line Tools. In addition to bug fixes and i...WinXmlCook: WinXmlCook 2.1: Version 2.1 released!Xrns2XMod: Xrns2XMod 1.1: some source code optimization在线Office控件 Online Offical Control: SPOffice2.0Release: 该版本在MS Office2003/2007，WPS2009，WPS2010下测试通过Most Popular ProjectsRawrWBFS ManagerAJAX Control ToolkitMicrosoft SQL Server Product Samples: DatabaseSilverlight ToolkitWindows Presentation Foundation (WPF)patterns & practices – Enterprise LibraryMicrosoft SQL Server Community & SamplesPHPExcelASP.NETMost Active Projectspatterns & practices – Enterprise LibraryMirror Testing SystemRawrBlogEngine.NETPHPExcelMicrosoft Biology FoundationwhiteWindows Azure Command-line Tools for PHP DevelopersStyleCopShake - C# Make

Search Results

Search found 908 results on 37 pages for 'optimistic concurrency'.

Page 25/37 | < Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32 | Next Page >

- by user1074896

- by dotnetdev

- by Ben

- by Viet

- by Shaun

- by David Allan

- by Pinal Dave

- by Daniel Moth

- by Ashok_Ora

- by orokusaki

- by SlavaGu

- by JonathanHayward

- by PeterTheNiceGuy

- by Cheradenine

- by Zoltan Hamori

- by jkff

- by Tomislav Mutak

- by Mike Two

- by Priit

- by Jay

- by yeyeyerman

- by Vilx-

- by brainfrog

- by KSH

< Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32 | Next Page >