cores - Page 24 - Developer IT

Hardware-specific questions

- by overflow

I'm good at programming yet I feel like I don't know enough about the architecture of the hardware I'm working on. What does the Northbridge on the mainboard do? What does the L2 cache of my processor do? Can Windows XP use multiple processors? Not in terms of concrete multitasking in all programs but using the capacity of all cores if needed instead of always only one core. How can my processor/mainboard interact with multiple kinds of graphics/sound cards?

Read the article

Are indivisible operations still indivisible on multiprocessor and multicore systems?

- by Steve314

As per the title, plus what are the limitations and gotchas. For example, on x86 processors, alignment for most data types is optional - an optimisation rather than a requirement. That means that a pointer may be stored at an unaligned address, which in turn means that pointer might be split over a cache page boundary. Obviously this could be done if you work hard enough on any processor (picking out particular bytes etc), but not in a way where you'd still expect the write operation to be indivisible. I seriously doubt that a multicore processor can ensure that other cores can guarantee a consistent all-before or all-after view of a written pointer in this unaligned-write-crossing-a-page-boundary situation. Am I right? And are there any similar gotchas I haven't thought of?

Read the article

How does one modify the thread scheduling behavior when using Threading Building Blocks (TBB)?

- by J Teller

Does anyone know how to modify the thread scheduling (specifically affinity) when using TBB? Doing a high level analysis on a simple parallel-for application, it seems like TBB is specifying the underlying threads' affinity in a way that reduces performance. Specifically, the cores I'm running on have hyper-threading enabled, and it looks like TBB is affinitizing threads to the same core even if there is a different core left completely unloaded. FWIW, I realize it's likely that TBB is doing the "right thing" and that changing the threads' affinity will only reduce performance. I'd just like to experiment with it to see if that's really the case.

Read the article

Are mathamatical Algorithms protected by copyright

- by analogy

I wish to implement an algorithm which i read in a journal paper in my software (commercial). I want to know if this is allowed or not. The algorithm in question is described in http://arxiv.org/abs/0709.2938 It is a very simple algorithm and a number of implementations exist in python (http://igraph.sourceforge.net/) and java. One of them is in gpl another which i got from a different researcher and had no license attached. There are significant differences in two implementations, e.g. second one uses threads and multiple cores. It is possible to rewrite/ (not translate) the algorithm. So can I use it in my software or on a server for commercial purpose. Thanks

Read the article

GPU Computing - # of GPUs supported

- by TehTypoKing

I currently have a desktop with 6 GPUs ( 3x HD 5970s ) in non-crossfire mode. Unfortunately, it seems that Windows 7 64bit only supports up to 4 GPUs. I have not been able to find a reliable source to deny or confirm this. If windows 7 has this limitation, is there a Linux flavor that supports more than 4 GPUs? In-case you are wondering, this is not for gaming but high-speed single precision computing. With this current setup ( if I can find 6gpu support ) I am looking to reach 13.8 Teraflops. Also, my motherboard does support 3 16x pci-xpress gen2 slots... and I have a 1500w powersupply plugged into a 20amp outlet. Windows is able to detect all 6 cores.. although, 2 of which displays the warning "Drivers failed to load". To recap: - Can windows support 6 GPUs? - If not, does Linux? Thank you.

Read the article

Huge page buffer vs. multiple simultaneous processes

- by Andrei K.

One of our customer has a 35 Gb database with average active connections count about 70-80. Some tables in database have more than 10M records per table. Now they have bought new server: 4 * 6 Core = 24 Cores CPU, 48 Gb RAM, 2 RAID controllers 256 Mb cache, with 8 SAS 15K HDD on each. 64bit OS. I'm wondering, what would be a fastest configuration: 1) FB 2.5 SuperServer with huge buffer 8192 * 3500000 pages = 29 Gb or 2) FB 2.5 Classic with small buffer of 1000 pages. Maybe some one has tested such case before and will save me days of work :) Thanks in advance.

Read the article

Killing the mysqld process

- by Josh K

I have a table with ~800k rows. I ran an update users set hash = SHA1(CONCAT({about eight fields})) where 1; Now I have a hung Sequel Pro process and I'm not sure about the mysqld process. This is two questions: What harm can possibly come from killing these programs? I'm working on a separate database, so no damage should come to other databases on the system, right? Assume you had to update a table like this. What would be a quicker / more reliable method of updating without writing a separate script. I just checked with phpMyAdmin and it appears as though the query is complete. I still have Sequel Pro using 100% of both my cores though...

Read the article

Is code clearness killing application performance?

- by Jorge Córdoba

As today's code is getting more complex by the minute, code needs to be designed to be maintainable - meaning easy to read, and easy to understand. That being said, I can't help but remember the programs that ran a couple of years ago such as Winamp or some games in which you needed a high performance program because your 486 100 Mhz wouldn't play mp3s with that beautiful mp3 player which consumed all of your CPU cycles. Now I run Media Player (or whatever), start playing an mp3 and it eats up a 25-30% of one of my four cores. Come on!! If a 486 can do it, how can the playback take up so much processor to do the same? I'm a developer myself, and I always used to advise: keep your code simple, don't prematurely optimize for performance. It seems that we've gone from "trying to get it to use the least amount of CPU as possible" to "if it doesn't take too much CPU is all right". So, do you think we are killing performance by ignoring optimizations?

Read the article

How to evaluate "enterprise" platforms?

- by Ran Biron

Hi all, I'm tasked with evaluating an "enterprise" platform for the next-gen version of a product. We're currently considering two "types" of platforms - RAD (workflow engine, integrated UI, small cores of "technology plugins" to the workflows, automatic persisting of state...) like SalesForce.com / Service-Now.com and "cloud based" (EC2 / AppEngine...). While I have a few ideas on where to start, I'd like your opinions - how would you evaluate platforms for an enterprise suite of products? What factors would you consider? How would you eliminate weak options quickly enough to be able to concentrate on the few strong ones? Also interesting is how would you compare enterprise RAD (proven technology, quick to develop - but tends to look "the same as the competition") to cloud-based technology (lots of "buzz", not that many competitors - easy to justify to management, but probably lacking (?) enterprise tools and experience). As said before - I have a few ideas, but would like to see some answers before I post mine so I wouldn't drive the discussion to a specific place. RB.

Read the article

Running applictions via ruby and multi-core support? (OSX)

- by Nick Faraday

Hi All, I'm looking for some tutorials/resources/tips that will show me how to run applications via a ruby script. I have several small tools that we use in our day to day operations that I want to manage their tasks in one ruby script. Basically what I'm trying to do is: run app via ruby script. (wait for result) get result code (success, or error msg) if ok, start the app on its next task. Also each of the tasks are independent so I'd like to take advantage of the 8 cores on my MacPro and run 8 instances at a time. Any resources you could point me towards would be greatly appreciated!

Read the article

What's your development setup? (Talking right now to my boss)

- by Flinkman

How do I tell my boss, that I need endless cpu power to automate my daily job? By the way, what's your setup, now in sep, 2008. How fast disks? How much memory? How many cores? How big screen? (Ok, what the hell are you doing, you may ask. I'm working in multiple environments, vmware. Have couple of build-systems running, for compatibility tests. These build systems are automated. The setup of the build system is also. Is there an another way?) Thanks!

Read the article

Inserting asyncronously into Oracle, any benefits?

- by Karl Trumstedt

I am using ODP.NET for loading data into Oracle. I am bulking inserts into groups of a 1000 rows each call. Is there any performance benefits in calling my load method asynchronously? So say I want to insert 10000 rows, instead of making 10 calls synchronously I make 10 calls asynchronously. My database is using ASSM right now but otherwise plenty of freelists are used of course. The database server has several cores as well. My initial tests seem to point to a performance increase, but maybe there is something I cannot see? Potential deadlock or contention issues? Of course, there is added complexity in handling transactions and such doing my load this way.

Read the article

Why do debug symbols so adversely affect the performance of threaded applications on Linux?

- by fluffels

Hi. I'm writing a ray tracer. Recently, I added threading to the program to exploit the additional cores on my i5 Quad Core. In a weird turn of events the debug version of the application is now running slower, but the optimized build is running faster than before I added threading. I'm passing the "-g -pg" flags to gcc for the debug build and the "-O3" flag for the optimized build. Host system: Ubuntu Linux 10.4 AMD64. I know that debug symbols add significant overhead to the program, but the relative performance has always been maintained. I.e. a faster algorithm will always run faster in both debug and optimization builds. Any idea why I'm seeing this behavior?

Read the article

Multiple things at once (Threads?)

- by Jonathan

All, What is a really simple way of having a program do more than one thing at once, even if the computer does not necessarily have multiple 'cores'. Can I do this by creating more than one Thread? My goal is to be able to have two computers networked (through Sockets) to respond to each-other's requests, while my program will at the same time be able to be managing a UI. I want the server to potentially handle more than one client at the same time as well. My understanding is that the communication is done with BufferedReader.readLine() and PrintWriter.println(). My problem is that I want the server to be waiting on multiple readLine() requests, and also be doing other things. How do I handle this? Many thanks, Jonathan

Read the article

Which parallel sorting algorithm has the best average case performance?

- by Craig P. Motlin

Sorting takes O(n log n) in the serial case. If we have O(n) processors we would hope for a linear speedup. O(log n) parallel algorithms exist but they have a very high constant. They also aren't applicable on commodity hardware which doesn't have anywhere near O(n) processors. With p processors, reasonable algorithms should take O(n/p log n/p) time. In the serial case, quick sort has the best runtime complexity on average. A parallel quick sort algorithm is easy to implement (see here and here). However it doesn't perform well since the very first step is to partition the whole collection on a single core. I have found information on many parallel sort algorithms but so far I have not seen anything pointing to a clear winner. I'm looking to sort lists of 1 million to 100 million elements in a JVM language running on 8 to 32 cores.

Read the article

Creating a sub site in SharePoint takes a very long time

- by denni

Hi, I am working in a MOSS 2007 project and have customized many parts of it. There is a problem in the production server where it takes a very long time (more than 15 minutes, sometimes fails due to timeouts) to create a sub site (even with the built-in site templates). While in the development server, it only takes 1 to 2 minutes. Both servers are having same configuration with 8 cores CPU and 8 GIGs RAM. Both are using separate database servers with the same configuration. The content db size is around 100 GB. More than a hundred subsites are there. What could be the reason why in the other server it will take so much time? Is there any configuration or something else I need to take care? Thanks a lot, all helps are appreciated.

Read the article

Is there a way to force CoreImage to use the GPU?

- by NSSplendid

We are having the following problem: a series of Core Image filters runs constantly in our program. When evaluating on my Macbook Pro, Core Image decides to schedule all graphics computation on the GPU, as expected. When using a MacPro, however, CI uses the CPUs! This is a problem, as we need them for other processing. [1] The question now is: Can one tell CI to run exclusively on the GPU? [1] Both hardware sets are of the newest kind. The MacPro has 8 Cores.

Read the article

scikit learn extratreeclassifier hanging

- by denson

I'm running the scikit learn on some rather large training datasets ~1,600,000,000 rows with ~500 features. The platform is Ubuntu server 14.04, the hardware has 100gb of ram and 20 CPU cores. The test datasets are about half as many rows. I set n_jobs = 10, and am forest_size = 3*number_of_features so about 1700 trees. If I reduce the number of features to about 350 it works fine but never completes the training phase with the full feature set of 500+. The process is still executing and using up about 20gb of ram but is using 0% of CPU. I have also successfully completed on datasets with ~400,000 rows but twice as many features which completes after only about 1 hour. I am being careful to delete any arrays/objects that are not in use. Does anyone have any ideas I might try?

Read the article

Does SetThreadPriority cause thread reschedulling?

- by Suma

Consider following situation, assuming single CPU system: thread A is running with a priority THREAD_PRIORITY_NORMAL, signals event E thread B with a priority THREAD_PRIORITY_LOWEST is waiting for an event E (Note: at this point the thread is not scheduled because it is runnable, but A is higher priority and runnable as well) thread A calls SetThreadPriority(B, THREAD_PRIORITY_ABOVE_NORMAL) Is thread B re-scheduled immediately to run, or is thread A allowed to continue until current time-slice is over, and B is scheduled only once a new time-slice has begun? I would be interested to know the answer for WinXP, Vista and Win7, if possible. Note: the scenario above is simplified from my real world code, where multiple threads are running on multiple cores, but the main object of the question stays: does SetThreadPriority cause thread scheduling to happen?

Read the article

SSRS Performance Mystery

- by user101654

I have a stored procedure that returns about 50000 records in 10sec using at most 2 cores in SSMS. The SSRS report using the stored procedure was taking 20min and would max out the processor on an 8 core server for the entire time. The report was relatively simple (i.e. no graphs, calculations). The report did not appear to be the issue as I wrote the 50K rows to a temp table and the report could display the data in a few seconds. I tried many different ideas for testing altering the stored procedure each time, but keeping the original code in a separate window to revert back to. After one Alter of the stored procedure, going back to the original code, the report and server utilization started running fast, comparable to the performance of the stored procedure alone. Everything is fine for now, but I am would like to get to the bottom of what caused this in case it happens again. Any ideas?

Read the article

HPC Server Dynamic Job Scheduling: when jobs spawn jobs

- by JoshReuben

HPC Job Types HPC has 3 types of jobs http://technet.microsoft.com/en-us/library/cc972750(v=ws.10).aspx · Task Flow – vanilla sequence · Parametric Sweep – concurrently run multiple instances of the same program, each with a different work unit input · MPI – message passing between master & slave tasks But when you try go outside the box – job tasks that spawn jobs, blocking the parent task – you run the risk of resource starvation, deadlocks, and recursive, non-converging or exponential blow-up. The solution to this is to write some performance monitoring and job scheduling code. You can do this in 2 ways: manually control scheduling - allocate/ de-allocate resources, change job priorities, pause & resume tasks , restrict long running tasks to specific compute clusters Semi-automatically - set threshold params for scheduling. How – Control Job Scheduling In order to manage the tasks and resources that are associated with a job, you will need to access the ISchedulerJob interface - http://msdn.microsoft.com/en-us/library/microsoft.hpc.scheduler.ischedulerjob_members(v=vs.85).aspx This really allows you to control how a job is run – you can access & tweak the following features: max / min resource values whether job resources can grow / shrink, and whether jobs can be pre-empted, whether the job is exclusive per node the creator process id & the job pool timestamp of job creation & completion job priority, hold time & run time limit Re-queue count Job progress Max/ min Number of cores, nodes, sockets, RAM Dynamic task list – can add / cancel jobs on the fly Job counters When – poll perf counters Tweaking the job scheduler should be done on the basis of resource utilization according to PerfMon counters – HPC exposes 2 Perf objects: Compute Clusters, Compute Nodes http://technet.microsoft.com/en-us/library/cc720058(v=ws.10).aspx You can monitor running jobs according to dynamic thresholds – use your own discretion: Percentage processor time Number of running jobs Number of running tasks Total number of processors Number of processors in use Number of processors idle Number of serial tasks Number of parallel tasks Design Your algorithms correctly Finally , don’t assume you have unlimited compute resources in your cluster – design your algorithms with the following factors in mind: · Branching factor - http://en.wikipedia.org/wiki/Branching_factor - dynamically optimize the number of children per node · cutoffs to prevent explosions - http://en.wikipedia.org/wiki/Limit_of_a_sequence - not all functions converge after n attempts. You also need a threshold of good enough, diminishing returns · heuristic shortcuts - http://en.wikipedia.org/wiki/Heuristic - sometimes an exhaustive search is impractical and short cuts are suitable · Pruning http://en.wikipedia.org/wiki/Pruning_(algorithm) – remove / de-prioritize unnecessary tree branches · avoid local minima / maxima - http://en.wikipedia.org/wiki/Local_minima - sometimes an algorithm cant converge because it gets stuck in a local saddle – try simulated annealing, hill climbing or genetic algorithms to get out of these ruts watch out for rounding errors – http://en.wikipedia.org/wiki/Round-off_error - multiple iterations can in parallel can quickly amplify & blow up your algo ! Use an epsilon, avoid floating point errors, truncations, approximations Happy Coding !

Read the article

Microsoft Technical Computing

- by Daniel Moth

In the past I have described the team I belong to here at Microsoft (Parallel Computing Platform) in terms of contributing to Visual Studio and related products, e.g. .NET Framework. To be more precise, our team is part of the Technical Computing group, which is still part of the Developer Division. This was officially announced externally earlier this month in an exec email (from Bob Muglia, the president of STB, to which DevDiv belongs). Here is an extract: "… As we build the Technical Computing initiative, we will invest in three core areas: 1. Technical computing to the cloud: Microsoft will play a leading role in bringing technical computing power to scientists, engineers and analysts through the cloud. Existing high- performance computing users will benefit from the ability to augment their on-premises systems with cloud resources that enable ‘just-in-time’ processing. This platform will help ensure processing resources are available whenever they are needed—reliably, consistently and quickly. 2. Simplify parallel development: Today, computers are shipping with more processing power than ever, including multiple cores, but most modern software only uses a small amount of the available processing power. Parallel programs are extremely difficult to write, test and trouble shoot. However, a consistent model for parallel programming can help more developers unlock the tremendous power in today’s modern computers and enable a new generation of technical computing. We are delivering new tools to automate and simplify writing software through parallel processing from the desktop… to the cluster… to the cloud. 3. Develop powerful new technical computing tools and applications: We know scientists, engineers and analysts are pushing common tools (i.e., spreadsheets and databases) to the limits with complex, data-intensive models. They need easy access to more computing power and simplified tools to increase the speed of their work. We are building a platform to do this. Our development efforts will yield new, easy-to-use tools and applications that automate data acquisition, modeling, simulation, visualization, workflow and collaboration. This will allow them to spend more time on their work and less time wrestling with complicated technology. …" Our Parallel Computing Platform team is directly responsible for item #2, and we work very closely with the teams delivering items #1 and #3. At the same time as the exec email, our marketing team unveiled a website with interviews that I invite you to check out: Modeling the World. Comments about this post welcome at the original blog.

Read the article

You do not need a separate SQL Server license for a Standby or Passive server - this Microsoft White Paper explains all

- by tonyrogerson

If you were in any doubt at all that you need to license Standby / Passive Failover servers then the White Paper “Do Not Pay Too Much for Your Database Licensing” will settle those doubts. I’ve had debate before people thinking you can only have a single instance as a standby machine, that’s just wrong; it would mean you could have a scenario where you had a 2 node active/passive cluster with database mirroring and log shipping (a total of 4 SQL Server instances) – in that set up you only need to buy one physical license so long as the standby nodes have the same or less physical processors (cores are irrelevant). So next time your supplier suggests you need a license for your standby box tell them you don’t and educate them by pointing them to the white paper. For clarity I’ve copied the extract below from the White Paper. Extract from “Do Not Pay Too Much for Your Database Licensing” Standby Server Customers often implement standby server to make sure the application continues to function in case primary server fails. Standby server continuously receives updates from the primary server and will take over the role of primary server in case of failure in the primary server. Following are comparisons of how each vendor supports standby server licensing. SQL Server Customers does not need to license standby (or passive) server provided that the number of processors in the standby server is equal or less than those in the active server. Oracle DB Oracle requires customer to fully license both active and standby servers even though the standby server is essentially idle most of the time. IBM DB2 IBM licensing on standby server is quite complicated and is different for every editions of DB2. For Enterprise Edition, a minimum of 100 PVUs or 25 Authorized User is needed to license standby server. The following graph compares prices based on a database application with two processors (dual-core) and 25 users with one standby server. [chart snipped] Note All prices are based on newest Intel Xeon Nehalem processor database pricing for purchases within the United States and are in United States dollars. Pricing is based on information available on vendor Web sites for Enterprise Edition. Microsoft SQL Server Enterprise Edition 25 users (CALs) x $164 / CAL + $8,592 / Server = $12,692 (no need to license standby server) Oracle Enterprise Edition (base license without options) Named User Plus minimum (25 Named Users Plus per Core) = 25 x 2 = 50 Named Users Plus x $950 / Named Users Plus x 2 servers = $95,000 IBM DB2 Enterprise Edition (base license without feature pack) Need to purchase 125 Authorized User (400 PVUs/100 PVUs = 4 X 25 = 100 Authorized User + 25 Authorized Users for standby server) = 125 Authorized Users x $1,040 / Authorized Users = $130,000

Read the article

Using TPL and PLINQ to raise performance of feed aggregator

- by DigiMortal

In this posting I will show you how to use Task Parallel Library (TPL) and PLINQ features to boost performance of simple RSS-feed aggregator. I will use here only very basic .NET classes that almost every developer starts from when learning parallel programming. Of course, we will also measure how every optimization affects performance of feed aggregator. Feed aggregator Our feed aggregator works as follows: Load list of blogs Download RSS-feed Parse feed XML Add new posts to database Our feed aggregator is run by task scheduler after every 15 minutes by example. We will start our journey with serial implementation of feed aggregator. Second step is to use task parallelism and parallelize feeds downloading and parsing. And our last step is to use data parallelism to parallelize database operations. We will use Stopwatch class to measure how much time it takes for aggregator to download and insert all posts from all registered blogs. After every run we empty posts table in database. Serial aggregation Before doing parallel stuff let’s take a look at serial implementation of feed aggregator. All tasks happen one after other. internal class FeedClient { private readonly INewsService _newsService; private const int FeedItemContentMaxLength = 255; public FeedClient() { ObjectFactory.Initialize(container => { container.PullConfigurationFromAppConfig = true; }); _newsService = ObjectFactory.GetInstance<INewsService>(); } public void Execute() { var blogs = _newsService.ListPublishedBlogs(); for (var index = 0; index <blogs.Count; index++) { ImportFeed(blogs[index]); } } private void ImportFeed(BlogDto blog) { if(blog == null) return; if (string.IsNullOrEmpty(blog.RssUrl)) return; var uri = new Uri(blog.RssUrl); SyndicationContentFormat feedFormat; feedFormat = SyndicationDiscoveryUtility.SyndicationContentFormatGet(uri); if (feedFormat == SyndicationContentFormat.Rss) ImportRssFeed(blog); if (feedFormat == SyndicationContentFormat.Atom) ImportAtomFeed(blog); } private void ImportRssFeed(BlogDto blog) { var uri = new Uri(blog.RssUrl); var feed = RssFeed.Create(uri); foreach (var item in feed.Channel.Items) { SaveRssFeedItem(item, blog.Id, blog.CreatedById); } } private void ImportAtomFeed(BlogDto blog) { var uri = new Uri(blog.RssUrl); var feed = AtomFeed.Create(uri); foreach (var item in feed.Entries) { SaveAtomFeedEntry(item, blog.Id, blog.CreatedById); } } } Serial implementation of feed aggregator downloads and inserts all posts with 25.46 seconds. Task parallelism Task parallelism means that separate tasks are run in parallel. You can find out more about task parallelism from MSDN page Task Parallelism (Task Parallel Library) and Wikipedia page Task parallelism. Although finding parts of code that can run safely in parallel without synchronization issues is not easy task we are lucky this time. Feeds import and parsing is perfect candidate for parallel tasks. We can safely parallelize feeds import because importing tasks doesn’t share any resources and therefore they don’t also need any synchronization. After getting the list of blogs we iterate through the collection and start new TPL task for each blog feed aggregation. internal class FeedClient { private readonly INewsService _newsService; private const int FeedItemContentMaxLength = 255; public FeedClient() { ObjectFactory.Initialize(container => { container.PullConfigurationFromAppConfig = true; }); _newsService = ObjectFactory.GetInstance<INewsService>(); } public void Execute() { var blogs = _newsService.ListPublishedBlogs(); var tasks = new Task[blogs.Count]; for (var index = 0; index <blogs.Count; index++) { tasks[index] = new Task(ImportFeed, blogs[index]); tasks[index].Start(); } Task.WaitAll(tasks); } private void ImportFeed(object blogObject) { if(blogObject == null) return; var blog = (BlogDto)blogObject; if (string.IsNullOrEmpty(blog.RssUrl)) return; var uri = new Uri(blog.RssUrl); SyndicationContentFormat feedFormat; feedFormat = SyndicationDiscoveryUtility.SyndicationContentFormatGet(uri); if (feedFormat == SyndicationContentFormat.Rss) ImportRssFeed(blog); if (feedFormat == SyndicationContentFormat.Atom) ImportAtomFeed(blog); } private void ImportRssFeed(BlogDto blog) { var uri = new Uri(blog.RssUrl); var feed = RssFeed.Create(uri); foreach (var item in feed.Channel.Items) { SaveRssFeedItem(item, blog.Id, blog.CreatedById); } } private void ImportAtomFeed(BlogDto blog) { var uri = new Uri(blog.RssUrl); var feed = AtomFeed.Create(uri); foreach (var item in feed.Entries) { SaveAtomFeedEntry(item, blog.Id, blog.CreatedById); } } } You should notice first signs of the power of TPL. We made only minor changes to our code to parallelize blog feeds aggregating. On my machine this modification gives some performance boost – time is now 17.57 seconds. Data parallelism There is one more way how to parallelize activities. Previous section introduced task or operation based parallelism, this section introduces data based parallelism. By MSDN page Data Parallelism (Task Parallel Library) data parallelism refers to scenario in which the same operation is performed concurrently on elements in a source collection or array. In our code we have independent collections we can process in parallel – imported feed entries. As checking for feed entry existence and inserting it if it is missing from database doesn’t affect other entries the imported feed entries collection is ideal candidate for parallelization. internal class FeedClient { private readonly INewsService _newsService; private const int FeedItemContentMaxLength = 255; public FeedClient() { ObjectFactory.Initialize(container => { container.PullConfigurationFromAppConfig = true; }); _newsService = ObjectFactory.GetInstance<INewsService>(); } public void Execute() { var blogs = _newsService.ListPublishedBlogs(); var tasks = new Task[blogs.Count]; for (var index = 0; index <blogs.Count; index++) { tasks[index] = new Task(ImportFeed, blogs[index]); tasks[index].Start(); } Task.WaitAll(tasks); } private void ImportFeed(object blogObject) { if(blogObject == null) return; var blog = (BlogDto)blogObject; if (string.IsNullOrEmpty(blog.RssUrl)) return; var uri = new Uri(blog.RssUrl); SyndicationContentFormat feedFormat; feedFormat = SyndicationDiscoveryUtility.SyndicationContentFormatGet(uri); if (feedFormat == SyndicationContentFormat.Rss) ImportRssFeed(blog); if (feedFormat == SyndicationContentFormat.Atom) ImportAtomFeed(blog); } private void ImportRssFeed(BlogDto blog) { var uri = new Uri(blog.RssUrl); var feed = RssFeed.Create(uri); feed.Channel.Items.AsParallel().ForAll(a => { SaveRssFeedItem(a, blog.Id, blog.CreatedById); }); } private void ImportAtomFeed(BlogDto blog) { var uri = new Uri(blog.RssUrl); var feed = AtomFeed.Create(uri); feed.Entries.AsParallel().ForAll(a => { SaveAtomFeedEntry(a, blog.Id, blog.CreatedById); }); } } We did small change again and as the result we parallelized checking and saving of feed items. This change was data centric as we applied same operation to all elements in collection. On my machine I got better performance again. Time is now 11.22 seconds. Results Let’s visualize our measurement results (numbers are given in seconds). As we can see then with task parallelism feed aggregation takes about 25% less time than in original case. When adding data parallelism to task parallelism our aggregation takes about 2.3 times less time than in original case. More about TPL and PLINQ Adding parallelism to your application can be very challenging task. You have to carefully find out parts of your code where you can safely go to parallel processing and even then you have to measure the effects of parallel processing to find out if parallel code performs better. If you are not careful then troubles you will face later are worse than ones you have seen before (imagine error that occurs by average only once per 10000 code runs). Parallel programming is something that is hard to ignore. Effective programs are able to use multiple cores of processors. Using TPL you can also set degree of parallelism so your application doesn’t use all computing cores and leaves one or more of them free for host system and other processes. And there are many more things in TPL that make it easier for you to start and go on with parallel programming. In next major version all .NET languages will have built-in support for parallel programming. There will be also new language constructs that support parallel programming. Currently you can download Visual Studio Async to get some idea about what is coming. Conclusion Parallel programming is very challenging but good tools offered by Visual Studio and .NET Framework make it way easier for us. In this posting we started with feed aggregator that imports feed items on serial mode. With two steps we parallelized feed importing and entries inserting gaining 2.3 times raise in performance. Although this number is specific to my test environment it shows clearly that parallel programming may raise the performance of your application significantly.

Read the article

Customer Support Spotlight: Clemson University

- by cwarticki

I've begun a Customer Support Spotlight series that highlights our wonderful customers and Oracle loyalists. A week ago I visited Clemson University. As I travel to visit and educate our customers, I provide many useful tips/tricks and support best practices (as found on my blog and twitter). Most of all, I always discover an Oracle gem who deserves recognition for their hard work and advocacy. Meet George Manley. George is a Storage Engineer who has worked in Clemson's Data Center all through college, partially in the Hardware Architecture group and partially in the Storage group. George and the rest of the Storage Team work with most all of the storage technologies that they have here at Clemson. This includes a wide array of different vendors' disk arrays, with the most of them being Oracle/Sun 2540's. He also works with SAM/QFS, ACSLS, and our SL8500 Tape Libraries (all three Oracle/Sun products). (pictured L to R, Matt Schoger (Oracle), Mark Flores (Oracle) and George Manley) George was kind enough to take us for a data center tour. It was amazing. I rarely get to see the inside of data centers, and this one was massive. Clemson Computing and Information Technology’s physical resources include the main data center located in the Information Technology Center at the Innovation Campus and Technology Park. The core of Clemson’s computing infrastructure, the data center has 21,000 sq ft of raised floor and is powered by a 14MW substation. The ITC power capacity is 4.5MW. The data center is the home of both enterprise and HPC systems, and is staffed by CCIT staff on a 24 hour basis from a state of the art network operations center within the ITC. A smaller business continuance data center is located on the main campus. The data center serves a wide variety of purposes including HPC (supercomputing) resources which are shared with other Universities throughout the state, the state's medicaid processing system, and nearly all other needs for Clemson University. Yes, that's no typo (14,256 cores and 37TB of memory!!! Thanks for the tour George and thank you very much for your time. The tour was fantastic. I enjoyed getting to know your team and I look forward to many successes from Clemson using Oracle products. -Chris WartickiGlobal Customer Management

Search Results

Search found 854 results on 35 pages for 'cores'.

Page 24/35 | < Previous Page | 20 21 22 23 24 25 26 27 28 29 30 31 | Next Page >

- by overflow

- by Steve314

- by J Teller

- by analogy

- by TehTypoKing

- by Andrei K.

- by Josh K

- by Jorge Córdoba

- by Ran Biron

- by Nick Faraday

- by Flinkman

- by Karl Trumstedt

- by fluffels

- by Jonathan

- by Craig P. Motlin

- by denni

- by NSSplendid

- by denson

- by Suma

- by user101654

- by JoshReuben

- by Daniel Moth

- by tonyrogerson

- by DigiMortal

- by cwarticki

< Previous Page | 20 21 22 23 24 25 26 27 28 29 30 31 | Next Page >