Search Results

Search found 3521 results on 141 pages for 'parallel computing'.

Page 46/141 | < Previous Page | 42 43 44 45 46 47 48 49 50 51 52 53  | Next Page >

  • ArchBeat Link-o-Rama for 2012-07-10

    - by Bob Rhubart
    Free Event Today: Virtual Developer Day: Oracle Fusion Development This free event—another in the ongoing series of OTN Virtual Developer Days—focuses on Oracle Fusion development, and features three session tracks plus hands-on labs. Agenda and session abstracts are available now so you can be ready for the live event when it kicks off today, July 10, 9am to 1pm PST / 12pm to 4pm EST / 1pm to 5pm BRT. Podcast: The Role of the Cloud Architect - Part 1/3 In part one of this three-part conversation, cloud architects Ron Batra (AT&T) and James Baty (Oracle) talk about how cloud computing is driving the supply-chaining of IT and the "democratization of the activity of architecture." Middleware and Cloud Computing Book | Tom Laszewski Cloud migration expert Tom Laszewski describes Middleware and Cloud Computing by Frank Munz as "one of only a couple books that really discuss AWS and Oracle in depth." Cloud computing moves from fad to foundation | David Linthicum "When enterprises make cloud computing work, they view the application of the technology as a trade secret of sorts, so there are no press releases or white papers," says David Linthicum. "Indeed, if you see one presentation around a successful cloud computing case study, you can bet you're not hearing about 100 more." Oracle Real-Time Decisions: Combined Likelihood Models | Lukas Vermeer Lukas Vermeer concludes his extensive series of posts on decision models with a look "an advanced approach to amalgamate models, taking us to a whole new level of predictive modeling and analytical insights; combination models predicting likelihoods using multiple child models." Running Oracle BPM 11g PS5 Worklist Task Flow and Human Task Form on Non-SOA Domain | Andrejus Baranovskis "With a standard setup, both the BPM worklist application and the Human task form run on the same SOA domain, where the BPM process is running," says Oracle ACE Director Andrejus Baranovskis. "While this work fine, this is not what we want in the development, test and production environment." BAM design pointers | Kavitha Srinivasan "When using EMS (Enterprise Message Source) as a BAM feed, the best practice is to use one EMS to write to one Data Object," says Oracle Fusion Middleware A-Team blogger Kavitha Srinivasan. "There is a possibility of collisions and duplicates when multiple EMS write to the same row of a DO at the same time." Changes in SOA Human Task Flow (Run-Time) for Fusion Applications | Jack Desai Oracle Fusion Middleware A-Team blogger Jack Desai shares a troubleshooting tip. Thought for the Day "A program which perfectly meets a lousy specification is a lousy program." — Cem Kaner Source: SoftwareQuotes.com

    Read the article

  • Updated Agenda for OTN Architect Day Los Angeles (Oct 25)

    - by Bob Rhubart
    Here's the latest information on the session schedule and content for Oracle Technology Network Architect Day in Los Angeles on October 25, 2012. Registration is open, but seating is limited. When: Thursday October 25 12, 2012 8:30am – 5:00pm Where: Sofitel Los Angeles 8555 Beverly Boulevard Los Angeles, CA 90048 Agenda Time Session Title Room 8:30 am - 9:00 am Registration and Continental Breakfast 9:00 am - 9:15 am Welcome and Opening Comments | Bob Rhubart Beverly Ballroom 9:15 am - 10:00 am Engineered Systems: Oracle's Vision for the Future | Ralf Dossmann Oracle's Exadata and Exalogic are impressive products in their own right. But working in combination they deliver unparalleled transaction processing performance with up to a 30x increase over existing legacy systems, with the lowest cost of ownership over a 3 or 5 year basis than any other hardware. In this session you'll learn how to leverage Oracle's Engineered Systems within your enterprise to deliver record-breaking performance at the lowest TCO. Beverly Ballroom 10:00 am - 10:30 am Monitoring and Managing Applications in the Cloud | Basheer Khan Oracle offers a broad portfolio of software and hardware products and services to enable public, private and hybrid clouds to power the enterprise. However, enterprise cloud computing presents new management challenges, that need to be addressed to realize the economic benefits of cloud computing. In this session you will learn about the methods and tools you can use to proactively monitor your end-to-end Oracle Applications environment in the cloud, define service-level objectives, gain insight into your end users, and troubleshoot performance problems from a single console. Beverly Ballroom 10:30 am - 10:45 am Break 10:45 am - 11:30 am Breakout Sessions (pick one) Cloud Computing - Making IT Simple | Dr. James Baty The road to Cloud Computing is not without a few bumps. This session will help to smooth out your journey by tackling some of the potential complications. We'll examine whether standardization is a prerequisite for the Cloud. We'll look at why refactoring isn't just for application code. We'll check out deployable entities and their simplification via higher levels of abstraction. And we'll close out the session with a look at engineered systems and modular clouds. Beverly Ballroom Innovations in Grid Computing with Oracle Coherence | Ashok Aletty Learn how Oracle Coherence can increase the availability, scalability and performance of your existing applications with its advanced low-latency data-grid technologies. Also hear some interesting industry-specific use cases that customers had implemented and how Oracle is integrating Coherence into its Enterprise Java stack. Hollywood Room 11:30 am - 12:15 pm Breakout Sessions (pick one) Enterprise Strategy for Cloud Security | Dave Chappelle Security is high on the list of concerns for many organizations as they evaluate their cloud computing options. This session will examine security in the context of the various forms of cloud computing. We'll consider technical and non-technical aspects of security, and discuss several strategies for cloud computing, from both the consumer and producer perspectives. Beverly Ballroom Oracle Enterprise Manager | Perren Walker This session examines new Oracle Enterprise Manager monitoring, administration, and management features for Oracle Exalogic. It focuses on two management themes: cloud management related to virtualization and applications-to-disk management. For private cloud management, it discusses virtualization management features providing an enhanced set of application deployment capabilities enabling IaaS as well as PaaS interactions. Then from an end-to-end perspective, it covers the specific capabilities and—where applicable—best practices for machine, cloud, middleware, and application administration. Hollywood Room 12:15 pm - 1:15 pm Lunch Beverly Ballroom Lounge 1:15 pm - 2:00 pm Panel Discussion - Q&A with session speakers Beverly Ballroom 2:00 pm - 2:45 pm Breakout Sessions (pick one) Oracle Cloud Reference Architecture | Anbu Krishnaswamy Cloud initiatives are beginning to dominate enterprise IT roadmaps. Successful adoption of Cloud and the subsequent governance challenges warrant a Cloud reference architecture that is applied consistently across the enterprise. This presentation will answer the important questions: What exactly is a Cloud, why you need it, what changes it will bring to the enterprise, and what are the key capabilities of a Cloud infrastructure are - using Oracle's Cloud Reference Architecture, which is part of the IT Strategies from Oracle (ITSO) Cloud Enterprise Technology Strategy ETS). Beverly Ballroom 21st Century SOA | Jeff Davies Service Oriented Architecture has evolved from concept to reality in the last decade. The right methodology coupled with mature SOA technologies has helped customers demonstrate success in both innovation and ROI. In this session you will learn how Oracle SOA Suite's orchestration, virtualization, and governance capabilities provide the infrastructure to run mission critical business and system applications. We'll also take a special look at the convergence of SOA & BPM using Oracle's Unified technology stack. Hollywood Room 2:45 pm - 3:00 pm Break 3:00 pm - 4:00 pm Roundtable Discussion Beverly Ballroom 4:00 pm - 4:15 pm Closing Comments & Readouts from Roundtables Beverly Ballroom 4:15 pm - 5:00 pm Networking / Reception Beverly Ballroom Lounge Note: Session schedule and content subject to change.

    Read the article

  • Rant - Why is Windows Azure not available in Africa?

    - by Allan Rwakatungu
    Yesterday at the .NET user group meeting in Kampala Uganda  I gave a talk on cloud computing with Windows Azure  (details will be in my next blog post). The guys where excited. Without owning they own inftrastucture and at low cost they can build scalable , highly available applications. Not quite. Azure accounts are only available to people in particular countries - none from Africa. I attended PDC in 2008 when Microsoft unleashed Windows Azure. One of the case studies to show the benefits ofr cloud computing was a project in Africa for an education service in Ethiopia. The point they where making was that the cloud was perfect for scenarios where computing infrastructure is not sophiscated, like Ethiopia. Perfect , i thought. So i got my beta account from PDC and started playing around in the cloud. Then Azure goes live , my beta account does not work any more and I cant pay because am from Uganda. Microsoft , this sucks. I dont know the reasons for Microsoft doing this, but am sure we can work out something. We in Africa need the cloud more than anybody else in the world. Setting up data centers that are higly scalable and available for our startups is not an option we have. But we also cant pay for cloud computing with Microsoft. Microsoft, we know we are a tiny insigficant market for a company your size, but your excluding us only continues to widen the digital divide. Microsoft , how about you have a reseller model for cloud computing. Instead of trying to deal direclty with each client you have local partners who help you sell and bill your cloud services. I think that would lead to Windows Azure being available in Africa. I can help you resell in Uganda.

    Read the article

  • Windows Azure Platform for Enterprises

    Cloud computing has already proven worthy of attention from established enterprises and start-ups alike. Most businesses are looking at cloud computing with more than just idle curiosity. As of this writing, IT market research suggests that most enterprise IT managers have enough resources to adopt cloud computing in combination with on-premises IT capabilities.

    Read the article

  • Windows Azure Platform for Enterprises

    Cloud computing has already proven worthy of attention from established enterprises and start-ups alike. Most businesses are looking at cloud computing with more than just idle curiosity. As of this writing, IT market research suggests that most enterprise IT managers have enough resources to adopt cloud computing in combination with on-premises IT capabilities.

    Read the article

  • How to Load Oracle Tables From Hadoop Tutorial (Part 5 - Leveraging Parallelism in OSCH)

    - by Bob Hanckel
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 Using OSCH: Beyond Hello World In the previous post we discussed a “Hello World” example for OSCH focusing on the mechanics of getting a toy end-to-end example working. In this post we are going to talk about how to make it work for big data loads. We will explain how to optimize an OSCH external table for load, paying particular attention to Oracle’s DOP (degree of parallelism), the number of external table location files we use, and the number of HDFS files that make up the payload. We will provide some rules that serve as best practices when using OSCH. The assumption is that you have read the previous post and have some end to end OSCH external tables working and now you want to ramp up the size of the loads. Using OSCH External Tables for Access and Loading OSCH external tables are no different from any other Oracle external tables.  They can be used to access HDFS content using Oracle SQL: SELECT * FROM my_hdfs_external_table; or use the same SQL access to load a table in Oracle. INSERT INTO my_oracle_table SELECT * FROM my_hdfs_external_table; To speed up the load time, you will want to control the degree of parallelism (i.e. DOP) and add two SQL hints. ALTER SESSION FORCE PARALLEL DML PARALLEL  8; ALTER SESSION FORCE PARALLEL QUERY PARALLEL 8; INSERT /*+ append pq_distribute(my_oracle_table, none) */ INTO my_oracle_table SELECT * FROM my_hdfs_external_table; There are various ways of either hinting at what level of DOP you want to use.  The ALTER SESSION statements above force the issue assuming you (the user of the session) are allowed to assert the DOP (more on that in the next section).  Alternatively you could embed additional parallel hints directly into the INSERT and SELECT clause respectively. /*+ parallel(my_oracle_table,8) *//*+ parallel(my_hdfs_external_table,8) */ Note that the "append" hint lets you load a target table by reserving space above a given "high watermark" in storage and uses Direct Path load.  In other doesn't try to fill blocks that are already allocated and partially filled. It uses unallocated blocks.  It is an optimized way of loading a table without incurring the typical resource overhead associated with run-of-the-mill inserts.  The "pq_distribute" hint in this context unifies the INSERT and SELECT operators to make data flow during a load more efficient. Finally your target Oracle table should be defined with "NOLOGGING" and "PARALLEL" attributes.   The combination of the "NOLOGGING" and use of the "append" hint disables REDO logging, and its overhead.  The "PARALLEL" clause tells Oracle to try to use parallel execution when operating on the target table. Determine Your DOP It might feel natural to build your datasets in Hadoop, then afterwards figure out how to tune the OSCH external table definition, but you should start backwards. You should focus on Oracle database, specifically the DOP you want to use when loading (or accessing) HDFS content using external tables. The DOP in Oracle controls how many PQ slaves are launched in parallel when executing an external table. Typically the DOP is something you want to Oracle to control transparently, but for loading content from Hadoop with OSCH, it's something that you will want to control. Oracle computes the maximum DOP that can be used by an Oracle user. The maximum value that can be assigned is an integer value typically equal to the number of CPUs on your Oracle instances, times the number of cores per CPU, times the number of Oracle instances. For example, suppose you have a RAC environment with 2 Oracle instances. And suppose that each system has 2 CPUs with 32 cores. The maximum DOP would be 128 (i.e. 2*2*32). In point of fact if you are running on a production system, the maximum DOP you are allowed to use will be restricted by the Oracle DBA. This is because using a system maximum DOP can subsume all system resources on Oracle and starve anything else that is executing. Obviously on a production system where resources need to be shared 24x7, this can’t be allowed to happen. The use cases for being able to run OSCH with a maximum DOP are when you have exclusive access to all the resources on an Oracle system. This can be in situations when your are first seeding tables in a new Oracle database, or there is a time where normal activity in the production database can be safely taken off-line for a few hours to free up resources for a big incremental load. Using OSCH on high end machines (specifically Oracle Exadata and Oracle BDA cabled with Infiniband), this mode of operation can load up to 15TB per hour. The bottom line is that you should first figure out what DOP you will be allowed to run with by talking to the DBAs who manage the production system. You then use that number to derive the number of location files, and (optionally) the number of HDFS data files that you want to generate, assuming that is flexible. Rule 1: Find out the maximum DOP you will be allowed to use with OSCH on the target Oracle system Determining the Number of Location Files Let’s assume that the DBA told you that your maximum DOP was 8. You want the number of location files in your external table to be big enough to utilize all 8 PQ slaves, and you want them to represent equally balanced workloads. Remember location files in OSCH are metadata lists of HDFS files and are created using OSCH’s External Table tool. They also represent the workload size given to an individual Oracle PQ slave (i.e. a PQ slave is given one location file to process at a time, and only it will process the contents of the location file.) Rule 2: The size of the workload of a single location file (and the PQ slave that processes it) is the sum of the content size of the HDFS files it lists For example, if a location file lists 5 HDFS files which are each 100GB in size, the workload size for that location file is 500GB. The number of location files that you generate is something you control by providing a number as input to OSCH’s External Table tool. Rule 3: The number of location files chosen should be a small multiple of the DOP Each location file represents one workload for one PQ slave. So the goal is to keep all slaves busy and try to give them equivalent workloads. Obviously if you run with a DOP of 8 but have 5 location files, only five PQ slaves will have something to do and the other three will have nothing to do and will quietly exit. If you run with 9 location files, then the PQ slaves will pick up the first 8 location files, and assuming they have equal work loads, will finish up about the same time. But the first PQ slave to finish its job will then be rescheduled to process the ninth location file, potentially doubling the end to end processing time. So for this DOP using 8, 16, or 32 location files would be a good idea. Determining the Number of HDFS Files Let’s start with the next rule and then explain it: Rule 4: The number of HDFS files should try to be a multiple of the number of location files and try to be relatively the same size In our running example, the DOP is 8. This means that the number of location files should be a small multiple of 8. Remember that each location file represents a list of unique HDFS files to load, and that the sum of the files listed in each location file is a workload for one Oracle PQ slave. The OSCH External Table tool will look in an HDFS directory for a set of HDFS files to load.  It will generate N number of location files (where N is the value you gave to the tool). It will then try to divvy up the HDFS files and do its best to make sure the workload across location files is as balanced as possible. (The tool uses a greedy algorithm that grabs the biggest HDFS file and delegates it to a particular location file. It then looks for the next biggest file and puts in some other location file, and so on). The tools ability to balance is reduced if HDFS file sizes are grossly out of balance or are too few. For example suppose my DOP is 8 and the number of location files is 8. Suppose I have only 8 HDFS files, where one file is 900GB and the others are 100GB. When the tool tries to balance the load it will be forced to put the singleton 900GB into one location file, and put each of the 100GB files in the 7 remaining location files. The load balance skew is 9 to 1. One PQ slave will be working overtime, while the slacker PQ slaves are off enjoying happy hour. If however the total payload (1600 GB) were broken up into smaller HDFS files, the OSCH External Table tool would have an easier time generating a list where each workload for each location file is relatively the same.  Applying Rule 4 above to our DOP of 8, we could divide the workload into160 files that were approximately 10 GB in size.  For this scenario the OSCH External Table tool would populate each location file with 20 HDFS file references, and all location files would have similar workloads (approximately 200GB per location file.) As a rule, when the OSCH External Table tool has to deal with more and smaller files it will be able to create more balanced loads. How small should HDFS files get? Not so small that the HDFS open and close file overhead starts having a substantial impact. For our performance test system (Exadata/BDA with Infiniband), I compared three OSCH loads of 1 TiB. One load had 128 HDFS files living in 64 location files where each HDFS file was about 8GB. I then did the same load with 12800 files where each HDFS file was about 80MB size. The end to end load time was virtually the same. However when I got ridiculously small (i.e. 128000 files at about 8MB per file), it started to make an impact and slow down the load time. What happens if you break rules 3 or 4 above? Nothing draconian, everything will still function. You just won’t be taking full advantage of the generous DOP that was allocated to you by your friendly DBA. The key point of the rules articulated above is this: if you know that HDFS content is ultimately going to be loaded into Oracle using OSCH, it makes sense to chop them up into the right number of files roughly the same size, derived from the DOP that you expect to use for loading. Next Steps So far we have talked about OLH and OSCH as alternative models for loading. That’s not quite the whole story. They can be used together in a way that provides for more efficient OSCH loads and allows one to be more flexible about scheduling on a Hadoop cluster and an Oracle Database to perform load operations. The next lesson will talk about Oracle Data Pump files generated by OLH, and loaded using OSCH. It will also outline the pros and cons of using various load methods.  This will be followed up with a final tutorial lesson focusing on how to optimize OLH and OSCH for use on Oracle's engineered systems: specifically Exadata and the BDA. /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

    Read the article

  • What is the cloud?

    - by llaszews
    Everyone has their own definition cloud computing. This is a real conversation overheard at small cafe in NH between two general contractors: Contractor One: I can't get document I need because it is on my home PC. Contractor Two: You need cloud computing! Contractor One: What the hell is that? Contractor Two: You log into one computer and all your information for all your other computers is available. The NH live free or die definition of cloud computing!

    Read the article

  • MathWorks offre une nouvelle fonctionnalité de calculs parallèles pour une simulation plus rapide et une génération de code améliorée

    MathWorks propose une nouvelle fonctionnalité de calculs parallèles Pour une simulation plus rapide et une génération de code améliorée grâce à Parallel Computing Toolbox MathWorks a annoncé aujourd'hui une nouvelle fonctionnalité qui permet d'accélérer la génération de code de système utilisant le référencement de modèles. Cette amélioration est rendue possible par Real-Time Workshop, un outil de génération de code qui tire désormais parti des outils d'amélioration de performance de la Parallel Computing Toolbox et du MATLAB Distributed Computing Server (MDCS). Cette fonction élargit également la prise en charge des calculs parallèles dans d'autres outils MathWorks pour améliorer...

    Read the article

  • Which one to select for my future career; Java, C#, Azure or Apex?

    - by user636195
    Hi folks, This time, I am going to studying Masters in Computer Science in U.S.A after a week. I have been doing my B.Sc for the past three years and after my freshman year I started working on projects (in C# and very rarely in Java) for the past two years.(i.e while I was a second and third year student). Now I am in a college where all of the programming courses are going to be taken in Java only (using Eclipse) and I am going to stay in this college for 8 months on campus and then fully employed for two years in other companies as a CPT. I really love to work on Microsoft products because, for me, they are simple and easy to use and understand. My future plan is to work in Cloud computing and be a Cloud based business owner in the near future. Since the college is going to teach us and let us do every project in Java, I was confused which programming language to use that will help me and enhance me in my career, and of course I wanted to select the one I liked to do everytime. I also heard a lot about Azure (Microsoft’s ) and Apex (Salesforce.com’s cloud computing programming language). Would you please give me your advice and recommendation based on my situation? Should I have to study only Java, or should I have to study C# or Azure beside Java on my own? The reason I asked this is because, since I have no clue how Azure works and how long it will take me to know the language, I am really confused which one to select (Java Vs C# and Azure Vs Apex or if there is any popular and mostly used Cloud Computing langauge). Do you think I can get a job in cloud computing if I study Azure or Apex by my own without experience? There is also one issue I want to consider which is a short term issue is. i.e Salary. Since I have to pay my student loan, I also need to get a good job which will let me pay my loan within two years. But, as I said, my long term plan is, get experience in Cloud Computing (from programming to administrative part,i.e every area of cloud computing) and then have my own business may be within 5-10 years. What do you think? Thank you for your time.

    Read the article

  • VMware présente vCenter Operations pour rationaliser la gestion des infrastructures virtuelles intelligentes

    VMware présente vCenter Operations Pour rationaliser la gestion des infrastructures virtuelles intelligentes Après VMware View 4.6, VMware, annonce une nouvelle solution pour simplifier et automatiser la gestion des services IT dans les environnements virtuels et dynamiques de cloud computing. VMware vCenter perations est destiné à optimiser l'exploitation opérationnelle des systèmes d'information des entreprises de manière à bénéficier de l'agilité et des économies financières inhérentes au cloud computing « Le cloud computing repose sur une approche fondamentalement nouvelle de l'informatique caractérisée par la mise en commun des différents types de ressources, par une gestion extr...

    Read the article

  • Which technologies will most affect Financial Services over the next decade? [on hold]

    - by opposite of you
    I couldn't quite think of a proper description, so as Wikipedia puts it so beautifully: Financial services are the economic services provided by the finance industry, which encompasses a broad range of organizations that manage money, including credit unions, banks, credit card companies, insurance companies, consumer finance companies, stock brokerages, investment funds and some government sponsored enterprises. These are quite a range of industries. I've already thought about how banks specifically have gotten involved with app markets allowing users to make transfers on the go, and this goes with cloud computing, but what else could there be apart from mobile technologies and cloud computing? Or how else could they be used? I feel like I'm thinking about this wrong.. Apart from mobile computing and cloud computing, what other examples will influence the sector either positively or negatively?

    Read the article

  • Does "for" in .Net Framework 4.0 execute loops in parallel? Or why is the total not the sum of the p

    - by Shiraz Bhaiji
    I am writing code to performance test a web site. I have the following code: string url = "http://xxxxxx"; System.Diagnostics.Stopwatch stopwatch = new System.Diagnostics.Stopwatch(); System.Diagnostics.Stopwatch totalTime = new System.Diagnostics.Stopwatch(); totalTime.Start(); for (int i = 0; i < 10; i++) { stopwatch.Start(); WebRequest request = HttpWebRequest.Create(url); WebResponse webResponse = request.GetResponse(); webResponse.Close(); stopwatch.Stop(); textBox1.Text += "Time Taken " + i.ToString() + " = " + stopwatch.Elapsed.Milliseconds.ToString() + Environment.NewLine; stopwatch.Reset(); } totalTime.Stop(); textBox1.Text += "Total Time Taken = " + totalTime.Elapsed.Milliseconds.ToString() + Environment.NewLine; Which is giving the following result: Time Taken 0 = 88 Time Taken 1 = 161 Time Taken 2 = 218 Time Taken 3 = 417 Time Taken 4 = 236 Time Taken 5 = 217 Time Taken 6 = 217 Time Taken 7 = 218 Time Taken 8 = 409 Time Taken 9 = 48 Total Time Taken = 257 I had expected the total time to be the sum of the individual times. Can anybody see why it is not?

    Read the article

  • Fast JSON serialization (and comparison with Pickle) for cluster computing in Python?

    - by user248237
    I have a set of data points, each described by a dictionary. The processing of each data point is independent and I submit each one as a separate job to a cluster. Each data point has a unique name, and my cluster submission wrapper simply calls a script that takes a data point's name and a file describing all the data points. That script then accesses the data point from the file and performs the computation. Since each job has to load the set of all points only to retrieve the point to be run, I wanted to optimize this step by serializing the file describing the set of points into an easily retrievable format. I tried using JSONpickle, using the following method, to serialize a dictionary describing all the data points to file: def json_serialize(obj, filename, use_jsonpickle=True): f = open(filename, 'w') if use_jsonpickle: import jsonpickle json_obj = jsonpickle.encode(obj) f.write(json_obj) else: simplejson.dump(obj, f, indent=1) f.close() The dictionary contains very simple objects (lists, strings, floats, etc.) and has a total of 54,000 keys. The json file is ~20 Megabytes in size. It takes ~20 seconds to load this file into memory, which seems very slow to me. I switched to using pickle with the same exact object, and found that it generates a file that's about 7.8 megabytes in size, and can be loaded in ~1-2 seconds. This is a significant improvement, but it still seems like loading of a small object (less than 100,000 entries) should be faster. Aside from that, pickle is not human readable, which was the big advantage of JSON for me. Is there a way to use JSON to get similar or better speed ups? If not, do you have other ideas on structuring this? (Is the right solution to simply "slice" the file describing each event into a separate file and pass that on to the script that runs a data point in a cluster job? It seems like that could lead to a proliferation of files). thanks.

    Read the article

  • O(log n) algorithm for computing rank of union of two sorted lists?

    - by Eternal Learner
    Given two sorted lists, each containing n real numbers, is there a O(log?n) time algorithm to compute the element of rank i (where i coresponds to index in increasing order) in the union of the two lists, assuming the elements of the two lists are distinct? I can think of using a Merge procedure to merge the 2 lists and then find the A[i] element in constant time. But the Merge would take O(n) time. How do we solve it in O(log n) time?

    Read the article

  • How to execute a program on PostBuild event in parallel?

    - by John
    I managed to set the compiler to execute another program when the project is built/ran with the following directive in project options: call program.exe param1 param2 The problem is that the compiler executes "program.exe" and waits for it to terminate and THEN the project executable is ran. What I ask: How to set the compiler to run both executables in paralel without waiting for the one in PostBuild event to terminate? Thanks in advance

    Read the article

  • SSIS - Parallel Execution of Tasks - How efficient is it?

    - by Randy Minder
    I am building an SSIS package that will contain dozens of Sequence tasks. Each Sequence task will contain three tasks. One to truncate a destination table and remove indexes on the table, another to import data from a source table, and a third to add back indexes to the destination table. My question is this. I currently have nine of these Sequences tasks built, and none are dependent on any of the others. When I execute the package, SSIS seems to do a pretty good job of determining which tasks in which Sequence to execute, which, by the way, appears to be quite random. As I continue adding more Sequences, should I attempt to be smarter about how SSIS should execute these Sequences, or is SSIS smart enough to do it itself? Thanks.

    Read the article

< Previous Page | 42 43 44 45 46 47 48 49 50 51 52 53  | Next Page >